Binary Tree

F

Foodbank

Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {			// specify the "shape" of a tnode structure ...
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/






	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/





}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}
 
K

kleuske

Foodbank schreef:
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists.

A "binary search" does not involve any trees, it's a way of searching
through *sorted* sequential lists.
I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed.

Yes. All but the actual implementation of the tree-building and
searching algo's, which are left to the student. I bet that's what the
teacher gave you as an assignment.
Any help is greatly appreciated.
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
/***** CODE TO DO THE BINARY SRCH *****/
/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/

Do your own homework, or at least let us see you trying.
 
M

mensanator

Foodbank said:
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below?

Here's a couple links to how binary trees work. It's about numbers
in a Collatz sequence instead of words but the same principles apply.
In fact, I learned how to do this from the O'Reilly book "Practical
C Programming" where the example processes a list of words. The
solution
I talk about uses a three pointer structure that allows me to create
a binary tree that is simultaneously a linked list. Ignore that third
pointer feature if it's not applicable to your problem.

This is a quickie tutorial on binary trees

<http://groups.google.com/group/Dyma...6be54/cb42098abc4cf84e?hl=en#cb42098abc4cf84e>

And this is how a binary tree is applicable to the problem of
finding a duplicate number in a Collatz sequence.

<http://groups.google.com/group/Dyma...fb30c/6f35e1f30e0a3b43?hl=en#6f35e1f30e0a3b43>

If none of this makes any sense, I can dig up my code examples,
but right now they are coded to use GMP unlimited precision integers
and won't directly plug into your program, but the principles should
apply.
Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {			// specify the "shape" of a tnode structure ...
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/






	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/





}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}
 
M

mensanator

Here's a couple links to how binary trees work. It's about numbers
in a Collatz sequence instead of words but the same principles apply.
In fact, I learned how to do this from the O'Reilly book "Practical
C Programming" where the example processes a list of words. The
solution
I talk about uses a three pointer structure that allows me to create
a binary tree that is simultaneously a linked list. Ignore that third
pointer feature if it's not applicable to your problem.

This is a quickie tutorial on binary trees

<http://groups.google.com/group/Dyma...6be54/cb42098abc4cf84e?hl=en#cb42098abc4cf84e>

And this is how a binary tree is applicable to the problem of
finding a duplicate number in a Collatz sequence.

<http://groups.google.com/group/Dyma...fb30c/6f35e1f30e0a3b43?hl=en#6f35e1f30e0a3b43>

These links take you to the end of the threads. I meant for you to read
the
first message in each thread (the others drift away from the subject
although
you're certainly welcome to read them).

If none of this makes any sense, I can dig up my code examples,
but right now they are coded to use GMP unlimited precision integers
and won't directly plug into your program, but the principles should
apply.
Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {			// specify the "shape" of a tnode structure ...
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/






	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/





}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}
 
F

Foodbank

Thanks guys for the links. That made me laugh when you said this was
homework. I wish it were and I was still in school. However, I'm
self-teaching myself C in order to switch from my electrical
engineering based job (my degree in) into more of a programming based
job (I know, I have a long way to go, haha). The program is an
exercise from a book I'm reading.

Thanks,
Jams
 
M

Malcolm

Foodbank said:
I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.
The idea of a binary search is this.

We have a telephone directory, and want to find an arbitrary name - Gubbins.
Open the directory in the middle. The middle entry will be a name like
"Marks". If it is equal to "Gubbins" you have found your entry, if not, you
know the name must be somewhere between the start of the directory and
"Marks". Now go a quarter way through. This name is "Henry". Still above
"Gubbins", so we know "Gubbins" is in the top quarter. When we check an
eighth of the way through, we find "Evans". So Gubbins is in the second
eighth of the directory. Split on the third sixteenth and you get "Groves".
Gubbins is in the fourth sixteenth, between "Groves" and "Henry". Continue
doing this until you have narrowed to a single entry.

You will see that you eliminate half the possibilities each time, so you
need log2(N) operations to complete the search. This is much faster than
going through the directory sequentially.

Unfortunately this algorithm only works if the directory entries are held in
a sorted array. If we want to insert and delete entries, we have to move the
array about to keep it sorted, which is expensive.

The solution is to use a tree structure. We select an arbitrary entry about
halfway along as the root. All the entries lower than it go in the left
sub-tree, all the entries higher go in the right sub-tree. leaves have null
children. We can easily add new members by growing new leaves, without
rebuilding the whole tree. (The tree doesn't have to be perfectly balanced
for the algorithm to work in reasonable time. Keeping the tree balanced is a
whole new story we won't go into).

The first thing to do is to write the routine that builds the tree. The
first word you encounter goes in the root, the second becomes the right or
the left leaf, and you just continue until you run out of leaves.
Test the tree on a tiny dataset with only half a dozen words, and print it
out so that you can see it is correct.

Then write the function to search the tree, starting at the top and going
down until it either hits a match or encounters a null pointer indicating it
has fallen off a leaf.
 
F

Foodbank

Hi,

Am I on the right path in the section of the code that adds new nodes?
I'm not finished, but trying to make an attempt. It's under the
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/ section. I'm
pretty sure that it's supposed to check if it's null and if so,
allocates memory for a new node. Syntax is probably wrong, but was
hoping for some input.

Thanks,
James

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {                  // specify the "shape" of a tnode
structure ...
        struct tnode *left;     // the left and right branch pointers
        struct tnode *right;
        int count;              // the word count as before
        char *word;             // a pointer to the word

} *root;                        // declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;


int main(int argc, char *argv[]) {
        struct tnode **tpp;
        char word_buff[100];    // the reusable word buffer
        int i;
        while(get_word(word_buff)) {
                tpp = tree_search(&root, word_buff);
                /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
               ///new code below here
               if(root==NULL)
			if(*tpp==NULL){
				tpp=malloc(sizeof(struct tnode));
				tpp->word = strdup(word_buff);
				tpp->*left = NULL;
				tpp->*right = NULL;
			}
		else statement here if there's a node there, increments count I
think, not sure which variables to use
			
        } 
[code]
 
M

mensanator

Foodbank said:
Hi,

Am I on the right path in the section of the code that adds new nodes?
I'm not finished, but trying to make an attempt. It's under the
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/ section. I'm
pretty sure that it's supposed to check if it's null and if so,
allocates memory for a new node. Syntax is probably wrong, but was
hoping for some input.

I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.

Sample output:

$ ./a the ice was here the ice was there the ice was all around
it cracked and groaned and shrieked and howled like noises in a
swound

a (1) all (1) and (3) around (1) cracked (1) groaned (1)
here (1) howled (1) ice (3) in (1) it (1) like (1) noises (1)
shrieked (1) swound (1) the (3) there (1) was (3)

L L L L L [a - 1]
R U [all - 1]
R L L [and - 3]
R U [around - 1]
R L [cracked - 1]
R L [groaned - 1]
R U U U U [here - 1]
R L [howled - 1]
R U U [ice - 3]
R L L [in - 1]
R U [it - 1]
R L L [like - 1]
R L [noises - 1]
R U U [shrieked - 1]
R L [swound - 1]
R U U U U [the - 3]
R L L [there - 1]
R U [was - 3]
R U U
Thanks,
James

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {                  // specify the "shape" of a tnode
structure ...
struct tnode *left;     // the left and right branch pointers
struct tnode *right;
int count;              // the word count as before
char *word;             // a pointer to the word

} *root;                        // declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;


int main(int argc, char *argv[]) {
struct tnode **tpp;
char word_buff[100];    // the reusable word buffer
int i;
while(get_word(word_buff)) {
tpp = tree_search(&root, word_buff);
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
///new code below here
if(root==NULL)
			if(*tpp==NULL){
				tpp=malloc(sizeof(struct tnode));
				tpp->word = strdup(word_buff);
				tpp->*left = NULL;
				tpp->*right = NULL;
			}
		else statement here if there's a node there, increments count I
think, not sure which variables to use

}
[code][/QUOTE]




#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

//  gcc -g -I. c/foodbank.c

struct node {
  struct node    *left;              // tree to the left
  struct node    *right;             // tree to the right
  unsigned int   count;              // # of occurences...
  char           *word;              // ...of this word
};

static struct node  *root = NULL;

void memory_error(void)
{
  fprintf(stderr, "Error:Out of memory\n");
  exit(8);
}

char *save_w(char *string)         // copy string to heap
{
  char *new_string;
  new_string = malloc((unsigned) (strlen(string)+1));
  if (new_string == NULL)
     memory_error();
  strcpy(new_string,string);
  return (new_string);
}

void enter(struct node **node, char *word)
{
  char          *save_w();
  int           comp;

  if ((*node) == NULL)
    {
      (*node) = malloc(sizeof(struct node));
      if ((*node) == NULL)
          memory_error();
      (*node)->left  = NULL;
      (*node)->right = NULL;
      (*node)->count = 1;             // new node = first occurence
      (*node)->word  = save_w(word);  // of this word
      return;
    }
  comp = strcmp((*node)->word,word);
  if (comp==0 )
    {
      (*node)->count++;              // word already in tree
      return;
    }
  if (comp<0)
      enter(&(*node)->right,word);  // recursively try to enter word
  else                              // until a NULL is found
      enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
  if (top == NULL)
      return;
  print_tree(top->left);
  printf("%s (%d)  ", top->word,top->count);
  print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
  if (top == NULL)
      return;
  printf("L ");                                // go left
  print_tree_debug(top->left);
  printf("[%s - %d]\n", top->word,top->count);
  printf("R ");                                // go right
  print_tree_debug(top->right);
  printf("U ");                                // go up
}

int main(int argc, char *argv[])
{
  unsigned int i;

  for (i=1; i<argc; i++)
      enter(&root,argv[i]);

  printf("\n");
  print_tree(root);
  printf("\n");

  printf("\n");
  print_tree_debug(root);
  printf("\n");

 return (0);
}
 
B

Barry Schwarz

On 11 Oct 2005 15:26:31 -0700, "(e-mail address removed)"

snip
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.
snip
Code:
[/QUOTE]




#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

//  gcc -g -I. c/foodbank.c

struct node {
struct node    *left;              // tree to the left
struct node    *right;             // tree to the right
unsigned int   count;              // # of occurences...
char           *word;              // ...of this word
};

static struct node  *root = NULL;[/QUOTE]

Since you pass root to every function that needs it, there is no need
for it to be at file scope.  This will prevent an unintended function
from modifying it.
[QUOTE]
void memory_error(void)
{
fprintf(stderr, "Error:Out of memory\n");
exit(8);[/QUOTE]

EXIT_FAILURE would be more portable than 8.[QUOTE]
}

char *save_w(char *string)         // copy string to heap
{
char *new_string;
new_string = malloc((unsigned) (strlen(string)+1));
if (new_string == NULL)
memory_error();
strcpy(new_string,string);
return (new_string);
}

void enter(struct node **node, char *word)
{
char          *save_w();[/QUOTE]

Marginally incorrect and error prone.  save_w was defined above.  That
will serve as the prototype of any subsequent calls to the function.
If you are going to repeat a declaration, repeat it exactly.
[QUOTE]
int           comp;

if ((*node) == NULL)
{
(*node) = malloc(sizeof(struct node));
if ((*node) == NULL)
memory_error();
(*node)->left  = NULL;
(*node)->right = NULL;
(*node)->count = 1;             // new node = first occurence
(*node)->word  = save_w(word);  // of this word
return;
}
comp = strcmp((*node)->word,word);
if (comp==0 )
{
(*node)->count++;              // word already in tree
return;
}
if (comp<0)
enter(&(*node)->right,word);  // recursively try to enter word
else                              // until a NULL is found
enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
if (top == NULL)
return;
print_tree(top->left);
printf("%s (%d)  ", top->word,top->count);
print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
if (top == NULL)
return;
printf("L ");                                // go left
print_tree_debug(top->left);
printf("[%s - %d]\n", top->word,top->count);
printf("R ");                                // go right
print_tree_debug(top->right);
printf("U ");                                // go up
}

int main(int argc, char *argv[])
{
unsigned int i;

for (i=1; i<argc; i++)
enter(&root,argv[i]);

printf("\n");
print_tree(root);
printf("\n");

printf("\n");
print_tree_debug(root);
printf("\n");

return (0);
}[/QUOTE]


<<Remove the del for email>>
 
M

mensanator

Barry said:
On 11 Oct 2005 15:26:31 -0700, "(e-mail address removed)"

snip
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.
snip
Code:
[/QUOTE]




#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

//  gcc -g -I. c/foodbank.c

struct node {
struct node    *left;              // tree to the left
struct node    *right;             // tree to the right
unsigned int   count;              // # of occurences...
char           *word;              // ...of this word
};

static struct node  *root = NULL;[/QUOTE]

Since you pass root to every function that needs it, there is no need
for it to be at file scope.  This will prevent an unintended function
from modifying it.[/QUOTE]

Where does it go? FWIW, I copied (and modified) this code out
of the O'Reilly book "Practical C Programming" by Steve Oualline.
I don't know enough to make a mistake that sophisticated.
[QUOTE]
EXIT_FAILURE would be more portable than 8.[/QUOTE]

Ditto, didn't know there was such a constant.
[QUOTE]
Marginally incorrect and error prone.  save_w was defined above.  That
will serve as the prototype of any subsequent calls to the function.
If you are going to repeat a declaration, repeat it exactly.[/QUOTE]

That one's partially my fault. The prototype is in the book,
but it got inexplicably changed when I was modifying it.

Thanks for pointing those out. I'm not a C programmer but
occaisionally need to create C programs. I'm completely
dependant on examples from books. Good to know that even
the books aren't perfect.
[QUOTE]
[QUOTE]
int           comp;

if ((*node) == NULL)
{
(*node) = malloc(sizeof(struct node));
if ((*node) == NULL)
memory_error();
(*node)->left  = NULL;
(*node)->right = NULL;
(*node)->count = 1;             // new node = first occurence
(*node)->word  = save_w(word);  // of this word
return;
}
comp = strcmp((*node)->word,word);
if (comp==0 )
{
(*node)->count++;              // word already in tree
return;
}
if (comp<0)
enter(&(*node)->right,word);  // recursively try to enter word
else                              // until a NULL is found
enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
if (top == NULL)
return;
print_tree(top->left);
printf("%s (%d)  ", top->word,top->count);
print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
if (top == NULL)
return;
printf("L ");                                // go left
print_tree_debug(top->left);
printf("[%s - %d]\n", top->word,top->count);
printf("R ");                                // go right
print_tree_debug(top->right);
printf("U ");                                // go up
}

int main(int argc, char *argv[])
{
unsigned int i;

for (i=1; i<argc; i++)
enter(&root,argv[i]);

printf("\n");
print_tree(root);
printf("\n");

printf("\n");
print_tree_debug(root);
printf("\n");

return (0);
}[/QUOTE]


<<Remove the del for email>>[/QUOTE]
 
F

Foodbank

Thanks mensanator and everyone else who helped. I appreciate it. I'll
try this out and see how things go.

Regards,
James
 
B

Barry Schwarz

Barry said:
On 11 Oct 2005 15:26:31 -0700, "(e-mail address removed)"

snip
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.
snip

Code:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

//  gcc -g -I. c/foodbank.c

struct node {
struct node    *left;              // tree to the left
struct node    *right;             // tree to the right
unsigned int   count;              // # of occurences...
char           *word;              // ...of this word
};

static struct node  *root = NULL;[/QUOTE]

Since you pass root to every function that needs it, there is no need
for it to be at file scope.  This will prevent an unintended function
from modifying it.[/QUOTE]

Where does it go? FWIW, I copied (and modified) this code out
of the O'Reilly book "Practical C Programming" by Steve Oualline.
I don't know enough to make a mistake that sophisticated.[/QUOTE]

In this case, I would suggest it belongs in main.



<<Remove the del for email>>
 
M

mensanator

Barry said:
Barry said:
On 11 Oct 2005 15:26:31 -0700, "(e-mail address removed)"

snip

I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.

snip

Code:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

//  gcc -g -I. c/foodbank.c

struct node {
struct node    *left;              // tree to the left
struct node    *right;             // tree to the right
unsigned int   count;              // # of occurences...
char           *word;              // ...of this word
};

static struct node  *root = NULL;

Since you pass root to every function that needs it, there is no need
for it to be at file scope.  This will prevent an unintended function
from modifying it.[/QUOTE]

Where does it go? FWIW, I copied (and modified) this code out
of the O'Reilly book "Practical C Programming" by Steve Oualline.
I don't know enough to make a mistake that sophisticated.[/QUOTE]

In this case, I would suggest it belongs in main. 
Thanks.




<<Remove the del for email>>[/QUOTE]
 
F

Foodbank

Hey guys, I'm almost there, but getting a seg. fault. I have no idea
why. I've commented where it's happening at the location that says:

//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {			// specify the "shape" of a tnode structure
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/

		if(*tpp!=NULL){
			(*tpp)->count++;
		}
		else{
			(*tpp)=malloc(sizeof(struct tnode));
			(*tpp)->left=NULL;
			(*tpp)->right=NULL;
			(*tpp)->count=1;
			(*tpp)->word=strdup(word_buff);
		}

	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	int cmp;

	while(*tpp){
		cmp=strcmp(w,(*tpp)->word);
   //*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
			if (cmp>0) {
				tpp=(*tpp)->right;
				}
			else if (cmp<0){
				tpp=(*tpp)->left;
				}
			else if (cmp==0){
				break;
		    }
}


return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
if(tp=NULL)
	return;
	total_words+=tp->count;
	total_nodes++;
	if(tp->count > high){
		high=tp->count;
		most_frequent=tp;
	}
}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}
 
M

mensanator

Foodbank said:
Hey guys, I'm almost there, but getting a seg. fault. I have no idea
why.

I think a seg. fault means you have a bad pointer.

I thought, perhaps, someone else would have posted the answer by now,
but since that hasn't happened, I'll make a stab at it. Keep in mind
I'm not an expert, so what follows are guesses (hopefully educated).
I've commented where it's happening at the location that says:

//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Code:
#include <stdio.h>
#include <malloc.h>
struct tnode {			// specify the "shape" of a tnode structure
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable[/QUOTE]

Root HAS to start out as a NULL pointer. I don't know if the
preceeding initializes the pointer or not. I think you either
need to initialize it here by

} *root = NULL;

assuming that's correct usage or else set it to null at the
start of main()

root = NULL

[QUOTE]
struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/

		if(*tpp!=NULL){
			(*tpp)->count++;
		}
		else{
			(*tpp)=malloc(sizeof(struct tnode));
			(*tpp)->left=NULL;
			(*tpp)->right=NULL;
			(*tpp)->count=1;
			(*tpp)->word=strdup(word_buff);
		}

	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	int cmp;

	while(*tpp){[/QUOTE]

The other thing I don't know is whether NULL pointers evaluate
to FALSE. If not, then you need to explicitly test for NULL in
the while statement.
[QUOTE]
cmp=strcmp(w,(*tpp)->word);
//*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
			if (cmp>0) {
				tpp=(*tpp)->right;
				}
			else if (cmp<0){
				tpp=(*tpp)->left;
				}
			else if (cmp==0){
				break;
		    }
}


return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
if(tp=NULL)
	return;
	total_words+=tp->count;
	total_nodes++;
	if(tp->count > high){
		high=tp->count;
		most_frequent=tp;
	}
}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}
 
F

Flash Gordon

I think a seg. fault means you have a bad pointer.

Or overrunning the end of a buffer. Or a number of other things.
I thought, perhaps, someone else would have posted the answer by now,

I didn't because I could see it retained a number of errors I had
previously (and not many days before) posted corrections to. I can't
remember if it was the same person, but it was obviously based on the
same homework question. However, since someone else is showing some
interest I decided to provide you with some feedback on lots of problems
with it to help you learn what to avoid. I am almost certain there are
further problems I have not pointed out.
but since that hasn't happened, I'll make a stab at it. Keep in mind
I'm not an expert, so what follows are guesses (hopefully educated).
I've commented where it's happening at the location that says:

//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Code:
#include <stdio.h>
#include <malloc.h>[/QUOTE][/QUOTE]

malloc.h is not a standard header, the standard header is stdlib.h
[QUOTE]
Root HAS to start out as a NULL pointer. I don't know if the
preceeding initializes the pointer or not. I think you either
need to initialize it here by

} *root = NULL;[/QUOTE]

It is declared at file scope and therefor will therefore be initialised 
to a null pointer if no explicit initialisation is provided.
[QUOTE]
assuming that's correct usage or else set it to null at the
start of main()

root = NULL[/QUOTE]

Not needed, your initialisation syntax was correct.

All this use of pointers to pointers is confusing and pointless.

struct tnode *tree_search(struct tnode *, char *);

[QUOTE][QUOTE]
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
struct tnode **tpp;[/QUOTE][/QUOTE]

     struct tnode *tpp;
[QUOTE][QUOTE]
char word_buff[100];	// the reusable word buffer[/QUOTE][/QUOTE]

Don't use // syle comments on news groups, they won't survive line wrapping.

Don't use tabs on code posted to news groups. You have no idea how they 
will be rendered and sometimes they even get stripped. I've replaces 
them with spaces.

You are not trying to modify the value of root, so why pass it's address?
         tpp = tree_search(root, word_buff);

Why the parenthesis? Also, why yous sizeof(type) when you can use sizeof 
object and reduce the chance of error?
               tpp=malloc(sizeof *tpp);
With my changes you also need to remove a number of other dereferences.

strdup is not a standard function.

Also you need to add this node you've just created to your tree. It does 
not happen by magic.
[QUOTE][QUOTE]
}

}
tree_stats(root);
printf("total_nodes %d\n", total_nodes);
printf("total_words %d\n", total_words);
if(most_frequent)
printf("most frequent word <%s> count is %d\n",
most_frequent->word, most_frequent->count);
for(i = 1; i < argc; i++) {
tpp = tree_search(&root, argv[i]);
if((*tpp) == NULL)
printf("%s is NOT in the tree\n", argv[i]);
else
printf("<%s> count is %d\n", argv[i], (*tpp)->count);
}
return(0);[/QUOTE][/QUOTE]

You don't need parenthesis here, return is not a function.
       return 0;
See what I meant about // comments not surviving line wrapping?
[QUOTE]
The other thing I don't know is whether NULL pointers evaluate
to FALSE. If not, then you need to explicitly test for NULL in
the while statement.[/QUOTE]

You don't need to compare against NULL, although some people consider it 
to be better style.

Then you have a problem building your tree.

Again, no need for the parenthesis.
       return(tpp);

Functions modifying globals are horrible things. There are many far 
better solutions.

No protection against overflowing your buffer. Very bad.
[QUOTE][QUOTE]
c = getchar();
} while(isalpha(c) || isdigit(c));
*s = 0;
return(1);
}
 
B

Barry Schwarz

Hey guys, I'm almost there, but getting a seg. fault. I have no idea
why. I've commented where it's happening at the location that says:

This should not be your real code. It doesn't compile. Are you
ignoring the warnings? If you don't see the warnings, up the warning
level of your compiler.
//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Code:
#include <stdio.h>
#include <malloc.h>[/QUOTE]

Not a standard header.  malloc() is in stdlib.h.  You need string.h
and ctype.h.
[QUOTE]
struct tnode {			// specify the "shape" of a tnode structure
	struct tnode *left;	// the left and right branch pointers
	struct tnode *right;
	int count;		// the word count as before
	char *word;		// a pointer to the word
} *root;			// declare the root pointer variable[/QUOTE]

root is initialized by default to NULL.
[QUOTE]
struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;

int main(int argc, char *argv[]) {
	struct tnode **tpp;
	char word_buff[100];	// the reusable word buffer
	int i;
	while(get_word(word_buff)) {
		tpp = tree_search(&root, word_buff);
		/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/

		if(*tpp!=NULL){
			(*tpp)->count++;
		}
		else{
			(*tpp)=malloc(sizeof(struct tnode));
			(*tpp)->left=NULL;
			(*tpp)->right=NULL;
			(*tpp)->count=1;
			(*tpp)->word=strdup(word_buff);[/QUOTE]

Not a standard function.
[QUOTE]
}

	}
	tree_stats(root);
	printf("total_nodes %d\n", total_nodes);
	printf("total_words %d\n", total_words);
	if(most_frequent)
		printf("most frequent word <%s> count is %d\n",
			most_frequent->word, most_frequent->count);
	for(i = 1; i < argc; i++) {
		tpp = tree_search(&root, argv[i]);
		if((*tpp) == NULL)
			printf("%s is NOT in the tree\n", argv[i]);
		else
			printf("<%s> count is %d\n", argv[i], (*tpp)->count);
	}
	return(0);
}

// binary tree search returning a pointer to the pointer leading to a
hit
struct tnode **tree_search(struct tnode **tpp, char *w) {
	/***** CODE TO DO THE BINARY SRCH *****/
	int cmp;

	while(*tpp){
		cmp=strcmp(w,(*tpp)->word);
//*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********[/QUOTE]

Do you expect your compiler to parse this as / /*... (divide followed
by start of comment) or // *... (one line comment)?  Don't use //
comments for posting.  If the comment overflows the line in your
newsreader, others cannot assemble your code.
[QUOTE]
if (cmp>0) {
				tpp=(*tpp)->right;[/QUOTE]

tpp is a pointer to pointer to struct.  (*tpp)->right is a pointer to
struct.  The two are not compatible for implicit conversion.
[QUOTE]
}
			else if (cmp<0){
				tpp=(*tpp)->left;
				}
			else if (cmp==0){
				break;
		    }
}


return(tpp);
}

// gather stats to check tree correctness
void tree_stats(struct tnode *tp) {
	/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
if(tp=NULL)[/QUOTE]

You probably meant == here.
[QUOTE]
return;
	total_words+=tp->count;
	total_nodes++;
	if(tp->count > high){
		high=tp->count;
		most_frequent=tp;
	}
}

#include <ctype.h>
/* Leave this routine EXACTLY as it stands */
int get_word(char *s) {
	int c;
	do {
		c = getchar();
		if(c == EOF)
			return(0);
	} while(!isalpha(c) && !isdigit(c));
	do {
		if(isupper(c))
			c = tolower(c);
		*s++ = c;
		c = getchar();
	} while(isalpha(c) || isdigit(c));
	*s = 0;
	return(1);
}


<<Remove the del for email>>
 
F

Foodbank

Hi everyone,

Just wanted to say thanks for the help on this one. Finally got it
working.

Take care,
James
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Compiler error 10
Radix Tree 1
K&R2, Binary-Tree, section 6.5 7
doubly-linked list & sorting 5
PR-Tree 10
A Binary Tree in C 18
Command Line Arguments 0
realloc pointer array - binary tree 2

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top