Break ifstream input into words?

D

dmurray14

Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this?

Input the words into strings, and then copy it to the character arrays, or
alllocate space for the pointers and copy to it. Or just pass the c_str ()
of the strings to the functions that takes constant c-style strings, if
appropriate.
 
S

Salt_Peter

Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan

Here is a suggestion. Before trying something like this, get familiar
with std::string and containers like std::vectors. Its considerably
more difficult and error prone to deal with char arrays and pointers.
I'ld avoid pointers altogether and leave new/delete allocations to
ancient history.
A good book would help, consult this newsgroup for some recommended
titles.

#include <iostream>
#include <string>
#include <vector>

int main()
{
std::string s("a short string");

std::vector< std::string > vs; // a vector of strings
vs.push_back(s);
vs.push_back("another string");
vs.push_back("the last string");

for(size_t i = 0; i < vs.size(); ++i)
{
std::cout << vs << "\n";
}
return 0;
}
 
D

dmurray14

Thanks, I really appreciate it. I would love to use strings, and I
understand everyone knows it's the better way to go, however the
guidelines I was given to work with included using character arrays and
assigning pointers to them. Basically, the program I create needs to
gather a list of words from an input file and do an analysis of how and
when they appear. The current plan is to put these words into a linked
list, alphabetize them, and go from there. However, I'm stuck at
getting the words seperated. Once I can get them into char arrays, I
should be fine. It's just the breaking down of the file into words that
I'm having huge problems with. I'm guessing that if I go with the
character array, the best way to do this will be to go through the file
character by character, looking for spaces and punctuation and putting
words into their nodes in a linked list. However, I don't know how to
do this - I'm not familiar with how to look at each character in a
file, and then how to create a character array ("word") with the data.

Again, thanks for all your help, hopefully this makes it a little more
clear what I'm going for!

Dan
Salt_Peter said:
Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan

Here is a suggestion. Before trying something like this, get familiar
with std::string and containers like std::vectors. Its considerably
more difficult and error prone to deal with char arrays and pointers.
I'ld avoid pointers altogether and leave new/delete allocations to
ancient history.
A good book would help, consult this newsgroup for some recommended
titles.

#include <iostream>
#include <string>
#include <vector>

int main()
{
std::string s("a short string");

std::vector< std::string > vs; // a vector of strings
vs.push_back(s);
vs.push_back("another string");
vs.push_back("the last string");

for(size_t i = 0; i < vs.size(); ++i)
{
std::cout << vs << "\n";
}
return 0;
}
 
G

Gianni Mariani

Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan

Dan - what don't you get, it's simple. (massive tongue in cheek here).

Hope this helps.

#include <string>
#include <iostream>
#include <vector>
#include <algorithm>

const char * GetStr( std::string & i_str )
{
return i_str.c_str();
}

int main()
{
std::string str;
std::vector<std::string> vec;

while ( std::cin >> str )
{
vec.push_back( str );
}

std::vector<const char *> vec2( vec.size() );

std::transform( vec.begin(), vec.end(), vec2.begin(), GetStr );

const char ** array = & vec2[0];
const int num_array = vec2.size();

// stuff is in array - don't touch vec *or* vec2 while you use
// "array" otherwise you could be using dangling pointers

for ( int i = 0; i < num_array; ++ i )
{
std::cout << array << "\n";
}

}
 
J

Jim Langston

Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan

Here is a suggestion. Before trying something like this, get familiar
with std::string and containers like std::vectors. Its considerably
more difficult and error prone to deal with char arrays and pointers.
I'ld avoid pointers altogether and leave new/delete allocations to
ancient history.
A good book would help, consult this newsgroup for some recommended
titles.

#include <iostream>
#include <string>
#include <vector>

int main()
{
std::string s("a short string");

std::vector< std::string > vs; // a vector of strings
vs.push_back(s);
vs.push_back("another string");
vs.push_back("the last string");

for(size_t i = 0; i < vs.size(); ++i)
{
std::cout << vs << "\n";
}
return 0;
}



dmurray14 said:
Thanks, I really appreciate it. I would love to use strings, and I
understand everyone knows it's the better way to go, however the
guidelines I was given to work with included using character arrays and
assigning pointers to them. Basically, the program I create needs to
gather a list of words from an input file and do an analysis of how and
when they appear. The current plan is to put these words into a linked
list, alphabetize them, and go from there. However, I'm stuck at
getting the words seperated. Once I can get them into char arrays, I
should be fine. It's just the breaking down of the file into words that
I'm having huge problems with. I'm guessing that if I go with the
character array, the best way to do this will be to go through the file
character by character, looking for spaces and punctuation and putting
words into their nodes in a linked list. However, I don't know how to
do this - I'm not familiar with how to look at each character in a
file, and then how to create a character array ("word") with the data.

Again, thanks for all your help, hopefully this makes it a little more
clear what I'm going for!

Dan
Salt_Peter wrote:

If you really have to, just read each word into a std::string, then copy it
to a c-string.

std::string MyString;
MyISteam >> MyString;
That will get one "word", although when it hits the end of line it won't
read anymore.
One way to deal with this is to read a line at a time, put that into a
stringstream, then read the words out.

std::string Line;
while ( std::getline( MyIStream, Line ) )
{
std::stringstream LineStream;
LineStream << Line;
std::string Word;
while ( LineStream >> Word )
{
// Do something with the word contained in the std::string here.
Since you need CStyle strings...
char Word[100];
strcpy( Word, Word.c_str() );
// Now Word has the word in it, what do you want to do with it?
}
}
 
A

Adrian

Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Here is an example using strings to split out the words. C++ has some nice
features to make it easy. If you really need char * after then you can
convert them once you have the separate word. Otherwise there is lots of
checking involved. If you really want to use char * have a look at the
standard library function strtok()

Otherwise this works quite well
#include <fstream>
#include <iostream>
#include <vector>
#include <string>

void split(std::string &line, const std::string &separators,
std::vector<std::string> &words);

int main()
{
std::ifstream in("file.txt");
std::string line;
std::vector<std::string> word_list;
const std::string word_separators(" ,.;:?!");

while(in && getline(in, line))
{
split(line, word_separators, word_list);
}

for(std::vector<std::string>::const_iterator i=word_list.begin();
i!=word_list.end(); i++)
{
std::cout << "Word: [" << *i << "]" << std::endl;
}
}

void split(std::string &line, const std::string &separators,
std::vector<std::string> &words)
{
int n = line.length();
int start, stop;
start = line.find_first_not_of(separators);
while((start >= 0) && (start < n))
{
stop=line.find_first_of(separators, start);
if((stop < 0) || (stop > n))
{
stop = n;
}
words.push_back(line.substr(start, stop - start));
start=line.find_first_not_of(separators, stop+1);
}
}
 
G

Gianni Mariani

BTW - someone else is going to tell you to stop ""TOP POSTING"". Not
me, I don't mind but others will, be ready !
Thanks, I really appreciate it. I would love to use strings, and I
understand everyone knows it's the better way to go, however the
guidelines I was given to work with included using character arrays and
assigning pointers to them.

That sentence makes no sense - you can't assign pointers to a character
array. It might sound like I'm being pedant but it seems like there may
be some confusion so you need to ask.
.... Basically, the program I create needs to
gather a list of words from an input file and do an analysis of how and
when they appear. The current plan is to put these words into a linked
list, alphabetize them, and go from there. However, I'm stuck at

a) why linked list ? Why not insert them directly into a std::map ?
b) exactly what information are you looking for ? Positional information
as well (like line number? location in file ? etc ..)
getting the words seperated. Once I can get them into char arrays, I
should be fine. It's just the breaking down of the file into words that
I'm having huge problems with. I'm guessing that if I go with the
character array, the best way to do this will be to go through the file
character by character, looking for spaces and punctuation and putting
words into their nodes in a linked list. However, I don't know how to
do this - I'm not familiar with how to look at each character in a
file, and then how to create a character array ("word") with the data.

std::istream "knows" how to do this when you read into a std::string.
Again, thanks for all your help, hopefully this makes it a little more
clear what I'm going for!

Dan
Salt_Peter said:
Hey guys,

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Dan
Here is a suggestion. Before trying something like this, get familiar
with std::string and containers like std::vectors. Its considerably
more difficult and error prone to deal with char arrays and pointers.
I'ld avoid pointers altogether and leave new/delete allocations to
ancient history.
A good book would help, consult this newsgroup for some recommended
titles.

#include <iostream>
#include <string>
#include <vector>

int main()
{
std::string s("a short string");

std::vector< std::string > vs; // a vector of strings
vs.push_back(s);
vs.push_back("another string");
vs.push_back("the last string");

for(size_t i = 0; i < vs.size(); ++i)
{
std::cout << vs << "\n";
}
return 0;
}

 
G

Gianni Mariani

Jim Langston wrote:
....
// Do something with the word contained in the std::string here.
Since you need CStyle strings...
char Word[100];

****** WARNING ******* - Security hole - right here!
strcpy( Word, Word.c_str() );

It can't be stressed enough. Don't do an unbounded copy into any array
variable, especially one on the stack and even more especially from user
input. You will be 0wn3d.

Never use "strcpy" or "sprintf" or the like.
 
D

dmurray14

Wow, you guys are a huge help! Seriously, this is great.. I really
appreciate it. Let me respond to Gianni's post:

Gianni said:
BTW - someone else is going to tell you to stop ""TOP POSTING"". Not
me, I don't mind but others will, be ready !

I don't know what "TOP POSTING" is, so sorry if I did something wrong.
That sentence makes no sense - you can't assign pointers to a character
array. It might sound like I'm being pedant but it seems like there may
be some confusion so you need to ask.

OK, sorry if it didn't make sense. Basically I need to store the words
in a character array inside a node of a linked list.
a) why linked list ? Why not insert them directly into a std::map ?
b) exactly what information are you looking for ? Positional information
as well (like line number? location in file ? etc ..)

Those were the requirements for the assignment. After the words are
stored, I need to figure out how many times the words appear, which
should be something that I can handle. Just stuck on getting from the
ifstream down to words in a node, in a character array.

std::istream "knows" how to do this when you read into a std::string.

Which is why I would love to be able to use strings, but again, that
wasn't the assignment. Probably on purpose, unfortunately.
 
D

Daniel T.

I'm a C++ newbie here - I've messed with VB, but I mostly stick to web
languages, so I find C++ to be very confusing at times. Basically, I am
trying to import a text file, but I want to do it word by word. I am
confused as to how to do this. Typically, I would think it would make
sense to try and input the words into strings, but for this application
I need to use character arrays and pointers. So what's the best way to
go about this? I know what I need to do - go character by character and
dump into an array until we get to either a space or some other form of
punctuation, but I'm having trouble getting this into code. If any of
you could share some ideas on how to go about this, it would be much
appreciated! I'm assuming its going to be something like a while loop
that imports characters while != a space, comma, period, etc, then
stops when it gets to that. But again - not sure how to do this by
character - I'm used to strings.

Thanks a lot, much appreciated!

Based on the conversation so far, I get the impression that this is a
homework assignment, hence the limit on what you are allowed to use of
the language.

There are a couple of ways you can attack this problem depending on what
parts of the language you *can* use. For example, you can't use
std::string, but can you use std::vector? (A vector<char> can make a
handy string replacement.) Can you write your own classes? (You can hack
out a small string class of your own.)

Do this, write the program so it works on a text file that contains only
one word. Let me see what you end up with and I'll help you extend it
from there.
 
M

ma740988

Gianni said:
It can't be stressed enough. Don't do an unbounded copy into any array
variable, especially one on the stack and even more especially from user
input. You will be 0wn3d.

Never use "strcpy" or "sprintf" or the like.

Intesting. Assume for the moment that two processors ( one card with
two processors on it ) communicate with each other via a struct called
test. The struct test an area of memory called 'shared memory' that
both processors sees. Details aside, here's my ( ingoring the shared
memory business) current approach to this.

struct test {
char input1 [ max ];
char input2 [ max ];
char input3 [ max ];
char input4 [ max ];

};
# include <sstream>
int main()
{
double const velocity ( -44.222 ) ;
std::eek:stringstream oss;
oss << " SNR value is " << velocity << std::endl;
test t;
strcpy ( t.input1, oss.str().c_str() ) ;
std::cout << t.input1 << std::endl;
}

How would I copy the contents to input1 without the use of strcpy?


An aside:
In my application the code - on processor A - is akin to:

int const address_of_test ( 0x3F000000 );

test * ptr_test ( 0 );
ptr_test = ( test *)( address_of_test ) ;
double const velocity ( -44.222 ) ;
std::eek:stringstream oss;
oss << " SNR value is " << velocity << std::endl;
strcpy ( ptr_test->input1, oss.str().c_str() ) ;

NOTE: address_of_test is where the struct is created and both a and b
have access to said location. I'm thinking placement new would be
better here, but I need to do some more reading.
 
D

dmurray14

Daniel said:
Based on the conversation so far, I get the impression that this is a
homework assignment, hence the limit on what you are allowed to use of
the language.

There are a couple of ways you can attack this problem depending on what
parts of the language you *can* use. For example, you can't use
std::string, but can you use std::vector? (A vector<char> can make a
handy string replacement.) Can you write your own classes? (You can hack
out a small string class of your own.)

Do this, write the program so it works on a text file that contains only
one word. Let me see what you end up with and I'll help you extend it
from there.

It is in fact work related to a class I'm taking, yes. So far, my plan
of attack is as follows: use getline to dump each line of the file into
a character array. Then, I'm going to run through the character array
looking for spaces, and pull out words every time I get to a space,
sending them into their own nodes. From there it should be easy to make
the framework to check to see whether the word has appeared before and
if so,just add to the count.

I don't think we're supposed to be making classes, no. The idea is
likely to stick to the basics so we get the concepts. Hopefully this
will work out...
 
D

Daniel T.

dmurray14 said:
It is in fact work related to a class I'm taking, yes. So far, my plan
of attack is as follows: use getline to dump each line of the file into
a character array. Then, I'm going to run through the character array
looking for spaces, and pull out words every time I get to a space,
sending them into their own nodes. From there it should be easy to make
the framework to check to see whether the word has appeared before and
if so,just add to the count.

OK, so start with a file that has only one word in it. You need to read
in the word and output what word it is and that it was used once. Can
you get that working?

If you do the above, that will (a) give me an idea of what you are
allowed to use of the language, and (b) give me an idea of how good you
are and therefore what you probably need help with.
 
B

BobR

dmurray14 wrote in message ...
Wow, you guys are a huge help! Seriously, this is great.. I really
appreciate it. Let me respond to Gianni's post:

That part in your post was top-posting. The rest of that post was ok
"format".

We don't like to see the cart before the horse, the answer before the
question, etc..

Mr. Steinbach puts it:
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Another thing to mention:
Trim (delete) anything in a prior post that you are not responding to. Like:
if you are posting your newly corrected program, delete the old posted
program. We can simply refer to that post up-thread if we need to review it.
It wastes bandwith and needlessly takes up space on our hard drives.

These are not laws, just courtesy.
 
B

BobR

ma740988 wrote in message ...
Gianni said:
It can't be stressed enough. Don't do an unbounded copy into any array
variable, especially one on the stack and even more especially from user
input. You will be 0wn3d.

Never use "strcpy" or "sprintf" or the like.

Intesting. Assume for the moment that two processors ( one card with
two processors on it ) communicate with each other via a struct called
test. The struct test an area of memory called 'shared memory' that
both processors sees. Details aside, here's my ( ingoring the shared
memory business) current approach to this.

struct test {
char input1 [ max ];
char input2 [ max ];
char input3 [ max ];
char input4 [ max ];
};
# include <sstream>
int main(){
double const velocity ( -44.222 ) ;
std::eek:stringstream oss;
oss << " SNR value is " << velocity << std::endl;
test t;
strcpy ( t.input1, oss.str().c_str() ) ;
std::cout << t.input1 << std::endl;
}

How would I copy the contents to input1 without the use of strcpy?

Use 'strncpy' or 'std::copy'.
 
D

dmurray14

BobR said:
Use 'strncpy' or 'std::copy'.

Thanks...seems like it would work but I'm going crazy, it isn't.

I have it copying into a character array just fine. I can print it out
and everything is exactly as it should be. Now all that's left to do is
split the lines into words, and I JUST CAN'T figure it out for the life
of me! I've tried strcmp, it tells me it needs Char* (which I thought
was what I was giving it, but apparently no good), and I've tried even
doing something like this, which doesn't work:

//cp = "Where in the world is carmen sandiego"

char char1[] = "W";
char char2[] = cp[0];

strcmp(char1, char2);

Even that won't work. I can't even get it to compare the first letter,
let alone the whole word! I am now completely lost as to how to break
these strings apart into words. Please help!

Dan
 
D

Daniel T.

dmurray14 said:
BobR said:
Use 'strncpy' or 'std::copy'.

Thanks...seems like it would work but I'm going crazy, it isn't.

I have it copying into a character array just fine. I can print it out
and everything is exactly as it should be. Now all that's left to do is
split the lines into words, and I JUST CAN'T figure it out for the life
of me! I've tried strcmp, it tells me it needs Char* (which I thought
was what I was giving it, but apparently no good), and I've tried even
doing something like this, which doesn't work:

//cp = "Where in the world is carmen sandiego"

char char1[] = "W";
char char2[] = cp[0];

strcmp(char1, char2);

Even that won't work. I can't even get it to compare the first letter,
let alone the whole word! I am now completely lost as to how to break
these strings apart into words. Please help!

You are having problems because you are trying to solve the problem the
wrong way. You need to take a "vertical slice" of the problem instead.
Get it to work when the file only has one word in it first, from
beginning to end, including loading the word in the linked list (was
that class provided by your teacher?) Show us the code and we can help
you from there.
 
B

BobR

dmurray14 wrote in message ...
//cp = "Where in the world is carmen sandiego"
char char1[] = "W";
char char2[] = cp[0];
strcmp(char1, char2);

Even that won't work. I can't even get it to compare the first letter,
let alone the whole word! I am now completely lost as to how to break
these strings apart into words. Please help!
Dan

You can compare two chars directly.

{
char cp[] = "Where in the world is carmen sandiego";
char char1 = 'W'; // note singlequote
if( cp[0] == char1 ){
std::cout <<" cp[0] == char1 "<<std::endl;
}
else{
std::cout <<" cp[0] != char1 "<<std::endl;
}
// out: cp[0] == char1

char cp2[] = "Where in the world is Carmen Sandiego";
if( strcmp( cp, cp2) == 0 ){
std::cout <<" cp == cp2 "<<std::endl;
}
else{
std::cout <<" cp != cp2 "<<std::endl;
}
// out: cp != cp2

int dif = std::strncmp( cp, cp2, 5);
std::cout <<" strncmp( cp, cp2, 5) ="<<dif<<std::endl;
// out: strcmp( cp, cp3, 5) =0
}

Some other things that may help you (in header <cctype>)
std::isalpha
std::isdigit
std::ispunct
std::isspace
std::tolower
std::toupper
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top