Any way to take a word as input from stdin ?

A

arnuld

Could you identify the std::string feature that implements this? I
couldn't find any use of the word "word" anywhere in section 21 of the
C++ standard, which describes std::string.


I think we have to look at the source code of std::string library and see
how it is implemented. I am sure it is done using C way, arrays and
pointers ;)
 
I

Ian Collins

arnuld said:
I think std::string in C++ defines what exactly *definition* of word is.
Look at my code and see how std::string works and perhaps we can settle on
some common and standard meaning word.


I "token" what you are looking for?

you see even if you put a line as input, std::string will automatically
dissect it into separate words.
No, it will not.

<OT> The input stream tokenises the input. The C++ standard defines how
formatted input is tokenised. </OT>
 
A

arnuld

Even if we could, it would only be *our* definition, not a universal
definition.

Let me give you an example from ordinary English, where whitespace
delimiters are not sufficient:

Problem: design an algorithm for removing punctuation from arbitrary
English sentences, *without* removing punctuation that actually belongs to
the word (example: "will-o'-the-wisp" must retain its three hyphens and
its apostrophe).


That means, we will also have a function containing all of the words with
intended hyphens and apostrophes to which we will compare the input words.
Hence that function will be used at run time and will have millions of
words, hence will be very expansive to run. If the user wants to enter
comp.lang.3c as words then its his choice or stupidity.Let hi do this way,
why we need to think about it.



As Knuth would say: [50]

I don't know what that means .
 
R

Richard

Richard Heathfield said:
arnuld said:
That means, we will also have a function containing all of the words with
intended hyphens and apostrophes to which we will compare the input
words.

Not necessarily a function, but yes, we would need some kind of dictionary
- and even then, we wouldn't be done, because some French or German or
Spanish or Czech or Polish or Slovakian or Turkish geezer would come along
and say "you call those words? Those aren't words - THESE are words...",
and give you a whole new set of problems.

The lesson here is that there is no single answer that will satisfy
everyone.
As Knuth would say: [50]

I don't know what that means .

<sigh> I know.

I wonder how many people do?

But then 90% of SW Engineers never read Knuth or possibly they tried it
and found it impenetrable. Only in c.l.c is it recommended as a "great
way to learn programming". I still smile when I remember that thread.

So, basically, I don't know what that means either.
 
C

CBFalconer

arnuld said:
.... snip ...

I think std::string in C++ defines what exactly *definition* of
word is. Look at my code and see how std::string works and perhaps
we can settle on some common and standard meaning word. I don't
like to put C++ code in a C group and I think I don't have any
choice to define what a word is:

/* A program that will ask user for input and then will print
* them in an alphabetical order
*
* VERSION 1.1
*/

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

Please restrain your C++ writings to comp.land.c++. This is c.l.c
and they are off-topic here. C++ is a different language.
 
A

arnuld

..SNIP...
If you want
to discuss which definition of "word" (or "token", or whatever) is
correct, that's not a C question, nor is it really an answerable
question.

okay, that seems a good reply. I mean, we make it topical to C again as I
lost in the confusion a little. so *my* definition of word will be the
same one yo told earlier:

A "word" is a non-empty contiguous sequence of characters other
than space, tab, or newline, preceded or followed either by a
space, tab, or newline or by the start or end of the input.


Now you earlier questions down here:
It would also be good to specify whether the input is a string, a line
of text, or an entire text file.

in the current case, it is a "word" from terminal, the word we just
defined.


so lets code it :)
 
R

Richard

CBFalconer said:
Please restrain your C++ writings to comp.land.c++. This is c.l.c
and they are off-topic here. C++ is a different language.

Please get lost. This was on topic and in a discussion with how best to
approach certain issues. Only a complete idiot like you would try to do
that without considering already researched (partial) solutions.
 
A

arnuld

Now since *my* definition of word is done. Here is the outline of the
program:



/* It will ask the user for input and will print the input
* in alphabetcial order when user will hit EOF (Ctrl-D on Linux)
*
* VERSION 1.0
*
*/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void get_words( char** );
void sort_words( char** );
void printf_words( char** );


int main( int argc, char* argv[] )
{
char** pda;

get_words( pda );

sort_words( pda );

print_words( pda );

return EXIT_SUCCESS;
}



SOLUTION: pda is an array of pointers, where pointers are pointing to
different words input by the user ( which are in fact arrays of characters
terminated by null, which means they are string literals of C, which means
it is still inherently confusing to me )

So we have an array of arrays. when we want to sort the input, we will
just sort the pointers pointers to words, rather than sorting the arrays
themselves. That will be much more efficient and is an idea i learned from
K&R2 :) . We don't sort the string literals, we will sort the pointers
pointing to them.


2nd we don't have any idea on how many words a user will enter, so we will
use dynamic memory allocation , which I am going to learn for first time,
so please run me in the wrong way ;) . 3rd, we do have an idea on the
maximum length of the word. Wikipedia says, the longest English word is
189819 characters long, a chimcal name for some sort of protein:

http://en.wikipedia.org/wiki/Longest_word_in_English


which I *assume* user is not going to enter. I will limit the longets
words to what we call "Longest word in Shakespeare's work" whihc is 27
characters long, hence limiting the array size to be used to store the
words to 28.

good idea ?
 
A

arnuld

Please get lost. This was on topic and in a discussion with how best to
approach certain issues. Only a complete idiot like you would try to do
that without considering already researched (partial) solutions.


Though I really appreciate that you supported me and I think you are
disrespecting him by calling him an idiot. Its not what I think of him
when he replied. Chuck said so because he respects clc like all of us.
Though he could have added something like "it is ok for this time but no
C++ next time. you know better", to his reply. If he did not add that it
does not mean anyone should disrespect him. He is looking for the
well-being of clc , like me and I understood his reasoning.

Only trolls should be disrespected by not replying to their posts ;)
 
A

arnuld

I don't think this is going to cut it.

But what if something goes wrong? You'll need to be able to report an
error. The natural way to do this is via a return value, which means we
can't use that value for either the list or the count, and that leads us
to:

int get_words(char ***, size_t *);

why *** , 3 levels of indirection ? when we pass an array of characters
as an argument to a function, it becomes a pointer, single * . Hence when
we will pass an array of pointers, it will become **.



Not string literals - just strings.

string literal, string and string constant aren't 3 names for a single
thing ?


Up to you, but I wouldn't bother setting a limit (or, if I did, I'd set
it at a million or so, and treat any string longer than that as a
reportable error). With dynamic allocation, you don't /need/ to set a
limit; you simply allocate as you go, and reallocate if necessary.


so you want to dynamically allocate both the single word and the array of
words.
 
J

James Kuyper

arnuld wrote:
....
I think std::string in C++ defines what exactly *definition* of word is.
Where?

Look at my code and see how std::string works and perhaps we can settle on
some common and standard meaning word.

The C++ standard does not provide a name to describe what it is that
operator>> extracts into a string; but the most generally used term for
that kind of thing is "token", not "word".

The std::string operator>> overload reads in delimited tokens. By
default, the set of delimiters is the set of characters that are
considered to be spacing characters under the currently imbued locale.
This default can be overridden.
 
A

arnuld

Yes, but you're not passing an array of pointers. You're trying to pass a
pointer to a pointer to char - which is fine, but it means that any
changes made to the pointer value within the function (and there *will* be
changes) will be local to that function. That isn't what you want.


I don't get it to be true. You can never pass an array as value,
arrays are *always* passed by reference. It means when I
pass the name of an array of characters to a function as an argument, then
any changes made to the array will be made to the original array because
when you pass an array to a function as an argument, it will be changed to
a pointer to its first element:


char arrc[3] = { 'a', 'z', '\0'};
char* pc;

pc = arrc;

some_function( arrc );
some_function( pc );

both calls are same, right ?


Now when we will pass an array of pointers to some function, then it will
be converted as pointer to its first element ( which in fact is already a
pointer) hence it will be passed as pointer to pointer to char and with
that we can modify the original elements:


#include <stdio.h>
#include <ctype.h>

enum { ARRSIZE = 2 };

void edit_first_element_arrp( char** ppc );


int main( void )
{
char* p1;
char* p2;
char** p_arrp;


char* arrp[ARRSIZE] = { 0 };

p1 = p2 = NULL;

arrp[0] = p1;
arrp[1] = p2;

p_arrp = arrp;

edit_first_element_arrp( p_arrp );

/* pointer has moved, so take it to the original position */
p_arrp = arrp;

printf("arrp[0] = %c\n", **p_arrp++);
printf("arrp[1] = %c\n", **p_arrp);

return 0;
}



void edit_first_element_arrp( char** ppc )
{
int idx;

for( idx = 0; idx != ARRSIZE; ++idx )
{
if( ! (idx) )
{
**ppc++ = 'Z';
}
}
}


Hence we can change the values of p1 and p2 pointing to. but this function
Segfaults :(




Two of them are two names for a single thing. Although "string literal"
is the formal term for a string literal, people will know what you mean
if you say "string constant". But consider this:

char foo[3];
foo[0] = 'H';
foo[1] = 'i';
foo[2] = '\0';

foo now contains a string, but no string literals are involved.

So what is a string literal ?



Yes. I think that's the best approach.


okay, first I will try to test the dynamic version of get_single_word
function. which will just make a single word out of some input characters.
 
J

James Kuyper

CBFalconer said:
arnuld wrote:
... snip ...

Please restrain your C++ writings to comp.land.c++. This is c.l.c
and they are off-topic here. C++ is a different language.

The only way he knows how to clearly describe what he wants his code to
do is by providing a C++ example; this has been made abundantly clear by
his failed attempts to clearly describe it in English. However, the code
he wants to write should be in C.

If he were to post this same question to comp.lang.c++, and there were a
C++BFalconer on comp.lang.c++, C++BFalconer would certainly respond by
saying that this C question is off-topic in comp.lang.c++. Should arnuld
then simply remain silent about his question?
 
J

James Kuyper

arnuld said:
I think we have to look at the source code of std::string library and see
how it is implemented. I am sure it is done using C way, arrays and
pointers ;)

No, that will only tell you what std::string actually does. It will not
tell you what the meaning of the word "word" is. For that, you have to
search the relevant documentation, the C++ standard - and that
documentation never uses the word "word" to describe what std::string does.
 
B

Ben Bacarisse

arnuld said:
I don't get it to be true. You can never pass an array as value,
arrays are *always* passed by reference. It means when I
pass the name of an array of characters to a function as an argument, then
any changes made to the array will be made to the original array because
when you pass an array to a function as an argument, it will be changed to
a pointer to its first element:

Richard is taking about the pointer to the whole array. A function
that takes: void get_word(char **words); can change words[32] to point
to some new string just found. It can change words[32][0] to be 'x',
but it can't change words itself. Well, it can, but the effect will
be lost when the function returns.

The most important change you need to make is that you will have to
realloc the space for the char * array. This is, of course, a char
**, but if the function is to change a char ** this is outside and
passed in, that parameter must be a char ***.
char arrc[3] = { 'a', 'z', '\0'};
char* pc;

pc = arrc;

some_function( arrc );
some_function( pc );

both calls are same, right ?

Yes, but some_function can't make pc point to a bigger array if
needed. pc will point to the same place after the call.
Now when we will pass an array of pointers to some function, then it will
be converted as pointer to its first element ( which in fact is already a
pointer) hence it will be passed as pointer to pointer to char and with
that we can modify the original elements:


#include <stdio.h>
#include <ctype.h>

enum { ARRSIZE = 2 };

void edit_first_element_arrp( char** ppc );


int main( void )
{
char* p1;
char* p2;
char** p_arrp;


char* arrp[ARRSIZE] = { 0 };

p1 = p2 = NULL;

arrp[0] = p1;
arrp[1] = p2;

All these last three lines make no changes. Both elements of arrp are
already NULL.
p_arrp = arrp;

edit_first_element_arrp( p_arrp );

/* pointer has moved, so take it to the original position */
p_arrp = arrp;

No. p_arrp can't be change by the call. This is a key thin about C
and applied to all types:

void f(int x);
...
int x = 42;
f(x);

x is guaranteed to be unchanged here. The same applies it x is a
pointer or a pointer to a pointer or a pointer to a pointer to a
pointer or...
printf("arrp[0] = %c\n", **p_arrp++);
printf("arrp[1] = %c\n", **p_arrp);

return 0;
}



void edit_first_element_arrp( char** ppc )
{
int idx;

for( idx = 0; idx != ARRSIZE; ++idx )
{
if( ! (idx) )
{
**ppc++ = 'Z';

*ppc is NULL -- you it to be NULL before the call. You can write any
value into **ppc.
}
}
}


Hence we can change the values of p1 and p2 pointing to. but this function
Segfaults :(

See above.
Two of them are two names for a single thing. Although "string literal"
is the formal term for a string literal, people will know what you mean
if you say "string constant". But consider this:

char foo[3];
foo[0] = 'H';
foo[1] = 'i';
foo[2] = '\0';

foo now contains a string, but no string literals are involved.

So what is a string literal ?

It is a sequence of characters (and escaped chracters) between ""s.
I.e. it is there, literally, in your program's text.
 
A

arnuld

Richard is taking about the pointer to the whole array.

pointer to the whole array ? char* is a pointer to char, int** is a
pointer to pointer to int. How you get pointer to array, I mean what type
it is?


A function
that takes: void get_word(char **words); can change words[32] to point
to some new string just found.

Right. And words++ will take us to the 2nd element of the array.

It can change words[32][0] to be 'x',
but it can't change words itself. Well, it can, but the effect will
be lost when the function returns.

Now here is the problem where my understanding about pointers and arrays
blows away:

get_word( char* words[3] )

so we can change where words[0], [1] and [2] point because array will be
converted to pointer to first element and pointer *always* changes the
original element.




Yes, but some_function can't make pc point to a bigger array if needed.
pc will point to the same place after the call.

yes, it means I can understand arrays and pointers :)



arrp[0] = p1;
arrp[1] = p2;
All these last three lines make no changes. Both elements of arrp are
already NULL.

There is difference. First array had NULL elements. Now arrays has
pointers which point to NULL. There is a difference.



No. p_arrp can't be change by the call. This is a key thin about C and
applied to all types:

void f(int x);
...
int x = 42;
f(x);

x is guaranteed to be unchanged here. The same applies it x is a
pointer or a pointer to a pointer or a pointer to a pointer to a pointer
or...


That I know, x is a variable in the example and variables are passed as
value. Pointers and arrays are passed as references, hence we can change
the original elements.

*ppc is NULL -- you it to be NULL before the call. You can write any
value into **ppc.


then why that values does not appear ?


It is a sequence of characters (and escaped chracters) between ""s. I.e.
it is there, literally, in your program's text.

I got it. What we pass to printf() is a string literal.
 
A

Andrew Poelstra

As Knuth would say: [50]

I don't know what that means .

In the series /The Art of Computer Programming/ by Donald
Knuth, which is probably the greatest book on mathematical
computing ever written, problems are given at the end of
each section with a numerical code indicating their difficulty.

A code of [01], for example, you should be able to answer
in your head without pausing. A code of [50] means that,
if you solve the problem, you will have been the first
in the history of mathematics to do so.

The point is that I highly recommend you pick up a copy of
at least the first three volumes of this work, and when
you are able, read though them all.
 
B

Ben Bacarisse

arnuld said:
pointer to the whole array ? char* is a pointer to char, int** is a
pointer to pointer to int. How you get pointer to array, I mean what type
it is?

I was being a bit vague. Lets leave actual array pointers out of
this. I mean that Richard was talking about changing the char ** as
seen from the calling function. The thing you are intending to pass,
a char **, is in some sense a pointer to the whole array: from it all
of the array's data is accessible. The trouble is you can can't
change this char ** inside the function -- not in a way that has any
effect outside. All you can do is change the various things it points
to.

If a function needs to change an int, you pass an int *. If it needs
to change int *, you pass an int **. If it needs to change and int **
you must pass an int ***.
A function
that takes: void get_word(char **words); can change words[32] to point
to some new string just found.

Right. And words++ will take us to the 2nd element of the array.

Right. With no visible effect outside. Just as:

void f(int x)
{
x++; /* changes x but has no effect on anything passed */
}
It can change words[32][0] to be 'x',
but it can't change words itself. Well, it can, but the effect will
be lost when the function returns.

Now here is the problem where my understanding about pointers and arrays
blows away:

get_word( char* words[3] )

(First, the declaration is confusing because the 3 has no effect.
Pretend you wrote get_word(char **words);).
so we can change where words[0], [1] and [2] point because array will be
converted to pointer to first element and pointer *always* changes the
original element.

Absolutely. Now, having set words[0], words[1] and words[2] what
happens when you need to set sets words[3]. You can't. You need to
realloc some more space (always assuming that this is how the function
is supposed to work). That means changing words:

char **new_space = realloc(words, new_size * sizeof *new_space);
if (new_space) {
/* set up new space with all the right pointer in it... */
words = new_space;
}

Now what? words has more space and you can set words[3], but the
calling function will never see it. The calling function will still
have the old vale of that is passed (we can't even say what it is
called since it is just a pointer value) and, worse, that pointer now
points to storage invalidated by the realloc call.
Yes, but some_function can't make pc point to a bigger array if needed.
pc will point to the same place after the call.

yes, it means I can understand arrays and pointers :)



arrp[0] = p1;
arrp[1] = p2;
All these last three lines make no changes. Both elements of arrp are
already NULL.

There is difference. First array had NULL elements. Now arrays has
pointers which point to NULL. There is a difference.

No. I don't know how to explain this because I can't see the source
of your confusion. Writing:

char* arrp[ARRSIZE] = { 0 };

p1 = p2 = NULL;

arrp[0] = p1;
arrp[1] = p2;

as you did, is just like writing:

int arr[ARRSIZE] = { 42, 42 };

i1 = i2 = 42;
arr[0] = i1;
arr[1] = i2;

All the elements were 42 to start with and the are 42 after the
assignments. All I did was change the type. Everything is an int
rather than a char *.
That I know, x is a variable in the example and variables are passed as
value. Pointers and arrays are passed as references, hence we can change
the original elements.

Excellent! It words the same with a pointer -- provided you think
about the value of the pointer itself
then why that values does not appear ?

Typo! I meant you *can't* write any value into **ppc! Sorry. There
are two typos, I now see. It should have read: "*ppc is NULL -- you
set it to be NULL before the call. You can't write any value into
**ppc."
 
J

James Kuyper

arnuld wrote:
....

I won't address most of your questions, because I'm short of time and
the answers are complicated; I'll let Richard or Ben take care of that.
I'll just address one thing where the answer is simple:
I got it. What we pass to printf() is a string literal.

The format string passed to printf is often a string literal; the other
arguments can be string literals, but often aren't. However, it's quite
feasible to call printf() without using any string literals.

The following code is simplified for purpose of exposition by failing to
checking for the validity, or even the presence, of command line
arguments in any way. This is NOT recommended.

#include <stdio.h>
int main(int argc, char *argv[])
{
printf(argv[1], argv[2]);
return 0;
}


What is passed to printf in that case is two pointers to char. No string
literals are involved in any way.
 
K

Keith Thompson

arnuld said:
Though I really appreciate that you supported me and I think you are
disrespecting him by calling him an idiot. Its not what I think of him
when he replied. Chuck said so because he respects clc like all of us.
Though he could have added something like "it is ok for this time but no
C++ next time. you know better", to his reply. If he did not add that it
does not mean anyone should disrespect him. He is looking for the
well-being of clc , like me and I understood his reasoning.

Only trolls should be disrespected by not replying to their posts ;)

Richard no-last-name has made a hobby out of insulting Chuck Falconer
at every opportunity, even dragging his name into discussions in which
Chuck has not participated.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top