Do I understand pointers?

R

Rob Morris

Hi, I'm teaching myself C for fun. I wrote the litle program listed
below to convert rot13 text. It reads one char at a time and converts
it via pointers.

The constant char* letters holds the alphabet. I subract the pointer
returned from strchr from the address of letters to get the location
within the alphabet, then rot13 it. My question is, is this safe and
legal? (It works on my windows machine BTW.) I googled for programs to
do this and they mostly just subtracted 'a' from the input, which I
gather is ASCII only - so is my program any more portable?

Would you recommed another method for things like this?

Many thanks,
Rob Morris

#include <stdio.h>
#include <string.h>

int main (void)
{
int in, out;
char *loc;
char *letters="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;
putchar(out);
}
return 0;
}
 
M

Martin Dickopp

Rob Morris said:
Hi, I'm teaching myself C for fun. I wrote the litle program listed
below to convert rot13 text. It reads one char at a time and converts
it via pointers.

The constant char* letters holds the alphabet.

The only (minor) criticism I have about your program is that you did not
tell the compiler that these pointers point to something unmodifiable,
i.e. I would recommend

const char *loc;
const char *letters = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

Since the latter pointer itself never changes its value, you could also
const-qualify the pointer itself:

const char *const letters = "AaBb...";
I subract the pointer returned from strchr from the address of letters
to get the location within the alphabet, then rot13 it. My question
is, is this safe and legal? (It works on my windows machine BTW.)

Yes, that's safe and valid. You can subtract two pointers if they point
to the same object. In this case, both point to the same string literal
(`letter' to its initial element, `loc' somewhere "into" it) when you
subtract them.
I googled for programs to do this and they mostly just subtracted 'a'
from the input, which I gather is ASCII only

That's not ASCII only, but it assumes that the letters are consecutive
and ordered. This is the case for ASCII and possibly other encodings,
but there are still other encodings for which it isn't true. Anyway,
the C standard doesn't guarantee consecutiveness and order of the
letters, so one shouldn't rely on it in portable programming.
- so is my program any more portable?
Yes.

Would you recommed another method for things like this?

No, this one is fine.
#include <stdio.h>
#include <string.h>

int main (void)
{
int in, out;
char *loc;
char *letters="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;
putchar(out);
}
return 0;
}

Martin
 
S

Stephen L.

Rob said:
Hi, I'm teaching myself C for fun. I wrote the litle program listed
below to convert rot13 text. It reads one char at a time and converts
it via pointers.

The constant char* letters holds the alphabet. I subract the pointer
returned from strchr from the address of letters to get the location
within the alphabet, then rot13 it. My question is, is this safe and
legal? (It works on my windows machine BTW.) I googled for programs to
do this and they mostly just subtracted 'a' from the input, which I
gather is ASCII only - so is my program any more portable?

Would you recommed another method for things like this?

Many thanks,
Rob Morris

#include <stdio.h>
#include <string.h>

int main (void)
{
int in, out;
char *loc;
char *letters="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;
putchar(out);
}
return 0;
}

Rob,

I built your program and provided
the following input -

AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz

and received the following output -

NnOoPpQqRrSsTtUuVvWwXxYyZzAaBbCcDdEeFfGgHhIiJjKkLlMm

I'd say you achieved your objective.
Everything you've done is, IMHO, quite
legal and portable.

I'd also say that your understanding
far exceeds pointers, and you'll be
one hell-of-a C programmer someday.
What programming background are you
comming from?

BTW, I built it on Sun Solaris 8 using gcc.


Stephen
 
R

Rob Morris

Stephen said:
Rob,

I built your program and provided
the following input -

AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz

and received the following output -

NnOoPpQqRrSsTtUuVvWwXxYyZzAaBbCcDdEeFfGgHhIiJjKkLlMm

I'd say you achieved your objective.
Everything you've done is, IMHO, quite
legal and portable.

I'd also say that your understanding
far exceeds pointers, and you'll be
one hell-of-a C programmer someday.
What programming background are you
comming from?

BTW, I built it on Sun Solaris 8 using gcc.


Stephen

Gee, thanks. My programming background so far has been the barest
amount needed for a mechanical engineering degree. Hundred-line programs.

Thanks for the critique. I hope to someday reach the point where
pointers no longer scare me.

OTOH I'm quite tempted to turn to the dark side and learn the windows
API, whereupon portability will become a foreign concept to me. Oh well :)
 
M

Malcolm

Rob Morris said:
I googled for programs to do this and they mostly just subtracted 'a'
from the input, which I gather is ASCII only - so is my program any
more portable?
Your way is more portable. ASCII is so common that many programs assume it,
but C doesn't require it.
Would you recommed another method for things like this?
The program is perfectly ok.

I would write a function

char rot13(char ch)
{
}

since the process of rot13 encoding is naturally a function. Armed with
this, we can then apply it to streams or strings as we desire.
 
C

Chris Torek

[much snippage]

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;

There is one minor "bug-ette", as it were, that may even be impossible
to trigger depending on one's system. If you can somehow arrange for
getchar() to return 0 -- the '\0' character -- from a text stream
(stdin), the strchr() call will find the 0 at offset 52, so that
"loc-letters" is 52. 52 + 26 is of course 78, and 78 % 52 is 26,
and letters[26] is 'N', so a 0 byte is translated into an uppercase
N, as if it were an uppercase A originally.

This only occurs because strchr()'s specification says that if you
look for '\0', it should return a pointer to the '\0' marking the
end of the string. If you decide that translating '\0' to 'N' is
not desired -- even though no '\0' should appear in text in the
first place -- you can just check for it explicitly, e.g.:

loc = in ? strchr(letters, in) : NULL;

or:

if (loc != NULL && in != '\0')

for instance.
 
R

Rob Morris

Chris said:
[much snippage]

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;


There is one minor "bug-ette", as it were, that may even be impossible
to trigger depending on one's system. If you can somehow arrange for
getchar() to return 0 -- the '\0' character -- from a text stream
(stdin), the strchr() call will find the 0 at offset 52, so that
"loc-letters" is 52. 52 + 26 is of course 78, and 78 % 52 is 26,
and letters[26] is 'N', so a 0 byte is translated into an uppercase
N, as if it were an uppercase A originally.

This only occurs because strchr()'s specification says that if you
look for '\0', it should return a pointer to the '\0' marking the
end of the string. If you decide that translating '\0' to 'N' is
not desired -- even though no '\0' should appear in text in the
first place -- you can just check for it explicitly, e.g.:

loc = in ? strchr(letters, in) : NULL;

or:

if (loc != NULL && in != '\0')

for instance.[/QUOTE]

Wow - that's one hard-to-spot bug! Got to test this:

K:\>more writebin.c
#include <stdio.h>

int main(void)
{
FILE *fp;
char buf[3] = {'a', 0, 'a'};

if ((fp=fopen("rot.bin", "wb")) != NULL) {
fwrite(buf, sizeof(char), 3, fp);
fclose(fp);
}
return 0;
}

K:\>gcc -o writebin.exe writebin.c

K:\>writebin

K:\>more rot.bin
a
a

K:\>rot13 <rot.bin
nNn

You're right (of course). Let's fix this:

Better code:

K:\>more rot13-2.c
#include <stdio.h>
#include <string.h>

int main (void)
{
int in, out;
char *loc;
const char *const
letters="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
;

while ((in = getchar()) != EOF) {
loc = in ? strchr(letters, in) : NULL;
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;
putchar(out);
}
return 0;
}


K:\>gcc -o rot13-2.exe rot13-2.c

K:\>rot13-2 <rot.bin
n n

There we go. Not sure how much more useful the program is now, but I
guess the story has an important moral: always think about every
possible input you could get.

Thanks!
Rob
 
N

Neil Cerutti

Hi, I'm teaching myself C for fun. I wrote the litle program
listed below to convert rot13 text. It reads one char at a
time and converts it via pointers.

It's a good program, but it's not enough to indicate that you
necessarily understand everything about pointers. However, you
used them correctly in this case.
The constant char* letters holds the alphabet. I subract the
pointer returned from strchr from the address of letters to get
the location within the alphabet, then rot13 it. My question
is, is this safe and legal? (It works on my windows machine
BTW.) I googled for programs to do this and they mostly just
subtracted 'a' from the input, which I gather is ASCII only -
so is my program any more portable?

Would you recommed another method for things like this?

As reported, it's more portable, but it's slower. You can avoid
the overhead of strchr by converting your function into data: a
lookup table. However, a charset independent table will use more
memory. and requires run-time initialization to be portable.

For example, this program writes a header file that allows you to
convert to rot13. As long as your build dependencies are set up
correctly, I think it's portable.

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *f;
char rot13[CHAR_MAX];
int i;

/* Init the table */
for (i = 0; i < CHAR_MAX; ++i) {
rot13 = i;
}
/* Works with all char sets for the english alphabet. */
rot13['a'] = 'n';
rot13['b'] = 'o';
rot13['c'] = 'p';
rot13['d'] = 'q';
rot13['e'] = 'r';
/* etc... 46 rote lines deleted to save bandwidth. */
rot13['Z'] = 'M';

/* write the header file */
f = fopen("rot13.h", "w");
if (f == NULL) return EXIT_FAILURE;

fprintf(f, "/* rot13.h: machine generated. Do not edit. */\n\n");
fprintf(f, "char rot13[] = {\n 0");
for (i = 1; i < CHAR_MAX; ++i) {
if (i%15 == 0) fprintf(f, ",\n %d", i);
else fprintf(f, ", %d", rot13);
}
fprintf(f, "\n};\n");
fclose(f);
return 0;
}
 
M

Martin Dickopp

Neil Cerutti said:
As reported, it's more portable, but it's slower. You can avoid
the overhead of strchr by converting your function into data: a
lookup table. However, a charset independent table will use more
memory. and requires run-time initialization to be portable.

For example, this program writes a header file that allows you to
convert to rot13. As long as your build dependencies are set up
correctly, I think it's portable.

Not necessarily: The standard doesn't guarantee that an object can
larger than 65535 bytes, but `CHAR_MAX' could well be much larger than
that.

Also, users with (e.g.) 32 bit bytes will probably not appreciate it
if a simple rot13 program takes up 2 gigabytes of memory.
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *f;
char rot13[CHAR_MAX];
int i;

/* Init the table */
for (i = 0; i < CHAR_MAX; ++i) {
rot13 = i;
}
/* Works with all char sets for the english alphabet. */
rot13['a'] = 'n';
rot13['b'] = 'o';
rot13['c'] = 'p';
rot13['d'] = 'q';
rot13['e'] = 'r';
/* etc... 46 rote lines deleted to save bandwidth. */
rot13['Z'] = 'M';

/* write the header file */
f = fopen("rot13.h", "w");
if (f == NULL) return EXIT_FAILURE;

fprintf(f, "/* rot13.h: machine generated. Do not edit. */\n\n");
fprintf(f, "char rot13[] = {\n 0");
for (i = 1; i < CHAR_MAX; ++i) {
if (i%15 == 0) fprintf(f, ",\n %d", i);
else fprintf(f, ", %d", rot13);
}
fprintf(f, "\n};\n");
fclose(f);
return 0;
}


Martin
 
N

Neil Cerutti

It's a good program, but it's not enough to indicate that you
necessarily understand everything about pointers. However, you
used them correctly in this case.


As reported, it's more portable, but it's slower. You can avoid
the overhead of strchr by converting your function into data: a
lookup table. However, a charset independent table will use more
memory. and requires run-time initialization to be portable.

However, there's no run-time initialization required for the
solution below.
For example, this program writes a header file that allows you
to convert to rot13. As long as your build dependencies are
set up correctly, I think it's portable.

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *f;
char rot13[CHAR_MAX];
int i;

/* Init the table */
for (i = 0; i < CHAR_MAX; ++i) {
rot13 = i;
}
/* Works with all char sets for the english alphabet. */
rot13['a'] = 'n';
rot13['b'] = 'o';
rot13['c'] = 'p';
rot13['d'] = 'q';
rot13['e'] = 'r';
/* etc... 46 rote lines deleted to save bandwidth. */
rot13['Z'] = 'M';

/* write the header file */
f = fopen("rot13.h", "w");
if (f == NULL) return EXIT_FAILURE;

fprintf(f, "/* rot13.h: machine generated. Do not edit. */\n\n");
fprintf(f, "char rot13[] = {\n 0");
for (i = 1; i < CHAR_MAX; ++i) {
if (i%15 == 0) fprintf(f, ",\n %d", i);


Oops! Better get out my swatter.

if (i%15 == 0) fprintf(f, ",\n %d", rot13);

else fprintf(f, ", %d", rot13);
}
fprintf(f, "\n};\n");
fclose(f);
return 0;
}
 
N

Neil Cerutti

Not necessarily: The standard doesn't guarantee that an object
can larger than 65535 bytes, but `CHAR_MAX' could well be much
larger than that.

Also, users with (e.g.) 32 bit bytes will probably not
appreciate it if a simple rot13 program takes up 2 gigabytes of
memory.

Thanks for the corrections. I hadn't considered such large char
values.
 
H

hugo27

Rob Morris said:
Hi, I'm teaching myself C for fun. I wrote the litle program listed
below to convert rot13 text. It reads one char at a time and converts
it via pointers.

The constant char* letters holds the alphabet. I subract the pointer
returned from strchr from the address of letters to get the location
within the alphabet, then rot13 it. My question is, is this safe and
legal? (It works on my windows machine BTW.) I googled for programs to
do this and they mostly just subtracted 'a' from the input, which I
gather is ASCII only - so is my program any more portable?

Would you recommed another method for things like this?

Many thanks,
Rob Morris

#include <stdio.h>
#include <string.h>

int main (void)
{
int in, out;
char *loc;
char *letters="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

while ((in = getchar()) != EOF) {
loc = strchr(letters, in);
if (loc != NULL)
out = *(((loc-letters+26)%52)+letters);
else
out = in;
putchar(out);
}
return 0;
}

hugo 27, May 13, 2004
Re Mr. Morris's program, I have three comments.
1.) The *letters pointer is certainly const, but *loc must be
variable, since each new char input can result in new value
for loc.

2.) If input is non-alphabet, then loc=NULL and ELSE executes.
Character then prints unchanged. As intended?

3.) How does the program exit the While loop? As I understand it,
getchar() is line buffered, so here it's reading keyboard buffer.
What error or condition would cause getchar() to return EOF?
When user presses ENTER, that sends a '\n' code into the buffer,
but that's not EOF is it?
 
M

Martin Dickopp

while ((in = getchar()) != EOF) {
[...]

3.) How does the program exit the While loop? As I understand it,
getchar() is line buffered, so here it's reading keyboard buffer.

It's reading from whatever standard input is associated with. This may
or may not be a keyboard.
What error or condition would cause getchar() to return EOF?

End-of-file condition or a read error.
When user presses ENTER, that sends a '\n' code into the buffer,
but that's not EOF is it?

When standard input reads from a file, it's relatively obvious when
end-of-file condition occurs. When standard input reads from a
keyboard, there may be a system-specific way to cause the terminal
driver to cause end-of-file condition (e.g. by pressing Ctrl+D followed
Enter on many Unix systems). When standard input reads from something
else, that device may also have a device-specific definition of when
end-of-file condition occurs.

Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,818
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top