special character to strings and vice versa

P

Peter Monsson

Hi all,

I'm sitting with a problem where I have some special characters which
have been "stringified" meaning that instead of having f. x. a '\n' or
'\x20' I have "\n" and "\x20" and I have to convert them back to a
normal char (and actually be able to revers this). Now I've made an
ugly switch statment to make this work, but at the same time I feel
that there must be a better way. Is there anything obvious I've
missed?

Thanks
Peter
 
M

Mike Wahler

Peter Monsson said:
Hi all,

I'm sitting with a problem where I have some special characters which
have been "stringified" meaning that instead of having f. x. a '\n' or
'\x20' I have "\n" and "\x20" and I have to convert them back to a
normal char

char s[] = "\x20\n";
char c1 = s[0];
char c2 = s[1];
(and actually be able to revers this). Now I've made an
ugly switch statment to make this work, but at the same time I feel
that there must be a better way. Is there anything obvious I've
missed?

You missed posting your code.

-Mike
 
S

sathya

Peter said:
Hi all,

I'm sitting with a problem where I have some special characters which
have been "stringified" meaning that instead of having f. x. a '\n' or
'\x20' I have "\n" and "\x20" and I have to convert them back to a
normal char (and actually be able to revers this). Now I've made an
ugly switch statment to make this work, but at the same time I feel
that there must be a better way. Is there anything obvious I've
missed?

Thanks
Peter

I don't know what do you mean by f.x a. But using gets() might be a idea.
gets() returns a char *. Assign the return value in a char variable.
You might have tried the above method in your switch case.
But unless you post some code this group's
experts might not be able to help you.



--
"Combination is the heart of chess"
A.Alekhine
Mail to:
sathyashrayan25 AT yahoo DOT com
(AT = @ and DOT = .)
 
J

Jack Klein

I don't know what do you mean by f.x a. But using gets() might be a idea.
gets() returns a char *. Assign the return value in a char variable.
You might have tried the above method in your switch case.
But unless you post some code this group's
experts might not be able to help you.

Please don't post here if you do not know what the heck you are
talking about, and you most surely do not in this case.

1. Never, never, NEVER use gets(), or recommend its use. It is the
most dangerous function in the entire C standard library, bar none,
because there is absolutely no way to use it safely.

2. Even though gets() should NEVER be used, it does indeed return a
pointer to character. Suggesting that a pointer to character should
be assigned to a char is bad advice, first because it makes no sense
in this (or hardly any other) context, second because a cast is
required, and third because there is a good chance it will cause
undefined behavior in most cases.
 
S

sathya

Jack said:
Please don't post here if you do not know what the heck you are
talking about, and you most surely do not in this case.

Yes, point taken as per faq 12.23.
1. Never, never, NEVER use gets(), or recommend its use. It is the
most dangerous function in the entire C standard library, bar none,
because there is absolutely no way to use it safely.

2. Even though gets() should NEVER be used, it does indeed return a
pointer to character. Suggesting that a pointer to character should
be assigned to a char is bad advice, first because it makes no sense
in this (or hardly any other) context, second because a cast is
required, and third because there is a good chance it will cause
undefined behavior in most cases.

--
"Combination is the heart of chess"
A.Alekhine
Mail to:
sathyashrayan25 AT yahoo DOT com
(AT = @ and DOT = .)
 
M

Marcus Lessard

Jack Klein wrote:

1. Never, never, NEVER use gets(), or recommend its use. It is the
most dangerous function in the entire C standard library, bar none,
because there is absolutely no way to use it safely.

How does it remain in libraries? Do the standards people ever decide that
a function should be removed from all implementations to be compliant?
 
J

Joona I Palaste

How does it remain in libraries? Do the standards people ever decide that
a function should be removed from all implementations to be compliant?

Very rarely. They are too afraid to break existing code.
 
P

Peter Monsson

Mike Wahler said:
You missed posting your code.

Sorry about that.

Maybe I should clarify it a bit. I'm trying to read a frequency table
in the format:
character space number newline
Unfortunatly the characters may be of the form "a", "\\n" or "\\x20"
The backslash is a backslash character and that's what makes it so
annoying.


while (!feof(filePointer))
{
fgets(buffer, BUFFER_LENGTH, filePointer);
sscanf(buffer, "%s %s\n", smallbuffer, buffer);

if (smallbuffer[0] != '\\')
c = smallbuffer[0];
else
switch(smallbuffer[1])
{
case 'n':
c = '\n';
break;
case 't':
c = '\t';
break;
case 'v':
c = '\v';
break;
case 'b':
c = '\b';
break;
case 'r':
c = '\r';
break;
case 'f':
c = '\f';
break;
case 'a':
c = '\a';
break;
default:
/* Hex left out*/
break;
}
addEntry(table, c, buffer, &pos, &size);
}
 
M

Michael Mair

Peter said:
Sorry about that.

Maybe I should clarify it a bit. I'm trying to read a frequency table
in the format:
character space number newline
Unfortunatly the characters may be of the form "a", "\\n" or "\\x20"
The backslash is a backslash character and that's what makes it so
annoying.

Remarks: - It would have been nice to repeat your original request
which was "Is there a more elegant way to do it than the following?"
- It is considered better to post code without tabs.
Just replace the tabs by a decent (>=2) number of white spaces
before pasting it.
while (!feof(filePointer))
{
fgets(buffer, BUFFER_LENGTH, filePointer);
sscanf(buffer, "%s %s\n", smallbuffer, buffer);

if (smallbuffer[0] != '\\')
c = smallbuffer[0];
else
switch(smallbuffer[1])
{
case 'n':
c = '\n';
break;
case 't':
c = '\t';
break;
case 'v':
c = '\v';
break;
case 'b':
c = '\b';
break;
case 'r':
c = '\r';
break;
case 'f':
c = '\f';
break;
case 'a':
c = '\a';
break;
default:
/* Hex left out*/
break;
make this case 'x' and give an error as default.
}
addEntry(table, c, buffer, &pos, &size);
}

Leave the code as blunt as it is. Everything more elegant
probably gives you just a headache or is harder to maintain.
I would write a function with the prototype
int GetType(char *smallbuffer)
which essentially does what you want and hides the ugliness.

C99 gives you a nice alternative due to designated
initializers:
-----
#include <limits.h>

const char ctable[UCHAR_MAX] = {
['n'] = '\n';
['t'] = '\t';
.....
['a'] = '\a';
['x'] = 'x';
};
.....

if (smallbuffer[0] != '\\')
c = smallbuffer[0];
else {
if ( !(c = ctable[(unsigned char)smallbuffer[1]]) ) {
/* c==0: same as default above; error treatment */
} else if (c=='x') {
/* hex treatment */
}
}
-----
(untested but hopefully illustrates the point)

In C89, building the table leads effectively to the same
code you already have, so you gain not much in clarity
and pay with memory.

Note: I do not know how you do your hex treatment but
wanted to make you aware of strtoul() (unsigned long
instead of long in order to be on the safe side) called
with base 16.


Cheers
Michael
 
M

Michael Mair

Michael said:
>
C99 gives you a nice alternative due to designated
initializers:
-----
#include <limits.h>

const char ctable[UCHAR_MAX] = {
['n'] = '\n';
['t'] = '\t';
....
['a'] = '\a';
['x'] = 'x';
};
....

if (smallbuffer[0] != '\\')
c = smallbuffer[0];
else {
if ( !(c = ctable[(unsigned char)smallbuffer[1]]) ) {
/* c==0: same as default above; error treatment */
} else if (c=='x') {
/* hex treatment */
}
}

Addendum: Use this only for small enough values of
UCHAR_MAX/CHAR_BIT or you are in for nasty surprises
for, say, 32 Bit characters.
Then, you still can use the maximum of 'z','Z','9'
(I don't know whether the standard says which is
to be the largest int value).
If you are very sure that nobody ever will use a wrong
character in the file, then you can go for

const char ctable['x'-'a'+1] = {
['n'-'a'] = '\n';
['t'-'a'] = '\t';
.....
['a'-'a'] = '\a';
['x'-'a'] = 'x';
};

-Michael
 
H

Herbert Rosenau

Sorry about that.

Maybe I should clarify it a bit. I'm trying to read a frequency table
in the format:
character space number newline
Unfortunatly the characters may be of the form "a", "\\n" or "\\x20"
The backslash is a backslash character and that's what makes it so
annoying.


while (!feof(filePointer))
BUG! feof() can't dedect EOF unil _aftrer_ trying to acces the file
behind its end. So the following read would occure anyway
{
fgets(buffer, BUFFER_LENGTH, filePointer);

As fgets can set the EOF mark - but there is no check for the
following code acts with outdated data.
sscanf(buffer, "%s %s\n", smallbuffer, buffer);

In buffer is only data already handled because the last feof() had not
seen EOF and the buffer is not changed because the current fgets
failed with reaching EOF and the number readed is NOT checked.
 
M

Mike Wahler

Marcus Lessard said:
Jack Klein wrote:



How does it remain in libraries?

If it were removed, existing code which uses it would stop working.
Do the standards people ever decide that
a function should be removed from all implementations to be compliant?

I've never heard of such a case.

-Mike
 
P

Peter Monsson

Michael Mair said:
Remarks: - It would have been nice to repeat your original request
which was "Is there a more elegant way to do it than the following?"
- It is considered better to post code without tabs.
Just replace the tabs by a decent (>=2) number of white spaces
before pasting it.

OK, I'll remember it for next time.

Leave the code as blunt as it is. Everything more elegant
probably gives you just a headache or is harder to maintain.
I would write a function with the prototype
int GetType(char *smallbuffer)
which essentially does what you want and hides the ugliness.

Hmm, yeah that's probably the only way out. Thanks for your input tough.
C99 gives you a nice alternative due to designated
initializers:
-----
#include <limits.h>

const char ctable[UCHAR_MAX] = {
['n'] = '\n';
['t'] = '\t';
....
['a'] = '\a';
['x'] = 'x';
};
....

if (smallbuffer[0] != '\\')
c = smallbuffer[0];
else {
if ( !(c = ctable[(unsigned char)smallbuffer[1]]) ) {
/* c==0: same as default above; error treatment */
} else if (c=='x') {
/* hex treatment */
}
}
-----
(untested but hopefully illustrates the point)

In C89, building the table leads effectively to the same
code you already have, so you gain not much in clarity
and pay with memory.

Well I'll just stick with the other version then.
Note: I do not know how you do your hex treatment but
wanted to make you aware of strtoul() (unsigned long
instead of long in order to be on the safe side) called
with base 16.

I'll look into it, thanks a lot Michael.

Cheers
Peter
 
C

Christopher Benson-Manica

Mike Wahler said:
If it were removed, existing code which uses it would stop working.

Well, it might be argued that that would be an improvement in the case
of gets() :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,150
Messages
2,570,853
Members
47,394
Latest member
Olekdev

Latest Threads

Top