K&R 2 exercise 2-3

P

Peter Pichler

nrk said:
Nitpick, ITYM:
static const char * const ...

Here, and in other places as well :)

Correct, though I would use:
static const char uppercase[] = ...
static const char lowercase[] = ...

There are two advantages:
1. Saving 2*sizeof(char*) bytes, and
2. Being able to apply sizeof to 'uppercase' and 'lowercase'.

I am not aware of any disadvantages ;-)

Peter
 
I

Irrwahn Grausewitz

nrk said:
Nitpick, ITYM:
static const char * const ...

Here, and in other places as well :)

Umm, yes, don't know what I was thinking.
Thanks for correction. :)

Regards
 
M

Martin O'Brien

Hi~ i've studied C for a few months myself,

and i'd appreciate it if anyone could improve my coding or correct it.

the following is my solution to the K&R exercise 2-3

"Write the function htoi(s), which converts a string of hexademical digits
(including an optional 0x or 0X) into its equivalent integer value.
The allowable digits are 0 through 9, a through f, and A throught F."

Others have already provided valuable suggestions. I don't remember
seeing a mention of oveflow though. Anyway, this is Plauger's solution
from THE STANDARD C LIBRARY, a very useful C resource. Plauger calls
it hex2int rather than htoi.

/* hex2int.c: Convert hexadecimal number in any locale */
/* Source: THE STANDARD C LIBRARY by P J Plauger, p34 */
/* NOTE: This code does not check for overflow. That requires */
/* additional complexity. */

#include <ctype.h>
#include <string.h>

int hex2int(const char *s)
{
int value;
static const char xd[] =
{"0123456789abcdefABCDEF"};

static const char xv[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15};

for (value = 0; isxdigit(*s); ++s)
value = (value << 4) + xv[strchr(xd, *s) - xd];

return value;

}

Martin
http://martinobrien.co.uk/
 
D

David Rubin

Martin said:
/* hex2int.c: Convert hexadecimal number in any locale */
/* Source: THE STANDARD C LIBRARY by P J Plauger, p34 */
/* NOTE: This code does not check for overflow. That requires */
/* additional complexity. */

#include <ctype.h>
#include <string.h>

int hex2int(const char *s)
{
int value;
static const char xd[] =
{"0123456789abcdefABCDEF"};

static const char xv[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15};

for (value = 0; isxdigit(*s); ++s)
value = (value << 4) + xv[strchr(xd, *s) - xd];

return value;

I believe htoi() should return an unsigned int (perhaps unsigned long).
AFAIK, you cannot portably represent signed values in hex. In any case,
Plauger's code exhibits undefined behavior, so this is definitely not
the implementation you want to use.

/david
 
M

Martin O'Brien

David said:
I believe htoi() should return an
unsigned int (perhaps unsigned long). AFAIK,
you cannot portably represent signed values in hex.
In any case, Plauger's code exhibits undefined
behavior, so this is definitely not the
implementation you want to use.

The mistake is mine. I wrapped Plauger's code in a function definition
and wrongly put the return type as int.

Plauger says this:

---
To convert a hexadecimal number in any locale, write:

#include <ctype.h>
#include <string.h>
....
static const char xd[] =
{"0123456789abcdefABCDEF"};

static const char xv[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15};

for (value = 0; isxdigit(*s); ++s)
value = (value << 4) + xv[strchr(xd, *s) - xd];

Note that this code does not check for overflow. That requires
additional complexity.
 
C

CBFalconer

Martin said:
I believe htoi() should return an
unsigned int (perhaps unsigned long). AFAIK,
you cannot portably represent signed values in hex.
In any case, Plauger's code exhibits undefined
behavior, so this is definitely not the
implementation you want to use.

The mistake is mine. I wrapped Plauger's code in a function
definition and wrongly put the return type as int.

Plauger says this:

---
To convert a hexadecimal number in any locale, write:

#include <ctype.h>
#include <string.h>
...
static const char xd[] =
{"0123456789abcdefABCDEF"};

static const char xv[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15};

for (value = 0; isxdigit(*s); ++s)
value = (value << 4) + xv[strchr(xd, *s) - xd];

Note that this code does not check for overflow. That requires
additional complexity.

It still exhibits undefined behaviour. You need to define value.
Also s, although we can make some assumptions about that.
 
M

Martin

CBFalconer said:
It still exhibits undefined behaviour. You need to define value.
Also s, although we can make some assumptions about that.

I'm puzzled. How can it exhibit anything if it doesn't compile?

Martin
 
R

RoRsOaMrPiEo

David said:
I believe htoi() should return an
unsigned int (perhaps unsigned long). AFAIK,
you cannot portably represent signed values in hex.
In any case, Plauger's code exhibits undefined
behavior, so this is definitely not the
implementation you want to use.

The mistake is mine. I wrapped Plauger's code in a function definition
and wrongly put the return type as int.

Plauger says this:

---
To convert a hexadecimal number in any locale, write:

#include <ctype.h>
#include <string.h>
...
static const char xd[] =
{"0123456789abcdefABCDEF"};

static const char xv[] =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15};

for (value = 0; isxdigit(*s); ++s)
value = (value << 4) + xv[strchr(xd, *s) - xd];

Note that this code does not check for overflow. That requires
additional complexity.

htoi1 "check for overflow" ? Is it right?
How checking for overflow?

#include <stdio.h>


int lower(int c)
{if(c>='A' && c<='Z') return c+'a'-'A';
else return c;
}

int isdigit(int c)
{if(c>='0' && c<='9') return 1; else return 0;}

int ispace(int c)
{ if(c==' ' || c=='\t' || c=='\n') return 1; else return 0;}


int htoi1(char s[])
{unsigned i=0, n=0, d, h;
char c;

while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= ('0'<=c && c<='9' ) ? c - '0' :
( c>='A' && c<='F' ) ? c - 'A' + 10:
( c>='a' && c<='f' ) ? c - 'a' + 10: 16;
if(d==16)
break;
if( (h=16 * n) < n || (n=h+d)<h || n<d )
{n= ((int)n)>0 ? ((~0u)>>1) : ((~0u)>>1)+1; break;}
c=s[++i];
}
return (int) n;
}


int htoi(char s[])
{int i=0,n=0;
while(ispace(s)) i++;
if(s=='0'&& lower(s[i+1])=='x') i+=2;
for( ; ;++i)
if( isdigit(s) ) n=16*n + (s-'0');
else if((lower(s)>='a' && lower(s)<='f'))
n=16*n+lower(s)-'a'+ 10;
else break;
return n;
}

main()
{char b[30];
int c, ch, j;
do{printf("\nLista di numeri esadecinmali(1 termina)>");
do{scanf("%s", &b);
c=htoi1(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
c=htoi(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
if((ch=fgetc(stdin) )=='\n') break;
else ungetc(ch,stdin);
} while(ch!=EOF );
}while(c!=1);
return 0;
}
 
M

Martin

Hello RoRsOaMrPiEo.

Your code is very busy. Why are you writing functions lower(), isdigit() and
isspace() when the Standard Library has them? (in the C Library lower() is
called tolower()).

And this thread should have convinced you that although you can assume the
characters '0' to '9' have consecutive values, you can not make that
assumption about upper and lower case letters. Which is why Plauger uses a
table of values xv, to define the values of the characters in xd.

Also, Chuck has already said that my function wrapper around Plauger's
suggestion was faulty because it returned int, which was an oversight on my
part, and your function now does the same, and even casts the results of its
calculations (in an unsigned int) to int before returning it.

I think what Plauger means about additional complexity for overflow is to
flag a warning if overflow is indicated and stop the conversion. It's what
I'd do anyway.

--
Martin
http://martinobrien.co.uk/


RoRsOaMrPiEo said:
htoi1 "check for overflow" ? Is it right?
How checking for overflow?

#include <stdio.h>
[...]
} while(ch!=EOF );
 
M

Martin Dickopp

Martin said:
I'm puzzled. How can it exhibit anything if it doesn't compile?

"Undefined behavior" means that the standard poses no requirements on
how a conforming implementation behaves. If a compiler is part of the
implementation, not compiling the code at all is therefore just as
allowed as generating executable code.

(FWIW, there are cases of undefined behavior which only occurs at
run-time for some input data. An obvious example is any program which
calls the `gets' function. In such cases, the compiler must compile the
program.)
 
C

CBFalconer

RoRsOaMrPiEo said:
.... snip ...

#include <stdio.h>

int lower(int c)
{if(c>='A' && c<='Z') return c+'a'-'A';
else return c;
}

Already non-portable code, with foul formatting. I see at least
four blanks you could have saved.
 
R

RoRsOaMrPiEo

Hello RoRsOaMrPiEo.

Your code is very busy. Why are you writing functions lower(), isdigit() and
isspace() when the Standard Library has them? (in the C Library lower() is
called tolower()).

ok, ok
And this thread should have convinced you that although you can assume the
characters '0' to '9' have consecutive values, you can not make that
assumption about upper and lower case letters. Which is why Plauger uses a
table of values xv, to define the values of the characters in xd.

I assume ascii or ebcdic and a two's complement pc
Also, Chuck has already said that my function wrapper around Plauger's
suggestion was faulty because it returned int, which was an oversight on my
part, and your function now does the same, and even casts the results of its
calculations (in an unsigned int) to int before returning it.

Do you show some numeric examples?
 
R

RoRsOaMrPiEo

I assume that CHAR_MAX < 512

#include <stdio.h>
#include <assert.h>
#include <limits.h>

int htoi1(char s[])
{unsigned i=0, n=0, d, h;
char c;

assert(s!=NULL);
while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= ('0'<=c && c<='9' ) ? c - '0' :
( c>='A' && c<='F' ) ? c - 'A' + 10:
( c>='a' && c<='f' ) ? c - 'a' + 10: 16;
if(d==16)
break;
if( (h=16 * n) < n || (n=h+d)<h || n<d )
{n= ((int)n)>0 ? ((~0u)>>1) : ((~0u)>>1)+1; break;}
c=s[++i];
}
return (int) n;
}


char* volte1(void)
{static char xd[512] = {16};
xd['0']= 0; xd['1']= 1; xd['2']= 2; xd['3']= 3;
xd['4']= 4; xd['5']= 5; xd['6']= 6; xd['7']= 7;
xd['8']= 8; xd['9']= 9;
xd['a']=10; xd['b']=11; xd['c']=12; xd['d']=13;
xd['e']=14; xd['f']=15;
xd['A']=10; xd['B']=11; xd['C']=12; xd['D']=13;
xd['E']=14; xd['F']=15;
return xd;
}


int htoi2(char* s)
{unsigned i=0, n=0, d, h;
static char *p=0;
char c;

assert(s!=NULL);
if(p==0) p=volte1();

while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= c>0 ? p[c]: 16;
if(d==16)
break;
if( n>(UINT_MAX/16) || ( h=(n*16) )> UINT_MAX - d )
{n= ((int)n)>0 ? INT_MAX : INT_MIN; break;}
n= h + d;
c=s[++i];
}
return (int) n;
}


int main(void)
{char b[30]={0};
int c, ch, j;

assert(CHAR_MAX<=512);
do{printf("\nLista di numeri esadecinmali(1 termina)>");
do{scanf("%s", &b);
c=htoi1(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
c=htoi2(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
if((ch=fgetc(stdin) )=='\n') break;
else ungetc(ch,stdin);
} while(ch!=EOF );
}while(c!=1);
return 0;
}

Do you like it?
 
R

RoRsOaMrPiEo

I assume that CHAR_MAX < 512

I assume '0'...'9','a'...'f','A'...'F' < 512


#include <stdio.h>
#include <assert.h>
#include <limits.h>

/* It is wrong ...*/
int htoi1(char s[])
{unsigned i=0, n=0, d, h;
char c;

assert(s!=NULL);
while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= ('0'<=c && c<='9' ) ? c - '0' :
( c>='A' && c<='F' ) ? c - 'A' + 10:
( c>='a' && c<='f' ) ? c - 'a' + 10: 16;
if(d==16)
break;
if( (h=16 * n) < n || (n=h+d)<h || n<d )
{n= ((int)n)>0 ? ((~0u)>>1) : ((~0u)>>1)+1; break;}
c=s[++i];
}
return (int) n;
}


char* volte1(void)
{static char xd[512] = {16};
xd['0']= 0; xd['1']= 1; xd['2']= 2; xd['3']= 3; xd['4']= 4;
xd['5']= 5;
xd['6']= 6; xd['7']= 7; xd['8']= 8; xd['9']= 9;
xd['a']=10; xd['b']=11; xd['c']=12; xd['d']=13; xd['e']=14;
xd['f']=15;
xd['A']=10; xd['B']=11; xd['C']=12; xd['D']=13; xd['E']=14;
xd['F']=15;
return xd;
}

int ci_siamo_o_no(void)
{if('9'>= 512 ||
'a'>= 512 || 'b'>= 512 || 'c'>= 512 || 'd'>= 512 || 'e'>= 512 ||
'f'>= 512 ||
'A'>= 512 || 'B'>= 512 || 'C'>= 512 || 'D'>= 512 || 'E'>= 512 ||
'F'>= 512
) return 0;
return 1;
}

int htoi2(char* s)
{unsigned i=0, n=0, d, h;
static char *p=0;
char c;

assert(s!=NULL);
if(p==0) p=volte1();

while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= c>0 && (unsigned) c < 512u ? p[c]: 16;
if(d==16)
break;
if( n>(UINT_MAX/16) || ( h=(n*16) )> UINT_MAX - d )
{n= ((int)n)>0 ? INT_MAX : INT_MIN; break;}
n= h + d;
c=s[++i];
}
return (int) n;
}


int main(void)
{char b[30]={0};
int c, ch, j;

c=ci_siamo_o_no();
assert(c==1);

do{printf("\nLista di numeri esadecinmali(1 termina)>");
do{scanf("%29s", &b);
c=htoi1(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
c=htoi2(b);
printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c);
if((ch=fgetc(stdin) )=='\n') break;
else ungetc(ch,stdin);
} while(ch!=EOF );
}while(c!=1);
return 0;
}
 
R

Richard Bos

RoRsOaMrPiEo said:
Do you say that you can make a "htoi()" better than my "htoi2()" ?

Easily. Why do work which the library will do for you?

#include <limits.h>
#include <stdlib.h>

unsigned int htoi(char *txt)
{
unsigned long tmp=strtoul(txt, 0, 16);
if (tmp>UINT_MAX) return UINT_MAX;
return tmp;
}

Shorter, sweeter, doesn't waste value space on negative numbers which
you don't parse anyway, doesn't invoke undefined behaviour on overflow,
and is portable no matter what your character set it. Feel free.

Richard
 
R

RoRsOaMrPiEo

Already non-portable code, with foul formatting. I see at least
four blanks you could have saved.

Do you say that you can make a "htoi()" better than my "htoi2()" ?
 
R

Richard Bos

RoRsOaMrPiEo said:
but here htoi(8eeeeeeeeeeeeeeeeeeeeeee)==-1

this seems htoi(8eeeeeeeeeeeeeeeeeeeeeee)== INT_MAX

Imprimis, htoi(8eeeeeeeeeeeeeeeeeee) is a syntax error.
Secundis, htoi("8eeeeeeeeeeeeeeeeeeee") invokes undefined behaviour for
your implementation (and thus has a license to return anything,
including garbage, and even crash the machine), and returns UINT_MAX
(not INT_MAX) for my implementation, without being undefined.
but my htoi2 seems 2*faster than library htoi()

There is no library htoi() (not in Standard C, at least), and some
slight slowness, while not ideal, is at least to be preferred above a
license to crash your machine at any inconvenient moment.

Richard
 
R

RoRsOaMrPiEo

Easily. Why do work which the library will do for you?

#include <limits.h>
#include <stdlib.h>

unsigned int htoi(char *txt)
{
unsigned long tmp=strtoul(txt, 0, 16);
if (tmp>UINT_MAX) return UINT_MAX;
return tmp;
}

but here htoi(8eeeeeeeeeeeeeeeeeeeeeee)==-1

this seems htoi(8eeeeeeeeeeeeeeeeeeeeeee)== INT_MAX

unsigned int htoi(char *txt)
{unsigned long tmp=strtoul(txt, 0, 16);
if (tmp>=UINT_MAX)
return INT_MAX;
return tmp;
}
Shorter, sweeter, doesn't waste value space on negative numbers which
you don't parse anyway, doesn't invoke undefined behaviour on overflow,
and is portable no matter what your character set it. Feel free.

Richard
but my htoi2 seems 2*faster than library htoi()

C:>2_7_c
test is ok
Differenza libreria =6.000000 s
Differenza mia =2.000000 s

C:>2_7_c
test is ok
Differenza libreria =6.000000 s
Differenza mia =3.000000 s

C:>2_7_c
test is ok
Differenza libreria =6.000000 s
Differenza mia =3.000000 s

C:>2_7_c
test is ok
Differenza libreria =6.000000 s
Differenza mia =3.000000 s

C:>2_7_c
test is ok
Differenza libreria =6.000000 s
Differenza mia =3.000000 s


#include <stdio.h>
#include <assert.h>
#include <limits.h>
#include <stdlib.h>
#include <time.h>

char* riempi_hex(char* a, int lunghezza)
{static char ary[]="0123456789abcdefABCDEF";
int i, j, lungh;

assert( a!=NULL && lunghezza>13 );
lungh = rand() % 12;
for( j=0; j<lungh; ++j )
a[j]=ary[ rand() % 21 ];

a[j]=0;
return a;
}

unsigned int htoi(char *txt)
{unsigned long tmp=strtoul(txt, 0, 16);
if (tmp>=UINT_MAX)
return INT_MAX;
return tmp;
}

char* volte1(void)
{static char xd[512] = {16};
xd['0']= 0; xd['1']= 1; xd['2']= 2; xd['3']= 3; xd['4']= 4; xd['5']=
5;
xd['6']= 6; xd['7']= 7; xd['8']= 8; xd['9']= 9;
xd['a']=10; xd['b']=11; xd['c']=12; xd['d']=13; xd['e']=14;
xd['f']=15;
xd['A']=10; xd['B']=11; xd['C']=12; xd['D']=13; xd['E']=14;
xd['F']=15;
return xd;
}

int ci_siamo_o_no(void)
{if('9'>= 512 ||
'a'>= 512 || 'b'>= 512 || 'c'>= 512 || 'd'>= 512 || 'e'>= 512 ||
'f'>= 512 ||
'A'>= 512 || 'B'>= 512 || 'C'>= 512 || 'D'>= 512 || 'E'>= 512 ||
'F'>= 512
) return 0;
return 1;
}

int htoi2(char* s)
{unsigned i=0, n=0, d, h;
static char *p=0;
char c;

assert(s!=NULL);
if(p==0) p=volte1();

while( (c=s)==' ' || c=='\t' || c=='\n' )
i++;
if( c=='0' && ( s[i+1]=='x' || s[i+1]=='X') )
{i+=2; c=s;}
while(1)
{d= c>0 && (unsigned) c < 512u ? p[c]: 16;
if(d==16)
break;
if( n>(UINT_MAX/16) || ( h=(n*16) )> UINT_MAX - d )
{n= ((int)n)>0 ? INT_MAX : INT_MIN; break;}
n= h + d;
c=s[++i];
}
return (int) n;
}


int main(void)
{char b[30]={0};
int c, ch, j;
time_t x, y;

c=ci_siamo_o_no();
assert(c==1);
srand(time(0));

for(j=0; j<700000; ++j)
{riempi_hex(b, 30);
c=htoi(b); ch=htoi2(b);
if(c!=ch && c-ch!=-1)
{printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c );
printf("Arrivo1 =%d\nTest fails \n", ch);
}
}

printf("test is ok \n");
x=time(0);
for(j=0; j<7000000; ++j)
{riempi_hex(b, 30);
c=htoi(b);
if(c==9999999)
{printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c );
}
}

y=time(0);
printf("Differenza libreria =%f s \n", difftime(y,x) );

x=time(0);
for(j=0; j<7000000; ++j)
{riempi_hex(b, 30);
c=htoi2(b);
if(c==9999999)
{printf("\nPartenza=%s\n",b);
printf("Arrivo =%d\n", c );
}
}
y=time(0);
printf("Differenza mia =%f s \n", difftime(y,x) );

return 0;
}
 
R

RoRsOaMrPiEo

Do you say that you can make a "htoi()" better than my "htoi2()" ?

Excuse me if I gave offence to you
It would be easy if we use pointer instead of index
but

int htoi3(char* s)
{unsigned n=0, d, h;
static char *p=0;
char c;

assert(s!=NULL);
if(p==0) p=volte1();

while( (c=*s)==' ' || c=='\t' || c=='\n' )
s++;
if( c=='0' && ( s[1]=='x' || s[1]=='X') )
{s += 2; c = *s;}
while(1)
{d= c>0 && (unsigned) c < 512u ? p[c]: 16;
if(d==16)
break;
if( n>(UINT_MAX/16) || ( h=(n*16) )> UINT_MAX - d )
{n= ((int)n)>0 ? INT_MAX : INT_MIN; break;}
n= h + d;
c=*++s;
}
return (int) n;
}

seems slower of htoi2 in my sys.
 
R

RoRsOaMrPiEo

Imprimis, htoi(8eeeeeeeeeeeeeeeeeee) is a syntax error.
Secundis, htoi("8eeeeeeeeeeeeeeeeeeee") invokes undefined behaviour for
your implementation (and thus has a license to return anything,
including garbage, and even crash the machine), and returns UINT_MAX
(not INT_MAX) for my implementation, without being undefined.

so your is htou and not htoi
There is no library htoi() (not in Standard C, at least),

ok, htoi that use library (strtoul(txt, 0, 16))
and some
slight slowness, while not ideal, is at least to be preferred above a
license to crash your machine at any inconvenient moment.

crash where in code?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,813
Members
47,357
Latest member
sitele8746

Latest Threads

Top