a small code for filter non-chinese files.

J

jvyyuie

// My email is (e-mail address removed)
// I want to make some friends and discuss about programing.

#include "stdafx.h"
#include <malloc.h>
#include <string.h>
#define sec1 (c=buffer)&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer)&&(c>0xA0&&c<0xAA)

void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)
if(sec1)
{
printf("%c%c", buffer, buffer[i+1]);
i+=2;
}
else if(sec2)
i+=2;
else
i++;
}

void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));
return 0;
}
 
G

Gianni Mariani

// My email is (e-mail address removed)
// I want to make some friends and discuss about programing.

Great - nice to meet you. Have a C++ question ?

Nice "C" program, not C++.
#include "stdafx.h"
Microsoft precompiled header nonsesnse - why do you need it here
exactly? Turn off precompiled headers and get rit of it if you want to
write truly portable code.
#include <malloc.h>

malloc.h is not commonly used in C++ and should probably use

#include said:
#include <string.h>

In C++ you should include:

#include said:
#define sec1 (c=buffer)&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer)&&(c>0xA0&&c<0xAA)


MUCH MUCH better if you wrote inline functions instead of macros.
void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)

Why do you call strlen for every character ?
if(sec1)
{
printf("%c%c", buffer, buffer[i+1]);
i+=2;
}
else if(sec2)
i+=2;
else
i++;
}


Use std::istream instead of FILE* for C++
void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);

C style cast instead of C++ style casts. Also looks like you need to
learn about std::string (or std::basic_string).
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);

Memory management is best done by the compiler. Use std::string.
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));

No reports of error if fopen fails ?
return 0;
}

The code below would look more like C++.

#include <iostream>
#include <ostream>
#include <istream>
#include <fstream>

inline bool Sec1( unsigned char c )
{
return ( c > 0x80 && c <0xA1 || c > 0xA9 && c < 0xFF );
}

inline bool Sec2( unsigned char c )
{
return ( c > 0xA0 ) && ( c < 0xAA );
}

std::eek:stream & Convert2PureChinese(
std::istream & i_i,
std::eek:stream & i_o
) {

char c;

while ( i_i )
{
i_i.get( c );

if ( Sec1( c ) )
{
i_o << c;
i_i.get( c );
i_o << c;
}
else if ( Sec2( c ) )
{
i_i.get( c );
}
}

return i_o;
}

int main(int argc, char* argv[])
{
std::ifstream i_file( "test.txt", std::ios_base::binary );

if ( i_file.fail() )
{
std::cerr << "Failed to open test.txt\n";
return 1;
}

Convert2PureChinese( i_file, std::cout );
}
 
J

Jon Slaughter

Gianni Mariani said:
Great - nice to meet you. Have a C++ question ?

Nice "C" program, not C++.

Microsoft precompiled header nonsesnse - why do you need it here exactly?
Turn off precompiled headers and get rit of it if you want to write truly
portable code.

Why bitch about it? If it makes no difference then why make an issue? Just
cause you don't like microsoft doesn't mean you need to make this an issue
when its not. It definately speeds up the compiling process so it is
needed... if he wants to port the code(which is usually never the case
anyways), you know what? takes about 2 seconds to turn off precompiled
headers(if that) and about 1 second to remove the #include tag. (and if its
a very large project I'm sure he can make it so he can remove this with
ill-effect)

And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".


It seems like one should bitch more about that it is a C program and this is
a C++ newsgroup than about including a ms specific headers? (actually, I
just rename my precompiled headers to Headers.h ;)

Jon
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Jon said:
And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".

Bad logic.

"Programs are better if you don't use many gotos"

"Hey, just by limiting the use of gotos you don't have a good program"
 
J

Jon Slaughter

Julián Albo said:
Bad logic.

"Programs are better if you don't use many gotos"

"Hey, just by limiting the use of gotos you don't have a good program"

um, actually not. Your logic must be worse than mine then because your not
following the logic of his comment.

He said that if you want to make your code "truely" protable then you must
remove the stdafx header... true enough, but he is implying this is basicaly
the only cause, that it is so signficant that all other issues of
portability are inconsequential. This is simply not true, as if it was the
only portability issue then its not even an issue since its so easy to fix.

i.e., its very simple. Why would someone making such an issue out of
something thats so easy to change? I'm sure if the OP ever wanted to make
his code portable he wouldn't have that big a deal with the stdafx... take a
few mins to fix at most. Oh but I'm sure its not possible he could have
some real portability problems that might take him hours to fix or even
longer?

The fact is that he's only bitching about it because he doesn't like
microsoft and he wants to make it an issue with everyone that does(not that
I do). Instead of ignoring it since it has nothing to do with the actual
question(well, it could, say, if there are a lot of headers included in the
stdafx that were not mentioned but still).

So, heres some logic that I guess you won't understand? Since you are
"sticking up" for him you too must hate microsoft? (and I bet you my
conclusion is right wether you agree with my logic or not).


I mean, I just don't really see why you guys who surely must hold your time
valuable would bitch about something that is irrelevant to the problem asked
when it can simply be ignored or a 5 word sentence can be used to say its
not a good idea. What this implies is there is some alterior reason why
someone would take some extra time to make it an issue.

Now I could understand if the OP asked "MY CODE ISN'T COMPILING IN GC++
PLEASE HELP!!" but when he is asking for something completely different and
the solution to the problem is irrelevant to his code being portable and
also the "tone" that is used then I can only come to trhee conclusions:
either he's just very arrogant and has a huge ego and likes to find ways to
let it be known, he is obsessed with portability, or he just hates anything
that has to do with microsoft and anyone that uses a microsoft product.
I'll bet its a combination of the first and last but thats just my guess.

The problem I have with it, and its not alterior, is that I assume this
group is suppose to be a support group and the whole point is to help if you
want. Help != Throwing in your own egotistical comments that are irrelevant
to the solution. So I think its fair game to be bitched at if you think its
go to bitch at someone. If you are going to combine your help with an
attitude then I think your deserve what you get. I don't know what happend
to this NG but it used to be so much better.

Anyways, I got better things to do than talk about this crap.
 
J

Jon Slaughter

Jon Slaughter said:
um, actually not. Your logic must be worse than mine then because your not
following the logic of his comment.
<snip>


I guess I should say that if I'm wrong then I apologize... I don't want to
misinterpret what someone says, but usually if it walks like a duck, swims
like a duck, quacks like a duck, craps like a duck, smells like duck,
etc... then its usually a duck.

Jon
 
G

Gianni Mariani

Jon said:
Why bitch about it?

I have run into more problems using stdafx.h that any other header file.
For most applications, it does not make any noticable difference to
compilation times. It also usually includes non portable header files.
Ever since I have used the practive of removing stdafx.{h,cpp} and
turned off precompiled headers, I have never had to deal with those
issues. Time not dealing with those issues is time saved.

If it makes no difference then why make an issue? Just
cause you don't like microsoft doesn't mean you need to make this an issue
when its not.

Asserting what I like and don't like is a practice that usually leads
you to the wrong conclusions.

It definately speeds up the compiling process so it is
needed...

I have never seen it do so in any appreciable way, and certainly not for
the OP's code.

if he wants to port the code(which is usually never the case
anyways),

The practice of writing portable code usually speeds up the development
process. The practice I use in my current job is that we do Win32 and
Linux (both IA32 and AMD64) builds on automated builds. The number of
times that a bug was introduced in the Win32 code that was not caught by
the MS compiler (and visa versa) is astonishing right down to race
conditions that appeared only on one platform but not the others.
Finding bugs early means lower development cost.

Doing this means you need a compatability library.

you know what? takes about 2 seconds to turn off precompiled
headers(if that) and about 1 second to remove the #include tag.

Yep - so do it.

(and if its
a very large project I'm sure he can make it so he can remove this with
ill-effect)

And I promise you that just by trying off precompiled headers isn't "going
to make it truely portable".

It is if it contains OS specific headers.
It seems like one should bitch more about that it is a C program and this is
a C++ newsgroup than about including a ms specific headers? (actually, I
just rename my precompiled headers to Headers.h ;)

I suspect the OP's code would probably compile just as well in C++ as it
would in C, so technically, it's "standard" C++ except for stdafx.h and
malloc.h.
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Jon said:
(snip)

So, heres some logic that I guess you won't understand? Since you are
"sticking up" for him you too must hate microsoft? (and I bet you my
conclusion is right wether you agree with my logic or not).

I must rectify. Your logic is not bad... is not logic at all.
 
B

Bill Thompson

// My email is (e-mail address removed)
// I want to make some friends and discuss about programing.

#include "stdafx.h"
#include <malloc.h>
#include <string.h>
#define sec1 (c=buffer)&&(c>0x80&&c<0xA1||c>0xA9&&c<0xFF)
#define sec2 (c=buffer)&&(c>0xA0&&c<0xAA)


It appears that sec2 is a subset of sec1; if sec1 is false, sec2 will be
false as well. Thus, the else condition marked below will never be
executed.
void Convert2PureChinese(unsigned char* buffer)
{
unsigned char c;
for(int i=0;i<(int)strlen((char*)buffer);)
if(sec1)
{
printf("%c%c", buffer, buffer[i+1]);
i+=2;
}


never executed?
else if(sec2)
i+=2;
else
i++;
}

void getFileBuffer(FILE* fp)
{
if(fp==NULL)
return;
fseek(fp, 0L, SEEK_END);
long len=ftell(fp);
rewind(fp);
unsigned char* buffer=(unsigned char*)malloc(len);
fread(buffer, len, 1, fp);
fclose(fp);
Convert2PureChinese(buffer);
free(buffer);
}

int main(int argc, char* argv[])
{
getFileBuffer(fopen("test.txt", "r+b"));
return 0;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top