c++ conversion files

K

kalio80

Hi everyone I am trying to create a file that converts text files from
unix to windows and windows to unix
I understand the general concept of it as
unix uses line feed LF
Windows uses CRLF carrige return and feed line
my program will prompt the user to enter a file to open and then
prompts for a destination file to save the new formatted file.
I am having few problems that i have been trying to solve for few
hours
here is the code I created
#include <iostream>
#include <fstream>
using namespace std;

const string unix = "/uw";
const string windows = "/wu";
const string help = "/?";
int uw();
int wu();
int helpfile();

int main()
{


string option;
cout << "\n";
cout << "************************ Menu *********************\n";
cout << "\n";
cout << " Please choose from the following list\n";
cout << "\n";
cout << " * type /uw to convert a file from Unix to Windows\n";
cout << " * type /wu to convert a file from Windows to Unix\n";
cout << " * type Help to Display the help file\n";
getline(cin,option);


while((option != unix) && (option!= windows) && (option != help) ){
cout << " that's not a valid option! Try again\n";
getline(cin,option);
}

if (option == unix){
uw();
}else if(option == windows){
wu();
}else if(option == help){
helpfile();
}

return 0;
}



int uw(){
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file: \n";
cin >> FileName;

ifstream in(FileName,ios::binary | ios:: in);
if(!in){
cout << "Error opening source\n";
return 0;
}

cout << "please enter the name of the destination file: \n";
cin >> Destination;
while(FileName == Destination){
cout << "Source and Destination file names must be different please
try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out){
cout << "Error creating destination file \n";
return 0;
}


char c;

while(in.get(c)){
out.put(c);
if(c == 13){
cout << "Carriage return, not a Unix text file please reconsider
option\n";
}
if(c == 10){
cout << "Line feed\n";
}
}

in.close();
out.close();



return 0;
}




int wu(){
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file (i.e.) file1.txt
\n";
cin >> FileName;
ifstream in(FileName ,ios::binary | ios:: in);

if(!in){
cout << "Error opening source\n";
return 1;
}
cout << "please enter the name of the destination file: \n";
cin >> Destination;

while(FileName == Destination){
cout << "Source and Destination file names must be different please
try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out){
cout << "Error creating destination file \n";
return 0;
}

char c;

while(in.get(c)){
out.put(c);
if(c == 13){
cout << "Carriage return\n";
}
if(c == 10){
cout << "Line feed\n";
}
}

in.close();
out.close();



return 0;


}



int helpfile(){

//system("pause");
cout << "\n\nHere is c:\\help.txt \n\n";
ifstream inf("c:\\help.txt",ios::in);
if(!inf){
cout << "Error reading file\n";
return 1;
}
string theLine = "";
while(getline(inf,theLine)){
cout << theLine << endl;
}
inf.close();

return 0;

}
 
Z

zentara

Hi everyone I am trying to create a file that converts text files from
unix to windows and windows to unix
I understand the general concept of it as
unix uses line feed LF
Windows uses CRLF carrige return and feed line
my program will prompt the user to enter a file to open and then
prompts for a destination file to save the new formatted file.
I am having few problems that i have been trying to solve for few
hours

Hi, I'm just learning C++ myself, I generally use Perl, which would
allow be a simple 1-liner.

But for the sake of my learning, I tried to get your script to run, and
below is a working version, but I removed the troublesome char strings
you used in your menus, and just went to int's. So you need to work
on your strings.

#include <iostream>
#include <fstream>
using namespace std;

#define CR 0x0d
#define LF 0x0a

int uw();
int wu();
int helpfile();

int main()
{
int option;
cout << "\n";
cout << "************************ Menu
*********************\n";
cout << "\n";
cout << " Please choose from the following list\n";
cout << "\n";
cout << " * type 1 to convert a file from Unix to Windows\n";
cout << " * type 2 to convert a file from Windows to Unix\n";
cout << " * type 3 to Display the help file\n";

cin >> option;

while((option != 1) && (option != 2) && (option != 3) )
{
cout << " that's not a valid option! Try again\n";

cin >> option;

}
if (option == 1)
{
uw();
}
else if(option == 2)
{
wu();
}
else if(option == 3)
{
helpfile();
}
return 0;
}


int uw()
{
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file: \n";
cin >> FileName;
ifstream in(FileName,ios::binary | ios:: in);
if(!in)
{
cout << "Error opening source\n";
return 0;
}
cout << "please enter the name of the destination file: \n";
cin >> Destination;
while(FileName == Destination)
{
cout << "Source and Destination file names must be different
please try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out)
{
cout << "Error creating destination file \n";
return 0;
}
char c;
while(in.get(c))
{
if(c == LF)
{
out.put(CR);
out.put(LF);
}
else
{
out.put(c);
}

}
in.close();
out.close();
return 0;
}

int wu()
{
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file (i.e.)
file1.txt\n";
cin >> FileName;
ifstream in(FileName ,ios::binary | ios:: in);
if(!in)
{
cout << "Error opening source\n";
return 1;
}
cout << "please enter the name of the destination file: \n";
cin >> Destination;
while(FileName == Destination)
{
cout << "Source and Destination file names must be different
please try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out)
{
cout << "Error creating destination file \n";
return 0;
}
char c;
while(in.get(c))
{
if(c != CR)
{
out.put(c);
}
}
in.close();
out.close();
return 0;
}


int helpfile()
{
//system("pause");
cout << "\n\nHere is help \n\n";
ifstream inf("help",ios::in);
if(!inf)
{
cout << "Error reading file\n";
return 1;
}
string theLine = "";
while(getline(inf,theLine))
{
cout << theLine << endl;
}
inf.close();
return 0;
}
 
J

Jonathan Turkanis

kalio80 said:
Hi everyone I am trying to create a file that converts text files from
unix to windows and windows to unix
I understand the general concept of it as
unix uses line feed LF
Windows uses CRLF carrige return and feed line
my program will prompt the user to enter a file to open and then
prompts for a destination file to save the new formatted file.
I am having few problems that i have been trying to solve for few
hours

You can use the newline filter from the Boost Iostreams library:

http://home.comcast.net/~jturkanis/iostreams/libs/io/doc/?path=5.9.2.2


The library will be part of the 1.33 release and is available here:

http://home.comcast.net/~jturkanis/iostreams/

Best Regards,
Jonathan Turkanis
 
J

Jerry Coffin

Hi everyone I am trying to create a file that converts text files from
unix to windows and windows to unix

[ ... ]
I am having few problems that i have been trying to solve for few
hours
here is the code I created

We can usually help solve problems better if you tell us what those
problems ARE. Looking through your code a little bit, I see a number
of problems. Most of them are minor, but repeated freuently throughout
the code. First and most obvious, your code seems to assume that the
standard library is in the global namespace, when it's actually in the
std namespace. This may reflect a problem in your compiler.

Second, I think the way you've split up the code up into functions
could be improved -- for example, note that both uw() and wu() contain
(essentially identical) code to retrieve the names of the input and
output files.

Your code also contains some false assumptions: while it's true that
UNIX doesn't require a CR to signal the end of a file, it's also true
that CRs are sometimes included in text files under UNIX. Where they
occur, you probably want to pass them through unhindered.

Given that this is essentially a filter, I think making it interactive
is a mistake -- I'd have the user pass the input file, output file,
and conversion to be done on the command line. Programs like this that
insist on interaction generally get tiresome very quickly. Actually,
come to that, I'd probably also separate the functionality out into
two separate programs, one for each direction of conversion to make
them easier to use.

With those given, I'd write the code something like this:

// wu.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
int c;
FILE *infile, *outfile;

if (argc != 3) {
fprintf(stderr, "Usage: uw <infile> <outfile>\n");
return EXIT_FAILURE;
}

if (NULL == (infile=fopen(argv[1], "rb"))) {
fprintf(stderr, "Unable to open input file.\n");
return EXIT_FAILURE;
}

if (NULL == (outfile=fopen(argv[2], "wb"))) {
fprintf(stderr, "Unable to create output file.\n");
return EXIT_FAILURE;
}

while (EOF != (c=fgetc(infile)))
if ( c != '\n')
fputc(c, outfile);

fclose(infile);
fclose(outfile);
return 0;
}

uw.c would be the same, except that the inner loop would look
something like:

while ( EOF != (c=fgetc(infile))) {
if ( c == '\n')
fputc('\r', outfile);
fputc(c, outfile);
}

I've used the C I/O operators -- for this task, I see no advantage to
iostreams.

If I was going to do both operations in a single program, I'd have a
common function for opening files, and then wu() and uw() would ONLY
do the conversion from one file to another. If you really want to do
things this way, I'd still make it look to the user like two separate
programs. Specifically, I'd create two separate hard links to the same
executable file, and inside the executable look at argv[0] to see
which name it was invoked under and act appropriately.
 
J

Jonathan Turkanis

Given that this is essentially a filter,

Right. So why not encapsulate it in a reusable component, like the newline
filter I posted?
I've used the C I/O operators -- for this task, I see no advantage to
iostreams.

The advantage is that iostreams is an extensible framework. In particular, the
newline filter can be combined with any number of other filters; this allows
user-defined streams which perform some other sort of filtering to support a
text-mode.

Jonathan
 
K

kalio80

zentara said:
On 3 Dec 2004 20:56:23 -0800, (e-mail address removed) (kalio80) wrote:
*******************
KALIO80 Wrote
*******************
****************
Zentara Wrote
******************
Hi, I'm just learning C++ myself, I generally use Perl, which would
allow be a simple 1-liner.

But for the sake of my learning, I tried to get your script to run, and
below is a working version, but I removed the troublesome char strings
you used in your menus, and just went to int's. So you need to work
on your strings.
****************
Kalio80 Wrote
****************
Thanks Zentara this code compiles without any errors but I can't get
to verify if it's actually doing the conversion or not. I have A txt
file that was written with Vi in linux and stored in Windows XP folder
but that is still reading a long line to the end of the file even
after being copied and converted it's still the same?? I think there
might be a problem with the way I am trying to solve this.
#include <iostream>
#include <fstream>
using namespace std;

#define CR 0x0d
#define LF 0x0a

int uw();
int wu();
int helpfile();

int main()
{
int option;
cout << "\n";
cout << "************************ Menu
*********************\n";
cout << "\n";
cout << " Please choose from the following list\n";
cout << "\n";
cout << " * type 1 to convert a file from Unix to Windows\n";
cout << " * type 2 to convert a file from Windows to Unix\n";
cout << " * type 3 to Display the help file\n";

cin >> option;

while((option != 1) && (option != 2) && (option != 3) )
{
cout << " that's not a valid option! Try again\n";

cin >> option;

}
if (option == 1)
{
uw();
}
else if(option == 2)
{
wu();
}
else if(option == 3)
{
helpfile();
}
return 0;
}


int uw()
{
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file: \n";
cin >> FileName;
ifstream in(FileName,ios::binary | ios:: in);
if(!in)
{
cout << "Error opening source\n";
return 0;
}
cout << "please enter the name of the destination file: \n";
cin >> Destination;
while(FileName == Destination)
{
cout << "Source and Destination file names must be different
please try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out)
{
cout << "Error creating destination file \n";
return 0;
}
char c;
while(in.get(c))
{
if(c == LF)
{
out.put(CR);
out.put(LF);
}
else
{
out.put(c);
}

}
in.close();
out.close();
return 0;
}

int wu()
{
char FileName[20];
char Destination[20];
cout << "please enter the name of the source file (i.e.)
file1.txt\n";
cin >> FileName;
ifstream in(FileName ,ios::binary | ios:: in);
if(!in)
{
cout << "Error opening source\n";
return 1;
}
cout << "please enter the name of the destination file: \n";
cin >> Destination;
while(FileName == Destination)
{
cout << "Source and Destination file names must be different
please try another:\n";
cin >> Destination;
}
ofstream out(Destination,ios::binary | ios::eek:ut);
if(!out)
{
cout << "Error creating destination file \n";
return 0;
}
char c;
while(in.get(c))
{
if(c != CR)
{
out.put(c);
}
}
in.close();
out.close();
return 0;
}


int helpfile()
{
//system("pause");
cout << "\n\nHere is help \n\n";
ifstream inf("help",ios::in);
if(!inf)
{
cout << "Error reading file\n";
return 1;
}
string theLine = "";
while(getline(inf,theLine))
{
cout << theLine << endl;
}
inf.close();
return 0;
}
 
Z

zentara

Thanks Zentara this code compiles without any errors but I can't get
to verify if it's actually doing the conversion or not. I have A txt
file that was written with Vi in linux and stored in Windows XP folder
but that is still reading a long line to the end of the file even
after being copied and converted it's still the same?? I think there
might be a problem with the way I am trying to solve this.

Look at the file with a hex editor, and you will see for sure what
is in the original file and the output.

The program I wrote works on linux, I can take a file and switch
it back and forth to dos or unix. If you
are using linux, is you will see "control-M" at the end of dos files,
in most text editors, like vi. I use mcedit, from Midnight Commander.

I don't use windows, so maybe windows is trying to do something
"auto-magically" for you?
 
Z

zentara

Look at the file with a hex editor, and you will see for sure what
is in the original file and the output.

The program I wrote works on linux, I can take a file and switch
it back and forth to dos or unix. If you
are using linux, is you will see "control-M" at the end of dos files,
in most text editors, like vi. I use mcedit, from Midnight Commander.

I don't use windows, so maybe windows is trying to do something
"auto-magically" for you?

As an afterthought, the reason your file probably stays the same
before and after conversion, is it may be completely devoid of any
lineends of any kind. It is just a big long line.

In that case you may want to use a "line wrap" function on it...where
you would count chars, and if you havn't seen a LF in say x number
of words, you insert one. How to determine "words" can be a simple as
looking for spaces.

Once again, look at the file with a hex editor.
 
K

Konstantin Litvinenko

Hello, Jonathan!
You wrote on Sat, 4 Dec 2004 14:36:59 -0700:

JT> You can use the newline filter from the Boost Iostreams library:

JT> http://home.comcast.net/~jturkanis/iostreams/libs/io/doc/?path=5.9.2.
JT> 2


JT> The library will be part of the 1.33 release and is available here:

JT> http://home.comcast.net/~jturkanis/iostreams/

The library is superior, but it wan't compile with Boost CVS :((( -
something wrong with mpl/apply_if, actualy there no such file. Where can I
get working sources?

With best regards, Konstantin Litvinenko. E-mail: (e-mail address removed)
 
J

Jonathan Turkanis

Konstantin Litvinenko said:
Hello, Jonathan!
You wrote on Sat, 4 Dec 2004 14:36:59 -0700:

JT> You can use the newline filter from the Boost Iostreams library:

JT> http://home.comcast.net/~jturkanis/iostreams/libs/io/doc/?path=5.9.2.
JT> 2


JT> The library will be part of the 1.33 release and is available here:

JT> http://home.comcast.net/~jturkanis/iostreams/

The library is superior

Thanks!


, but it wan't compile with Boost CVS :((( -

Whoops! I forgot to mention that I haven't updated it to use
<boost/mpl/eval_if.hpp> and boost::mpl::eval_if instead of the old apply_if.

I'll do this soon. I guess I should stop posting links to the library until this
is done.
something wrong with mpl/apply_if, actualy there no such file. Where can I
get working sources?

With best regards, Konstantin Litvinenko. E-mail: (e-mail address removed)

Best Regards,
Jonathan
 
J

Jerry Coffin

[ ... ]
Right. So why not encapsulate it in a reusable component, like the newline
filter I posted?

Because I think this is a cure worse than the disease -- I'm
reasonably certain your component makes the code harder to read and
write.

Worse, your code doesn't look particular extensible to me, so in quite
a few cases seemingly minor changes in the filtration to be done would
require tossing the existing code entirely, and starting over again
from the beginning.

The code I posted was more or less typical of small filters: two or
three lines of actual filtration code, surrounded by 30+ lines doing
the bare minimum necessary to ensure against things like accessing
beyond the end of argv or attempting to read/write a file that didn't
open properly. Frankly, its error handling is sufficiently poor that
it qualifies as "barely acceptable" only if we assume that the user is
a programmer or somebody on that order.

IMO, if you want to assist in writing filters, this is the area to
address -- checking argv, opening files, checking for nonexistent
input files, pre-existing output files, etc. Even doing it poorly is
tedious, repetitious and rarely has much need to vary much from one
filter to the next. Doing it well is so rare it's hardly worth
discussing.

In the end, improving this part of filter writing is likely to produce
much greater improvements for not only the coder, but also (more
importantly) for the user as well.
 
J

Jonathan Turkanis

Jerry Coffin said:
"Jonathan Turkanis" <[email protected]> wrote in message
[ ... ]
Right. So why not encapsulate it in a reusable component, like the newline
filter I posted?

Because I think this is a cure worse than the disease -- I'm
reasonably certain your component makes the code harder to read and
write.

To convert a file from, say, classic mac to windows you would write (leaving
aside error handling, which is orthogonal to line-ending conversion):

using namespace std;
using namespace boost::io;

ifstream in("src", ios::binary | ios::in);
ofstream out("dest", ios::binary | ios::eek:ut);

filtering_istream fin;
fin.push(newline_filter(newline::windows));
fin.push(in);
boost::io::copy(fin, out);

It seems to me this code is self-explanatory. Furthermore it allows you to
chose, e.g., whether to pass through stray LF's or to suppress them.
Worse, your code doesn't look particular extensible to me,

It's extensible in the sense that conforming components can be mixed and
matched.
so in quite
a few cases seemingly minor changes in the filtration to be done would
require tossing the existing code entirely, and starting over again
from the beginning.

Are you talking about the implementation of newline filter? Yes, if you want it
to do something different you have to rewrite it. But it's a fairly flexible
library component which can do most types of line-ending conversion
out-of-the-box.
The code I posted was more or less typical of small filters: two or
three lines of actual filtration code, surrounded by 30+ lines doing
the bare minimum necessary to ensure against things like accessing
beyond the end of argv or attempting to read/write a file that didn't
open properly.

You're addressing the problem of writing chainable command-line tools. I'm
addressing line-ending conversions. I guess both problems are relevant since the
OP didn't say what he/she was having trouble with.

Jonathan
 
J

jcoffin

Hi Johathan,

To me, your code looks far more impenetratable than self-explanatory.
If I had to guess at what it did, I'd have guessed at it treating
UNIX-style new-lines as the default. It associates Windows-style
new-lines with the input, and nothing in particular with the output, so
from reading the code, I'd have guessed this was supposd to do
Windows->UNIX conversion.

I've reread your code a number of times, and I'm _utterly_ lost as to
what you think would give the reader even the faintest hint that it
does anything related to MacOS at all.

By contrast, looking at:

FILE *fin = fopen("src", "rb");
FILE *fout = fopen("dest", "wb");
int ch;

while ( EOF!=(ch=getc(fin))) {
putc(ch, fout);
if (ch == '\r')
putc('\n', fout);
}

The actual conversion seems (to me) almost impossible to miss. The only
part open to any question at all is whether the reader realizes what
OSes are associated with CR-only vs. CR/LF line-ends. I'd have to
guess, however, that any programmer who cares to read Mac text files
under Windows would immediately recognize what it does.

As far as extensibilty goes, consider transferring a document from text
format into a word processor. For the sake of discussion, assume we
want to convert each paragraph into (in essence) one long line so the
word processor can change line breaks as needed for the margins being
used. If there are two or more new-lines in a row, one is retained to
mark the end of the paragraph, but all other new-lines are deleted.

Now, I don't understand your code well enough to say it can't do this
job. OTOH, if it takes more than about a minute to figure out how to
get your code to do it, doing so is more trouble than directly writing
the code to do so.
 
J

Jonathan Turkanis

Hi Johathan,

To me, your code looks far more impenetratable than self-explanatory.
If I had to guess at what it did, I'd have guessed at it treating
UNIX-style new-lines as the default. It associates Windows-style
new-lines with the input, and nothing in particular with the output, so
from reading the code, I'd have guessed this was supposd to do
Windows->UNIX conversion.



I've reread your code a number of times, and I'm _utterly_ lost as to
what you think would give the reader even the faintest hint that it
does anything related to MacOS at all.

By contrast, looking at:

FILE *fin = fopen("src", "rb");
FILE *fout = fopen("dest", "wb");
int ch;

while ( EOF!=(ch=getc(fin))) {
putc(ch, fout);
if (ch == '\r')
putc('\n', fout);
}

The actual conversion seems (to me) almost impossible to miss. The only
part open to any question at all is whether the reader realizes what
OSes are associated with CR-only vs. CR/LF line-ends. I'd have to
guess, however, that any programmer who cares to read Mac text files
under Windows would immediately recognize what it does.

As far as extensibilty goes, consider transferring a document from text
format into a word processor. For the sake of discussion, assume we
want to convert each paragraph into (in essence) one long line so the
word processor can change line breaks as needed for the margins being
used. If there are two or more new-lines in a row, one is retained to
mark the end of the paragraph, but all other new-lines are deleted.

Now, I don't understand your code well enough to say it can't do this
job. OTOH, if it takes more than about a minute to figure out how to
get your code to do it, doing so is more trouble than directly writing
the code to do so.

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
J

Jonathan Turkanis

Hi Johathan,
Hi.,

To me, your code

Which code?
looks far more impenetratable than self-explanatory.
If I had to guess at what it did, I'd have guessed at it treating
UNIX-style new-lines as the default. It associates Windows-style
new-lines with the input, and nothing in particular with the output, so
from reading the code, I'd have guessed this was supposd to do
Windows->UNIX conversion.

I assume you mean this:
using namespace std;
using namespace boost::io;

ifstream in("src", ios::binary | ios::in);
ofstream out("dest", ios::binary | ios::eek:ut);

filtering_istream fin;
fin.push(newline_filter(newline::windows));
fin.push(in);
boost::io::copy(fin, out);
I've reread your code a number of times, and I'm _utterly_ lost as to
what you think would give the reader even the faintest hint that it
does anything related to MacOS at all.

Well,

- filtering_istream is an input stream which can zero or have one or more
filters added to it.
- newline_filter is a filter which, quoting from the documentation I cited,
"converts between the text file formats used by various operating systems. Its
sole constructor takes an integral flag parameter used to specify the source and
target formats "
- newline::windows is an integral flag, which is "Useful for converting data to
the Windows format"

Maybe you would be happier if the constant newline::windows were renamed
newline::convert_to_windows_format or newline::convert_to_crlf? Perhaps that's
not a bad idea, but I like shorter names.
By contrast, looking at:

FILE *fin = fopen("src", "rb");
FILE *fout = fopen("dest", "wb");
int ch;

while ( EOF!=(ch=getc(fin))) {
putc(ch, fout);
if (ch == '\r')
putc('\n', fout);
}

The actual conversion seems (to me) almost impossible to miss. The only
part open to any question at all is whether the reader realizes what
OSes are associated with CR-only vs. CR/LF line-ends. I'd have to
guess, however, that any programmer who cares to read Mac text files
under Windows would immediately recognize what it does.

First, your code is hard-wired to use FILEs. My library works for any types
modeling the concepts Source and Sink.

http://home.comcast.net/~jturkanis/iostreams/libs/io/doc/?path=5.1

Second, converting line endings is common enough that it's worth encapsulating.
As far as extensibilty goes, consider transferring a document from text
format into a word processor. For the sake of discussion, assume we
want to convert each paragraph into (in essence) one long line so the
word processor can change line breaks as needed for the margins being
used. If there are two or more new-lines in a row, one is retained to
mark the end of the paragraph, but all other new-lines are deleted.

Right, newline filter doesn't do this, and doesn't pretend to.
Now, I don't understand your code well enough to say it can't do this
job. OTOH, if it takes more than about a minute to figure out how to
get your code to do it, doing so is more trouble than directly writing
the code to do so.

Basically, you can take whatever code you would write in your "while (
EOF!=(ch=getc(fin)))" loop, replace getc, putc, etc. with their generic
counterparts boost::io::get, boost::io::put, etc., stick the code into the body
of a filter member function, and Voila! --you have a reusable component.

E.g., for converting from mac to windows, based on your code:

using namespace boost::io;

struct convert_to_windows : boost::io::eek:utput_filter {
template<typename Sink>
void put(Sink& sink, char c)
{
boost::io::put(Sink, c);
if (ch == '\r')
boost::io::put(Sink, '\n');
}
};

For the word-processor example, assuming all newlines have been converted to
'\n' using the (chainable) newline_filter, we can write:

struct remove_excess_newlines : boost::io::eek:utput_filter {
remove_excess_newlines() : last_char_was_newline_(false) { }

template<typename Sink>
void put(Sink& sink, char c)
{
bool current_char_is_newline = c == '\n';
if (!current_char_is_newline || !last_char_was_newline_)
boost::io::put(Sink, c);
last_char_was_newline_ = current_char_is_newline;
}

bool last_char_was_newline_;
};

This is untested, but I'm sure something equally trivial will do.
Later,
Jerry.

Jonathan
 
J

jcoffin

Making a reusable component is only worthwhile when/if the result is
easier to read and/or write than working without the compoment. So far,
it appears to me these components do precisely the opposite.

Just for example, you still haven't explained what part of the example
code you posted was supposed to give the reader even the slightest clue
that MacOS format was intended for any file involved. You haven't
addressed the issue that in nearly every case using your components
makes the code longer. You haven't addressed the fact that your
components change the basic nature of understanding the code: instead
of applying rules (that are already, of necessity, well known to all C
and C++ programmers) it now consists of memorizing a (rather strange)
interface, a large number of constants, etc. WIthout the component, the
primary requirement for doing the job is one of logic: deciding what
you're really tring to do. IOW, the component changes it from logic to
memorization. There are certainly people who are good at memorization
and poor at logic -- but they're NOT programmers.

When/if such memorization really reduces complexity elsewhere quite a
bit, it can be worthile, but that's not the case here -- quite the
contrary, even if we ignore the memorization issues, the examples
you've posted seem to add memorization AND extra complexity.

That leaves only one possible issue, the point of reading/writing
generic sources and sinks instead of assuming FILE *'s. So far, I've
rarely (if ever) seen a reason to worry about that for this particular
kind of job, but if it is useful to allow for more generic input and
output, I think using iterators is at least as good a way to handle it.
I haven't bothered posting code like this simply because I think it's
pointless, but for anybody who understands iterators, the change should
be fairly trivial.

I'd also be interested in hearing what your classes accomplish that
couldn't be done in the existing iostreams framework using codecvt
facets.
 
J

jcoffin

Making a reusable component is only worthwhile when/if the result is
easier to read and/or write than working without the compoment. So far,
it appears to me these components do precisely the opposite.

Just for example, you still haven't explained what part of the example
code you posted was supposed to give the reader even the slightest clue
that MacOS format was intended for any file involved. You haven't
addressed the issue that in nearly every case using your components
makes the code longer. You haven't addressed the fact that your
components change the basic nature of understanding the code: instead
of applying rules (that are already, of necessity, well known to all C
and C++ programmers) it now consists of memorizing a (rather strange)
interface, a large number of constants, etc. WIthout the component, the
primary requirement for doing the job is one of logic: deciding what
you're really tring to do. IOW, the component changes it from logic to
memorization. There are certainly people who are good at memorization
and poor at logic -- but they're NOT programmers.

When/if such memorization really reduces complexity elsewhere quite a
bit, it can be worthile, but that's not the case here -- quite the
contrary, even if we ignore the memorization issues, the examples
you've posted seem to add memorization AND extra complexity.

That leaves only one possible issue, the point of reading/writing
generic sources and sinks instead of assuming FILE *'s. So far, I've
rarely (if ever) seen a reason to worry about that for this particular
kind of job, but if it is useful to allow for more generic input and
output, I think using iterators is at least as good a way to handle it.
I haven't bothered posting code like this simply because I think it's
pointless, but for anybody who understands iterators, the change should
be fairly trivial.

I'd also be interested in hearing what your classes accomplish that
couldn't be done in the existing iostreams framework using codecvt
facets.
 
J

jcoffin

Making a reusable component is only worthwhile when/if the result is
easier to read and/or write than working without the compoment. So far,
it appears to me these components do precisely the opposite.

Just for example, you still haven't explained what part of the example
code you posted was supposed to give the reader even the slightest clue
that MacOS format was intended for any file involved. You haven't
addressed the issue that in nearly every case using your components
makes the code longer. You haven't addressed the fact that your
components change the basic nature of understanding the code: instead
of applying rules (that are already, of necessity, well known to all C
and C++ programmers) it now consists of memorizing a (rather strange)
interface, a large number of constants, etc. WIthout the component, the
primary requirement for doing the job is one of logic: deciding what
you're really tring to do. IOW, the component changes it from logic to
memorization. There are certainly people who are good at memorization
and poor at logic -- but they're NOT programmers.

When/if such memorization really reduces complexity elsewhere quite a
bit, it can be worthile, but that's not the case here -- quite the
contrary, even if we ignore the memorization issues, the examples
you've posted seem to add memorization AND extra complexity.

That leaves only one possible issue, the point of reading/writing
generic sources and sinks instead of assuming FILE *'s. So far, I've
rarely (if ever) seen a reason to worry about that for this particular
kind of job, but if it is useful to allow for more generic input and
output, I think using iterators is at least as good a way to handle it.
I haven't bothered posting code like this simply because I think it's
pointless, but for anybody who understands iterators, the change should
be fairly trivial.

I'd also be interested in hearing what your classes accomplish that
couldn't be done in the existing iostreams framework using codecvt
facets.
 
J

Jonathan Turkanis

Making a reusable component is only worthwhile when/if the result is
easier to read and/or write than working without the component.
Agreed.

So far,
it appears to me these components do precisely the opposite.

It's hard for me to believe we're talking about the same code. Just to be clear,
here it is:

ifstream in("src", ios::binary | ios::in);
ofstream out("dest", ios::binary | ios::eek:ut);

filtering_istream fin;
fin.push(newline_filter(newline::windows));
fin.push(in);
boost::io::copy(fin, out);

The pattern is simple. Create a filtering stream, add any number of filters, and
pump data through it using boost::io::copy.

The exact same pattern could be used to compress, to encrypt, to perform regex
substitutions or an infinite number of other filtering operations. Only the
filters vary. All that someone reading the code needs to know is what each
filter does.

In this case, I've tried to pick a helpful name. newline_filter performs
line-ending conversions. As for the integral constants, I would have thought it
would be easy enough to remember that "windows" is for converting to windows
format, "mac" for converting to Mac format, and "unix" for converting to unix
format. For more advanced uses, consult the documentation.
Just for example, you still haven't explained what part of the example
code you posted was supposed to give the reader even the slightest clue
that MacOS format was intended for any file involved.

It's not.
You haven't
addressed the issue that in nearly every case using your components
makes the code longer.

Slightly. If this becomes an issue, you can hand-code it. Remember the saying
about premature optimization.
You haven't addressed the fact that your
components change the basic nature of understanding the code: instead
of applying rules (that are already, of necessity, well known to all C
and C++ programmers) it now consists of memorizing a (rather strange)
interface, a large number of constants, etc. WIthout the component, the
primary requirement for doing the job is one of logic: deciding what
you're really tring to do.

The code I posted is at a much higher level of abstraction. Instead of having to
read through low-level looping constructs, the reader sees immediately that a
filtering operation is being performed, and only needs to note the type and
order of filters.
IOW, the component changes it from logic to
memorization. There are certainly people who are good at memorization
and poor at logic -- but they're NOT programmers.

To me, remembering that a newline_filter performs line-ending conversions not a
spectacular feat of memorization. YMMV.
That leaves only one possible issue, the point of reading/writing
generic sources and sinks instead of assuming FILE *'s. So far, I've
rarely (if ever) seen a reason to worry about that for this particular
kind of job, but if it is useful to allow for more generic input and
output, I think using iterators is at least as good a way to handle it.
I haven't bothered posting code like this simply because I think it's
pointless, but for anybody who understands iterators, the change should
be fairly trivial.

I'd also be interested in hearing what your classes accomplish that
couldn't be done in the existing iostreams framework using codecvt
facets.

Regarding both the iterators and the codecvts, see

http://lists.boost.org/MailArchives/boost/msg70761.php

Basically, using iterators as filters gives up some important optimizations that
are possible when an entire buffer of characters is present at one time, forces
filters to manage the lifetime of downstream filters and makes construction of
filter chains more difficult.

As for codecvts, they are notoriously tricky to write correctly, whereas writing
the type of filters I demonstrated above is a snap. Furthermore, there is
currently no component in the standard library, other than the file streams,
which actually *use* codecvts to perform conversions. (P.J.Plaugher has proposed
a remedy: http://tinyurl.com/5hal5) This leaves users with the non-trivial task
of pumping data through a codecvt -- a task which some standard library
implementations get wrong, BTW.

If you really like the codecvt style of programming (you might be the first),
you can use the generic version -- a Symmetric Filter: http://tinyurl.com/3lz82.

Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,001
Messages
2,570,254
Members
46,850
Latest member
VMRKlaus8

Latest Threads

Top