Nasty code...but please critique it anyway :-)

J

J. Campbell

Martijn Lievaart said:
Jerry Coffin said:
[ ... ]

I must be pretty dense...I had the hardest time understanding what I
was doing wrong. I think I get it now. Will you take a look at the
(revised) code from my first link in this thread and tell me if my use
of headers is now more or less normal?

If I still had the link handy I would, but I don't...

http://home.bellsouth.net/p/PWP-brightwave

Sorry I didn't include the link before...I didn't want people to think
that I was entheusiastically pimping my novice code, and the link was
just up the thread, at least on my newsreader ;-) Anyway, thanks for
responding.

unencrypt.h:15: `Filespec' was not declared in this scope

(snip a lot more errors due to missing headers)

make: *** [J_Crypt.o] Error 1

Compilation exited abnormally with code 2 at Fri Nov 7 10:18:11

It really is better to watch your case. :)

Martijn, thanks for the reply.

Sorry about the case errors with compilation...It's a hard error to
catch when there is no (local) penalty for making it (on my DOS
machine).
A quick look on some random chosen files.

- hashSclass.h
You only need <string>, move all the other #includes to the C++ files.

Is this a style issue? I moved all #includes to the header because I
read in a different thread (while trying to understand appropriate use
of headers), "put everything that will be needed for the class in it's
header", which makes some sense. Is there a reason to put the
#includes in the *.cpp file? I'm not arguing a point, just trying to
learn ;-)
- J_Crypt.h
A file that includes all other headers. A style issue, but I don't like
it. Also, including <iostream> may lead to code bloat and longer
compilation times. Include it only where needed.

Any pointers to where I can learn about using <iosfwd> instead of
<iostream>?

Can you demonstrate how to use <iosfwd> rather than <iostream> to
avoid code bloat in the following, trivial example?

#include <iostream>
using namespace std;
int main(){
cout << "Thanks Martijn!" << endl;
return 0;
}

- Encrypt.cpp
Looks good. Good comments, very readable. Only thing is that you do user
interaction here. Better to decouple this. A way to do that would be to
pass a const ostream& or a function pointer to a function that can write
strings. The latter technique makes sure you can use the same encrypt
routine in a GUI program as well.

Thanks for the advice...I'll need to ponder this a bit before it is
clear...I suspect it may not be until I start writing code for GUI
use. However, it does make sense. If I understand, you are saying
that I should remove *all* commands that output information to the
screen from this function, because it makes the function less
reusable?
- menu.cpp
* Get text by line and parse that, good!
* You may want to use a std::stringstream to parse the integer, else just
use atoi.
* Why not print the menutext the first time from menu1 as well?
Inconsistent.
* You may want to flush the output stream after printing menutext, in
trhis function you cannot be sure it is terminated with a newline.

Thanks for the advice

BTW, What's HTH?? "Hate to harp"?
 
J

J. Campbell

Jerry Coffin said:
Well, I've taken a look at the code, and I still think it could use some
work. One obvious point would be that there are quite a few places that
use std::cout (for one example) directly (e.g. HashS::showhex and
Slicer::SeedMe).

IMO, these should take a stream as a parameter, and use that stream,
rather than always using std::cout and std::cin.

Thanks for the time. Sorry to keep dragging this out. I agree that
outputting to stream would offer more flexibility. I made it as is
because if the string of the hash (a "gettable value")is printed it
can contain non-printable characters...and I wanted the class to have
an in-built way to *display* itself to the user. So...you'd advise
outputting the hex-formated output to a stringstream then pass that
back from the function as a string? Makes sense but makes it slightly
harder to use, since the user would then have 2 different ways of
getting the hash...the displayable but not usable form, and the usable
but not displayable form. I was trying to avoid that by making a
"show" and a "get" function, rather than 2 different "get" functions.
Anyway, I realize that I'm novice enough that I'm quite likely trying
to justify a poor design decision. Any further thoughts on this?
As-is, your HashS class looks to me like it's a function in hiding (so
to speak).

good point. I think I'll change it, since there is no reason to keep
the object once it has returned it's value, and the:
class instance(data);
instance.hash();
motif that needs to be used with the current setup is awkward. a
simple:
string hashfunction(data);
function would be less awkward to use.
File_spec looks a bit similar -- you basically just
construct a File_spec from a string, and then use (public, no less) data
members of the object. I see little advantage to this design.

What's the better way? I was basically trying to avoid parsing the
filespec multiple times, or passing so many strings (eg path + root +
ext) into functions...seemed easier (and less error-pront) to pass the
class.
It's also somewhat confusing that encrypt.cpp contains only a driver,
and the encryption proper is done in slicer_class -- the latter name, in
particular, gives not even a hint as to the real use or point of the
class.

Well...I was writing PRNG's with names that were descriptive (to me
anyway) of how they worked. Slicerclass happened to be a pretty good
one at generating solid random data that passes all tests for
randomness that I'm aware of. I decided to keep the PRNG as a
stand-alone class that spits out random data, rather than to write it
as an encrypter that accept files and spit out encrypted files.
(other than having a more useful name like My_PRNG) do you have a
suggestion about how to use the PRNG? Would you write a "super-class"
that includes all the functionality of a PRNG, and all the
functionality of the hasher, and all the functionality of the
encrypter, and the associated helper functions? I thought that I had
essentially done this with my "driver" function, while keeping my PRNG
class (slicer_class) usable as a general purpose PRNG.
Though I've been _trying_ to stick to commenting on how the code is
written, I can't help pointing out that your design for the file format
makes a ciphertext-only attack _dramatically_ easier than it should be.

Hey...I'll take free criticisim where I can get it...I appreciate it
;-)
To get any hope of security, you want to make it as difficult as
possible to even guess at whether a decryption was successful or not --
if at all possible, you'd like almost any key to produce something
that's reasonable, so it's as difficult as possible for the attacker to
decide whether a given key is correct. Your design does the opposite:
it tells him immediately whether his guess at a key was correct.

Well, since I built authentication into the system, the user *needs*
to know if he correctly unencrypted the file. While I agree that
telling him that he got it correct would make a dictionary attack
easier (if I understand the attack that you imply), I thought it
wasn't really an issue since I wrote the prog so that such an attack
would be costly at circa 1+ second / try. At this rate(using a single
modern computers) it would take something like 150 million years to
test all 8-character passwords (92 characters (u-case, l-case, nums &
punctuation)), and over a trillion yrs if a 10 char pass is used...
(in case you are reading and say, "bull...all 8 char combinations can
be generated in *much* less time, I need to tell you that the program
uses an initialization vector whereby each PRNG is stepped forward as
fast as the computer can muster for at least 0.5 seconds before it is
used. And 2 independent PRNGs must be initialized to unencrypt.)

Thanks, SO MUCH, for the input. Last week when I posted the code, my
use of header files was whacked...I was including *.cpp files into the
main-containing file and compiling the whole thing into a single
object file...You helped kick my ass to at least get me stumbling down
a smoother road.

Joe
 
L

lilburne

J. Campbell said:
Is this a style issue? I moved all #includes to the header because I
read in a different thread (while trying to understand appropriate use
of headers), "put everything that will be needed for the class in it's
header", which makes some sense. Is there a reason to put the
#includes in the *.cpp file? I'm not arguing a point, just trying to
learn ;-)

The recommendation is to #include all the types that are
required by a header but no more.

Take the following header files:

fred.h
======

class Fred {
};


wilma.h
=======

#include "fred.h"

class Wilma {
Fred partner;
};


pebbles.h
=========

#include "wilma.h"

class Pebbles {
Fred father;
Wilma mother;
};

now if ever Wilma decides that Fred is superflous to
requirements and removes the include of fred.h then pebbles
will no longer know about her father. So pebbles.h should
include fred.h not rely on the indirect inclusion from
wilma.h That is what is meant by "put everything that will
be needed for the class in it's header".

Having made the changes so that you now have:

wilma.h
=======

class Wilma {
};


pebbles.h
=========

#include "wilma.h"
#include "fred.h"

class Pebbles {
Fred father;
Wilma mother;
};

milkman.h

#include "wilma.h"

class Milkman {
Wilma customer;
};

if wilma.h also included pebbles.h because it was used in
wilma.cpp then the Milkman would become dependent on both
pebbles and fred too.
 
M

Martijn Lievaart

unencrypt.h:15: `Filespec' was not declared in this scope

(snip a lot more errors due to missing headers)

make: *** [J_Crypt.o] Error 1

Compilation exited abnormally with code 2 at Fri Nov 7 10:18:11

It really is better to watch your case. :)

Sorry about the case errors with compilation...It's a hard error to
catch when there is no (local) penalty for making it (on my DOS
machine).

Yes I know. I found out the hard way when porting 300KLOC. That was not
fun, but a good opporunity to better my scripting skills.
Is this a style issue? I moved all #includes to the header because I
read in a different thread (while trying to understand appropriate use
of headers), "put everything that will be needed for the class in it's
header", which makes some sense. Is there a reason to put the
#includes in the *.cpp file? I'm not arguing a point, just trying to
learn ;-)

Decoupling. It makes maintenance easier in the end. Include in the header
only what you need to compile the header. Put in the cpp file what you
need to compile that cpp file. It also lessens compile times (except when
you have a broken precompiled header implementation).
Any pointers to where I can learn about using <iosfwd> instead of
<iostream>?

Ahhh, Google? Scott Meyers "Effective STL" and Jossutis "The standard C++
Library" come to mind. Thos are books every C++ programmer should have
anyhow.
Can you demonstrate how to use <iosfwd> rather than <iostream> to avoid
code bloat in the following, trivial example?

#include <iostream>
using namespace std;
int main(){
cout << "Thanks Martijn!" << endl;
return 0;
}

No, you need <iostream> to compile this, you would use <iosfwd> in headers
where you only need the forward declarations to classes like ostream. E.g:

-- file.h

#include <iosfwd>

class X {
public:
void print(std::eek:stream&);
};
Thanks for the advice...I'll need to ponder this a bit before it is
clear...I suspect it may not be until I start writing code for GUI use.
However, it does make sense. If I understand, you are saying that I
should remove *all* commands that output information to the screen from
this function, because it makes the function less reusable?
Exactly.

BTW, What's HTH?? "Hate to harp"?

www.acronymfinder.com is your friend.

HTH,
M4
 
J

Jerry Coffin

[ ... ]
Thanks for the time. Sorry to keep dragging this out. I agree that
outputting to stream would offer more flexibility. I made it as is
because if the string of the hash (a "gettable value")is printed it
can contain non-printable characters...and I wanted the class to have
an in-built way to *display* itself to the user.

Separating "display" from "store" is perfectly fine, and irrelevant to
what I'm talking about here.
So...you'd advise
outputting the hex-formated output to a stringstream then pass that
back from the function as a string?

I suppose you could do that, but it's not what I had in mind at all. I
was thinking of something a bit simpler. Right now, showhex looks
roughly like this:

void HashS::showhex() {

std::cout << various_stuff;
}

I was talking about changing that to something like this:

void HashS::showhex(std::eek:stream os) {

os << various_stuff;
}
Makes sense but makes it slightly
harder to use, since the user would then have 2 different ways of
getting the hash...the displayable but not usable form, and the usable
but not displayable form. I was trying to avoid that by making a
"show" and a "get" function, rather than 2 different "get" functions.
Anyway, I realize that I'm novice enough that I'm quite likely trying
to justify a poor design decision. Any further thoughts on this?

It's open to some question, but it's not the question I was asking. A
hash _is_ a numeric value, so it might also be reasonable to just have a
function to get the value (probably as some user-defined long-integer
type) and then separately have functions to deal with formatting that
type for display, storage, etc. That's a bit beyond what I had in mind
at the moment though.
good point. I think I'll change it, since there is no reason to keep
the object once it has returned it's value, and the:
class instance(data);
instance.hash();
motif that needs to be used with the current setup is awkward. a
simple:
string hashfunction(data);
function would be less awkward to use.

I quite agree -- it looks (to me) like the only other time you use the
hash object after creation is the display function we discussed above.
It looks to me like you may have had something somewhat different in
mind to start with, but it ended up being simplified out of existence
before all was said and done.

One thing that might be worth considering is defining your own type to
hold the hash value -- it doesn't look to me like a string is the ideal
type for the job.
What's the better way? I was basically trying to avoid parsing the
filespec multiple times, or passing so many strings (eg path + root +
ext) into functions...seemed easier (and less error-pront) to pass the
class.

It depends -- as-is, you might about as well have a single function that
manipulates data in a normal struct. For a class to make much sense,
the class should encapsulate _something_. As-is, your class has almost
no intelligence, no encapsulation, etc.
Well...I was writing PRNG's with names that were descriptive (to me
anyway) of how they worked. Slicerclass happened to be a pretty good
one at generating solid random data that passes all tests for
randomness that I'm aware of. I decided to keep the PRNG as a
stand-alone class that spits out random data, rather than to write it
as an encrypter that accept files and spit out encrypted files.
(other than having a more useful name like My_PRNG) do you have a
suggestion about how to use the PRNG? Would you write a "super-class"
that includes all the functionality of a PRNG, and all the
functionality of the hasher, and all the functionality of the
encrypter, and the associated helper functions?

No -- rather the contrary. I'd probably split things up quite a bit
more finely. Right now, your random number generator has one function
that's really related to generating a random number, which is good, but
also has things to give direct access to the key (almost certainly NOT
good) getting an init vector and system size (apparently both of which
are somehow related to repeating a sequence, but only in some rather
ill-defined way.

I'd have a couple of classes. First would be an PRNG state. It would
incorporate enough to be able to store a state of the PRNG, and restore
it later -- rather than your get_system_size, get_init_vector, etc.,
being stored (and manipulated?) by the user in some unknown way, have a
PRNG state that can be created, stored, assigned, and used to create a
stream.

Separate from that, I'd have the encryptor that combines the random
stream with the data to produce the encrypted stream.
I thought that I had
essentially done this with my "driver" function, while keeping my PRNG
class (slicer_class) usable as a general purpose PRNG.

Perhaps I missed something, but right now the separation doesn't look
very clean, and your encryption seems to know quite a bit about the
internals of the PRNG, to the point that you probably could not (for one
example) plug in some totally different PRNG at a moment's notice.
Well, since I built authentication into the system, the user *needs*
to know if he correctly unencrypted the file. While I agree that
telling him that he got it correct would make a dictionary attack
easier (if I understand the attack that you imply), I thought it
wasn't really an issue since I wrote the prog so that such an attack
would be costly at circa 1+ second / try.

That would sort of apply in the case of a dictionary attack, but really
would not to a dictionary attack. A PRNG is basically an algorithmic
way of creating a large, but still finite, sequence of numbers. The
seed to the PRNG simply selects a point in that sequence at which to
start using numbers.

When you do your second of looping, you still end up selecting _some_
point in that sequence. If I skip over the second of looping, I pick
some point in the sequence as well. If I'm doing a brute-force attack,
my plan is to simply look at _every_ starting point in the sequence
until I find the one you used -- and the fact that I got there by a
somewhat different route than you did is utterly irrelevant.

IOW, a brute force attack has no reason whatsoever to use one second per
candidate key.

I haven't tried to figure out the details of a dictionary attack, but in
your bounce(), you seem to use only linear operations. Assuming that's
correct, any number of iterations can be collapsed down to a single,
constant-time operation, meaning your 1-second loop can be reduced to a
matter of a few nanoseconds.

Anyhow, this is getting wildly off-topic, so I'll leave it alone for
now.
 
J

Jerry Coffin

[ ... ]
The recommendation is to #include all the types that are
required by a header but no more.

Anybody who says anything about "the recommendation" on this point
clearly hasn't a clue of what he's talking about.

There are at least two entirely separate theories on this: one says that
no header should include any other header -- if a header needs to use
types from other headers, you include those headers before you include
this one. This style was (and to a lesser extent remains) quite common
on, for one example, UNIX and similar OSes. Just for example, many of
the sys/* header require that you include sys/types.h first. The same
is true in the Windows world, where most other system-oriented headers
require that you include windows.h first.

Another theory runs that a header should be independent, so you can
include it without explicitly including any other headers first; any
other headers needed for it to work correctly, it should include itself.

There is, to some extent, a middle ground between these as well: make a
header independent by including only forward declarations of other types
needed for that header to work. This is arguably cleaner than either
extreme, but almost certainly the most difficult to do well also -- in
particular, it more or less requires generating one complete header, and
another containing only the forward declarations. Theoretically, this
saves you from including a lot of unnecessary "stuff" (especially extra
names that could conflict) from the full-blown header. IMO, it's rarely
worth it: reasonable use of namespaces removes the conflict, so at best
you're talking about going to a fair amount of extra work to reduce
compile times. A header large enough for this to make a real difference
is a good clue that you have other problems. This is usually roughly
equivalent to giving a band-aid and an aspirin to somebody who's just
stepped on a land-mine.
 
J

J. Campbell

Thanks Jerry for the intelligent commentary. The goal was to get my
feet wet with C++, and seeing that I'm now considering getting some
waders, it looks like success! No need to reply to this...you've
given me enough of your time on this project...I really appreciate it.
I'm going to make a couple comments, but agree that this is way off
topic to the c++ newsgroup.
If I'm doing a brute-force attack,
my plan is to simply look at _every_ starting point in the sequence
until I find the one you used -- and the fact that I got there by a
somewhat different route than you did is utterly irrelevant.

I agree. I guess my assumption (always dangerous) is that it would be
easier to try to brute the pass, rather than the hash of the pass,
considering that the hash is 256 bytes and it would be unlikely that a
pass would be that big.
I haven't tried to figure out the details of a dictionary attack, but in
your bounce(), you seem to use only linear operations. Assuming that's
correct, any number of iterations can be collapsed down to a single,
constant-time operation, meaning your 1-second loop can be reduced to a
matter of a few nanoseconds.

If my read of S. Wolfram's book is correct, I don't believe that this
system(the "Rule 30" 1d binary cellular automata) can be reduced. You
can certainly make a single equation that will produce the output
after n-iterations, but that equation will get longer and more complex
for each step that you want to bypass...and in the end is
computationally equivalent to letting the system develop according to
the rule.

It's not all that apparent from the code that my PRNG *is* equivalent
to this system, because for speed, the system has been turnd on it's
ear so that it can be updated in parallel with only 2 shift operations
required(the first and last word of the array holding the system)
rather than shifting every word at each iteration when implemented the
"natural" way. The result of doing this is that the bits coming out
of the system are not in the "natural" order, but, since this system
is non-reversable, and since it produces a good 'random-appearing'
sequence, It's the way I choose to do it

Also, regarding the use of a plug-in PRNG...actually the PRNG has a
get_rand() function that retuens a 4-byte random number...and the
Encrypter() function can trivially make use of this, so,
theoretically, another PRNG should be usable...although the details of
how to seed it would need to be coded...

The point of returning a pointer to the key was to give access to
blocks of random data, rather than just a word at a time. Since,
using default values, the system is 8kb, returning 1kb data between
update cycles, you can get this data all at once rather than using 256
calls to get_rand(). This doesn't really reveal any more of the
internal state than simply getting the numbers 1 at a time.

The reason for having the public functions get_system_size(),
get_init_vector(), etc is because this information is needed to
unencrypt the file...it'd be easy to fix these values to constants so
that a user would be unaware that they even exist...however, I thought
it made a better PRNG to have the system parameters specifiable. This
way, you can use the PRNG in "non-deterministic mode(when
time_initialize is used)" when you don't want the same sequence on
subsequent runs, or you can specify the initialization vector to get
the identical "random appearing" sequence for subsequent runs.

Anyway, I suspect that your interests are more in general
math/programming/science/anything stimulating, than in the details of
some lame home-brew crypto system. ;-) However, if you are interested
in poking more holes in, discussing, or in better understanding my
PRNG or cyrpto-system, I'll be a willing participant in the discourse.

Ciao,

Joe
jkc8289 at symbol bellsouth dot net

Anyhow, this is getting wildly off-topic, so I'll leave it alone for
now.

I don't blame you. Thanks again for the help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
anuragag27

Latest Threads

Top