Java vs C++ speed (IO & Sorting)

Z

zionztp

No, in that case, we are comparing c++ library vs java library.

I meant that we are not sure how good are the numbers java/c++
produce, since "more random" numbers usually take more time to
generate, so we would need to know that both C++/Java are producing
equally good numbers before trying to compare speed.

I deleted the code i used to test, i've just rewritten it using
std:string since i was using plain arrays, this one runs 2x slower but
still faster than the original code:

const int len = 50000000;
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}

int main()
{
std::string s;
randomString(s);
}

compiled with:
g++ code.cpp -O3

btw could you provide the full java code so i can test both?
 
I

Ian Collins

n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);

Take care, the standard only guarantees RAND_MAX shall be at least
32767. You could end up with rather a lot of 'a'.
 
R

Razii

btw could you provide the full java code so i can test both?

use server VM and give it 256m
C:\>java -Xmx256m -server Find2


import java.util.*;

public class Find{

public static void main(String[] arg)
{
final int len = 50000000;

long start = System.currentTimeMillis();
String s = randomString(len);
long end = System.currentTimeMillis();
System.out.println("Time: " + (end - start) + " ms");
}

//returns a String of length l with lowercase letters from a to z
static String randomString(int len)
{
char[] cr = new char[len];
Random rd = new Random();
for (int i = 0; i < len; i++){
int num = rd.nextInt(26) + 97;
cr = (char) num;
}

return new String(cr);

}

}
 
R

Razii

const int len = 50000000;
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}

int main()
{
std::string s;
randomString(s);
}

Even this version is slightly slower than java version even without
making changes to the version I posted

C:\>cl /O2 /GL Find.cpp /link /ltcg

Time: 1906 ms (for your version)

I was getting 1700 ms for java version (without changes yet, i will
see if you can do something to make it faster)
 
K

kwikius

btw could you provide the full java code so i can test both?

use server VM and give it 256m
C:\>java -Xmx256m -server Find2

import java.util.*;

public class Find{

  public static void main(String[] arg)
  {
    final int len = 50000000;

     long start = System.currentTimeMillis();
         String s = randomString(len);
     long end = System.currentTimeMillis();  
         System.out.println("Time: " + (end - start) + " ms");
 }

 //returns a String of length l with lowercase letters from a to z
  static String randomString(int len)
  {
          char[] cr = new char[len];
          Random rd = new Random();
          for (int i = 0; i < len; i++){
             int num =  rd.nextInt(26) + 97;
              cr = (char) num;
         }

     return new String(cr);

  }



}- Hide quoted text -

- Show quoted text -


What's this shit ?

regards
Andy Little
 
R

Razii

Why don't you address the points in the rest of my post?

Anyway, I must be bored, this should be quicker:

Yeah, this was quicker BUT IT FAILED THE QUALITY TEST..

The "a" in your string is always more than "z"

C:\>Find a
Number of a: 2333262

C:\>Find z
Number of z: 1794645

C:\>Find z
Number of z: 1794954

C:\>Find a
Number of a: 2336319

You FAILED the requirement. Try again.
 
R

Razii

const int len = 50000000;
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}

int main()
{
std::string s;
randomString(s);
}


Not ONLY your version is still slower than Java, it failed the
quallity test. the 'a's in your version are always more than 'z'

C:\>Find a
Number of a: 25977124
Time: 1875 ms

C:\>Find z
Number of z: 829753
Time: 1921 ms

C:\>Find a
Number of a: 25975915
Time: 1890 ms

C:\>Find z
Number of z: 830666
Time: 1890 ms

FAILURE!!

Compare that with java version

C:\>java -Xmx256m -server Find a
Number of a: 191701
Time: 203 ms

C:\>java -Xmx256m -server Find z
Number of z: 192765
Time: 203 ms

C:\>java -Xmx256m -server Find d
Number of d: 192420
Time: 218 ms

C:\>java -Xmx256m -server Find k
Number of k: 192557
Time: 203 ms

Your version failed the test..
 
R

Razii

Yeah, this was quicker BUT IT FAILED THE QUALITY TEST..

Your version failed the test. It's not random. There are more 'a's
than 'z's

C:\>Find a
Number of a: 2333262

C:\>Find z
Number of z: 1794645

C:\>Find z
Number of z: 1794954

C:\>Find a
Number of a: 2336319


C:\>java -Xmx256m -server Find d
Number of d: 192420
Time: 218 ms

C:\>java -Xmx256m -server Find k
Number of k: 192557
Time: 203 ms

C:\>java -Xmx256m -server Find a
Number of a: 1923888
Time: 1781 ms

C:\>java -Xmx256m -server Find z
Number of z: 1924369
Time: 1750 ms

C:\>java -Xmx256m -server Find a
Number of a: 1922051
Time: 1813 ms

C:\>java -Xmx256m -server Find z
Number of z: 1923912
Time: 1734 ms
 
Z

zionz

const int len = 50000000;
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}
int main()
{
std::string s;
randomString(s);
}

Not ONLY your version is still slower than Java, it failed the
quallity test. the 'a's in your version are always more than 'z'

C:\>Find a
Number of a: 25977124
Time: 1875 ms

C:\>Find z
Number of z: 829753
Time: 1921 ms

C:\>Find a
Number of a: 25975915
Time: 1890 ms

C:\>Find z
Number of z: 830666
Time: 1890 ms

FAILURE!!

Compare that with java version

C:\>java -Xmx256m -server Find a
Number of a: 191701
Time: 203 ms

C:\>java -Xmx256m -server Find z
Number of z: 192765
Time: 203 ms

C:\>java -Xmx256m -server Find d
Number of d: 192420
Time: 218 ms

C:\>java -Xmx256m -server Find k
Number of k: 192557
Time: 203 ms

Your version failed the test..

What??? we are talking about RANDOM numbers, are you sure thats a
valid way of determining randomness?, but maybe the problem is related
to RAND_MAX size, ill take a look at it and see if im getting same
results here.

anyway i just tested with the code you posted, theres some noticeable
difference here:

Java 1650ms avg
C++ 1093ms avg

The C++ time was measured via the 'time' command so it should be a bit
less.
 
R

Razii

C:\>java -Xmx256m -server Find a
Number of a: 191701
Time: 203 ms

I was using 5 million.. that's why time is 203 ms. With 50 million..

C:\>java -Xmx256m -server Find a
Number of a: 1923974
Time: 1750 ms

C:\>java -Xmx256m -server Find z
Number of z: 1924369
Time: 1750 ms


note how the number of a's and z's are distributed well. Your version
doesn't work.
 
Z

zionz

I was using 5 million.. that's why time is 203 ms. With 50 million..

C:\>java -Xmx256m -server Find a
Number of a: 1923974
Time: 1750 ms

C:\>java -Xmx256m -server Find z
Number of z: 1924369
Time: 1750 ms

note how the number of a's and z's are distributed well. Your version
doesn't work.

I just tested it here with:

int main()
{
string s;
getRandomString(s);
int i,ac,zc;
ac=zc=0;
for(i=0; i<s.size(); i++){
if(s=='a'){ac++;}
if(s=='z'){zc++;}
}
cout << ac << endl << zc << endl;
}

output:
Results:
1952956
1709172

So its not that bad here...
 
R

Razii

What??? we are talking about RANDOM numbers

Your version is not random... it has more a's than z's.. also, time
doesn't mater here because your versioon is not working. I can make
java version like you and it will be faster, but what would be point?
Your version is not even working

C:\>Find a
Number of a: 25976908
Time: 1906 ms

C:\>Find z
Number of z: 831122
Time: 1890 ms

C:\>Find a
Number of a: 25976872
Time: 1890 ms

C:\>Find z
Number of z: 828853
Time: 1906 ms

Compare that with mine

C:\>java -Xmx256m -server Find a
Number of a: 1924228
Time: 1765 ms

C:\>java -Xmx256m -server Find z
Number of z: 1923019
Time: 1766 ms

C:\>java -Xmx256m -server Find a
Number of a: 1922127
Time: 1750 ms

C:\>java -Xmx256m -server Find z
Number of z: 1921560
Time: 1954 ms

Also, note time .. your version is slower on my computer and doesn't
even work.
 
R

Razii

I just tested it here with:

Let me post the whole program so you can test both. You version
doesn't work


=== C++===
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <string>

using namespace std;

const int len = 50000000;
void getRandomString( std::string& s, size_t len);
void randomString( std::string& s);

int main( int argc, char *argv[])
{
if (argc < 2) { cout << "Fool enter text to search"; exit(0);}

string s;

string toSearch = argv[1];

clock_t start=clock();
randomString (s);
//getRandomString(s, len); version by Ian
clock_t endt=clock();

int found = s.find (toSearch);
int count = 0;
while (found!=string::npos)
{
count++;
found=s.find(toSearch,found+1);
}

std::cout <<"Number of " << toSearch << ": " << count << "\n";
std::cout <<"Time: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
}

/*
//my version
void getRandomString( std::string& s, size_t len )
{
srand((unsigned)std::time(0));

std::string cr( len, '\0' );
for( int i = 0 ; i < len ; ++i )
{
int iNumber;
iNumber = rand() % 26 + 97;
cr = (char) iNumber;
}

s.swap( cr );
}

*/

//version by zionz
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}


//version by Ian Collins
void getRandomString( std::string& s, size_t len )
{
srand(std::time(0));

std::string cr( len, '\0' );

// The following code is based on the requirement that RAND_MAX
// shall be at least 32767.
//
const size_t byThree = (len/3)*3;

size_t i(0);

while( i < byThree )
{
const int iNumber( rand() );

cr[i++] = iNumber % 26 + 97;
cr[i++] = (iNumber>>5) % 26 + 97;
cr[i++] = (iNumber>>10) % 26 + 97;
}

while( i < len )
{
const int iNumber( rand() );

cr[i++] = iNumber % 26 + 97;
}

s.swap( cr );
}

===Java===

import java.util.*;

public class Find{

public static void main(String[] arg)
{
if (arg == null)
{
System.out.println("Fool! Enter text on command line to search");
System.exit(-1);
}

final int len = 50000000;
String toSearch = arg[0];

//create a string with 50 million chars from a-z

long start = System.currentTimeMillis();
String s = randomString(len);
long end = System.currentTimeMillis();

int count = 0;
int index = s.indexOf(toSearch);
while (index != -1)
{
index++;
count++;
index = s.indexOf(toSearch, index);
}


System.out.println("Number of " + toSearch + ": " + count);
System.out.println("Time: " + (end - start) + " ms");
}

//returns a String of length l with lowercase letters from a to z
static String randomString(int len)
{
char[] cr = new char[len];
Random rd = new Random();
for (int i = 0; i < len; i++){
int num = rd.nextInt(26) + 97;
cr = (char) num;
}

return new String(cr);

}

}
 
T

Tim Smith

I always post the complete thing that people can cut and paste. What's
with posting snippets?

Anyway, his example didn't compile so until he does it, c++ was
slower.

You code doesn't always compile, either.
 
R

Razii

if (arg == null)
{
System.out.println("Fool! Enter text on command line to search");
System.exit(-1);
}

Ops, this should be

if (arg.length == 0)
{
System.out.println("Fool! Enter text on command line to search");
System.exit(-1);
}
 
R

Razii

So its not that bad here...

want to see another proof your version is failure?

WATCH THIS

C:\>Find zion
Number of zion: 0
Time: 1890 ms

C:\>Find zion
Number of zion: 0
Time: 1890 ms

C:\>Find zion
Number of zion: 0
Time: 2062 ms

that's because you have so few z's:

Now watch this:

C:\>java -Xmx256m -server Find zion
Number of zion: 127
Time: 1812 ms

C:\>java -Xmx256m -server Find zion
Number of zion: 121
Time: 1750 ms

C:\>java -Xmx256m -server Find zion
Number of zion: 106
Time: 1797 ms


EVEN YOUR OWN PROGRAM DOESN'T RECOGNIZE YOU

:)
 
Z

zionz

want to see another proof your version is failure?

WATCH THIS

C:\>Find zion
Number of zion: 0
Time: 1890 ms

C:\>Find zion
Number of zion: 0
Time: 1890 ms

C:\>Find zion
Number of zion: 0
Time: 2062 ms

that's because you have so few z's:

Now watch this:

C:\>java -Xmx256m -server Find zion
Number of zion: 127
Time: 1812 ms

C:\>java -Xmx256m -server Find zion
Number of zion: 121
Time: 1750 ms

C:\>java -Xmx256m -server Find zion
Number of zion: 106
Time: 1797 ms

EVEN YOUR OWN PROGRAM DOESN'T RECOGNIZE YOU

:)

Could you verify this one:

void getRandomString(std::string &s)
{
int n,i;
srand((unsigned)std::time(0));
s.reserve(len);
for(i = 0 ; i < len; i+=4){
n = rand();
s+= (char) (n % 26 + 97);
s+= (char) ( (n>>8) % 26 + 97);
s+= (char) ( (n>>16) % 26 + 97);
s+= (char) ( (n>>24) % 26 + 97);
}
}

btw it seems the c++ rand() is slower than the java equivalent, and in
this test theres no real way of optimizing the things since everything
depends on that function.
 
L

Lew

In my system the rand() function is taking most of the time in this
test, and since it generates a 4 bytes random number i tried the
following:

n = rand();
memcpy(&n, nt, 4);
b[i+0] = nt[0] % 26 + 97;
b[i+1] = nt[1] % 26 + 97;
b[i+2] = nt[2] % 26 + 97;
b[i+3] = nt[3] % 26 + 97;

This resulted into 6x faster execution.

But not necessarily in equally random bytes.

The trouble is that randomness of each byte is not guaranteed by randomness of
the whole int.
 
R

Razii

Could you verify this one:

I would but I already posted the sorce so you can also verify it
yourself


=== C++===
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <string>

using namespace std;

const int len = 50000000;
void getRandomString( std::string& s, size_t len);
void randomString( std::string& s);

int main( int argc, char *argv[])
{
if (argc < 2) { cout << "Fool enter text to search"; exit(0);}

string s;

string toSearch = argv[1];

clock_t start=clock();
randomString (s);
//getRandomString(s, len); version by Ian
clock_t endt=clock();

int found = s.find (toSearch);
int count = 0;
while (found!=string::npos)
{
count++;
found=s.find(toSearch,found+1);
}

std::cout <<"Number of " << toSearch << ": " << count << "\n";
std::cout <<"Time: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
}

/*
//my version
void getRandomString( std::string& s, size_t len )
{
srand((unsigned)std::time(0));

std::string cr( len, '\0' );
for( int i = 0 ; i < len ; ++i )
{
int iNumber;
iNumber = rand() % 26 + 97;
cr = (char) iNumber;
}

s.swap( cr );
}

*/

//version by zionz
void randomString(std::string &s)
{
char cr[len];
int n;
unsigned char tn[4];
srand((unsigned)std::time(0));
for( int i = 0 ; i < len; i+=4 ){
n = rand();
memcpy(tn, &n, 4);
s+= (char) (tn[0] % 26 + 97);
s+= (char) (tn[1] % 26 + 97);
s+= (char) (tn[2] % 26 + 97);
s+= (char) (tn[3] % 26 + 97);
}
}


//version by Ian Collins
void getRandomString( std::string& s, size_t len )
{
srand(std::time(0));

std::string cr( len, '\0' );

// The following code is based on the requirement that RAND_MAX
// shall be at least 32767.
//
const size_t byThree = (len/3)*3;

size_t i(0);

while( i < byThree )
{
const int iNumber( rand() );

cr[i++] = iNumber % 26 + 97;
cr[i++] = (iNumber>>5) % 26 + 97;
cr[i++] = (iNumber>>10) % 26 + 97;
}

while( i < len )
{
const int iNumber( rand() );

cr[i++] = iNumber % 26 + 97;
}

s.swap( cr );
}

===Java===

import java.util.*;

public class Find{

public static void main(String[] arg)
{
if (arg.length == 0)
{
System.out.println("Fool! Enter text on command line to search");
System.exit(-1);
}

final int len = 50000000;
String toSearch = arg[0];

//create a string with 50 million chars from a-z

long start = System.currentTimeMillis();
String s = randomString(len);
long end = System.currentTimeMillis();

int count = 0;
int index = s.indexOf(toSearch);
while (index != -1)
{
index++;
count++;
index = s.indexOf(toSearch, index);
}


System.out.println("Number of " + toSearch + ": " + count);
System.out.println("Time: " + (end - start) + " ms");
}

//returns a String of length l with lowercase letters from a to z
static String randomString(int len)
{
char[] cr = new char[len];
Random rd = new Random();
for (int i = 0; i < len; i++){
int num = rd.nextInt(26) + 97;
cr = (char) num;
}

return new String(cr);

}

}
 
R

Razii

Could you verify this one:

void getRandomString(std::string &s)
{
int n,i;
srand((unsigned)std::time(0));
s.reserve(len);
for(i = 0 ; i < len; i+=4){
n = rand();
s+= (char) (n % 26 + 97);
s+= (char) ( (n>>8) % 26 + 97);
s+= (char) ( (n>>16) % 26 + 97);
s+= (char) ( (n>>24) % 26 + 97);
}
}


Nope...

C:\>Find a
Number of a: 25970426
Time: 1422 ms

C:\>Find z
Number of z: 872062
Time: 1438 ms

C:\>Find zion
Number of zion: 0
Time: 1437 ms

C:\>Find adam
Number of adam: 0
Time: 1421 ms

in java adam returns

C:\>java -Xmx256m -server Find adam
Number of adam: 116
Time: 1766 ms

(notw it's similar number as zion because both words are 4 letters. A
true random string will always return similar number for 4 letter
word).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,947
Members
47,501
Latest member
Ledmyplace

Latest Threads

Top