Links

Home
Open Source
World News
About W3 Systems
References
Health Sharing
Sleep
Talking it over
Wall Builders
Selective Immune suppression

Asides...

Hi

Benchmarking

Doing a cursory google or yahoo search, I found some tests that claim Java is faster than C++.
One such is at www.kano.net

I just looked at one of the worst cases to see what was going on, and found the test & results to be pretty bogus.  I'll concentrate on the first thing I looked at, the hash test...


Taken directly from the site:
Times (smaller is better)


G++ 3.3.1 Java 1.4.2

Intel 386 Intel 686 Server JVM Client JVM
Ackermann 60.95 39.03 34.53 N/A*
Fibonacci 49.41 43.13 33.12 49.78
Hash2 21.11 21.62 9.96 14.56
Hash 16.43 16.51 7.93 86.76
Heapsort 20.04 19.63 21.59 20.47
Matrix 10.58 10.76 14.95 30.55
Method call 24 20.49 2.47 39.18
Nested loop 12.63 15.08 23.42 32.57
Object creation 24.88 23.47 6.4 7.17
Random no. gen. 21.29 15.12 29.99 57.08
Sieve 16.54 16.53 15.08 16.66
String concatenation 2.49 1.79 2.6 2.99
Sumcol 9.11 8.51 6.69 12.75
Word count 4.37 4.27 3.02 3.89

BUT take a look at the raw command log for the hash results:
(I put the time command just before each result for clarity.)
------------------------------
[keith@leak bench]$ time cpp/hash 3000000; 
299999

real 0m17.624s
user 0m16.510s
sys 0m0.590s

time cpp/hash-386 3000000;
299999


real 0m17.623s
user 0m16.430s
sys 0m0.540s

time java -server -Xmx512M -cp java hash 3000000
299999

real 0m26.478s
user 0m7.930s
sys 0m0.150s

[keith@leak bench]$ time java -Xmx512M -cp java hash 3000000
299999

real 1m32.083s
user 1m26.760s
sys 0m2.840s

Please note that 'user' times are being used,
which is erroneous, since the JVM time apparently is
not included for the java -s invocation.
I would strongly argue that real time should be used.

If the computer isn't tasked doing anything else but
the benchmark (which is only reasonable for a benchmark!) The real question
is how long did it take!

You can't always trust the user number, if the 'program' is just
a proxy client for the JVM!

Now on to the C++ code that was used for the benchmark...



For Comparison - The Java code:

// $Id: hash.java,v 1.3 2001/03/02 02:17:29 doug Exp $
// http://www.bagley.org/~doug/shootout/

// this program is modified from:
// http://cm.bell-labs.com/cm/cs/who/bwk/interps/pap.html
// Timing Trials, or, the Trials of Timing: Experiments with Scripting
// and User-Interface Languages by Brian W. Kernighan and
// Christopher J. Van Wyk.

import java.io.*;
import java.util.*;

public class hash {

public static void main(String args[]) throws IOException {
int n = Integer.parseInt(args[0]);
int i, c;
String s = "";
Integer ii;
// the original program used:
// Hashtable ht = new Hashtable();
// John Olsson points out that Hashtable is for synchronized access
// and we should use instead:
HashMap ht = new HashMap();

c = 0;
for (i = 1; i <= n; i++)
ht.put(Integer.toString(i, 16), new Integer(i));
for (i = 1; i <= n; i++)
// The original code converted to decimal string this way:
// if (ht.containsKey(i+""))
if (ht.containsKey(Integer.toString(i, 10)))
c++;

System.out.println(c);
}
}

The Original C++ Code:


// -*- mode: c++ -*-
// $Id: hash.g++,v 1.2 2001/06/20 03:20:02 doug Exp $
// http://www.bagley.org/~doug/shootout/

#include <stdio.h>
#include <iostream>
#include <hash_map.h>

using namespace std;

struct eqstr {
bool operator()(const char* s1, const char* s2) const {
return strcmp(s1, s2) == 0;
}
};

int
main(int argc, char *argv[]) {
int n = ((argc == 2) ? atoi(argv[1]) : 1);
char buf[16];
typedef hash_map<const char*, int, hash<const char*>, eqstr> HM;
HM X;

for (int i=1; i<=n; i++) {
sprintf(buf, "%x", i);
X[strdup(buf)] = i;
}

int c = 0;
for (int i=n; i>0; i--) {
sprintf(buf, "%d", i);
if (X[strdup(buf)]) c++;
}

cout << c << endl;
}



And this is pretty bad; for several reasons. For large iterations the problem is that the benchmark becomes bogged down in managing a fragmented heap, not performing hash operations. I noticed that our 'impartial tester' upped the iterations of the original benchmark test from 20000 to 3000000!

Also notice that in the second loop, the iterators are different than in the Java code.  The logical expression of the programs has ben altered, which may effect the validity of the test. 

Also, the strdup call in the second loop is not necessary, and adds to the overhead.  A better hash test would elimanate strdup altogether!  That's exactly what I did, along with other cleanups with the following modified code:


*
 hash-v2.cpp modified hash benchmark by D.K.McCombs
 http://www.w3sys.com
 Original Code hash.cpp
 http://www.bagley.org/~doug/shootout/
*/
#include <stdio.h>
#include <iostream>
#include <hash_map.h>

using namespace std;

struct eqstr {
    bool operator()(const char* s1, const char* s2) const {
    return strcmp(s1, s2) == 0;
    }
};

int
main(int argc, char *argv[]) {
    int n = ((argc == 2) ? atoi(argv[1]) : 1);
    char * buffer = new char[10*n];
    char * buf = buffer;
    char tbuf[16];
    typedef hash_map<const char*, int, hash<const char*>, eqstr> HM;
    HM X;

    for (int i=0; i<n; i++) {
      sprintf(buf, "%x", i);
      X[buf] = i;
      buf+=strlen(buf)+1;
    }

    int c = 0;
    for (int i=0; i<n; i++) {
      sprintf(tbuf, "%d", i);
      if (X.find(tbuf) != X.end() ) c++;
    }
    delete [] buffer;
    cout << c << endl;
}




This code is twice as fast.  In the context of the orginal kano test results would yield:
9 seconds for C++
26 seconds for Java -s

C++ is well over twice as fast as Java -s.

As a parting shot, while researching whether the JVM was in or out of process, which would nail down the bogus user metric, I came apon this FAQ.  These are questions by real developers, trying to use Java.  Note the disatisfying responses!  This is why C++ is so important.  The paradigm trusts the developer to be a professional and program competently.  C++ allows you to get to the bottom of any problems. 

I really didn't have time for this, but this uncollected garbage needed to be rebutted.

Copyright (C) 2008 - W3 Systems Design