Overview
How you access data can make a difference to the speed. Whether you use manual loop unrolling or let the JIT do it for you can also make a difference to performance.I have included C++ and Java tests doing the same thing for comparison.
Tests
In each case, different approaches to storing 16 GB of data were compared.In the following tests I compared storing data
- allocating, writing to, reading from and total GC times
- byte[] (smallest primitive) and long[] (largest primitive)
- arrays, direct ByteBuffer and Unsafe
- JIT optimised and hand unrolled four times
store | type | size | unrolled | allocate | writing | reading | GC time |
---|---|---|---|---|---|---|---|
C++ char[] | native | 8-bit char | no | 31 μs | 12.0 s | 8.7 s | N/A |
C++ char[] | native | 8-bit char | yes | 5 μs | 8.8 s | 6.6 s | N/A |
C++ long long[] | native | 64-bit int | no | 11 μs | 4.6 s | 1.4 s | N/A |
C++ long long[] | native | 64-bit int | yes | 12 μs | 4.2 s | 1.2 s | N/A |
byte[] | heap | byte | no | 4.9 s | 20.7/7.8 s | 7.4 s | 51 ms |
byte[] | heap | byte | yes | 4.9 s | 7.1 s | 8.5 s | 44 ms |
long[] | heap | long | no | 4.7 s | 1.6 s | 1.5 s | 37 ms |
long[] | heap | long | yes | 4.7 s | 1.5 s | 1.4 s | 45 ms |
ByteBuffer | direct | byte | no | 4.8 s | 18.1/10.0 s | 14.0 s | 6.1 ms |
ByteBuffer | direct | byte | yes | 4.8 s | 12.2/10.0 s | 16.7 s | 6.1 ms |
ByteBuffer | direct | long | no | 4.7 s | 6.0/3.9 s | 2.4 s | 6.1 ms |
ByteBuffer | direct | long | yes | 4.6 s | 4.7/2.3 s | 7.9 s | 6.1 ms |
Unsafe | direct | byte | no | 10 μs | 18.2 s | 13.8 s | 6.0 ms |
Unsafe | direct | byte | yes | 10 μs | 8.7 s | 8.3 s | 6.0 ms |
Unsafe | direct | long | no | 10 μs | 5.2 s | 1.9 s | 6.0 ms |
Unsafe | direct | long | yes | 10 μs | 4.2 s | 1.3 s | 6.0 ms |
In each case, this is the time to perform 8-bit byte or 64-bit long operations on 16 GB of data in different structures as required. In C++ and using Unsafe, I single array/block memory was used. For Java array and ByteBuffer multiple objects were use to create the same total amount of space.
C++ test configuration
All tests were performed with gcc 4.5.2 on ubuntu 11.04, compiled with -O2Java test configuration
All test were performed with Java 6 update 26 and Java 7 update 0, on a fast PC with 24 GB of memory. Timings are for 6/7. Where there one value they were the same.All tests were run with the options -mx23g -XX:MaxDirectMemorySize=20g -verbosegc
Curiosity
For me the most curious result was the performance of the long[] which was very fast in Java, faster than using C++ or Unsafe directly.The code
C++ tests - memorytest/main.cppJava tests - MemoryTest.java
No comments:
Post a Comment