Wednesday, September 7, 2011

Reading/writing GC-less memory

Overview

How you access data can make a difference to the speed. Whether you use manual loop unrolling or let the JIT do it for you can also make a difference to performance.

I have included C++ and Java tests doing the same thing for comparison.

Tests

In each case, different approaches to storing 16 GB of data were compared.

In the following tests I compared storing data
  • allocating, writing to, reading from and total GC times
  • byte[] (smallest primitive) and long[] (largest primitive)
  • arrays, direct ByteBuffer and Unsafe
  • JIT optimised and hand unrolled four times

storetypesizeunrolledallocatewritingreadingGC time
C++ char[]native8-bit charno31 μs12.0 s8.7 sN/A
C++ char[]native8-bit charyes5 μs8.8 s6.6 sN/A
C++ long long[]native64-bit intno11 μs4.6 s1.4 sN/A
C++ long long[]native64-bit intyes12 μs4.2 s1.2 sN/A
byte[]heapbyteno4.9 s20.7/7.8 s7.4 s51 ms
byte[]heapbyteyes4.9 s7.1 s8.5 s44 ms
long[]heaplongno4.7 s1.6 s1.5 s37 ms
long[]heaplongyes4.7 s1.5 s1.4 s45 ms
ByteBufferdirectbyteno4.8 s18.1/10.0 s14.0 s6.1 ms
ByteBufferdirectbyteyes4.8 s12.2/10.0 s16.7 s6.1 ms
ByteBufferdirectlongno4.7 s6.0/3.9 s2.4 s6.1 ms
ByteBufferdirectlongyes4.6 s4.7/2.3 s7.9 s6.1 ms
Unsafedirectbyteno10 μs18.2 s13.8 s6.0 ms
Unsafedirectbyteyes10 μs8.7 s8.3 s6.0 ms
Unsafedirectlongno10 μs5.2 s1.9 s6.0 ms
Unsafedirectlongyes10 μs4.2 s1.3 s6.0 ms

In each case, this is the time to perform 8-bit byte or 64-bit long operations on 16 GB of data in different structures as required. In C++ and using Unsafe, I single array/block memory was used. For Java array and ByteBuffer multiple objects were use to create the same total amount of space.

C++ test configuration

All tests were performed with gcc 4.5.2 on ubuntu 11.04, compiled with -O2

Java test configuration

All test were performed with Java 6 update 26 and Java 7 update 0, on a fast PC with 24 GB of memory. Timings are for 6/7. Where there one value they were the same.

All tests were run with the options -mx23g -XX:MaxDirectMemorySize=20g -verbosegc

Curiosity

For me the most curious result was the performance of the long[] which was very fast in Java, faster than using C++ or Unsafe directly.

The code

C++ tests - memorytest/main.cpp

Java tests - MemoryTest.java

No comments:

Post a Comment