Single core performance, Free benchmark, SandyBridge
Performance in millions of particle-timesteps / second

Nparticles CPU (mpi) Kokkos/OMP (mpi) Kokkos/serial (mpi)
1000 36.19 (1) 22.02 (1) 36 (1)
2000 50.58 (1) 15.31 (1) 35.06 (1)
4000 39.86 (1) 29.13 (1) 36.92 (1)
8000 38.18 (1) 42.76 (1) 44.15 (1)
16000 43.89 (1) 16.82 (1) 37.8 (1)

Run commands and logfile links for column CPU

1000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 4 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=cpu.kind=core.size=1K.node=1.mpi=1
2000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=cpu.kind=core.size=2K.node=1.mpi=1
4000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 5 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=cpu.kind=core.size=4K.node=1.mpi=1
8000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=cpu.kind=core.size=8K.node=1.mpi=1
16000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 16 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=cpu.kind=core.size=16K.node=1.mpi=1

Run commands and logfile links for column Kokkos/OMP

1000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 4 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_omp.kind=core.size=1K.node=1.mpi=1.thread=1
2000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_omp.kind=core.size=2K.node=1.mpi=1.thread=1
4000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_omp.kind=core.size=4K.node=1.mpi=1.thread=1
8000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_omp.kind=core.size=8K.node=1.mpi=1.thread=1
16000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_omp.kind=core.size=16K.node=1.mpi=1.thread=1

Run commands and logfile links for column Kokkos/serial

1000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 4 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_serial.kind=core.size=1K.node=1.mpi=1
2000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 5 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_serial.kind=core.size=2K.node=1.mpi=1
4000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_serial.kind=core.size=4K.node=1.mpi=1
8000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_serial.kind=core.size=8K.node=1.mpi=1
16000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.free.steps -log log.sparta.date=23Dec17.model=free.machine=chama.pkg=kokkos_serial.kind=core.size=16K.node=1.mpi=1