Single core performance, Collide benchmark, SandyBridge
Performance in millions of particle-timesteps / second

Nparticles CPU (mpi) Kokkos/OMP (mpi) Kokkos/serial (mpi)
1000 24.79 (1) 13.67 (1) 19.76 (1)
2000 26.75 (1) 9.111 (1) 22.18 (1)
4000 20.92 (1) 13.05 (1) 19.24 (1)
8000 18.48 (1) 18.96 (1) 18.27 (1)
16000 18.41 (1) 8.998 (1) 16.78 (1)

Run commands and logfile links for column CPU

1000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 4 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=cpu.kind=core.size=1K.node=1.mpi=1
2000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=cpu.kind=core.size=2K.node=1.mpi=1
4000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 5 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=cpu.kind=core.size=4K.node=1.mpi=1
8000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 8 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=cpu.kind=core.size=8K.node=1.mpi=1
16000 mpirun -n 1 -N 1 --bind-to core spa_chama_cpu -v x 16 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=cpu.kind=core.size=16K.node=1.mpi=1

Run commands and logfile links for column Kokkos/OMP

1000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 4 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_omp.kind=core.size=1K.node=1.mpi=1.thread=1
2000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_omp.kind=core.size=2K.node=1.mpi=1.thread=1
4000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_omp.kind=core.size=4K.node=1.mpi=1.thread=1
8000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_omp.kind=core.size=8K.node=1.mpi=1.thread=1
16000 mpirun -n 1 -N 1 --bind-to socket spa_chama_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_omp.kind=core.size=16K.node=1.mpi=1.thread=1

Run commands and logfile links for column Kokkos/serial

1000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 4 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_serial.kind=core.size=1K.node=1.mpi=1
2000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 5 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_serial.kind=core.size=2K.node=1.mpi=1
4000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 5 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_serial.kind=core.size=4K.node=1.mpi=1
8000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_serial.kind=core.size=8K.node=1.mpi=1
16000 mpirun -n 1 -N 1 --bind-to core spa_chama_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.collide.steps -log log.sparta.date=23Dec17.model=collide.machine=chama.pkg=kokkos_serial.kind=core.size=16K.node=1.mpi=1