Single core performance, Sphere benchmark, Haswell
Performance in millions of particle-timesteps / second

Nparticles CPU (mpi) Kokkos/OMP (mpi) Kokkos/serial (mpi)
8000 20.65 (1) 17.41 (1) 17.81 (1)
16000 27.04 (1) 23 (1) 23.29 (1)

Run commands and logfile links for column CPU

8000 srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_cpu -v x 8 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=cpu.kind=core.size=8K.node=1.mpi=1.hyper=1
16000 srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_cpu -v x 16 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=cpu.kind=core.size=16K.node=1.mpi=1.hyper=1

Run commands and logfile links for column Kokkos/OMP

8000 setenv OMP_NUM_THREADS 1; srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=kokkos_omp.kind=core.size=8K.node=1.mpi=1.thread=1.hyper=1
16000 setenv OMP_NUM_THREADS 1; srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_kokkos_omp -sf kk -k on t 1 -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=kokkos_omp.kind=core.size=16K.node=1.mpi=1.thread=1.hyper=1

Run commands and logfile links for column Kokkos/serial

8000 srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 8 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=kokkos_serial.kind=core.size=8K.node=1.mpi=1.hyper=1
16000 srun -n 1 -C haswell --ntasks-per-node 1 --cpu_bind=cores -c 2 ./spa_mutrino_kokkos_serial -sf kk -k on -pk kokkos reduction parallel/reduce comm classic -v x 16 -v y 10 -v z 10 -v t 100 -in in.sphere.steps -log log.sparta.date=23Dec17.model=sphere.machine=mutrino.pkg=kokkos_serial.kind=core.size=16K.node=1.mpi=1.hyper=1