SPARTA (23 Dec 2017)
KOKKOS mode is enabled (../kokkos.cpp:39)
  using 0 GPU(s) per MPI task
  using 1 thread(s) per MPI task
package kokkos
package kokkos reduction parallel/reduce comm classic
# advect particles via VSS collisional flow on a uniform grid
# particles reflect off global box boundaries

variable            x index 10
variable            y index 10
variable            z index 10

variable            lx equal $x*1.0e-5
variable            lx equal 256*1.0e-5
variable            ly equal $y*1.0e-5
variable            ly equal 160*1.0e-5
variable            lz equal $z*1.0e-5
variable            lz equal 160*1.0e-5

variable            n equal 10*$x*$y*$z
variable            n equal 10*256*$y*$z
variable            n equal 10*256*160*$z
variable            n equal 10*256*160*160

seed	    	    12345
dimension   	    3
global              gridcut 1.0e-5

boundary	    rr rr rr

create_box  	    0 ${lx} 0 ${ly} 0 ${lz}
create_box  	    0 0.00256 0 ${ly} 0 ${lz}
create_box  	    0 0.00256 0 0.0016 0 ${lz}
create_box  	    0 0.00256 0 0.0016 0 0.0016
Created orthogonal box = (0 0 0) to (0.00256 0.0016 0.0016)
create_grid 	    $x $y $z
create_grid 	    256 $y $z
create_grid 	    256 160 $z
create_grid 	    256 160 160
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:381)
Created 6553600 child grid cells
  parent cells = 1
  CPU time = 0.107178 secs
  create/ghost percent = 17.503 82.497

balance_grid        rcb part
Balance grid migrated 6508484 cells
  CPU time = 0.436825 secs
  reassign/sort/migrate/ghost percent = 44.2139 0.814015 29.3637 25.6083

species		    ar.species Ar
mixture		    air Ar vstream 0.0 0.0 0.0 temp 273.15

global              nrho 7.07043E22
global              fnum 7.07043E6

collide		    vss air ar.vss

create_particles    air n $n
create_particles    air n 65536000
Created 65536000 particles
  CPU time = 0.389204 secs

stats		    100
compute             temp temp
stats_style	    step cpu np nattempt ncoll c_temp

# first equilibrate with large timestep to unsort particles
# then benchmark with normal timestep

timestep 	    7.00E-8
run                 30
Memory usage per proc in Mbytes:
  particles (ave,min,max) = 68.9175 68.9175 68.9175
  grid      (ave,min,max) = 10.5178 9.63888 11.5139
  surf      (ave,min,max) = 0 0 0
  total     (ave,min,max) = 79.4353 78.5564 80.4314
Step CPU Np Natt Ncoll c_temp 
       0            0 65536000        0        0    273.13318 
      30    16.930848 65536000 65760304 46290531    273.13318 
Loop time of 16.934 on 144 procs for 30 steps with 65536000 particles

Particle moves    = 1966080000 (1.97B)
Cells touched     = 9790138957 (9.79B)
Particle comms    = 180622352 (181M)
Boundary collides = 42966463 (43M)
Boundary exits    = 0 (0K)
SurfColl checks   = 0 (0K)
SurfColl occurs   = 0 (0K)
Surf reactions    = 0 (0K)
Collide attempts  = 1903678966 (1.9B)
Collide occurs    = 1382123700 (1.38B)
Reactions         = 0 (0K)
Particles stuck   = 0

Particle-moves/CPUsec/proc: 806267
Particle-moves/step: 6.5536e+07
Cell-touches/particle/step: 4.97952
Particle comm iterations/step: 3.03333
Particle fraction communicated: 0.0918693
Particle fraction colliding with boundary: 0.0218539
Particle fraction exiting boundary: 0
Surface-checks/particle/step: 0
Surface-collisions/particle/step: 0
Surf-reactions/particle/step: 0
Collision-attempts/particle/step: 0.968261
Collisions/particle/step: 0.702984
Reactions/particle/step: 0

Move  time (%) = 10.1926 (60.1901)
Coll  time (%) = 3.85419 (22.76)
Sort  time (%) = 1.63415 (9.65011)
Comm  time (%) = 1.21508 (7.17538)
Modfy time (%) = 0 (0)
Outpt time (%) = 0.0354956 (0.209611)
Other time (%) = 0.00249466 (0.0147316)

Particles: 455111 ave 456718 max 453556 min
Histogram: 1 9 16 24 23 25 24 13 7 2
Cells:      45511.1 ave 45512 max 45510 min
Histogram: 3 0 0 0 0 122 0 0 0 19
GhostCell: 15387.2 ave 21275 max 7325 min
Histogram: 2 2 8 18 18 26 26 23 13 8
EmptyCell: 4148.56 ave 6540 max 0 min
Histogram: 3 1 2 15 27 11 18 39 0 28
timestep 	    7.00E-9
run 		    $t
run 		    100
Memory usage per proc in Mbytes:
  particles (ave,min,max) = 68.9175 68.9175 68.9175
  grid      (ave,min,max) = 10.5178 9.63888 11.5139
  surf      (ave,min,max) = 0 0 0
  total     (ave,min,max) = 79.4353 78.5564 80.4314
Step CPU Np Natt Ncoll c_temp 
      30            0 65536000 65760304 46290531    273.13318 
     100    14.725935 65536000  6289563  4626067    273.13318 
     130    21.081372 65536000  6373884  4628319    273.13318 
Loop time of 21.083 on 144 procs for 100 steps with 65536000 particles

Particle moves    = 6553600000 (6.55B)
Cells touched     = 9158087012 (9.16B)
Particle comms    = 69710913 (69.7M)
Boundary collides = 14322520 (14.3M)
Boundary exits    = 0 (0K)
SurfColl checks   = 0 (0K)
SurfColl occurs   = 0 (0K)
Surf reactions    = 0 (0K)
Collide attempts  = 613997697 (614M)
Collide occurs    = 459899402 (460M)
Reactions         = 0 (0K)
Particles stuck   = 0

Particle-moves/CPUsec/proc: 2.15866e+06
Particle-moves/step: 6.5536e+07
Cell-touches/particle/step: 1.39741
Particle comm iterations/step: 1
Particle fraction communicated: 0.010637
Particle fraction colliding with boundary: 0.00218544
Particle fraction exiting boundary: 0
Surface-checks/particle/step: 0
Surface-collisions/particle/step: 0
Surf-reactions/particle/step: 0
Collision-attempts/particle/step: 0.0936886
Collisions/particle/step: 0.0701751
Reactions/particle/step: 0

Move  time (%) = 13.0139 (61.7269)
Coll  time (%) = 2.4464 (11.6037)
Sort  time (%) = 5.12473 (24.3074)
Comm  time (%) = 0.443562 (2.10388)
Modfy time (%) = 0 (0)
Outpt time (%) = 0.0531622 (0.252156)
Other time (%) = 0.0012552 (0.00595362)

Particles: 455111 ave 456580 max 453452 min
Histogram: 3 7 10 23 18 25 23 23 7 5
Cells:      45511.1 ave 45512 max 45510 min
Histogram: 3 0 0 0 0 122 0 0 0 19
GhostCell: 15387.2 ave 21275 max 7325 min
Histogram: 2 2 8 18 18 26 26 23 13 8
EmptyCell: 4148.56 ave 6540 max 0 min
Histogram: 3 1 2 15 27 11 18 39 0 28

