This is a shadow page. Here is the original.

N-Body / Particle Simulation Methods

Amara's Recap of Particle Simulation Methods ("ARPSM" !!)

My own experience with N-body simulation methods are with schemes using direct integration of forces ("Particle Particle" methods), and also with the symplectic methods. When a discussion in August 1995 on the sci.math.num-analysis newsgroup about N-body/Particle simulation methods caught my attention, I wrote this article to keep track of the different schemes (too many acronyms!), and to learn what researchers in the field are using today.

Amara Graps

Last Revision: 10 March 1996.
Please send comments, additions, and/or corrections to:
ASCII version available from:
WWW version: here.


Particle-Particle (PP)

The Particle-Particle method is the simplest. The method is to:

For example, in a gravitational N-body simulation, a particle of mass M attracts another particle of mass m with a force: -(GMm/r^3)*r. You have N particles, computing the force (N-1) times. Separate the equation into two first-order differential equations involving acceleration and involving velocity. Then, use an integration scheme like Euler or Runge-Kutta to get the positions and velocities. (Simple, eh?)

As particles approach each other, the forces get much bigger, and so do the velocities. If a constant time-step is used, the particles will travel too far and the errors can get unacceptably large when two particles come close together. To avoid this, you will want to consider a numerical integration scheme that uses variable time-steps. These schemes automatically cut the time step down when the particles are near each other, and increase the time step when the particles are far away.

The Particle-Particle direct integration approach is flexible but has a high computational cost: O(N^2) operations are required to evaluate the forces on all N particles.

If you have less than about N=1000 particles, and are interested in the close-range dynamics of the particles, then this method is the most straight-forward.

Some WWW N-Body Particle-Particle Simulation Examples


Any numerical analysis text worth anything will have chapters on integration methods. My two favorite references that highlight physics applications are listed below.

Particle-Mesh (PM)

The Particle-Mesh method treats the force as a field quantity by approximating it on a mesh. Differential operators, such as the laplacian, are replaced by finite difference approximations. Potentials and forces at particle positions are obtained by interpolating on the array of mesh-defined values. Mesh-defined densities are calculated by assigning particle attributes (e.g. "charge") to nearby mesh points in order to create the mesh-defined values (e.g. "charge density").

So the principle steps of the particle mesh calculation are:

  1. Assign "charge" to the mesh ("particle mass" becomes "grid density"),
  2. Solve the field potiential equation (e.g. Poisson's) on the mesh,
  3. Calculate the force field from the mesh-defined potential,
  4. Interpolate the force on the grid to find forces on the particles.
  5. Now like the PP: integrate the forces to get particle positions and velocities.
  6. Update the time counter.

Some details.

For Step 1), many schemes exist to assign charges to the mesh. The best ones are those that reduce the fluctuations of the force when the particles are near each other. This reduction in force fluctuation can be quantified with a constraint called: "smoothness." Smoothness is the continuity of the derivatives of the approximated function used in assigning charges to the mesh.

For Step 2), the PM method solves the Poisson equation:

del^2(phi) = 4 * pi * G * rho

for the gravitational potential phi, (usually) using an FFT technique. The transform is is computed by multiplication with a suitable Green's function for the Poisson equation (which contains a spatially-varying part of the mass density field). Periodic boundary conditions are usually used to simulate an "infinite universe."

The simplest type of PM method is the "Nearest-Grid-Point" (NGP) particle-mesh method where densities at a mesh point are assigned by the total amount of "charge" in the cell surrounding the mesh point, divided by the cell volume. However, this method is rarely used. One of its drawbacks are that it gives forces that are discontinuous in value.

The "Cloud-in-a-Cell" (CIC) PM scheme is a better approximation to the force by replacing the nearest grid point assignment by a scheme involving 2^N nearest neighbors (where N is dimensionality of the problem). The CIC method gives forces continuous in value, but discontinous in their first derivative.

A more accurate PM scheme is the "Triangular-Shaped-Cloud" (TSC) method. This scheme has an assignment interpolation function that is piecewise quadratic. In one dimension it employs three mesh points. (see Hockney and Eastwood, pg. 136)

The main advantage of the PM methods is speed. For N particles and Ng grid points, the slowest steps are the FFTs, requiring O(Ng log Ng) operations. But because only a limited number of basis functions can be used, the local spatial resolution is usually restricted. However, "staggered mesh schemes" have been found to significantly improve the resolution. (See, Melott, 1986; also in your literature searches on staggered mesh schemes, look for references by Park.)

A big reason for using the PM method is for situations when you want to suppress two-body collisionality. Some tests documenting this are in Melott et al, 1989.

The PM method is basically unacceptable for studying close encounters between particles ("correlated systems") because the algorithm, in effect, treats particles as being fuzzy, nearly round balls (varies depending on which assignment scheme is used). This method is good for simulations where you want a "softening" of the inverse square law force. In general, the mesh spacing should be smaller than the wavelengths of importance in the physical system.


Some Research That Uses This Technique

Particle-Particle/Particle-Mesh (P3M)

The P3M method solves the major shortcoming of the PM method- low resolution forces computed for particles near each other. This method supplements the interparticle forces with a direct sum over pairs separated by less than about 3*del_x, where del_x is the grid spacing. The interparticle forces are split into a rapidly-varying short-range part and a slowly-varying long-range part. The PP method is used to find the total short-range contribution to the force on each particle and the PM method is used to find the total slowly-varying force contributions.

Two meshes are used in these algorithms: a "charge-potential" mesh, and a "chaining" (coarser) mesh. The charge-potential mesh is used at different stages of the PM calculation to store charge density values, charge harmonics, potential harmonics, and potential values. The chaining mesh is an array of cells to locate pairs of neighboring particles in the short-range calculation.

The scheme looks like:

  1. Start with the position and momenta of the particles,
  2. Update time step,
  3. Calculate PM forces (assign charge to mesh, solve for potentials, interpolate forces, increment momenta),
  4. Calculate PP forces (fill chaining mesh and update momenta),
  5. Create equations of motion (update positions, apply particle boundary conditions, update energy accumulators),
  6. Integrate with leapfrog scheme to get position and momenta.

The number of operations is of order (N+Ng) (N = #particles, Ng=#grids). It has been widely used in cosmological simulations and is easiest to use when the forces can be readily split between short-range and long-range.

Some work in applying wavelet bases and using Fast Wavelet Transforms has extended the resolution and cut down the order of computation to be closer to O(Ng).

The disadvantage of the P3M algorithm is that it too easily becomes dominated by the direct summation part (the inter-particle force "supplement").

A solution to this problem are various adaption methods explored by many researchers in the mid 1980s through the mid 90s. For example: Pen's (1993) "Linear Moving Adaptive PM," a nested-grid adapted PM code by Chan (1986), and Anninos (1993), and Villumsen's (1987) Hierarchical Particle Mesh (HPM) code. See Splinter's 1996 paper for a description of these. In the later 1980s, Hugh Couchman constructed his algorithm, called the "mesh-refined" P3M algorithm, that is used by many people working with PM-type methods now. The algorithm uses subgrids to speed up the direct summation of nearby neighbors by shiftng some of the burden to high-resolution FFTs. This solution may well be faster than some of the hierarchical tree codes, described next.


Nested Grid Particle-Mesh (NGPM)

The Nested Grid Particle-Mesh is particle-mesh scheme by several researchers: Anninos, et al (1994), Villumsen, (1988), and others; and most recently by Randall Splinter (University of Kentucky) which uses a series of nested lattices in which the force and mass resolution increases for the more finely spaced sub-lattices. This latter point is an important distinction between this and the Tree / P3M methods. Those other methods increase the force resolution without increasing the mass resolution. This NGPM method does both, thereby avoiding the phase errors that may arise due to the growth of spurious perturbations in the simulation.

  1. Assume a parent grid. Assign "charge" to the mesh ("particle mass" becomes "grid density").

  2. Define a sub-grid anywhere within the parent grid consisting of n^3 sub-grid particles. The sub-grid particle mass becomes m_p/n_s^3 where p = the parent grid and s = the sub-grid.

  3. One can use a arbitrary number of parent grid cells and sub-grid cells, with a uniform spacing on both grids. Also, the sub-grid is fixed and doesn't move during the simulation. A buffer zone (which includes both an inner and outer zone) is included around the sub-grid to help smooth any force or density transients.

  4. Calculate the forces on the parents from the parent-grid mesh-defined potential.

  5. Calculate the forces on the sub-grid particles:

  6. Integrate the forces to get particle positions and velocities. Use a leap-frog technique on both the subgrid and the parent grid. Asynchronous time steps are used.

The asynchronous time-steps are an unusual feature and an improvement to other nested grid methods. Splinter gives reasons in his paper for being able to evolve the sub-grid distribution separately from the parent grid safely.

The reference paper doesn't list the order of computation, but it is probably somewhere between O(log Ng) and O(Ng log Ng).


Tree Code (TC)- Top Down

The tree codes approach the N-body simulation with the same force superposition concept as the P3M algorithm:

Force = external_force + nearest_neighbor_force + far_field_force

The external forces can be computed for each particle independently. The nearest neighbor forces only require interactions with nearest neighbors. The far-field forces, like gravity, are more expensive to compute because the force on each particle depends on all the other particles. The simplest expression for a far field force F(i) on particle i is

     for i = 1 to n
         F(i) = sum_{j=1, j != i}^n F(i,j)
     end for

where F(i,j) is the force on particle i due to particle j. Thus, the cost seems to rise as O(n^2), whereas the external and nearest neighbor forces appear to grow like O(n). Even with parallelism, the far-field forces will be much more expensive.

The Barnes-Hut Tree Code (top down) algorithm was first presented in 1986, and is based on an earlier algorithm of A. Appel in 1985. It is widely used in astrophysics, and has been thoroughly parallelized. It addresses the expensive far-field force calculations in the following clever "divide-and-conquer" way.

  1. Build a "quadtree,"
  2. For each subsquare in the quadtree, compute the center of mass and total mass for all the particles it contains,
  3. For each particle, traverse the tree to compute the force on it.

Some Details.

For Step 1), building a quadtree means to build a hierarchy of boxes which refine the computational domain into smaller and smaller regions. At the coarsest level, we have the entire domain. Refinement level {l+1} is obtained recursively from level l by subdivision of each box into two equal parts in each direction (eight total, for 3D). And so on.

For Steps 2) and 3), the tree structures partition the mass distribution into a hierarchy of localized regions, so that when calculating the force on a given particle, the "tree" region near the particle in question is explored in detail, and the more distant regions are explored more coarsely by treating distant clumps of particles as single massive pseudo-particles.

Tree Codes are gridless, have no preferred geometry and can incorporate either vacuum or periodic boundary conditions. In addition, they waste no time simulating regions devoid of matter. Hence, Tree Codes are particularly effective for modeling collisions between galaxies.

Forces on all particles are obtained with O(N log N) operations. The down side is that tree codes require a large amount of auxillary storage.

Note: The standard Barnes Tree Code is not as accurate, however, as the Fast Multipole Method (FMM), to be discussed later. The Barnes-Hut type algorithms are essentially dumb FMM algorithms with order-0 multipole expansions. FMM would only be slower than the Barnes Hut if you want more accuracy (but for a given accuracy, FMM is likely faster).

McMillan and Aarseth (1993) remedy the limited accuracy shortcoming by using a higher order integration scheme, along with block time steps and multipole expansions. The tree is not reconstructed at every step, but is instead predicted along with the particles with indifidual nodes rebuilt as needed.


Tree Code (TC)- Bottom Up

In this scheme, sometimes referred to as the "Press" Tree, the tree is built from the bottom up, rather than the top down, by taking the two particles nearest each other, and calling calling them leaf nodes. Take the next two particles that are nearest each other and tie them together, too. Keep tying particles together in pairs like this until there are less than two particles left. Then do the same thing, except do it with the centers of mass (CM) of each of these particle pairs. Then do the same thing with the CM of the particle quadruplets. Keep going until finally you connect the two CM's of the two halfs of the system.

One very nice feature of this algorithm is that you can give individual time steps to each of the particles and each of the CM's all the way up the tree such that every timestep is some power-of-two multiple of the smallest time step. Giving individual time-steps allows you to easily synchronize the timesteps of all the particles, while simultaneously allowing you to use small timesteps in subsections of the system where interesting close encounters are occuring.

This algorithm is designed to handle close approaches correctly, and can even handle tight binaries at the same time it handles long-range interactions. It can also use perturbation techniques to even more quickly and accurately handle close binaries. The drawback of this code is that it's probably not as simple as the Barnes-Hut algorithm to understand and code.

The computational order is O(n log n).


Fast Multipole Method (FMM)

The Fast Multipole Method (FMM) is a tree code that uses two representations of the potential field. The two representations are: far field (multipole) and local expansions. The two representations are referred to as the "duality principle."

This method uses a very fast calculation of the potential field. Why would one be interested in that? It's computationally easier to deal with the potential than that of the force. The force is a vector. The potential phi(x,y,z) is a scalar. (Remember from basic physics that the force is just the negative of the gradient of the potential.)

The strategy of the FMM is to compute a compact expression for the potential phi(x,y,z), which can be easily evaluated along with its derivative, at any point. It achieves this by evaluating the potential as a "multipole expansion," a kind of Taylor expansion, which is accurate when x^2+y^2+z^2 is large.

The FMM method shares the quadtree and divide-and-conquer paradigm of Barnes-Hut. It differs in the following ways:

McMillan and S. J. Aarseth (1993) briefly mention this method saying that this algorithm does not appear to be suitable for studies of collisional systems because 1) it is slow compared with competing methods, and 2) because it derives its O(1) CPU time per force evaluation by determining the forces on an arbitrary number of particles in essentially constant time. The FMM is thus well suited to applications where all particles have the same or similar time steps.

FMM is capable of achieving full machine precision (or less, depending on an accuracy tuning parameter). The FMM is an O(N) algorithm.


Tree-Code-Particle-Mesh (TPM)

The Tree-Code-Particle-Mesh is a hybrid scheme by Guohong Xu of Princeton University. This method uses a multiple tree-code (TC) approach in regions where the mass-density is high (above some threshold), and a particle-mesh (PM) algorithm in all other areas. PM codes have much higher speed than the Tree Codes, but with limited spatial resolution. The Tree codes have much higher spatial resolution, but the mass resolution is often poor with todays computers.

So, looking at the separation of forces:

F = F(short range) + F(long range),

we can separate the techniques used in each.

F(short range). Particles in overdense regions, use *primarily* the Tree Code. The forces on Tree particles are the sum of an external force, which is due to the particles outside the Tree, and is calculated by PM method, and an internal force, which is due to the particles in the Tree containing the particle, and is calculated by the Tree Code method.

F(long range). Particles in low density regions, use the PM method.

The TPM method solves the Poisson equation, using the same techniques as the PM method (usually FFT). The equations of motion are integrated from the potential and the tree-forces:

F = Sum(w_i * w_j * w_k * del(phi)) + F(tree)

using a leap-frog scheme, where the w are "Cloud-In-Cell" weights for the PM.

Multiple time scale integrations are implemented in this scheme because the particles in high density regions have different dynamic time scales from the particles in low density regions. In the simulation "box", the Tree particles have much shorter time steps than PM particles with each time step, optimized for each Tree.

The reference paper doesn't list the order of computation, but it is probably between O(log N) and O(N log N). This method has more free variables than the other N-body methods, which could be one big drawback.


Self-Consistent Field (SCF)

The Self-Consistent Field (SCF) method is an algorithm for evolving collisionless stellar systems. The particles in the SCF technique do not directly interact with one another, but rather, through their contribution to the combined gravitational field of the system. In other words, the potential in the SCF method does not depend on the relative coordinates of particles.

The justification for this approach is that the fundamental components of a collisionless systems, such as stars in galaxies, or electrons and ions in dense plasmas, move along orbits which are not perturbed by individual two-body interations. The distinguishing physics characteristic is that the gravitational potential is mostly determined by its continuum mass distribution.

In this scheme, the orbits in a given (time-dependent) potential are computed to find the evolving density distribution. With an improved density, as determined from these orbits, a new potential can be found, and the method continues to iterate to a "collisonless limit." After the potential is determined, one integrates N one body orbits in a given gravitational field. One advantage is that, since all orbits are computed in given potentials, one solves, iteratively, N one-body problems, rather than one N-body problem, which can be easily optimized on parallel hardware.

Hernquist & Ostriker (1992) improved on the original mean-field algorithm by expanding the density and potential of Poisson's equation with a set of basis functions that fully expands the angular and radial dependence. Other N-body Mean Field schemes that relied on expansions typically used only the angular part. Hernquist & Ostriker's set of basis functions, called "ultraspherical polynomials," resembled the system being studied- for example, spherical galaxies.

The SCF method is similar to Tree Codes in time-integration and round-off errors. It is an O(N) method. It retains tightly-bound particles, and may be capable of resolving steeper density gradients than direct N-body techniques.

It is inflexible in that it relies on expansions that are highly tuned to the problem. In Hernquist & Ostriker's case, it is to galaxies obeying the physical structure of galaxies exhibiting the de Vaucouleurs R^{1/4} law in projection.

This method might be considered as a kind of link between the fluid hydrodynamics codes (smoothed particle hydrodynamics) and particle methods.


Symplectic Method

Another N-body method that is actively being used in solar sytem dynamical integrations today, is very different from the direct integration methods, the mesh methods, and the tree code methods. It is based on Hamiltonian mechanics and goes back to Poincare's work in the late 1800's. It is called the Symplectic Method.

The following description came from Ben Leimkuhler's Web site:

Hamiltonian systems of differential equations model conservative
natural phenomena such as motion in molecular systems, in multibody
mechanical systems, and of the heavenly bodies. Hamiltonian systems
preserve more than just the energy: their flows posess a family of
symplectic invariants that characterize the long term dynamics of a
patch of nearby points in phase (position,momentum) space.

Since the flow (i.e. the mapping that takes a point in phase space to
it's evolution through t units of time) of a Hamiltonian system is
symplectic, it seems not unreasonable to ask the same of a discrete
(numerical) approximation. Thus we are led to the concept of
symplectic discretizations.

What's all the buzz about symplectic discretizations? A symplectic
method has been used to answer the question "is the solar system
chaotic?'' Symplectic methods are standard tools for designing
particle accelerators. Symplecticness explains the unreasonable
effectiveness of the leapfrog=Verlet=St"ormer scheme compared to
other second order integrators used in applications such as molecular
dynamics. In short, symplecticness is one of the few principles we
have for evaluating the behavior of schemes for conservative
nonlinear evolutionary phenomena on physically interesting time

The people working in this field feel that standard numerical integration schemes do not respect the global geometry present from the Hamiltonian nature of the dynamics. Some global structures are missed. [Franklin, Lecar and Quinn (1990) found that, in the case of circular planetary orbits, the direct integrations showed no evidence of instability for a few orbits that the symplectic mapping claimed were unstable.] Therefore, even if energy conservation is built into the integrations, numerical integration techniques for solar system dynamics is limited to the investigation of short-time quantitative phenomena only, and not to long-time qualitative phenomena investigations.

Numerical methods have been devised to approximate the time-delta_time map of the exact dynamics to any desired order in the timestep, and are symplectic. Algorithms used so far have included Runge-Kutta techniques, Adams techniques, and Pade approximants. The relative efficiency and/or simplicity of the methods vary from one problem to another and there appears to be no "best" approach, in general. These algorithms give the exact evolution of the Hamiltonian system and preserve the Poincare invariants, namely the integrals:

int int sum{over_i} dq_i dp_i


int int int int sim{i ne k} dq_i dp_i dp_k dp_k

(Look at the Channell and Scovel 1990 reference, the Sussman and Wisdom 1992 reference, and the Wisdom and Holman 1991 reference for specific details on how the symplectic method works.)

Two leaders in this field are Jacques Laksar at Bureau des Longitudes in Paris, and Jack Wisdom at MIT.

Laskar has performed integrations simulating many billions of years (longer than the age of the Solar System.. He said that it was to explore the limits of the chaotic zones). He said that the integration error was measured by integrating the equations back and forth over 10 Myr, and amount to ~10^{-13} after 10^7 years and behaves like order time^{1.4}. His integrations were performed on an IBM RS6000 and took 1 day of CPU time per Gyr.

Wisdom has performed the integrations with a special computer called the "Supercomputer Toolkit", which is a small multiprocessor computer built and optimized for the numerical solutions of systems of ordinary differential equations. His integrations have spanned close to a billion years. Each 100 million years takes about 1000 hours of Toolkit CPU time.

The results of both researchers is that they have found that the whole Solar Sytem is chaotic. Hyperion and other irregularly shaped satellites are "tumbling." Pluto's orbit is chaotic. The motion of the Jovian planet subsystem is chaotic. Miranda's high inclination and chaotic motions account for its tidal heating. Planet-crossing asteroid orbits were found to be unstable, the motion of comet Halley is chaotic. The chaotic diffusion (and the eccentricity) of Mercury is so large that it will be ejected from the Solar System upon a close encounter with Venus in less than 3.5 billion year (!).

Some other results: Duncan and Quinn (1993), and Mikkola and Innanen, (1995) describe and review computer simulations of large numbers of test particles placed between the giant planets and near/within the terrestrial zone. Duncan and Quinn show that test particles that encounter one or more of the giant planets are ejected to the Kuiper belt, the Oort cloud, or beyond to interstellar space. Mikkola and Innanen show that the asteroidal zone trajectories are chaotic, except within the region of the terrestrial planets.

D. Syer and S. Tremaine (1995) describe a method based on symplectic equations of motion that solve the combined collisionless Boltzmann and Poisson equations in a discreted, or lattice, phase space. The time and the positions and velocities of "particles" take on integer values, and the forces are rounded to the nearest integer. The lattice equations become the usual integro-differential equations of stellar dynamics. Equilibria are found n a variety of shapes and sizes that do not evolve with time, even slowly, unlike existing N-body approximations to stellar systems.


Other N-Body WWW Sources


I received valuable information on these N-Body/Particle Simulation concepts and/or the article from the following people: Sky Coyote, Andrew Bennett, Philippe Brieu, Mike Gross, Wayne Hayes, Joel Phillips, Jun Makino, Gordon Sande, Peter Teuben, Ton Dammers, Randall Splinter, and Adrian Melott.

Go To:

Last Modified by Amara Graps on 10 March 1996.
Mail to:

Current page access count = 2455

© Copyright Amara Graps, 1995, 1996.