The output spacing for group catalogues and snaphots has been selected as follows:
- constant in dlog(a)= 0.0081 below z=3 (171 dumps)
- constant in dlog(a)= 0.0162 between z=3 and z=10 (62 dumps)
- constant in dlog(a)= 0.0325 between z=10 and z=30 (32 dumps)
All together 265 dump times. At these dump times, the FOF and SUBFIND-HBT group finders have always been run, and the merger tree builder routine establishes on-the-fly links to previous output times. Also, matter power spectra are computed and stored. (For some runs, a few extra dumps at intermediate times were actually produced, due to a hick-up in the routine that selects the output times after a code restart - this means that for some simulations we have actually 266 or even 269 output times. This unfortunately slightly complicates the matching of equal-time output numbers between runs, but the intended 265 times are always available for all runs.)
Classical snapshot dumps with full phase-space information for all particles are however not stored for the majority of these output times, but have only been created for a reduced subset of output times. The selected redshifts for these dumps were: z = 0, 0.25, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0, and 7.0 (10 full dumps), with corresponding dump numbers 264, 237, 214, 179, 151, 129, 094, 080, 069, and 051.
Full snapshot data is stored in files with names
XXX is the snapshot number, and
Y enumerates the different files making up the full dump in GADGET/AREPO multi-file output format.
For all output times, the code additionally creates snapshot files named
which contain only those particles that used to be a most-bound particle of a subhalo sometime in the past. These particles can be used to approximately track in semi-analytic models so-called orphaned galaxies whose dark matter substructure has been disrupted. After the merger trees have been created, the code converts the files into
which represent a reshuffling of the particle data in the order of the
trees to which each particle belongs.
'Belonging' here means that the subhalo(s) in
which a given particle was most bound are associated with
the particle, and in turn the particle is associated with the tree
these subhalos are in. (Note that the tree construction makes sure that
all subhalos a give particle belongs to are always in the same tree,
hence assignment of a particle ID to a tree ID is unique.) When the
L-Galaxies semi-analytic model processes a tree, it can retrieve
the particle data of just its tree (any nothing more) from the
snapshot-prevmostboundonly-treeorder files in order to update
the phase-space coordinates of orphaned galaxies.
Group catalogues are saved in the majority of cases without explicit particle data, i.e. all available group and subhalo information must be computed on the fly.
Group data is stored in
files, which hold both the FOF-group and subhalo information. In case a corresponding full dump is stored, the particle data is stored in a nested fashion, biggest FOF halo first (and then in descending order), with the particles making up each subhalo within a FOF group being stored in descending order as well. Within each subhalo, the particles are ordered by increasing binding energy. It is thus possible to selectively load the particles making up each group or subhalo.
The following properties are currently stored in the group/subhalo catalogues.
|SubhaloPos||position (minimum of potential)|
|SubhaloCM||center of mass position|
|SubhaloVmaxRad||radius where this is attained|
|SubhaloHalfmassRad||radius containing half the mass|
|SubhaloIDMostbound||ID of most bound particle|
|SubhaloLen||number of particles|
|SubhaloLenType||type-specific length (for hydro/neutrino runs)|
|SubhaloMassType||type-specific mass (for hydro/neutrino runs)|
|SubhaloHalfmassRadType||type specific half-mass radius|
|SubhaloGroupNr||FOF group number of this subhalo|
|SubhaloRankInGr||rank of subhalo in this group|
|SubhaloParentRank||for nested subhalos, gives parent subhalo in group|
|SubhaloLenPrevMostBnd||needed to retrieve previously most bound particles in this subhalo|
|SubhaloOffsetType||first particle in output snapshot in this subhalo|
For FOF groups:
|GroupPos||position (minimum potential)|
|GroupLenType||type specific length|
|GroupMassType||type specific mass|
|Group_M_Crit200||spherical overdensity mass/radius|
|Group_M_Crit500||(accounting for all mass around GroupPos)|
|GroupFirstSub||index of first subhalo in FOF group|
|GroupNsubs||number of bound subhalos in group|
|GroupOffsetType||gives index of first particle in output belong to this group|
|GroupLenPrevMostBnd||needed to retrieve previously most bound particles in this group|
Descendant and progenitor information
While the simulation code is running, two subsequent group/subhalo catalogues are linked together
by tracking the central most bound particles of each subhalo. For every group catalogue
XXX, there are
giving the progenitor subhalo(s) in the previous output
XXX-1. Likewise, for every group catalogue, there are
files that give access to the descendant(s) of a subhalo in the next output
XXX+1. The meaning of these pointers is illustrated in the following diagram:
At the end of a simulation run, the code is restarted in a special mode to combine all this linking information into more easily accessible merger trees, but in principle these files can also be accessed, if desired, to hop from one instant forward or backwards in time.
The merger tree data combines all the group/subhalo catalogue information into a set of merger trees that can be read individually. Conceptually, the tree is organized as sketched below (described in detail in the GADGET-4 code paper):
The trees are stored in files with name
Y enumerates the files making up the full catalogue when split over several files in the multi-file format.
When the trees are made, the code also creates further auxiliary files named
These tell for each subhalo in which tree and at which place within the tree the corresponding subhalo can be found. One can then selectively load this tree if desired and examine the history and fate of the subhalo.
A set of different lightcones is produced by each run (periodic box replication is automatically applied to the extent necessary to fill the geometry of the specified cone):
|Cone Number||Lightcone Type||Redshift Range||Outer Comoving Distance|
|Cone 0||full sky particle lightcone||z = 0 - 0.4||~1090 Mpc/h|
|Cone 1||one octant particle lightcone||z = 0 - 1.5||~3050 Mpc/h|
|Cone 2||pencil beam particle lightcone, square footprint (10 degree)^2||z = 0 - 5.0||~5390 Mpc/h|
|Cone 3||disk-like particle lightcone, 15.0 Mpc/h thickness||z = 0 - 2.0||~3600 Mpc/h|
|Cone 4||full sky, only most-bound particles (for SAMs)||z = 0 - 5.0||~5390 Mpc/h|
Each lightcone is stored in its separate directory, and the individual lightcones are broken up into onion-like shells with a numbering increasing from high redshift to low redshift. This is because the code buffers lightcone data internally, and outputs a new shell whenever the buffer is full. The redshift boundaries of these onion shells have no particular meaning, and vary for different simulations. The lightcone data is stored in files
NN denotes the cone number,
XXXX is the shell number, and
Y the filenumber within the mult-file output for the current
shell. The actual particle data of the lightcone output are stored in
a format closely analogous to snapshot files, but within each onion-like
shell, the particles are stored in healpix order (in principle, the code could
also find groups and subhalos directly among the lightcone particles,
but we have not enabled this feature because it is not necessary for the semi-analytics). Combined with an offset table to the first particle in each healpix pixel, this allows
selective reads of certain regions of the sky, if desired. Besides
positions and velocities, the total peculiar gravitational accleration
is also stored for particles crossing the lightcone.
For cone #4, the lightcone data is available in a secondary reshuffled order, with filenames of the form
Here, the particle data within each shell (which consists only of
previously most bound particles in subhalos) is reshuffled such that
they are stored in sequence of the trees to which the particles belong
(see also the discussion for the
snapshot-prevmostboundonly-treeorder files). Similar to
snapshot-prevmostboundonly-treeorder, this serves to allow the
semi-analytic code update in an exact fashion the phase-space coordinates of
subhalos and orphan galaxies when they cross the lightcone.
For weak-lensing applications, we create onion-like shells of the full particle light cone onto a healpix tesselation of the sky. The comoving depth of these shells is 25 Mpc/h, and they go out to redshift z=5, giving 216 such maps in total.
The number of pixels in each maps is
Npix = 12 * Nside^2
and in the highest resolution maps we go up to
Nside = 12288, yielding 1.8 billion pixels and
a 0.28 arcmin angular resolution of the mass maps. The filenames of
these mass projections are
XXX is the shell number, and
Y is the number of the subfile
in multi-file decomposition of the full healpix map. angular resolution
As mentioned above, matter power spectra are measured for all defined
dump times (both for the total matter, and where available separately
for baryons and dark matter). Three power spectra measurements are
done each time, applying the folding technique twice with folding factors of
16^2. Combination of the measurements allows coverage of
the full spatial dynamic range of the simulations, for k-ranges that reach well below the softening
The power spectra are stored in the files named
and for the hydrodynamical runs in addition in
powerspecs/powerspec_type0_XXX.txt powerspecs/powerspec_type1_XXX.txt powerspecs/powerspec_type4_XXX.txt powerspecs/powerspec_type5_XXX.txt
for gas, dark matter, stars and black holes separately.
While the merger trees in the HDF5 files
the most-bound particle data in
the semi-analytic code L-Galaxies to access all input data for
processing of a tree in a convenient way (and in particular without
the need to read any extra data from other trees), the I/O is nevertheless slowed down
tremendously by the overhead of having to open a large number of files to collect the data
(due to the distribution of the data across a larger number of output
times, and across multiple files each time), and then fast forwarding in the
files to a number of different positions. The HDF5 library, in particular, adds
significant additional overhead to this, which is further amplified
by our decision to store some of the fields with lossless
on-the-fly compression (transparently applied by HDF5) to save disk space. This overhead is so large that running the semi-analytic code is completely dominated by it. We are talking of a slowdown by more than a factor of 100 here.
In order to circumvent this issue, the HDF5 input files for L-Galaxies have additionally been recast into a binary flat file structure, where all the data needed for one tree is stored in a contiguous block and can thus be read with a single read-statement. This eliminates the HDF5 overhead and only one file opening operation is needed at start-up of the code. These flat files are called
PartData PartCount PartOffset TreeData TreeCount TreeTimes
and can be viewed as (pretty large) scratch files that do not need to be archived long-term, but can always be recomputed as needed to make L-Galaxies run faster.