Using fermipy.jobs to analyze multiple ROIS

The fermipy.jobs sub-package includes a few scripts and tools that can be used to parallelize standard analysis of region of interest as well as both positive and negative control tests.

Overview

This package implements an analysis pipeline to perform as standard analysis on multiple ROIs. This involves a lot of bookkeeping and loops over various things. It is probably easiest to first describe this with a bit of pseudo-code that represents the various analysis steps.

The various loop variable are:

rosters

The list of all lists of targets to analyze. This might seem a bit like overkill, but it is a way to allow you to define multiple versions of the same target, e.g., to test different models.
targets

The list of all the analysis targets. This is generated by merging all the targets from the input rosters.
target.profiles

The list of all the spatial profiles to analyze for a particular targeet. This is generated by finding all the versions of a target from all the input rosters.
sims

This is of all the simulation scenarios to analyze. This is provided by the user.
first, last

The first and last seeds to use the random number genreator (for simulations), or the first and last random directions to use, (for random direction control studies).

# Initialization, prepare the analysis directories and build the list of targets
PrepareTargets(rosters)

# Data analysis

# Loop over targets
for target in targets:
    AnalyzeROI(target)

    for profile in target.profiles:
        AnalyzeSED(target, profile)
        PlotCastro(target, profile)


# Simulation analysis

# Loop over simulation scenarios
for sim in sims:

    # Loop over targets
    for target in targets:
        CopyBaseROI(sim, target)

        for profile in target.profiles:
            SimulateROI(sim, target, profile) # This loops over simulation seeds and produces SEDS
            CollectSED(sim, target, profile)


# Random direction control analysis

# Loop over targets
for target in targets:
    CopyBaseROI(target)
    RandomDirGen(target)

    for profile in target.profiles:
        for seed in range(first, last)
            AnalyzeSED(target, profile, seed)

        CollectSED('random', target, profile)

Master Configuration File

fermipy.jobs uses YAML files to read and write its configuration in a persistent format. The configuration file has a hierarchical structure that groups parameters into dictionaries that are keyed to a section name.

:caption: Sample Configuration

ttype: dSphs
rosters : ['test']
spatial_models: ['point']

sim_defaults:
    seed : 0
    nsims : 20
    profile : ['point']

sims:
    'null' : {}
    'pl2_1em9' : {}

random: {}

data_plotting:
    plot-castro : {}

Options at the top level apply to all parts of the analysis pipeline

Sample top level Configuration

  # Top level
  ttype : 'dSphs'
  rosters : ['test']
  spatial_models: ['point']

ttype: str Target tpye. This is used for bookkeeping mainly, to give the name of the top-level directory, and to call out specfic configuration files.
rosters: list List of rosters of targets to analyze. Each roster represents a self-consistent set of targets. Different versions of the same target can be on several different rosters. But no target should appear on a single roster more than once.
spatial_models: : list List of types of spatial model to use when fitting the DM. Options are * point : A point source

Note

If multiple rosters include the same target and profile, that target will only be analyzed once, and those results will be re-used when combining each roster.

Simulation configuration

The sim_defaults, sims and random sections can be used to define analysis configurations for control studies with simulations and random sky directions.

Sample simulation Configuration

sim_defaults:
    seed : 0
    nsims : 20
    profile : point

sims:
    'null' : {}
    'pl2_1em9' : {}

random: {}

sim_defaults : dict This is a dictionary of the parameters to use for simulations. This can be overridden for specific type of simulation.
- seedint
  Random number seed to use for the first simulation
- nsimsint
  Number of simulations
- profilestr
  Name of the spatial profile to use for simulations. This must match a profile defined in the roster for each target. The ‘alias_dict’ file can be used to remap longer profile names, or to define a common name for all the profiles in a roster.
sims : dict This is a dictionary of the simulation scenarious to consider, and of any option overrides for some of those scenarios.

Each defined simulation needs a ‘config/sim_{sim_name}.yaml’ to define the injected source to use for that simulation.
random: dict This is a dictionary of the options to use for random sky direction control studies.

Plotting configuration

Sample plotting Configuration

data_plotting:
    plot-castro : {}

data_plotting : dict

Dictionaries of which types of plots to make for data, simulations and random direction controls. These dictionaries can be used to override the default set of channels for any particular set of plots.

The various plot types are:
- plot-castro : SED plots of a particular target, assuming a particular spatial profile.

Additional Configuration files

In addition to the master configuration file, the pipeline needs a few additional files.

Fermipy Analysis Configuration Yaml

This is simply a template of the fermipy configuration file to be used for the baseline analysis and SED fitting in each ROI. Details of the syntax and options are here _ The actual direction and name of the target source in this file will be over written for each target.

Profile Alias Configuration Yaml

This is an optional small file that remaps the target profile names to shorter names (without underscores in them). Removing the underscores helps keep the file name fields more logical, and fermipy.jobs generally uses underscores as a field seperator. This also keeps file names shorter, and allows us to use roster with a mixed set of profile versions to do simulations. Here is an example:

ackermann2016_photoj_0.6_nfw : ack2016
geringer-sameth2015_nfw : gs2015

Simulation Scenario Configuration Yaml

This file specifies the signal to inject in the analysis (if any). Here is a example, note that everything inside the ‘injected_source’ tag is in the format that fermipy expects to see source defintions.

# For positive control tests we with injected source.
# In this case it is a powerlaw specturm
injected_source:
  name : testpl
  source_model :
    SpatialModel : PointSource
    SpectrumType : Powerlaw
  Prefactor :
    value : 1e-9
  index :
    value: 2.
  scale :
    value : 1000.

For null simulations, you should include the ‘injected_source’ tag, but leave it blank

# For positive control tests we with injected source.
# In this case it is a DM annihilation spectrum.
injected_source:

Random Direction Control Sample Configuration Yaml

The file define how we select random directions for the random direction control studies. Here is an example:

# These are the parameters for the random direction selection
# The algorithm picks points on a grid

# File key for the first direction
seed : 0
# Number of directions to select
nsims : 20

# Step size between grid points (in deg)
step_x : 1.0
step_y : 1.0
# Max distance from ROI center (in deg)
max_x : 3.0
max_y : 3.0

Preparing the analysis areas

The initial setup can be done either by using the PrepareTargets link directly from python, or by running the fermipy-prepare-targets executable. This will produce a number of analysis directories and populate them with the needed configuration files.

link = PrepareTargets()
link.update_args(dict(ttype=dSphs,
                      rosters=['dsph_roster.yaml'],
                      spatial_models=['point'],
                      sims=['null', 'random', 'pl2_1em9']))
link.run()

fermipy-prepare-targets --ttype dSphs --rosters dsph_roster.yaml --spatial_models point --sims random --sims pl2_1em9 --sims null

Additional Arguments
- alias_dict [None] Optional path to a file that remaps the target profile name to shorter names.

Baseline target analysis

The first step of the analysis chain is to perform a baseline re-opimization of each ROI. This is done by using AnalyzeROI_SG to run the AnalyzeROI link on each ROI in the target list generated by PrepareTargets. This can be done directly from python, or from the shell using the fermipy-analyze-roi-sg executable.

  link = AnalyzeROI_SG()
  link.update_args(dict(ttype=dSphs,
                        targetlist='dSphs/target_list.yaml'))
  link.run()

.. code-block:: shell

  fermipy-analyze-roi-sg --ttype dSphs --targetlist dSphs/target_list.yaml --config config.yaml

Additional Arguments
- config [‘config.yaml’] Name of the fermipy configuration file to use.
- roi_baseline [‘fit_baseline’] Prefix to use for the output files from the baseline fit to the ROI
- make_plots [False] Produce the standard plots for an ROI analysis.

Target SED analysis

The next step of the analysis chain is to perform extract the SED each spatial profile for each target. This is done by using AnalyzeSED_SG to run the AnalyzeSED link on each ROI in the target list. This uses the baseline fits as a starting point for the SED fits. This can be done directly from python, or from the shell using the fermipy-analyze-sed-sg executable.

  link = AnalyzeSED_SG()
  link.update_args(dict(ttype=dSphs,
                        targetlist='dSphs/target_list.yaml'))
  link.run()

.. code-block:: shell

  fermipy-analyze-sed-sg --ttype dSphs --targetlist dSphs/target_list.yaml --config config.yaml

Additional Arguments
- config [‘config.yaml’] Name of the fermipy configuration file to use.
- roi_baseline [‘fit_baseline’] Prefix to use for the output files from the baseline fit to the ROI
- make_plots [False] Produce the standard plots for a SED analysis.
- non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations.
- skydirs [None] Optional file with a set of directions to build SEDs for. This is used from random direction control samples.

Simulated realizations of ROI analysis

This module provides tools to perform simulated realizations of the ROIs. This is done by copying the baseline ROI, using it as a starting point, and simulating realizations of the analysis by throwing Poisson fluctuations on the expected models counts of the ROI and then fitting those simulated data. These simulations are done for each target and can optionally include injecting a signal source. This can be done directly from python, or from the shell using executables. Here is an example of how to generate negative control (“null”) simulations. This requires having ‘config/sim_null.yaml’ consist of just a single empty tag ‘injected_source’. To run positive control sample you would just change “null” to, for example “pl2_1em9”, where ‘config/sim_pl2_1em9.yaml’ is the yaml file with the spectral model described above.

# Copy the base line ROI
copy_link = CopyBaseROI_SG()
copy_link.update_args(dict(ttype=dSphs,
                           targetlist='dSphs_sim/sim_null/target_list.yaml',
                           sim='null'))
copy_link.run()

# Run simulations of the ROI
sim_link = SimulateROI_SG()
sim_link.update_args(dict(ttype=dSphs,
                          targetlist='dSphs_sim/sim_null/target_list.yaml',
                          sim='null'))
sim_link.run()

# Collect the results of the simulations
col_link = CollectSED_SG()
col_link.update_args(dict(ttype=dSphs,
                          targetlist='dSphs_sim/sim_null/target_list.yaml',
                          sim='null'))
col_link.run()

fermipy-copy-base-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null
fermipy-simulate-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null
fermipy-collect-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null

Additional Arguments
- extracopy [] Extra files to copy from basline fit directory.
- config [‘config.yaml’] Name of the fermipy configuration file to use.
- roi_baseline [‘fit_baseline’] Prefix to use for the output files from the baseline fit to the ROI
- non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations.
- do_find_src [False] Do an additional setup of source finding in the ROI.
- sim_profile [‘default’] Name of the profile to use to produce the simulations
- nsims [20] Number of simulations to run
- seed [0] Starting random number seed. Also used as in bookkeeping.
- nsims_job [0] Number of simulations per job. 0 means to run all the simulations in a single job.

Random Direction Control Studies

This module also provides tools to perform analyses of random directions in the ROI as a control sample. This done by copying the baseline ROI, using it as a starting point, and then picked directions away from the center of the ROI and treating them as the target. Here is an example of how to generate random direction control simulations. Defined by having ‘config/sim_null.yaml’ consist of just a single empty tag ‘injected_source’. To run positive control sample you would just change “null” to, for example “pl2_1em9”, where ‘config/sim_pl2_1em9.yaml’ is the yaml file with the spectral model described above.

# Copy the base line ROI
copy_link = CopyBaseROI_SG()
copy_link.update_args(dict(ttype=dSphs,
                           targetlist='dSphs_sim/sim_random/target_random.yaml',
                           sim='random'))
copy_link.run()

# Make a set of random directions
dir_link = RandomDirGen_SG()
dir_link.update_args(dict(ttype=dSphs,
                          targetlist='dSphs_sim/sim_random/target_list.yaml',
                          sim='random',
                          rand_config='config/random_dSphs.yaml'))
dir_link.run()

# Construct the SED for each random direction
sed_link = AnalyzeSED_SG()
sed_link.update_args(dict(ttype=dSphs,
                          targetlist='dSphs_sim/sim_random/target_list.yaml',
                          skydirs='skydirs.yaml'))
sed_link.run()

# Collect the results for the random directions
col_link = CollectSED_SG()
col_link.update_args(dict(ttype=dSphs,
                          targetlist='dSphs_sim/sim_random/target_list.yaml',
                          sim='random'))
col_link.run()

fermipy-copy-base-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random
fermipy-random-dir-gen-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random --rand_config config/random_dSphs.yaml
fermipy-analyze-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --skydirs skydirs.yaml
fermipy-collect-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random

Additional Arguments
- extracopy [] Extra files to copy from basline fit directory.
- config [‘config.yaml’] Name of the fermipy configuration file to use.
- roi_baseline [‘fit_baseline’] Prefix to use for the output files from the baseline fit to the ROI
- non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations.
- do_find_src [False] Do an additional setup of source finding in the ROI.
- write_full [False Write a full description of all the collected SED results
- write_summary [False]

Plotting Results

The module also includes code to plot the SED for each target. Note that this can also be done with the make_plots=True option in AnalyzeSED_SG.

link = PlotCastro_SG()
link.update_args(dict(ttype=dSphs,
                      targetlist='dSphs/target_list.yaml'))
link.run()

fermipy-plot-castro-sg --ttype dSphs --targetlist dSphs/target_list.yaml

One of the resulting plots would look something like this: