.. _fermipy_jobs_multiple_ROIs: Using fermipy.jobs to analyze multiple ROIS =========================================== The fermipy.jobs sub-package includes a few scripts and tools that can be used to parallelize standard analysis of region of interest as well as both positive and negative control tests. Overview ========= This package implements an analysis pipeline to perform as standard analysis on multiple ROIs. This involves a lot of bookkeeping and loops over various things. It is probably easiest to first describe this with a bit of pseudo-code that represents the various analysis steps. The various loop variable are: * rosters The list of all lists of targets to analyze. This might seem a bit like overkill, but it is a way to allow you to define multiple versions of the same target, e.g., to test different models. * targets The list of all the analysis targets. This is generated by merging all the targets from the input rosters. * target.profiles The list of all the spatial profiles to analyze for a particular targeet. This is generated by finding all the versions of a target from all the input rosters. * sims This is of all the simulation scenarios to analyze. This is provided by the user. * first, last The first and last seeds to use the random number genreator (for simulations), or the first and last random directions to use, (for random direction control studies). .. code-block:: python # Initialization, prepare the analysis directories and build the list of targets PrepareTargets(rosters) # Data analysis # Loop over targets for target in targets: AnalyzeROI(target) for profile in target.profiles: AnalyzeSED(target, profile) PlotCastro(target, profile) # Simulation analysis # Loop over simulation scenarios for sim in sims: # Loop over targets for target in targets: CopyBaseROI(sim, target) for profile in target.profiles: SimulateROI(sim, target, profile) # This loops over simulation seeds and produces SEDS CollectSED(sim, target, profile) # Random direction control analysis # Loop over targets for target in targets: CopyBaseROI(target) RandomDirGen(target) for profile in target.profiles: for seed in range(first, last) AnalyzeSED(target, profile, seed) CollectSED('random', target, profile) Master Configuration File ========================= fermipy.jobs uses `YAML `_ files to read and write its configuration in a persistent format. The configuration file has a hierarchical structure that groups parameters into dictionaries that are keyed to a section name. .. code-block:: yaml :caption: Sample Configuration ttype: dSphs rosters : ['test'] spatial_models: ['point'] sim_defaults: seed : 0 nsims : 20 profile : ['point'] sims: 'null' : {} 'pl2_1em9' : {} random: {} data_plotting: plot-castro : {} Options at the top level apply to all parts of the analysis pipeline .. code-block:: yaml :caption: Sample *top level* Configuration # Top level ttype : 'dSphs' rosters : ['test'] spatial_models: ['point'] * ttype: str Target tpye. This is used for bookkeeping mainly, to give the name of the top-level directory, and to call out specfic configuration files. * rosters: list List of rosters of targets to analyze. Each roster represents a self-consistent set of targets. Different versions of the same target can be on several different rosters. But no target should appear on a single roster more than once. * spatial_models: : list List of types of spatial model to use when fitting the DM. Options are * point : A point source .. note:: If multiple rosters include the same target and profile, that target will only be analyzed once, and those results will be re-used when combining each roster. Simulation configuration ------------------------ The *sim_defaults*, *sims* and *random* sections can be used to define analysis configurations for control studies with simulations and random sky directions. .. code-block:: yaml :caption: Sample *simulation* Configuration sim_defaults: seed : 0 nsims : 20 profile : point sims: 'null' : {} 'pl2_1em9' : {} random: {} * sim_defaults : dict This is a dictionary of the parameters to use for simulations. This can be overridden for specific type of simulation. * seed : int Random number seed to use for the first simulation * nsims : int Number of simulations * profile : str Name of the spatial profile to use for simulations. This must match a profile defined in the roster for each target. The 'alias_dict' file can be used to remap longer profile names, or to define a common name for all the profiles in a roster. * sims : dict This is a dictionary of the simulation scenarious to consider, and of any option overrides for some of those scenarios. Each defined simulation needs a 'config/sim_{sim_name}.yaml' to define the injected source to use for that simulation. * random: dict This is a dictionary of the options to use for random sky direction control studies. Plotting configuration ---------------------- .. code-block:: yaml :caption: Sample *plotting* Configuration data_plotting: plot-castro : {} * data_plotting : dict Dictionaries of which types of plots to make for data, simulations and random direction controls. These dictionaries can be used to override the default set of channels for any particular set of plots. The various plot types are: * plot-castro : SED plots of a particular target, assuming a particular spatial profile. Additional Configuration files ============================== In addition to the master configuration file, the pipeline needs a few additional files. Fermipy Analysis Configuration Yaml ----------------------------------- This is simply a template of the `fermipy` configuration file to be used for the baseline analysis and SED fitting in each ROI. Details of the syntax and options are `here ` _ The actual direction and name of the target source in this file will be over written for each target. Profile Alias Configuration Yaml -------------------------------- This is an optional small file that remaps the target profile names to shorter names (without underscores in them). Removing the underscores helps keep the file name fields more logical, and fermipy.jobs generally uses underscores as a field seperator. This also keeps file names shorter, and allows us to use roster with a mixed set of profile versions to do simulations. Here is an example: .. code-block:: yaml ackermann2016_photoj_0.6_nfw : ack2016 geringer-sameth2015_nfw : gs2015 Simulation Scenario Configuration Yaml -------------------------------------- This file specifies the signal to inject in the analysis (if any). Here is a example, note that everything inside the 'injected_source' tag is in the format that `fermipy` expects to see source defintions. .. code-block:: yaml # For positive control tests we with injected source. # In this case it is a powerlaw specturm injected_source: name : testpl source_model : SpatialModel : PointSource SpectrumType : Powerlaw Prefactor : value : 1e-9 index : value: 2. scale : value : 1000. For null simulations, you should include the 'injected_source' tag, but leave it blank .. code-block:: yaml # For positive control tests we with injected source. # In this case it is a DM annihilation spectrum. injected_source: Random Direction Control Sample Configuration Yaml -------------------------------------------------- The file define how we select random directions for the random direction control studies. Here is an example: .. code-block:: yaml # These are the parameters for the random direction selection # The algorithm picks points on a grid # File key for the first direction seed : 0 # Number of directions to select nsims : 20 # Step size between grid points (in deg) step_x : 1.0 step_y : 1.0 # Max distance from ROI center (in deg) max_x : 3.0 max_y : 3.0 Preparing the analysis areas ---------------------------- The initial setup can be done either by using the `PrepareTargets` link directly from python, or by running the fermipy-prepare-targets executable. This will produce a number of analysis directories and populate them with the needed configuration files. .. code-block:: python link = PrepareTargets() link.update_args(dict(ttype=dSphs, rosters=['dsph_roster.yaml'], spatial_models=['point'], sims=['null', 'random', 'pl2_1em9'])) link.run() .. code-block:: shell fermipy-prepare-targets --ttype dSphs --rosters dsph_roster.yaml --spatial_models point --sims random --sims pl2_1em9 --sims null * Additional Arguments * alias_dict [None] Optional path to a file that remaps the target profile name to shorter names. Baseline target analysis ------------------------ The first step of the analysis chain is to perform a baseline re-opimization of each ROI. This is done by using `AnalyzeROI_SG` to run the `AnalyzeROI` link on each ROI in the target list generated by `PrepareTargets`. This can be done directly from python, or from the shell using the fermipy-analyze-roi-sg executable. .. code-block:: python link = AnalyzeROI_SG() link.update_args(dict(ttype=dSphs, targetlist='dSphs/target_list.yaml')) link.run() .. code-block:: shell fermipy-analyze-roi-sg --ttype dSphs --targetlist dSphs/target_list.yaml --config config.yaml * Additional Arguments * config ['config.yaml'] Name of the fermipy configuration file to use. * roi_baseline ['fit_baseline'] Prefix to use for the output files from the baseline fit to the ROI * make_plots [False] Produce the standard plots for an ROI analysis. Target SED analysis ------------------- The next step of the analysis chain is to perform extract the SED each spatial profile for each target. This is done by using `AnalyzeSED_SG` to run the `AnalyzeSED` link on each ROI in the target list. This uses the baseline fits as a starting point for the SED fits. This can be done directly from python, or from the shell using the fermipy-analyze-sed-sg executable. .. code-block:: python link = AnalyzeSED_SG() link.update_args(dict(ttype=dSphs, targetlist='dSphs/target_list.yaml')) link.run() .. code-block:: shell fermipy-analyze-sed-sg --ttype dSphs --targetlist dSphs/target_list.yaml --config config.yaml * Additional Arguments * config ['config.yaml'] Name of the fermipy configuration file to use. * roi_baseline ['fit_baseline'] Prefix to use for the output files from the baseline fit to the ROI * make_plots [False] Produce the standard plots for a SED analysis. * non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations. * skydirs [None] Optional file with a set of directions to build SEDs for. This is used from random direction control samples. Simulated realizations of ROI analysis -------------------------------------- This module provides tools to perform simulated realizations of the ROIs. This is done by copying the baseline ROI, using it as a starting point, and simulating realizations of the analysis by throwing Poisson fluctuations on the expected models counts of the ROI and then fitting those simulated data. These simulations are done for each target and can optionally include injecting a signal source. This can be done directly from python, or from the shell using executables. Here is an example of how to generate negative control ("null") simulations. This requires having 'config/sim_null.yaml' consist of just a single empty tag 'injected_source'. To run positive control sample you would just change "null" to, for example "pl2_1em9", where 'config/sim_pl2_1em9.yaml' is the yaml file with the spectral model described above. .. code-block:: python # Copy the base line ROI copy_link = CopyBaseROI_SG() copy_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_null/target_list.yaml', sim='null')) copy_link.run() # Run simulations of the ROI sim_link = SimulateROI_SG() sim_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_null/target_list.yaml', sim='null')) sim_link.run() # Collect the results of the simulations col_link = CollectSED_SG() col_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_null/target_list.yaml', sim='null')) col_link.run() .. code-block:: shell fermipy-copy-base-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null fermipy-simulate-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null fermipy-collect-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_null/target_list.yaml --sim null * Additional Arguments * extracopy [] Extra files to copy from basline fit directory. * config ['config.yaml'] Name of the fermipy configuration file to use. * roi_baseline ['fit_baseline'] Prefix to use for the output files from the baseline fit to the ROI * non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations. * do_find_src [False] Do an additional setup of source finding in the ROI. * sim_profile ['default'] Name of the profile to use to produce the simulations * nsims [20] Number of simulations to run * seed [0] Starting random number seed. Also used as in bookkeeping. * nsims_job [0] Number of simulations per job. 0 means to run all the simulations in a single job. Random Direction Control Studies -------------------------------- This module also provides tools to perform analyses of random directions in the ROI as a control sample. This done by copying the baseline ROI, using it as a starting point, and then picked directions away from the center of the ROI and treating them as the target. Here is an example of how to generate random direction control simulations. Defined by having 'config/sim_null.yaml' consist of just a single empty tag 'injected_source'. To run positive control sample you would just change "null" to, for example "pl2_1em9", where 'config/sim_pl2_1em9.yaml' is the yaml file with the spectral model described above. .. code-block:: python # Copy the base line ROI copy_link = CopyBaseROI_SG() copy_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_random/target_random.yaml', sim='random')) copy_link.run() # Make a set of random directions dir_link = RandomDirGen_SG() dir_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_random/target_list.yaml', sim='random', rand_config='config/random_dSphs.yaml')) dir_link.run() # Construct the SED for each random direction sed_link = AnalyzeSED_SG() sed_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_random/target_list.yaml', skydirs='skydirs.yaml')) sed_link.run() # Collect the results for the random directions col_link = CollectSED_SG() col_link.update_args(dict(ttype=dSphs, targetlist='dSphs_sim/sim_random/target_list.yaml', sim='random')) col_link.run() .. code-block:: shell fermipy-copy-base-roi-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random fermipy-random-dir-gen-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random --rand_config config/random_dSphs.yaml fermipy-analyze-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --skydirs skydirs.yaml fermipy-collect-sed-sg --ttype dSphs --targetlist dSphs_sim/sim_random/target_list.yaml --sim random * Additional Arguments * extracopy [] Extra files to copy from basline fit directory. * config ['config.yaml'] Name of the fermipy configuration file to use. * roi_baseline ['fit_baseline'] Prefix to use for the output files from the baseline fit to the ROI * non_null_src [False] If set to True, the analysis will zero out the source before computing the SED. This is needed for positive control simulations. * do_find_src [False] Do an additional setup of source finding in the ROI. * write_full [False Write a full description of all the collected SED results * write_summary [False] Plotting Results ---------------- The module also includes code to plot the SED for each target. Note that this can also be done with the make_plots=True option in `AnalyzeSED_SG`. .. code-block:: python link = PlotCastro_SG() link.update_args(dict(ttype=dSphs, targetlist='dSphs/target_list.yaml')) link.run() .. code-block:: shell fermipy-plot-castro-sg --ttype dSphs --targetlist dSphs/target_list.yaml One of the resulting plots would look something like this: .. image:: sed_default_point.png