Steps in making efficiency plots, no input optimization

  1. For no input parameter optimization, you can just run the analyis once and then work with that output. You can do this with the TopTag.py and TopTagNew.py scripts in the SVN under code/. The latter implements the same tools, but moving toward "FastJet3-ization". You will need the very latest SpartyJet SVN trunk, and FastJet 3.0.0, for TopTagNew.py. For TopTag.py, you will still need a version of SpartyJet newer than the latest release, but I'm not quite sure how recent... To run this analysis, just do "./TopTag.py inputfile outputfile". The output file is a ROOT file storing information about found jets and jet measurements. You can use multiple input files by placing them in quotes; shell globbing expressions like "/foo/bar/*.hepmc" are allowed.
  2. Once you have .root files for signal and background samples, you can run RunAnalysisSimple.py, which optimizes cuts on a particular analysis. The syntax is "./RunAnalysisSimple name nVars signalFile BGFile". "name" is the code name of an analysis; current choices are {ATLAS, CMS, HEP, JH20, JH50, NSub, Pruned20, Pruned50, TW, Trimmed}. nVars is the number of output variables to use; -2 is for the "best" set, which I've chosen with a small amount of discernment.
  3. This will produce some files, including an efficiencies file ending in .dat. (Because TMVA produces a bunch of output files and directories, it is best to do step 2 in a temporary directory.) The efficiencies file contains rows of "signalEff BGEff {cut and parameter info}". You can make simply plots of multiple efficiency files with the script EffPlots.py in code/results/efficiency/. Just call "./EffPlots.py foo.eff.dat bar.eff.dat baz.eff.dat". The script will use "foo" "bar" and "baz" as legend labels. You will need the matplotlib Python package to use this script. Of course, the efficiency files are pretty easy to parse, so of course you can make plots however you like.

Making efficiency plots, with input optimization

  1. In this case, you will let the RunAnalysis.py script run things repeatedly for you. The syntax is "./RunAnalysis.py name nVars nPoints sigfile_pattern bgfile_pattern". "name" is a choice of analysis that is defined in Analyses.py; {CMS, HEP, JH, NSub, Pruned} are available. "nVars" is as above, "nPoints" determines how many values of each input parameter to scan (you will have nPoints^nVars analysis runs). The file patterns can either be file names or globbing expressions (protected by quotes).
  2. This will eventually produce the same output as above, so you can just go to step 3 above.

Task List

  1. Implement Trimmed in Analyses.py, so it can be optimized. (Just copying from TopTag.py to Analyses.py...) (CV)
  2. Run analyses for Sherpa events, all pT
  3. Run detector simulator on all Sherpa events, all pT (CV)
  4. Run analyses on Herwig++ events, 500-600
  5. Run analyses on Herwig++ events, all pT
  6. Run detector on all Herwig++ events (CV)
  7. Run analyses on Sherpa-detector events
  8. Run analyses on Herwig++-detector events
  9. Do any/all of the above with input optimization for CMS, HEP, JH, Pruned, Trimmed.
-- ChristopherVermilion - 27 Oct 2011
Topic revision: r1 - 27 Oct 2011, ChristopherVermilion
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback