Analyzing Events with CMS Software#
These are instructions for analyzing many events in parallel and merging the output.#
Setting Up#
-
Figure out the exact dataset you want to run over from the list of Officially produced data stored at Wisconsin. The string you need looks something like these:#
/QCD_Pt30/Summer09-MC_31X_V3_7TeV_AODSIM-v1/AODSIM /PhotonJet_Pt15/Summer09-MC_31X_V3_7TeV_TrackingParticles_AODSIM-v1/AODSIM /ZeeJet_Pt0to15/Summer09-MC_31X_V3_7TeV_AODSIM-v1/AODSIM
-
The configuration file is the same as to what would normally be used, but two lines must be edited to look like: #
... process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring( $inputFileNames ) ) ... process.SimpleAnalyzer.OutputFile = '$outputFileName' ...
The “$inputFileNames” and the “$outputFileName” variables will be replaced by a script later.#
Here are some example configuration files for reference: MultiPhotonAnalyzer_cfg.py#
Running the Analysis#
-
Go to your CMSSW folder and setup the environment,#
cd ~/CMSSW_2_1_0_pre6/src/Analysis/ eval `scramv1 runtime -sh`
-
Get a valid grid certificate & make it valid for a decent number of hours#
voms-proxy-init -valid 40:00
-
Run a script to farm out the analysis jobs to condor. If you’re running on data you produced,#
farmoutAnalysisJobs FolderName ~/CMSSW_2_1_0_pre6/ ~/CMSSW_2_1_0_pre6/src/Analysis/exampleConfig.cfg
This will submit an analysis job for every root file contained in#
/hdfs/store/user/$USER/FolderName/
OR if you’re running on a dataset known to DBS, use a command like#
farmoutAnalysisJobs --input-dbs-path=/ph1j_20_60-alpgen/CMSSW_1_6_7-CSA07-1201165474/RECO ph1j_20_60 ~/CMSSW_1_6_8/ ~/CMSSW_1_6_8/src/Analysis/analgen.cfg
This will also submit a job for every root file contained in the input folder.#
Here’s an example of a sucessful submit:#
farmoutAnalysisJobs --input-files-per-job=40 PhotonJet500-1000 ~/CMSSW_2_1_0_pre6/ ~/CMSSW_2_1_0_pre6/src/Analysis/PhotonJetAnalyzer/photonjetanalyzer.cfg Generating submit files in /scratch/mbanderson/PhotonJet500-1000-photonjetanalyzer... ............. Submitting job(s)............. Logging submit event(s)............. 13 job(s) submitted to cluster 11892. Jobs for PhotonJet500-1000 are created in /scratch/mbanderson/PhotonJet500-1000-photonjetanalyzer Monitor your jobs at http://www.hep.wisc.edu/~mbanderson/jobMonitor.php
Wait about 2 minutes, then visit http://www.hep.wisc.edu/~YourScreenName/jobMonitor.php To see an auto-generated graph of the status of your jobs.#
-
If you want more information about your jobs, type
condor_q YourScreenName
to see your jobs in the queue. This will return something like:#-- Submitter: login02.hep.wisc.edu : <144.92.180.5:58221> : login02.hep.wisc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 11892.0 mbanderson 7/22 08:28 0+00:04:05 R 0 1220.7 cmsRun.sh photonje 11892.1 mbanderson 7/22 08:28 0+00:04:05 R 0 1220.7 cmsRun.sh photonje 11892.2 mbanderson 7/22 08:28 0+00:04:06 R 0 1220.7 cmsRun.sh photonje 3 jobs; 0 idle, 3 running, 0 held
For more information use the ID number:#
jondor_q -l 11892.2
-
Finally, when your jobs finish, your files are now all located in a folder in HDFS,#
/hdfs/store/user/$USER/AnyName/
To delete, rename, or move files there, type:#
gsido
This should give you a shell running as the same unix account that owns your files in HDFS. You can then cd to your directory and do what you wish with your files.#
Merging the Final ROOT Files#
-
When your jobs are all finished, go to the scratch space on your machine#
cd /scratch/
and type#
mergeFiles --copy-timeout=10 final.root /hdfs/store/user/$USER/AnyName/
And this will create a merged root file in your current directory. (Note: We suggest doing this in your scratch space in case the final root file is very large it may take up too much of your AFS space) Type#
mergeFiles --help
to see a list of other options.#
Merging Files with Different Cross-sections#
Use the mergeFiles
to merge files with the SAME cross-section. Do that
until you have a small set of ROOT files with different cross-sections
and then you can merge/plot from them in one of two ways:#
-
To correctly merge histograms only, you must download and edit root macro to your current directory, called hadd.C and use it from the
root
command line like so:#root [0] .x hadd.C
And that will combine the HISTOGRAMS, taking into account cross sections, into one final root file.#
-
To plot from ntuples from multiple files of different cross-sections you download PlotFromFiles.C and then edit and place the following code into a file
rootlogon.C
in your local directory:#G__loadfile("/afs/hep.wisc.edu/home/YourUserName/Folder/PlotFromFiles.C"); // Create a default canvas and histogram. // These are used by PlotFromNtuple.C, Plot2hists1D.C, and PlotFromFiles.C TCanvas *c1 = new TCanvas("c1","testCanvas",640,480); TH1F *h1 = new TH1F("h1","Blah",20,-5,5); // ******************************************************* // Specify files for "PlotFromFiles" const int NUM_OF_FILES = 6; TFile *fileArray = new TFile[NUM_OF_FILES]; fileArray[0] = new TFile( "phtn_jets_20_30-NEW.root" ); fileArray[1] = new TFile( "phtn_jets_30_50-NEW.root" ); fileArray[2] = new TFile( "phtn_jets_50_80-NEW.root" ); fileArray[3] = new TFile( "phtn_jets_80_120-NEW.root" ); fileArray[4] = new TFile( "phtn_jets_120_170-NEW.root" ); fileArray[5] = new TFile( "phtn_jets_170_300-NEW.root" ); // List of Cross-sections divided by num of events produced double crossSections[NUM_OF_FILES] = { 1.319E5/49961., 4.114E4/34646., 7.210E3/45295., 1.307E3/8874., 2.578E2/9281., 8.709E1/23867.}; // *******************************************************
Then, when you open root, you can use it at the command line so:#
root [0] PlotFromFiles("HiEtRecoPhtn","eta","deltaEt>0.08&&deltaEta<0.3",-3.4,3.4,61) Saving Images/HiEtRecoPhtn-eta-deltaEtGT0.08-deltaEtaLT0.3.gif Info in : GIF file Images/HiEtRecoPhtn-eta-deltaEtGT0.08-deltaEtaLT0.3.gif has been created root [1]
The parameters that must be provided are#
PlotFromFiles("Ntuple Name","Variable Name","Cuts", x-min, x-max, number-of-bins)