Skip to content

ProjectGoals

dlast44 edited this page May 21, 2021 · 5 revisions

The 2021 MINERvA 101 Tutorial

Goals

  • Extract a working single-differential cross section: Dan's ME inclusive cross section in muon pT
    • Why Inclusive?
      • Easiest cross section to extract
      • No sidebands
      • Cuts useful to all CC muon analyses
    • Steps:
      • Event loop handles all data and MC histograms at same time
      • Program to extract cross section given file from event loop
      • Modify Amy/Dan's GENIEXSecExtract for closure test
      • Use TransWarp to run warping studies and choose number of unfolding iterations
    • Validation: Compare to Dan's 2D inclusive results at
      • OLD: /minerva/data/users/drut1186/ME_Inclusive/PubProcessing_CollablJan2021_MuonEnergyScan/Inclusive_CV
      • NEW(recommended by Dan to David): /minerva/data/users/drut1186/ME_Inclusive/PubProcessing_v102_Rerun_NoBubbleConstraint_AddedStatUnc/Inclusive_CV
    • Status:
      • Event loop making good progress
        • (2D) efficiency denominator matches Dan to 1e-3
        • (2D) Efficiency numerator off by about 5% -> must be MINOS efficiency weight?
        • Migration matrix filled in same place as efficiency numerator
        • TODO: Data/selected events, backgrounds
      • Copy my cross section extraction program?
      • Hoping to reuse Amy/Dan's GENIEXSecExtract
      • TransWarp ready
  • Plotting scripts
  • NSF Validation Suite
  • Available energy study
  • Suggestions for intentional bugs to introduce?

We Have the Technology

  • Goals:
    • Uses MAD tuples at:
      • Data: TODO
      • MC: TODO
    • Works on GPVMs
    • Works on personal computers running:
      • OS X
      • Linux (Ubuntu and probably SL7 so far)
    • Works with only me1A from AnaTuple USB drives
    • Uses Satyajit's github PlotUtils repository
    • Stretch Goal: Automatic setup via CMake's ExternalProject_Add()
    • Stretch Goal: On the grid would be awesome
    • Stretch Goal: Dan requested 2D cross section
    • Stretch Goal: Nuclear targets. Maybe Aneczka's analysis now?
  • We'll be using:
    • Source code on public github: https://github.com/MinervaExpt/MAT_IncPions
    • Build system: CMake
    • MAT for everything it's worth:
      • MnvH1D/2D
      • Systematic universes
      • MacroUtil
      • Cutter
      • TODO: Upgrade/replace HistFolio with Categorized. Currently in project-specific code.
      • TODO: Reweighter
    • Start with PlotUtils from CVS. I've built it on Zubair's MacBook Pro before with some tweaks. Eventually switch to Satyajit's github repo.
    • MParamFiles, UnfoldUtils from CVS. Maybe put MParamFiles on USB drives if room?
    • ROOT 6 if we can swing it. Falling back to ROOT 5.34 shouldn't be a show stopper for at least CVS PlotUtils.
    • gcc, maybe clang for OS X support?
    • Markdown for documentation? Easy for github to turn into web page. Laura: Put all documentation in github wiki!

Starting Tasks

  • Andrew
    • Distribute MAD AnaTuples. Drives almost ready to be mailed.
    • Improve runEventLoop interface and general clean up
      • Pick playlist on the command line. How does that interface with flux used? Should there be a default?
      • Read playlist from STDIN?
      • Put everything in one .root file. Change the file name so it makes more sense in this context. Make sure histogram names match what cross section extraction expects. Add in flux and number of nucleons. Add in metadata about what was run?
      • How are warping studies going to work?
      • Write USAGE details in error messages. Standardize return code and check for failed file accesses while I'm at it.
    • Copy my cross section program into the tutorial. Make sure tutorial histograms have the names it expects.
    • Get GENIEXSecExtract to compile without cmt
      • Install a CMake build system like I did to PlotUtils and UnfoldUtils
      • This leads to writing a GENIEXSecExtract for a 1D inclusive cross section
      • Probably copy Amy and Dan's GENIEXSecExtract and take out the higher dimensionality parts?
    • Use PlotUtils::Model for all reweights in the tutorial. Hopefully speeds things up a bit when using systematics.
  • Christian
    • Only OS X tester for now? Hopefully Sean will catch up in about 1 week. Not sure if he has a Mac.
    • Compare 2D inclusive data to Dan
  • David
    • Compare 2D inclusive backgrounds to Dan
  • Sean
    • Run NSFValidationSuite against the tutorial. This will show us whether the standard systematics are working.
  • On the horizon:
    • Validate data/selection against Dan's CCQENu AnaTuples. Someone has to ask Dan which file/histogram to compare to selection.
    • Install migration matrix. Should just work once efficiency numerator and data selection work.
    • Test against Rob's NSF Validation Suite (in PlotUtils). Probably means adding new histograms to or rewriting event loop.
    • Background histograms via Categorized. Dan says we just need GENIE categories. I at least have those set up in Mehreen's macro.

Reference

  • Result based on Amy's LE and Dan's ME 2D inclusive analyses. Amy has a great paper explaining the LE at TODO
  • This tutorial is a heavily modified fork of Mehreen's macro.
  • Key to CCQENu comparison:
    • Tuples to compare to:
      • Official CCQENu reference: https://cdcvs.fnal.gov/redmine/projects/ccqenu-management/wiki
      • In playlists directory as CCQENu_*.txt. Ignore the Extended 2p2h playlist though.
      • Data: /pnfs/minerva/persistent/users/drut1186/CCQENu_Anatuples/MuonKludge_ProtonLLR_UpdatedNeutron/Data_Merged/minervame1Apass1/*.root
      • MC: /pnfs/minerva/persistent/users/drut1186/CCQENu_Anatuples/MuonKludge_ProtonLLR_UpdatedNeutron/MC_Merged/minervame1Apass1/*.root
    • Efficiency numerator: h_pzmu_ptmu_CC in /minerva/data/users/drut1186/ME_Inclusive/PubProcessing_CollablJan2021_MuonEnergyScan/Inclusive_CV/EffPurity_MakeFlux-1_minervame1A.root
    • Efficiency denominator: h_pzmu_ptmu_truth_CC in /minerva/data/users/drut1186/ME_Inclusive/PubProcessing_CollablJan2021_MuonEnergyScan/Inclusive_CV/EffPurity_MakeFlux-1_minervame1A.root
    • Selected data: TODO in TODO
    • Backgrounds: TODO in TODO
  • MAD AnaTuples can be found according to: https://docs.google.com/spreadsheets/d/1t7AR6FuA6klaxo8lqJfm_ULlVNKQnZgG8UMRB6XCSxs/edit?ts=60806baf#gid=0. I might copy them to /minerva/data soon.
  • GENIEXSecExtract program: TODO
  • My repository for cross section extraction program and plotting scripts: https://github.com/MinervaExpt/NucCCNeutrons/tree/develop and https://github.com/aolivier23/MINERvANeutronMultiplicity
  • Markdown guide for github: https://guides.github.com/features/mastering-markdown/
  • CMake setup instructions: In README.md for now.

runEventLoop User Interface Brainstorming

  • runEventLoop needs these modes:
    • Systematics on/off. Off is for faster debugging.
    • 1D cross section mode for the tutorial
    • 2D cross section from CCQENu for validating the CV
    • NSF Validation Suite mode for validating systematics
    • Alternative models for warping studies?
  • Can I combine them?
    • TLDR: Make extra histograms if and only if the tree name is "CCQE"
    • 1D versus 2D CCQE mode: check tree name
      • If it's "CCQE", make 2D histograms on top of others.
      • Else, just do 1D cross section
      • Important because I suspect that filling 2D histograms takes forever compared to 1D and I'm impatient
    • NSF Validation suite: only do this in CCQE mode?
    • Systematics definitely need to be on in CCQE mode for the validation suite. assert() this? Does it run fast enough in 1D mode?
    • Alternative models will require changing the code anyway.
  • User interface itself will just be files. Following Gtk's Command Line Interface (CLI) design philosophy from good old days: Gtk documentation
  • Application returns 0 if and only if all files opened/closed properly. If any file not opened properly, it returns something that's not 0. This way, scripts can decide whether to stop when runEventLoop fails.
  • Application prints warnings and errors to stderr. Printout to stdout will summarize the configuration and results. I might print things like event numbers to std::clog so I can easily switch between the two later.
  • Can I get the application to support reading file names from stdin?
    • If successful, this would let me use both playlist files and individual files (for e.g. debugging) with no if statement.
    • Downside: I need to separate data and MC files
      • I don't want to look for "data" in the file name because that seems like a code stink.
      • I could look for a tree named "Truth", but then I'd have to open every file an extra time. Seems like this might kill start performance when reading via xrootd (which is the default plan for the GPVMs).
Clone this wiki locally