-
Notifications
You must be signed in to change notification settings - Fork 9
Exercise A: The Event Loop
An event loop reads independent neutrino interactions and reduces them to some simplified format. Every data preservation analysis starts with one or more event loops over so-called AnaTuples. There are actually other event loops that were used to produce these AnaTuples, but they have already been run by the production team.
Our macro stage event loop will reduce AnaTuples to the ingredients we need for a cross section. We're going to learn to run and edit a simple 1-stage macro stage event loop. Analysis that produce multi-dimensional cross sections or have other unique computing needs might split their event loops up into more stages by memory requirements or run time. At the end of this exercise, you'll be have a .root file with cross section ingredients from the runEventLoop
program you updated.
Alex's talk yesterday explained what a cross section is, why they're important to measure, and one common procedure for extracting a cross section from our data using a Monte Carlo simulation. How do we turn his formula into a program for reducing AnaTuples to the histograms we'll need to extract a cross section?
Since i and j are true and reco bin indices, then each symbol in Alex's figure is a histogram. The efficiency * acceptance correction will turn out to be the ratio of two different histograms. We can sort these histograms along two axes:
- reco/true variables on their axes
- which cuts are applied
The DATA histogram will also be unique because it comes from the data sample. All other cross section ingredients in this tutorial will be derived from a Monte Carlo simulation of the MINERvA experiment.
- The data can only be measured and selected in reco variables
- The backgrounds are subtracted from the data, so they must also be binned in reco variables and pass the reco selection.
- The efficiency numerator characterizes signal events that pass the reco selection. It is applied to the data after the migration matrix, so it must be calculated in true variables.
- The efficiency denominator counts all events that pass the signal definition, even if they fail the reco selection. The AnaTool that produced the MasterAnaDev AnaTuples we'll be using already threw out some signal events, so the efficiency denominator has its own larger tuple of events that we could have detected.
- The migration matrix converts reco variables to true variables, so it is a 2D histogram with one axis of each type. We're going to apply it before the efficiency * acceptance correction.
- The flux is simulated and constrained independently from our analysis. It is provided in true variables.
runEventLoop
calculates the cross section ingredients in one pass over the data and MC samples. The flux our detector receives changed throughout data taking, so we split our data sample, and our MC sample to match, into flux periods called playlists. We're going to be analyzing the minervame1A playlist throughout this tutorial. runEventLoop
is designed to process only 1 playlist at a time, and we have to tell it which files to process on the command line like this:
runEventLoop data.txt mc.txt
If you hit the TAB key after typing runEventLoop
, the shell will list suitable file lists. This is a feature of the bash shell called auto-completion. If you run runEventLoop
with no arguments, it will tell you about its input and output. This is typical behavior for programs in UNIX-based operating systems. If you read its "help text" closely, you'll notice that runEventLoop
also looks for some environment variables.
The event loop itself is split into loops over 3 chains of TTree
s for data, Monte Carlo, and the so-called "Truth Tree" that's used for the efficiency denominator.