-
Notifications
You must be signed in to change notification settings - Fork 31
Storage of job execution environment in output files [2/4] #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Storage of job execution environment in output files [2/4] #497
Conversation
…bncode GIT version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- need to include the GIT extraction macros from
sbnobj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- added a new job configuration and new directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- added a new job configuration and new directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a new job configuration for dumping the SBN metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- metadata "framework":
- added algorithm to extract SBN metadata
- added output plugin to write the metadata into art output
- added the output module to dump that metadata from files to screen
- package metadata:
- added macros for extraction of the version of this repository
- added plugin for reporting versions of
sbncode
andsbnobj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- output module dumping SBN metadata from an input file to screen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- template source for
sbncode
repository version library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- template interface for
sbncode
repository version library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- implementation of the SBN metadata extraction algorithm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- interface of the algorithm for extracting SBN metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- helpers to add information to the data product class (which we keep simple...r)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- helpers to add information to the data product class (which we keep simple...r)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- interface of the art tool to collect version information from the various repositories
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- complete documentation of the system in Doxygen format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- output module plugin to save SBN metadata into art output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sbncode
-specific tool reporting the version ofsbncode
andsbnobj
repositories
These were written when I thought that libraries would register the repository versions as soon as loaded. Eventually I moved to dynamic loading plugin objects when the information is needed.
I love this! But I need to digest... so give me a bit of time... |
Hi @PetrilloAtWork, @SFBayLaser, and @absolution1 - just pinging to check in on this PR. I've merged develop into the branch - does this still look good to you? Would be great if we could merge this. Thanks! |
This PR implements a system for storing into some output files information ("SBN metadata") about the execution environment of each job.
From user perspective, adding a "plugin" to the job output module (
RootOutput
) configuration will have this metadata saved into all the art/ROOT output files and to theTFileService
file if available.Users can read back the metadata in art/ROOT files with a special output module (configuration
dump_sbnjobmetadata.fcl
provided insbncode
), and can unpack the one in theTFileService
file withEvtInfo->dump(std::cout)
provided that the dictionary of the class is available.What code is where
The system includes in
sbnobj
the data class holding the metadata (sbn::JobEnvironmentInfo
) and the small CMake library to extract the GIT branch version, and insbncode
the art-based modules and plugins and the job configuration file for dumping the metadata from art/ROOT file, plus extensive documentation of the whole system in Doxygen format.The system requires a modification in each repository that wants the GIT branch version extracted.
This PR provides those modifications for
icarusalg
andicaruscode
. In addition, since ICARUS uses a templateRootOutput
configuration, that configuration has been changed to include the plugin that will save the information.PR summary
What is included in the metadata
This PR includes in the metadata:
I attach an example of output from
dump_sbnmetadata.fcl
on a test file created by two "empty" jobs in chain.If art/ROOT input files contain SBN metadata, that will also be replicated in the art/ROOT output file(s), but not in the
TFileService
one.Some details of how this works and its limits
The system writes its metadata in a Results-level data product, which is unlike the ones we are used to in that is accessible basically only by the output modules. While this makes some sense, it also makes the programmatic usage of the metadata much, much harder.
The extraction of the execution environment information is pretty straightforward.
Conversely, the extraction of the GIT branch information, is not.
The pattern of the system is that the CMake building files of a repository need to include instructions to extract from GIT the branch information (
git describe
) and to put that into a C++ shared library (might have been something else, with different pros and cons). This is a fairly simple set of instruction, but long enough that the PR provides that in aSBNutil.cmake
library insbnobj
that the repositoryCMakeLists.txt
need to include.Then, the art-aware repositories (so, for example
sbncode
but notsbnobj
) need to define a art tool that links to the metadata and returns it.An algorithm class,
sbn::JobEnvironmentInfoExtractor
, is provided insbncode
which calls all the tools it knows (from its configuration) and fills the list of metadata.Finally, the
RootOutput
pluginSaveJobEnvironment
(sbncode
) is the front-end executing that algorithm (and passing it the list of known repositories/tools) and storing the result into the output.Another important limit is that the CMake macros used here are stored in
sbnobj
, which is the lowest level repository we have in SBN. For once, it is questionable that they belong here; and, more fatally, there are repositories which do not depend onsbnobj
(e.g.sbnanaobj
) and that as a consequence can't use them.sbnana
would have been a natural candidate for inclusion in the system, but it seems unlikely given that it does not depend onsbnobj
either (it does depend onsbndata
, which is a questionable workaround but still one;sbnanaobj
, on the other end, depends almost only on ROOT, and by design).This system is extensively described in
SBNsourceMetadataSystem.dox
file.Testing
The system, in the final incarnation in this PR, has been tested with eight combinations of builds including or not including
sbnobj
,sbncode
,icarusalg
andicaruscode
.In the process, a few defects have been found and corrected in the build scripts (typically, missing stuff that was overlooked because was being fortuitously provided by some other package in the build).
Review
I am calling for the review:
This system is complicate, and honestly should have been introduced at art level.
A lot of design was involved, and there were a lot of choices made in the process.
The sooner these choices are pondered, tested, discussed and challenged, the better: it is unlikely that design reconsideration will result in a backward-compatible change. Backward compatibility here is not a strong requirement, but it does not hurt.
I don't know how badly this interact with the Spack-based build system. Provided that CMake is still there, this system should still work, although not necessarily with satisfaction.
For example, the dump of the environment implies that all UPS products are trackable; this might not be the case with Spack any more. However, when it is the time, the system can be tuned to the new build system.