Replies: 2 comments
-
One of the big advantages of having A side-effect of this might be the invention of many "data products" that are not intended for persistence, but just for communication from a "track building algorithm" to an algorithm like |
Beta Was this translation helpful? Give feedback.
-
I wonder if, instead of implementing histogram-filling as a fold, it might make more sense in this case to implement it as an observer. These histograms would then no longer be data products. For example, the struct histograms {
TH1F* fLength_1stTrack;
TH1F* fLength_2ndTrack;
TH1F* fLength_3rdTrack;
TH1F* fLength_4thTrack;
TH1F* fLength_5thTrack;
}
histograms initialize_histograms(tfile_service* tfs)
{
return {
.fLength_1stTrack = tfs->make<TH1F>("fLength_Track1", "Muon Track Length", 100, 0, 100),
.fLength_2ndTrack = tfs->make<TH1F>("fLength_Track2", "2nd Track Length", 100, 0, 100),
.fLength_3rdTrack = tfs->make<TH1F>("fLength_Track3", "3rd Track Length", 100, 0, 100),
.fLength_4thTrack = tfs->make<TH1F>("fLength_Track4", "4th Track Length", 100, 0, 100),
.fLength_5thTrack = tfs->make<TH1F>("fLength_Track5", "5th Track Length", 100, 0, 100)
};
} But the void fill_histograms(TrackPair const& trackpair, histograms& hs)
{
if (trackpair.size() > 0) hs.fLength_1stTrack->Fill(trackpair[0].second);
if (trackpair.size() > 1) hs.fLength_2ndTrack->Fill(trackpair[1].second);
if (trackpair.size() > 2) hs.fLength_3rdTrack->Fill(trackpair[2].second);
if (trackpair.size() > 3) hs.fLength_4thTrack->Fill(trackpair[3].second);
if (trackpair.size() > 4) hs.fLength_5thTrack->Fill(trackpair[4].second);
}
REGISTER(m) {
auto job_level_histograms = m.with(initialize_histograms).using_resource<tfile_service>();
m.with(fill_histograms).using_resource(job_level_histograms).observe("best_track_pairs_ever");
} This way, the registration-code author does not need to know about the allowed thread-safety |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Our approach thus far has been to pursue a functional programming design—i.e. physics algorithms should be expressed as functions with no side effects. This is a different programming paradigm than the HEP community is used to, but it has many benefits regarding thread-safety, clarity of what the functions is supposed to do, etc. However, past frameworks have provided utilities that rely on the management of global state (e.g.
TFileService
). The new framework must be able to provide similar capabilities that are consistent with a functional programming approach.One avenue toward solving to this problem is to use a token-based messaging system, which we're already considering based on oneTBB's flow graph.
art-based approach
Consider the module
PrimaryVertexFinder
in LArSoft that uses theart::TFileService
to create histograms at the beginning of the job:During the event loop of the, the histograms are filled:
This use case illustrates two thread-safety issues:
beginJob()
function is invoked once per job, but theTFileService::make
function calls in it cannot be executed in parallel by other modules that use theTFileService
.produce(art::Event&)
member function is invoked perhaps many times per job. Although each module of typePrimaryVertexFinder
contains its own data members such asfLength_1stTrack
, the function callfLength_1stTrack->Fill(...)
cannot be executed in parallel for multiple events. In other words,PrimaryVertexFinder::produce
may not be executed concurrently for multiple events in flight.To handle these multithreading issues, any member functions of art modules using
TFileService
must be serialized with each other, even across all modules using theTFileService
. This is overkill, but it was a reasonable approach to ensure thread-safety.Token-based approach
With a token-based system, it's possible this serialization could be relaxed. For example, assuming that a
tbb::flow::resource_limiter_node
exists (RFC forthcoming), we could create some type of policy:The user code might look like a reduction whose result is initialized with the
tfile_service
:The above implies that
tfile_service
would be a limited resource, allowing only one thread to access it at a time. Thefill_histograms
function would also be serialized, but not with respect to any other module...which is an improvement compared to art.I'm not entirely happy with this approach, yet, but wanted to get it down before I forgot.
Beta Was this translation helpful? Give feedback.
All reactions