Replies: 6 comments 3 replies
-
SubsystemsRequired
Nice to have
|
Beta Was this translation helpful? Give feedback.
-
Related to: |
Beta Was this translation helpful? Give feedback.
-
Responsible developers: @marcpaterno and @sabasehrish. |
Beta Was this translation helpful? Give feedback.
-
Here are some types of chunking needed for DUNE and what implications chunking may have. It has a focus on FD charge data and Wire-Cell implementations so is definitely not comprehensive. Terms
Wire-Cell charge waveform simulationBasic transformation:
The
There are several types of chunking relevant to this simulation:
Wire-Cell charge waveform signal processingBasic transformation:
Signal waveforms represent a reconstruction of the distribution of drifted ionization charge in (transverse) space vs time dimensions of each tomographic wire-plane view. The samples of a signal waveform are in units of number of (drifted) ionization electrons per tick per channel. The signal waveforms are highly sparse and can be represented in a space-efficient way either with sparse arrays or as compressed dense arrays (zero padding the sparse regions). There are two types of chunking that are relevant:
Wire-Cell charge sim+sigprocAs a special case, when both simulation and signal processing are needed, it is desired (at least for large scale production) to NOT expose
Wire-Cell 3D charge imaging
This process reconstructs, with coarse resolution, locations in space/time likely to contain ionization electron signal. It is a per-APA transformation and essentially a streaming algorithm. Thus, robust against space-chunking at APA level and any reasonable time-chunk. Wire-Cell charge cluster stitching
WC (and other) reconstruction chains form "clusters" of some type that represent high resolution reconstruction of ionization locations. In WC and for the case of compact (nominal, not extended) data, clusters are constructed first on a per-TPC basis. They are then "stitched" across the two TPCs of one APA and then across neighboring APAs. Each type of stitching requires assembly of any chunk-level clusters such that the boundaries are spanned. This can be pair-wise at the 2TPC->APA stitching and then all APA level clusters can be assembled for the cross-APA stitching. Finding clusters from extended data poses a problem in the face of chunking due to a given set of blobs that should become a single cluster landing on a chunk boundary. Some possible solutions:
Wire-Cell Charge-Light matchingCharge clusters and "flashes" reconstructed from the optical detection system must be matched in space and time in order to absolutely locate the cluster.
The DUNE FD design does not include optical boundaries at the TPC or APA level and so the matching is done with whole-detector charge and light information. Any prior chunking of these data must be such to allow the required assembly. Like with clustering, chunking in time may be required for input clusters and/or flashes and similar solutions can be considered ("chunk and hope" vs "streaming alg"). Cross-chain mergingDUNE has multiple, independent reco chains. Eg Wire-Cell and Pandora both split off after signal processing in order to implement different strategies. It is necessary to allow data products from one chain to "cross over" to another. This is needed for performing comparisons and so that one chain simply input results from the other to form a subsequent hybrid chain. Each consumer at the merge will impose some requirements related to the chunk boundaries of the data products from each stream. Even in the unlikely case that identical chunk boundaries existed on both streams, the node consuming the two streams may have special needs. Eg, it may require to consume a FIFO queue of some depth of data products from each stream. |
Beta Was this translation helpful? Give feedback.
-
To provide some context, we discussed these slides at yesterday's meeting to start the discussion. |
Beta Was this translation helpful? Give feedback.
-
Hi @brettviren . Saba and I are just getting back to looking at the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
DUNE US S&C R&D item 101
Data chunking is intended to process a logical data product that is too large to fit in memory at once. This demonstrator requires several things:
std::span<T>
vs.std::vector<T>
could imply that some data can be chunked for an algorithm and some cannot.To produce a demonstrator we are introducing a concept of chunk-able data product (e.g. a sequence of waveforms), in general a chunk-able data product will be a sequence of something.
Beta Was this translation helpful? Give feedback.
All reactions