Skip to content

CLu98/2024-07-24-codas-hep-columnar-data-analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Materials for CoDaS-HEP 2024 Columnar Data Analysis Tutorial

Binder

Abstract:

Data analysis languages like Numpy, MATLAB, R, IDL, ADL, and Julia are primarily interactive, using an array-at-a-time interface. Instead of conducting an entire analysis in a single loop, each calculation step is performed separately, allowing users to inspect distributions at each stage.

However, these languages are typically limited to primitive data types, such as numbers and booleans. Variable-length and nested data structures, like varying numbers of particles per event, don’t fit well into this model. Fortunately, this limitation can be overcome.

In this tutorial, we will introduce awkward-array, explore the concepts of columnar data structures, and demonstrate how to leverage them in data analysis. For example, we’ll show how to compute combinatorics (quantities depending on combinations of particles) without using any for loops.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%