Skip to content
This repository was archived by the owner on Jun 21, 2022. It is now read-only.
This repository was archived by the owner on Jun 21, 2022. It is now read-only.

Request for .frompandas() function #215

@NumesSanguis

Description

@NumesSanguis

In your documentation you often mention awkward.topandas(), but how about the other way, a awkward.frompandas()?

  • Say I created a Pandas DataFrame from a .csv, do some filtering and stuff, and now I want to continue with an awkward array (for example I want to add a named numpy array). Is this possible?
  • Or nested inside a Jagged Array is a Table (loaded from HDF5). I want to give this Table as a DataFrame to a colleague who is only familiar with Pandas. After he/she is done, I want to store it back in the HDF5 as a Jagged Array. Can this be done?

I looked in the Python file where .topandas() was defined:
https://github.com/scikit-hep/awkward-array/blob/d942fb8d4fae5e1dec35c70938e24c05207b3f31/awkward/util.py#L213
, but nothing about loading DataFrames there.

I also tried with some code, but this failed:

import pandas as pd

df = pd.DataFrame({"foo": [2, 8], "bar": [0.3, -0.9]})
print(type(df))
# <class 'pandas.core.frame.DataFrame'>
print(df.head())
#    foo  bar
# 0    2  0.3
# 1    8 -0.9

af = awkward.fromiter(df)
print(af)
# ['foo' 'bar']

df_awk = awkward.topandas(af, flatten=True)
print(type(df_awk))
# <class 'pandas.core.series.Series'>
print(df_awk.head())
# 0    foo
# 1    bar
# dtype: object

Applying .fromiter() only gets the column names.

TL;DR How to convert a Pandas DataFrame to an awkward-array and vice-versa?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions