Skip to content

reduce on GroupedDTable of DTable of DataFrames returns NamedTuple #65

@schlichtanders

Description

@schlichtanders

I think it should be returning a DataFrame, preserving the inner type

here an example

using Distributed
# add two further julia processes which could run on other machines
addprocs(2, exeflags="--threads=2")
# Distributed.@everywhere execute code on all machines
@everywhere using Dagger  # needed for all_processors
# Dagger uses both Threads and Machines as processes
Dagger.all_processors()

using DTables, DataFrames, CSV

url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
files = [url, url, url, url, url]

d = DTable(DataFrame  CSV.File  download, files)
g = DTables.groupby(d, :species)
r = reduce(+, g, cols=[:sepal_width])
fetch(r)
# returns
# (species = String15["virginica", "setosa", "versicolor"], result_sepal_width = [743.5, 856.9999999999998, 692.4999999999995])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions