-
I want to perform heaps of Mann-Whitney U tests given two DataArrays. Although Here's a MWP: import numpy as np
import scipy.stats
import xarray as xr
# Set a seed for reproducibility
np.random.seed(0)
# Define the dimensions and shape
samples_dim = 20
season_dim = 4
category_dim = 3
# Coordinates
samples_coords = np.arange(samples_dim)
season_coords = ['winter', 'spring', 'summer', 'autumn']
category_coords = ['fish', 'mammals', 'birds']
# Create the data for the arrays
data1 = np.random.rand(samples_dim, season_dim, category_dim)
data2 = np.random.rand(samples_dim, season_dim, category_dim)
# Create the xarray DataArrays
da1 = xr.DataArray(data1, dims=("samples", "season", "category"), coords=[samples_coords, season_coords, category_coords])
da2 = xr.DataArray(data2, dims=("samples", "season", "category"), coords=[samples_coords, season_coords, category_coords])
def MWU(x, y, dim):
''' xarray wrapper for scipy.stats.mannwhitneyu '''
return xr.apply_ufunc(
scipy.stats.mannwhitneyu,
x,
y,
input_core_dims=[[dim], [dim]],
output_core_dims=[[], []],
vectorize=False # I don't want to vectorize because speed is important
)
MWU(da1, da2, dim='samples') With
Note that the following works: scipy.stats.mannwhitneyu(da1.values, da2.values, axis=0) This would probably be my fallback solution ;) Thanks for your help! :) |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
@nicrie The core dimensions would need to be specified this way: input_core_dims=[["samples", "season", "category"], ["samples", "season", "category"]],
output_core_dims=[["season", "category"], ["season", "category"]], As the target function works on all dimensions, they are actually all input core dimensions. For the output you just have to specify the remaining dimensions. |
Beta Was this translation helpful? Give feedback.
-
The issue is this (from the documentation of the
which means that we can't just use the default def MWU(x, y, dim):
""" xarray wrapper for scipy.stats.mannwhitneyu """
return xr.apply_ufunc(
scipy.stats.mannwhitneyu,
x,
y,
input_core_dims=[[dim], [dim]],
output_core_dims=[[], []],
vectorize=False # I don't want to vectorize because speed is important
kwargs={"axis": -1},
) |
Beta Was this translation helpful? Give feedback.
-
Thanks to both of you @kmuehlbauer & @keewis ! |
Beta Was this translation helpful? Give feedback.
The issue is this (from the documentation of the
input_core_dims
parameter):which means that we can't just use the default
axis=0
but have to setaxis=-1
. This should work: