-
Notifications
You must be signed in to change notification settings - Fork 370
Open
Description
Environment Details
Please indicate the following details about the environment in which you found the bug:
- SDV version: 1.24
- Python version: 3.12
- Operating System: darwin
Error Description
When creating an HMA Synthesizer with a dataset that includes a grandparent-parent-child relationship, changing the default distribution of the table synthesizers to 'norm' or 'uniform' causes an error when sampling
Steps to reproduce
Example
data = {
'child': pd.DataFrame({'id': range(10), 'parent_id': range(10)}),
'parent': pd.DataFrame({
'id': range(10),
'grandparent_id': range(10),
'categories': list(np.random.choice(['T', 'F'], size=10)),
}),
'grandparent': pd.DataFrame({'id': range(10)}),
}
metadata = Metadata.load_from_dict({
'tables': {
'child': {
'primary_key': 'id',
'columns': {'id': {'sdtype': 'id'}, 'parent_id': {'sdtype': 'id'}},
},
'parent': {
'primary_key': 'id',
'columns': {
'id': {'sdtype': 'id'},
'grandparent_id': {'sdtype': 'id'},
'categories': {'sdtype': 'categorical'},
},
},
'grandparent': {'primary_key': 'id', 'columns': {'id': {'sdtype': 'id'}}},
},
'relationships': [
{
'parent_table_name': 'parent',
'child_table_name': 'child',
'parent_primary_key': 'id',
'child_foreign_key': 'parent_id',
},
{
'parent_table_name': 'grandparent',
'child_table_name': 'parent',
'parent_primary_key': 'id',
'child_foreign_key': 'grandparent_id',
},
],
'METADATA_SPEC_VERSION': 'V1',
})
synthesizer = HMASynthesizer(metadata)
synthesizer.set_table_parameters('parent', {'default_distribution': 'norm'})
synthesizer.fit(data)
synthesizer.sample(1)
Traceback
sdv/multi_table/base.py:675: in sample
sampled_data = self._sample(scale=scale)
sdv/sampling/hierarchical_sampler.py:324: in _sample
self._sample_children(table_name=table, sampled_data=sampled_data, scale=scale)
sdv/sampling/hierarchical_sampler.py:211: in _sample_children
self._add_child_rows(
sdv/sampling/hierarchical_sampler.py:101: in _add_child_rows
sampled_rows = self._sample_rows(child_synthesizer, num_rows)
sdv/sampling/hierarchical_sampler.py:76: in _sample_rows
return synthesizer._sample_batch(round(num_rows), keep_extra_columns=True)
sdv/single_table/base.py:960: in _sample_batch
sampled, num_valid = self._sample_rows(
sdv/single_table/base.py:863: in _sample_rows
raw_sampled = self._sample(num_rows)
sdv/single_table/copulas.py:194: in _sample
return self._model.sample(num_rows, conditions=conditions)
../Copulas/copulas/utils.py:50: in wrapper
return function(self, *args, **kwargs)
../Copulas/copulas/multivariate/gaussian.py:299: in sample
output[column_name] = univariate.percent_point(cdf)
../Copulas/copulas/univariate/base.py:593: in percent_point
return self.MODEL_CLASS.ppf(U, **self._params)
> args, loc, scale = self._parse_args(*args, **kwds)
E TypeError: _parse_args() got an unexpected keyword argument 'a'
../miniconda3/envs/sdv/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py:2293: TypeError
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working