-
Notifications
You must be signed in to change notification settings - Fork 60
Local Geary loses order #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi! Thank you for the report. Yes, we've been aware of issues with w/dataframe/weights alignment for a while now (pysal/libpysal#184) This is indeed a bug, and not intended behavior. The output should always be aligned with X, and X should ideally be aligned with the spatial weights. The "right" way to work around this is to try to compute the spatial weights as close to the computation as possible, and ensure that the data is sorted by id before construction of the weights. I think as a fix, this should be addressed by re-ordering |
Thank you for the quick answer. I took a look at pysal/libpysal#184. This is a significant change in the structure of libpysal and downstream repercussions are hard to foresee. The reordering of
|
OK, this should be addressed in #195. For example: import libpysal, geopandas, numpy, esda
guerry = geopandas.read_file("./guerry.shp")
guerry_shuffled = guerry.sample(frac=1, replace=False)
guerry_abet = guerry.sort_values('Dprtmnt')
w = libpysal.weights.Queen.from_dataframe(guerry)
w_shuffled = libpysal.weights.Queen.from_dataframe(guerry_shuffled)
w_abet = libpysal.weights.Queen.from_dataframe(guerry, ids='Dprtmnt')
lG = esda.Geary_Local(connectivity=w).fit(guerry['Donatns'])
lG_shuffled = esda.Geary_Local(connectivity=w_shuffled).fit(guerry_shuffled['Donatns'])
lG_abet = esda.Geary_Local(connectivity=w_abet).fit(guerry_abet['Donatns'])
guerry['localG'] = lG.localG
guerry_shuffled['localG'] = lG_shuffled.localG
guerry_abet['localG'] = lG_abet.localG
numpy.testing.assert_allclose(guerry.sort_values('Dprtmnt').localG, guerry_abet.localG)
numpy.testing.assert_allclose(guerry_shuffled.sort_values('Dprtmnt').localG, guerry_abet.localG) This indicates that the "right" |
When running:
The order of values in
local_geary_contiguity_ratio
does not correspond to the order of values in the input arrayx
.The loss of order occurs at line 167 in local_geary.py:
adj_list_gs = adj_list_gs.groupby(by="ID").sum()
The groupby function returns the values in the lexicographic order of the weight id's. While the spatial weights class W stores data in lexicographic order by default, a user may impose a different ordering by setting the id_order parameter. In this case, the order in the localG attribute is different from the input, which is quite misleading and frankly speaking a bug.
It would be quite useful for the fit function to return the values in same order as the spatial weights. If you chose not to do it, please indicate in the documentation that the order of values in the localG attribute may change and is the lexicographic order of the spatial weights and not the order of the input.
The text was updated successfully, but these errors were encountered: