-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
In the cleaned_hm.csv
file, I believe the modified
column is the opposite of what it should be. You can see this by example with:
> df.loc[[50, 995],:]
original_hm cleaned_hm modified
50 I went shopping I went shopping True
995 I ate chikfila I ate chik-fil-a False
And confirmed it by recreating this column like so:
> (df.modified == (df.cleaned_hm != df.original_hm)).sum()
0
And seems reasonable, since currently modified
is True > 99% of the time!
> df.modified.value_counts()
True 98329
False 2206
Name: modified, dtype: int64
Or am I misunderstanding the data?
Metadata
Metadata
Assignees
Labels
No labels