Skip to content

modified column is inverted? #2

@wetchler

Description

@wetchler

In the cleaned_hm.csv file, I believe the modified column is the opposite of what it should be. You can see this by example with:

> df.loc[[50, 995],:]
	original_hm	cleaned_hm		modified
50	I went shopping	I went shopping		True
995	I ate chikfila	I ate chik-fil-a	False

And confirmed it by recreating this column like so:

> (df.modified == (df.cleaned_hm != df.original_hm)).sum()
0

And seems reasonable, since currently modified is True > 99% of the time!

> df.modified.value_counts()
True     98329
False     2206
Name: modified, dtype: int64

Or am I misunderstanding the data?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions