Skip to content

Creating dummy columns #21

@ammaryh92

Description

@ammaryh92

In chapter 26 - Reshaping DataFrames with Dummies, we wanted to turn values in the "job.role" columns into a categorical series, which we would then reshape into a dummy matrix.

That's the code of the book:

job = (jb
    .filter(like=r'job.role')
    .where(jb.isna(), 1)
    .fillna(0)
    .idxmax(axis='columns')
    .str.replace('job.role.', '', regex=False))

job

However, many rows have multiple jobs, and the above code only captures the first one.

I think the following code captures all jobs and converts them into a dummy matrix.

(jb
     .filter(like='job.role')
     .fillna('')
     .apply(lambda ser: ','.join([i for i in ser if i]), axis=1)
     .str.get_dummies(sep=',')
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions