-
Notifications
You must be signed in to change notification settings - Fork 129
Open
Description
In chapter 26 - Reshaping DataFrames with Dummies, we wanted to turn values in the "job.role" columns into a categorical series, which we would then reshape into a dummy matrix.
That's the code of the book:
job = (jb
.filter(like=r'job.role')
.where(jb.isna(), 1)
.fillna(0)
.idxmax(axis='columns')
.str.replace('job.role.', '', regex=False))
job
However, many rows have multiple jobs, and the above code only captures the first one.
I think the following code captures all jobs and converts them into a dummy matrix.
(jb
.filter(like='job.role')
.fillna('')
.apply(lambda ser: ','.join([i for i in ser if i]), axis=1)
.str.get_dummies(sep=',')
)
Metadata
Metadata
Assignees
Labels
No labels