Skip to content

File name sort is incorrect. #5

@Manslow

Description

@Manslow

Thanks for the fantastic embedding approach and implementation! I'm currently using it for a project I'm working on.

I found a possible bug in your code. In the get_files function of the utils module you have the following lines:

all_files = list(set(all_files))
all_files.sort()

I'm assuming that you want to sort graph files in order from low ids to high ids. However, .sort() will try to the sort the file name strings based on the bytes that make up the string not the int representation of it. In my work this led to an incorrect mapping between embeddings and graphs as I assumed the order would be the same.

I changed this to the rather ugly:

all_files = sorted(all_files, key=lambda file_name: int(file_name.split('/')[-1].split('.')[0]))

which extracts the file name then the graph number and sorts it as an int.

I'm not sure if this makes a difference for your paper but it might be worth reviewing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions