Skip to content

SQLAlchemy + Pandas very slow when compared to AWS Wrangler #254

@avibrazil

Description

@avibrazil

Tested same query that returns more than 2 million lines.

AWS Wrangler takes 1m34s to return a DataFrame.
SQLAlchemy + PyAthena takes 16m37s to return a DataFrame.

See attached notebook for proof and methods.

Also, PyAthena apparently returns an object almost 9% bigger. But this can be due to Pandas, SQLAlchemy, data types and other minor things that I wouldn’t care right now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions