Skip to content

Installing tangled-up-in-unicode takes up 1.8GB of space #10

@Julian

Description

@Julian
⊙  python3.10 -m venv venv && venv/bin/python -m pip install tangled-up-in-unicode                                                                                                 julian@Airm
Collecting tangled-up-in-unicode
  Using cached tangled_up_in_unicode-0.2.0-py3-none-any.whl (4.7 MB)
Installing collected packages: tangled-up-in-unicode
Successfully installed tangled-up-in-unicode-0.2.0

~/Desktop 
⊙  du -sh venv                                                                                                                                                                     julian@Airm
1.8G	venv

The worst offenders are:

318M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u13_0_0_data/__pycache__/unicode_data_to_name_start.cpython-310.pyc
318M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u14_0_0_data/__pycache__/unicode_data_to_name_start.cpython-310.pyc
540M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/__pycache__
540M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/__pycache__/tangled_up_in_unicode_12_0_1.cpython-310.pyc
588M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u13_0_0_data/__pycache__
588M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u14_0_0_data/__pycache__
602M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u13_0_0_data
602M	venv/lib/python3.10/site-packages/tangled_up_in_unicode/u14_0_0_data
1.8G	venv
1.8G	venv/lib
1.8G	venv/lib/python3.10
1.8G	venv/lib/python3.10/site-packages
1.8G	venv/lib/python3.10/site-packages/tangled_up_in_unicode

It seems this is due to huge blowup in the .pyc files for each file which just contains a big dict.

Even converting the data to JSON and loading the JSON will be both faster and produce smaller files.

Here, converting e.g. unicode_data_to_name_start to instead read the dict from a JSON file brings import times down from ~1.7 seconds to ~300ms, and brings the .pyc file size down from ~400MB to ~6MB.

#5 looks like it was hinting at the issue here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions