-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
To get around file size limitations for up/download to TDAI, Data Commons, large files can be split into smaller files. E.g. a 5GB file called dummyfile_5GB.txt
can be split into 100MB chunks with:
split -b 100M dummyfile_5GB.txt dummyfile_5GB_part_
mkdir dummyfile_5GB_parts
mv dummyfile_5GB_part_* dummyfile_5GB_parts
then upload the directory containing the split file parts to Data Commons:
dva upload dummyfile_5GB_parts <doi> --url https://datacommons.tdai.osu.edu/ # with API token as env variable
(the original file's md5 checksum should also be uploaded)
After a dataset consumer downloads, this could be rejoined into the original file with cat dummyfile_5GB_part_* > dummyfile_5GB.txt
, and a checksum could be calculated and compared.
In a test, this led to the following uninformative error after 2 parts had been uploaded:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Scripts\dva.exe\__main__.py", line 7, in <module>
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\cli.py", line 93, in main
cli()
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\cli.py", line 74, in upload
api.upload_file(doi, path)
File "C:\Users\thompson.4509\AppData\Local\miniconda3\envs\data-commons-tests\Lib\site-packages\dva\api.py", line 67, in upload_file
raise APIException(f"Uploading failed with status {status}.")
dva.api.APIException: Uploading failed with status ERROR.
Metadata
Metadata
Assignees
Labels
No labels