Skip to content

There should only be a single download endpoint #119

Open
@Bjarke42

Description

@Bjarke42

There are several download endpoint URLs seens as user on migrid, which depends on how much data is being downloaded.

For files smaller than 64MB you will use:
https://erda.dk/wsgi-bin/cat.py?path=file1&output_format=file

But if that file is larger than 64MB this url will be used:
https://erda.dk/cert_redirect/file1

This is all fine as long as the user is using the migrid web interface as intended, but as soon as anyone wants to make automation scripts that downloads files via web, migrid server will fail. This could be because of shared links access to the system for a user or specific application only allows web access, but automation is needed.

We are seeing migrid server becoming non responsive because of excessive ram usage of the http process that loads the entire file into memory before downloading begins via cat.py . If you are lucky it will oom kill the http process, but most cases we just see migrid server become non responsive for 5 minute to hours.

I will suggest this be changed so that there is only one way to download, and that is always the correct way which cannot in anyway, by misusage, or similar cause migrid server to become non responsive.

If the downloading is sucessful, which will not always happens, you can see the following in the mig.log:

2024-09-12 13:42:56,227 INFO WSGI cat yielding 8 output parts (1920000000b)
2024-09-12 13:43:23,302 INFO WSGI cat finished yielding all 8 output parts

Work around does not exist, but if it happens and you already is logged in on the server you can do a kill -9 on all tini processes, which will end the mayhem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions