Skip to content

[Metadata] Deal with anti-scraper measures #362

@nephros

Description

@nephros

SailfishOS VERSION: N/A

HARDWARE: N/A

SailfishOS:Chum GUI application VERSION: <= 0.6.11

BUG DESCRIPTION

In the age of AI Companies, many git hosters and other sites have adopted tools like Anubis to thwart their irresponsible scraping.

These scrapers break Chum GUI app desciptions where URLs are used, e.g. the DescriptionMD tag, and sometimes screenshot and icon links.

STEPS TO REPRODUCE

As an example- look at the app page of the waypipe package.

  1. Have a package which uses DescriptionMD in the Chum Metadata
  2. Browse to the App description page in Chum GUI
  3. Observe some raw html snippets instead of the markdown content
  4. Alternatively, try curl -A "Mozilla/5.0 actually curl "-L https://gitlab.freedesktop.org/mstoeckl/waypipe/-/raw/v0.10.4/README.md to see Anubis output.

ADDITIONAL INFORMATION

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions