Skip to content

Only grab single version of full size image url #13

@patcon

Description

@patcon

Re-ticketed from Slack msg

Currently, this gets scraped:

        "PageImages": [
            "https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2064&thumb=medium",
            "https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2064&thumb=small",
            "https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065&thumb=medium",
            "https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065&thumb=small"
        ],

but this would actually return the full image:

https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065

We should dedup and strip the thumb query param

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions