Skip to content

Bug with ScannedPDQHashes.from_dict() #1

@doghouch

Description

@doghouch

When calling scan_pdq_hashes(config=cfg), a response will look like (per the API):

{'scanned_hashes': {'...': {'classification': 'no-known-match', 'match_type': None, 'near_match_details': None}}}

However, even though the response JSON is perfectly valid, MatchType's enum only has two possible values:

  • exact
  • near

This means that a match_type of None will throw a ValueError (ValueError: None is not a valid MatchType). This can be fixed with improved handling for match_type in scanned_pdq.py:

https://github.com/CdnCentreForChildProtection/arachnid-shield-sdk-python/blob/16642d2c883af01b27f23b683c4bdb13701e5e9f/arachnid_shield_sdk/models/scanned_pdq.py#L57C1-L71C1

where:

 def from_dict(cls, src_dict: typing.Dict[str, typing.Any]) -> "ScannedPDQHashes":
    scanned_hashes = {}
    for key, value in src_dict["scanned_hashes"].items():
        near_match_details = value.get('near_match_details')
+      match_type = MatchType(value['match_type']) if value.get('match_type') else None
         if near_match_details is not None:
         near_match_details = NearMatchDetail.from_dict(near_match_details)
         scanned_hashes[key] = PDQMatch(
             classification=value['classification'],
-            match_type=match_type,
+            match_type=MatchType(value['match_type']),
             near_match_details=near_match_details
         )
    return cls(
        scanned_hashes=scanned_hashes
    )

I'm not sure about the best course of action here (adding a new value in the enum? checking match_type?), so I've held back on opening a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions