Report ingestSummary even if AnnData extraction fails (SCP-5973) #2243
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BACKGROUND & CHANGES
This fixes a corner case in Mixpanel reporting where AnnData files that fail during the initial extraction phase do not report an
ingestSummary
event. These events are the main indication SCP has in Mixpanel for AnnData ingest as only one event is ever sent per AnnData file. The lack of these extraction events means we are underreporting on AnnData ingest attempts. Now, if the initial extraction fails and is not going to be retried due to an OOM exception, the summary is sent.Also, this fixes a regression introduced in #2242 where metadata properties in AnnData files that have names ending in values that match ontology-based properties (e.g.
author_cell_type
matching tocell_type
) incorrectly attempt to validate that column. Now, exact name matches are enforced rather than relying onendsWith()
.MANUAL TESTS
raw_location: does_not_exist
in the Expression tab to ensure the extraction failsingestSummary
eventjobStatus: failed
andnumFilesExtracted: 0