Skip to content

Multi-instance DOI support complexity #307

@asmacdo

Description

@asmacdo

Summary

Currently, we inject a fake DOI prior to using the Pydantic validation because DOI is a required field. However, as we move forward with multiple instances of dandi-archive this becomes more complex.

The problem

Using the 294 branch with any configuration other than the dandiarchive.org schema config, validation of a PublishedDandiset will fail. (The default is that the DOI must be empty string, which fails because we've injected the fake doi, or if we used the ember configuration, the DOI will fail because the fake DOI is in the dandi pattern instead of the ember pattern)

@candleindark @CodyCBakerPhD and I discussed and we see 2 options to move forward:

Option 1: Inject "smart" DOIs

If we were to continue to inject a fake DOI, we will have to construct that DOI to follow the patterns specified by the dandi-schema instance config. In my opinion this adds complexity without value-- our validation will be testing that our fake doi is correct, but it won't actually impact the user data.

Option 2: Allow DOIs to be empty string in Pydantic Models

This would allow a "multistep" validation pattern.
step 1: publication time: no doi
step 2: post-publication: verify doi pattern to the specific instance (can be empty string for non-doi supporting instances)

At the time of publication, there is no DOI, this will not prevent successful validation. (so we can avoid the need to inject a fake DOI at all)
After validation passes, and a dandiset version is published, the dandi-archive will create the Datacite DOI, and add it to the version. At that point, we can execute Pydantic validation against the published version with doi. Now, if the DOI is anything other than the empty string, it must conform to the pattern set by the dandischema config.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions