Skip to content

first internal draft for the SIA -> DAP transition. Includes SIA2 upgrades #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Bonnarel
Copy link
Contributor

This PR to show changes proposals for the transition between SIA2 to DAP
Includes also specific SIA2 changes as in https://github.com/ivoa-std/SIA/ , PR #23
Doesn't include #PR 22 yet

Corresponding internal draft is here :

https://wiki.ivoa.net/internal/IVOA/SIAP-2_0-Next/DAP.pdf

@gpdf
Copy link

gpdf commented May 21, 2024

For future editing / reviewing clarity, I'd like to suggest that we get the existing approved errata to SIAv2 merged into main for DAP as separate commits, and then rebase this PR accordingly.

Bonnarel added 2 commits May 22, 2024 21:36
changing the date
Adding Gregory Dubois-Felsmann as author and editor
@gmantele
Copy link

What is the status of this PullRequest? It seems to be the basis of any editing to be started in DAP.
Are all editors OK with it? If yes, should it approved and merged?

@pdowler
Copy link
Collaborator

pdowler commented Oct 10, 2024 via email

Copy link
Collaborator

@pdowler pdowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of very long lines in the original so the diffs are somewhat hard to evaluate. Fixing that will be impossible to evaluate so I will do that in a separate PR that only fixes line folding.

@@ -328,6 +365,9 @@ \subsubsection{POL}

The POL parameter constrains values of the pol\_states column of the ObsCore data model; possible values for the POL parameter are also defined by ObsCore.

This parameter is case insensitive.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this really helps anyone. The set of allowed pol states is well known/documented and uses specific case (upper case). This probably just means clients can be careless, services have to convert the case of values, and there is another way for people to get things wrong.

The ID parameter is a string-valued parameter that specifies the identifier of dataset(s). Values of the ID parameter are compared to the obs\_publisher\_did column of the ObsCore data model. Note that IVOIDs MUST be compared case-insensitively. As publisher dataset identifiers in the VO generally are IVOIDs, implementations will usually have to use case-insensitive comparisons here.

The ID parameter is a string-valued parameter that specifies the identifier of dataset(s). Values of the ID parameter are compared to the obs\_publisher\_did column of the ObsCore data model. Note that IVOIDs MUST be compared case-insensitively. As publisher dataset identifiers in the VO generally are IVOIDs, implementations will usually have to use case-insensitive comparisons here. When wildcarding of the end of the ID is needed the expression "extensionof ivo://bla" SHOULD be used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, from a quick, search of the doc, that this is the only wildcard searching. I think this is something where we want to discuss a more general solution.

Here I also don't like case-insensitive but that's not entirely in our control. What if an implementor has some other URI scheme (not ivo) -- are those also case-insensitive or just if the param value is an ivo URI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some doubts : should we add it also for COLLECTION/INSTRUMENT/FACILITY ?

For the solution, the classical "*" seemed to introduce encoding issues. So I imagined this dirty fix. Ready to reconsider that.



The DPTYPE parameter is a string-valued parameter that specifies the type of data. This parameter is case-insensitive. The value is compared with the dataproduct\_type from the ObsCore data model. In contrast to SIA2.0 which allows only dataproduct\_types \textit{image} and \textit{cube}, all the other dataproduct\_types are allowed, in such a way that we can constrain the service to retrieve visibility data, timeseries and event lists. Using DPTYPE with the "spectrum" value can be considered as an upgrade of the SSA protocol.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the accepted values come from ObsCore (or the DataProductType vocabulary) I don't really see what case-insensitive adds here. It's harder to implement and will likely lead to inconsistent implementations. It will get even worse if/when the vocabulary of terms has mixed case terms in it.

DAP.tex Outdated
The RELEASEDATE parameter specifies the range of release dates to be searched for data.
The limits are compared to the obs\_release\_date optional attribute of the ObsCore data model.
They are expressed as 2 ISO 8601 dates in the general case. A single value is searched for an exact match.
As the obs\_release\_date attribute is optional, the service self description (\ref{sec:selfdesc}) informs the user of the availability of this parameter.
RELEASEDATE queries for services not providing the release\_date attribute SHOULD provide an empty response.



\subsubsection{RETRIEVEMODE (2 solutions)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been better to keep the SIA->DAP transition changes separate from upgrades like this. The main issue I see is that implementation is going to be complex for anyone who relies on DAP -> DataLink to handle 1 product to N files cardinality... if that leads to "only some simple DAP services implement this" then that will be bad.

Prefer: extract this from PR and create an issue to discuss further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont' think it's the only change independant from SIA-> DAP transition. For example the introduction of RELEASEDATe and MOCs are others.

About the discussion on this point I hope to load today on the appropriate the Sydney discussion as I said last friday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used an audio transcription on line facility and revisited it to assign to participants their names. Its' there https://wiki.ivoa.net/internal/IVOA/InterOpMay2024DAL/DAL-2-notes or on the DAL sessions page of the last interop. Discussion was very rich. I see 4 points.

  • initial controversy : Direct access to cutout using cutout (SODA) full URL in access_url or SODA service descriptors.
  • if cutout direct computation is chosen in combination with DataLink, is "this" item the cutout or the whole image ?
  • can we reintroduce with that change the different "INTERSECT" modes we had in SIAV1 instead of the unique OVERLAP mode. Specially the COVERS mode where the full ROI is included in the image.
  • When we force one output format for the datasets with the FORMAT= parameter how can this appear in combination with DataLink usage because access_format will contain "datalink" and not the format which has been chosen.

I personnaly think for the first controversy there is none : we can have the citout retrievmode forcing a soda access_url and the SODA service descriptor in the response for intercation with the service at the same time.

For the three other items I am going to write issues



The DALI UPLOAD parameter is not used by this version of DAP. The use case of uploading lists of coordinates is covered by the multiple-valued parameters values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue created to discuss UPLOAD

@gmantele
Copy link

Thanks Pat for looking into this 🙂

I will try to look at the PR today. I just finished work on a prototype DAP implementation that I'll share shortly as well. -- Patrick Dowler Canadian Astronomy Data Centre Victoria, BC, Canada

@msdemlei
Copy link

msdemlei commented Oct 11, 2024 via email

@gmantele
Copy link

About case insensitive parameters

DALI-1.1 establishes clearly that parameter names are case-insensitive. For the parameter values are case sensitive, unless the contrary is specified in the document.

That being said, one of the first issue raised by the P3T is the case-insensitive parameters. The P3T argues all parameters should be case-sensitive, in order to avoid ambiguous behaviors and errors on both sides (client and server).

Personally, I also think that all parameter names should be case-sensitive. For the parameters values, this is up to the standard defining them, but the general rule probably tends toward case-sensitive as well.

About IVO IDs

If I read IVOA Identifiers 2.0:

In addition to scheme and authority as in RFC 3986, in IVOIDs the resource key is also compared case-insensitively. This means that Registry references can be case-folded for processing.
[...]
For instance, given the IVOID
ivo://example.com/res/key1?par=U%20Pic#Part1,
the IVOID
IVO://EXAMPLE.COM/RES/KEY1?par=U%20Pic#Part1
must compare equal

According to me, the recommandation for standard editors should be to write the parts of IVOIDs being compared case-insensitively (e.g. scheme, host) in lower-case. Parameters of course should be written with the appropriate case-folding (as said above). Implementers should be encouraged to do so as well.

@gmantele
Copy link

About DALI saying that parameters are case-INsensitive, I kind of remember that the plan of the P3T is to make this statement change: parameters should be case-sensitive. But let's see how hard it will be to change this major point in DALI (2.0?).

@Bonnarel
Copy link
Contributor Author

About introducing case insensitive Parameter values.
Before P3T, the initial demand came from Adrian Damian (see my slides at May 2020 Interop : https://wiki.ivoa.net/internal/IVOA/InterOpMay2020DAL/SIA2-SODA-nextPyVO-support.pdf and discussion https://wiki.ivoa.net/internal/IVOA/InterOpMay2020DAL/IVOA_Virtual2020_sia2sodapy.txt)

Moreover I think PyVO or Python programmers were complaining about that

For me case sensitiveness was natural, but facing the demand I tried to gigure out where it could be avoided

This is the list of string valued parameters

MOC --> I think the ascii syntax is case sensitive
POL --> Case sensitive was not a must because there seems to be no ambiguity.
ID --> I agree with the comments above
COLLECTION , FACILITY and INSTRUMENT : It seemed diificult to allow case insensitive because these names are strongly dependant from data providers and capital or small letters may be important to them.
DPTYPE : seemed to be possible to relax the case sensitive constraint
TARGET : absolutly impossible to relax there.
FORMAT : seemed to be possible to relax there
RELEASEDATE : Is ISO8601 string. I don't know if capital letters for T and Z are mandatory.

In conclusion, I don't see any problem to force everything to be case sensitive except the ID (for reasons already explained by you above) but this should be rediscussed with PyVO and P3T people.

@pdowler
Copy link
Collaborator

pdowler commented Oct 15, 2024

For the ID param, I think advice to use lower case for the scheme and authority is the best way to reduce pain in the long term. I could/would certainly enforce that in CADC software. After that, I don't think we can/should do any conversion of the path, query, etc. because those are often strings that need to be consistent with other values in use and we can't foresee the consequences. For CADC, the name of the collection and the observations within it are not really in our control, although we can/do give data providers advice on how to avoid pain.

In addition, wouldn't the use case of ID parameter to be to assign a value you found somewhere else? In that case, you just use the value as you found it and it should work... that has been my experience anyway.

@pdowler
Copy link
Collaborator

pdowler commented Oct 15, 2024

For any string params where values are a vocabulary, being case insensitive and letting callers ignore the known vocabulary doesn't buy us any value, so those are pretty easy.


COLLECTION , FACILITY and INSTRUMENT : the service self-descriptor can list the finite set of local values and callers can use them as is, so case-insenstive doesn't seem as valuable as self-descriptor. In addition, there is work on a facility vocabulary and maybe we'd want to accept those values... but I foresee some pain if the facility vocab uses a form that doesn't match what the data provider uses internally (to be discussed in the context of that vocabulary).


TARGET : this is a tough one because self-descriptor isn't really viable for listing all acceptable values, data providers aren't very consistent, and lots of objects have multiple names. I don't know where responsibility for making this work best for users really belongs...

@msdemlei
Copy link

msdemlei commented Oct 16, 2024 via email

@gmantele
Copy link

Oh, talking about normalisation: Do we want to specify that path components like /./ or foo/../bar are forbidden? Or do we just say "No path normalisations are admitted"?

I think there is nothing good to have such path components. If you say, as registry expert, that it is already not that easy to deal with IVOIDs, then, I would say that complicating them even more does not sound like a good idea. So, my opinion is that they should be forbidden.

If we have to say that somewhere, where should it be? My guess would be in the IVOA Identifiers document.

@msdemlei
Copy link

msdemlei commented Oct 17, 2024 via email

@Bonnarel
Copy link
Contributor Author

I created another Pull request (#12) on the main in order to follow Gregory DF suggestion to start DAP discussion on top of SIA version fixed by errata. I also added him as Author/editor and upgraded the title
Pat can you review and merge that one before we go on, on the main one ?
I plan to make some of the changes asked in the last two weeks today.

@Bonnarel
Copy link
Contributor Author

Bonnarel commented Nov 8, 2024

As I wrote 2 weeks ago I created another pull request #12 which is SIA2 + mofied title and authors/editors + 2 SIA errata.
It is the required ground level to figure out the changes we discuss to move to DAP.
Pat , please can you merge that #12 one ?

By the way the CI exceution are now failing. Some library changed and makes the old CI Obsolete. Some you probably know what to do to solve this, because it surely happened for other specs.

For the current PR, I made the following changes in my fork :

+  Cut the lines in the latex source  which were too long
+ Comment out the RETRIEVEMODE section in the source. I didn't remove it because, contrary to Markus I'm not confident that going back to a previous githuIb version will be done. I did in such a way that we can still discuss and decomment / modify to go to the final solution later
 + For FORMAT I made a more self-consistent change but did not remove it. Fill free to do it if this is the final decision
 + for case sensitiveness, I don't care. Fill free to do what is consistent with DALI, PyVO and P3T

I wish you all good ADASS and IVOA meetings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants