-
-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Summary of problem
Currently TAPService.search()
is just an alias to run_sync()
, which means that all queries use sync execution regardless of service capabilities.
I think for many/most service providers this may not be the optimal choice, and at least in our case we'd prefer if search uses run_async.
# Current implementation
search = run_sync
Proposed Solutions
I see two potential approaches for implementing this enhancement:
Approach 1: Automatic async-first (capability-based default)
Make search()
automatically check capabilities and use async when available:
def search(self, query, *, language="ADQL", maxrec=None, uploads=None,
force_sync=False, **keywords):
if force_sync or not self._supports_async():
return self.run_sync(query, language=language, maxrec=maxrec, uploads=uploads, **keywords)
else:
return self.run_async(query, language=language, maxrec=maxrec, uploads=uploads, **keywords)
This approach does change the default-behaviour (though I would say it is backward compatible) so that queries will be run async but that shouldn't be visible to users.
With this option we get the benefits and stability of asynchronous job execution and the benefit is that it does not require any changes to the API.
Approach 2: Opt-in via service parameter
Add prefer_async
parameter to TAPService.__init__()
:
class TAPService:
def __init__(self, baseurl, *, capability_description=None, session=None,
prefer_async=False):
# ...
self._prefer_async = prefer_async
def search(self, query, *, language="ADQL", maxrec=None, uploads=None,
force_sync=False, **keywords):
if self._prefer_async and not force_sync and self._supports_async():
return self.run_async(query, language=language, maxrec=maxrec, uploads=uploads, **keywords)
else:
return self.run_sync(query, language=language, maxrec=maxrec, uploads=uploads, **keywords)
The benefits here is that users will get the exact same behavour (zero-risk) and it is an opt-in feature, while leaving the door open for a future change to the default behaviour.
Downside is probably that users need to know about the option to enable it so it may have slower adoption.
Both approaches would check the /capabilities to see whether the service supports async:
Usage Examples
Approach 1: Automatic async-first
service = TAPService("http://example.com/tap")
results = service.search("SELECT * FROM table") # Uses async if available
# Can still force sync when needed
results = service.search("SELECT * FROM table", force_sync=True)
Approach 2: Opt-in parameter
# Default behavior (unchanged)
service = TAPService("http://example.com/tap")
results = service.search("SELECT * FROM table") # Always sync
# Opt-in to async-prefer behavior
service = TAPService("http://example.com/tap", prefer_async=True)
results = service.search("SELECT * FROM table") # Uses async if available
# Can still force sync
results = service.search("SELECT * FROM table", force_sync=True)
Any thoughts on this? If it makes sense I'm happy to work on a PR to enable it.