-
Notifications
You must be signed in to change notification settings - Fork 54
Description
I'm searching a STAC catalog and then iterate over the result with the items()
generator:
def get_search_result(bbox, start, end):
catalog = stac.open("https://earth-search.aws.element84.com/v1")
return catalog.search(
max_items = None,
collections = ['sentinel-2-l2a'],
bbox = bbox,
datetime = [start+'T00:00:00Z', end+'T00:00:00Z'],
)
search = get_search_result(bbox, start, end)
for item in search.items():
# download needed assets
# process them into product
It's quite a lengthy loop, as each iteration takes about a minute (I don't know if that is relevant).
The other day, about 20 minutes into the loop, my worker crashed with a RemoteDisconnected
error:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/hsnb/./server-worker.py", line 385, in run_worker
for item in search.items():
File "/usr/local/lib/python3.12/site-packages/pystac_client/item_search.py", line 691, in items
for item in self.items_as_dicts():
File "/usr/local/lib/python3.12/site-packages/pystac_client/item_search.py", line 702, in items_as_dicts
for page in self.pages_as_dicts():
File "/usr/local/lib/python3.12/site-packages/pystac_client/item_search.py", line 734, in pages_as_dicts
for page in self._stac_io.get_pages(
File "/usr/local/lib/python3.12/site-packages/pystac_client/stac_api_io.py", line 307, in get_pages
page = self.read_json(link, parameters=parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/pystac/stac_io.py", line 205, in read_json
txt = self.read_text(source, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/pystac_client/stac_api_io.py", line 162, in read_text
return self.request(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/pystac_client/stac_api_io.py", line 218, in request
raise APIError(str(err))
pystac_client.exceptions.APIError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Apparently something went wrong during the communication with the server. Until today, I didn't even know that each yielding of the next item issues another HTTP request, but of course that makes sense, as all the details of that item have to be fetched.
That one time it failed -- happens.
But how to handle this? Adding a try ... except
around the loop would certainly be smart and at least save my worker from a total crash. But it would still throw me out of the loop. I think it would be nice if pystac_client would automatically retry failed requests one or two times?
Something similar seems to have been discussed recently in #680. That discussion ended with "not planned", because the issue was not seen on the pystac_client side. Maybe this example gives a new perspective on the topic?