Skip to content

Commit 40752b5

Browse files
committed
Add example stress test DID Finder that returns large numbers of files in each response
1 parent 5092f9c commit 40752b5

File tree

2 files changed

+82
-0
lines changed

2 files changed

+82
-0
lines changed

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,3 +148,11 @@ All the incoming DID's are expected to be URI's without the schema. As such, the
148148
* `get` - If the value is `all` (the default) then all files in the dataset must be returned. If the value is `available`, then only files that are accessible need be returned.
149149

150150
As am example, if the following URI is given to ServiceX, "rucio://dataset_name?files=20&get=available", then the first 20 available files of the dataset will be processed by the rest of servicex.
151+
152+
## Stressful DID Finder
153+
As an example, there is in this repo a simple DID finder that can be used to test the system. It is called `stressful_did_finder.py`. It will return a large number of files, and will take a long time to run. It is useful for testing the system under load.
154+
I'm not quite sure how to use it yet, but I'm sure it will be useful.
155+
156+
It accepts the following arguments:
157+
* `--num-files` - The number of files to return as part of each request. Default is 10.
158+
* `--file-path` - The DID Finder returns the same file over and over. This is the file to return in the response
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Copyright (c) 2024, IRIS-HEP
2+
# All rights reserved.
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions are met:
6+
#
7+
# * Redistributions of source code must retain the above copyright notice, this
8+
# list of conditions and the following disclaimer.
9+
#
10+
# * Redistributions in binary form must reproduce the above copyright notice,
11+
# this list of conditions and the following disclaimer in the documentation
12+
# and/or other materials provided with the distribution.
13+
#
14+
# * Neither the name of the copyright holder nor the names of its
15+
# contributors may be used to endorse or promote products derived from
16+
# this software without specific prior written permission.
17+
#
18+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19+
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21+
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
22+
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23+
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
24+
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
25+
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
26+
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28+
import argparse
29+
import logging
30+
from typing import Any, Dict, Generator
31+
32+
from servicex_did_finder_lib import DIDFinderApp
33+
34+
__log = logging.getLogger(__name__)
35+
36+
37+
def find_files(did_name: str,
38+
info: Dict[str, Any],
39+
did_finder_args: dict = None) -> Generator[Dict[str, Any], None, None]:
40+
for i in range(int(did_finder_args['num_files'])):
41+
yield {
42+
'paths': did_finder_args['file_path'],
43+
'adler32': 0, # No clue
44+
'file_size': 0, # Size in bytes if known
45+
'file_events': i, # Include clue of how far we've come
46+
}
47+
48+
49+
def run_open_data():
50+
# Parse the command line arguments
51+
parser = argparse.ArgumentParser()
52+
parser.add_argument('--num-files', dest='num_files', action='store',
53+
default='10',
54+
help='Number of files to generate for each dataset')
55+
56+
parser.add_argument('--file-path', dest='file_path', action='store',
57+
default='',
58+
help='Path to a file to be returned in each response')
59+
60+
DIDFinderApp.add_did_finder_cnd_arguments(parser)
61+
62+
__log.info('Starting Stressful DID finder')
63+
app = DIDFinderApp('stressful_did_finder', parsed_args=parser.parse_args())
64+
65+
@app.did_lookup_task(name="stressful_did_finder.lookup_dataset")
66+
def lookup_dataset(self, did: str, dataset_id: int, endpoint: str) -> None:
67+
self.do_lookup(did=did, dataset_id=dataset_id,
68+
endpoint=endpoint, user_did_finder=find_files)
69+
70+
app.start()
71+
72+
73+
if __name__ == "__main__":
74+
run_open_data()

0 commit comments

Comments
 (0)