You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Clarifications
* Fix up bad servicex.yaml file
* Add more robust funcadl query and clean up
* Enhance documentation for FuncADL sequences and Select call usage
* Clean up, filling things out a little bit.
* Updates
* Make it work for just the front end
@@ -6,65 +6,52 @@ Welcome to the ServiceX contributor guide, and thank you for your interest in co
6
6
Overview
7
7
--------
8
8
9
-
ServiceX uses a microservice architecture,
10
-
and is designed to be hosted on a Kubernetes cluster.
11
-
The ServiceX project uses a polyrepo strategy for source code management:
12
-
the source code for each microservice is located in a dedicated repo.
9
+
The ``servicex`` frontend code uses standard python packaging and open-source development methodologies. The code is hosted on GitHub,
10
+
and we use the GitHub issue tracker to manage bugs and feature requests. We also use GitHub pull requests for code review and merging.
13
11
14
-
Below is a partial list of these repositories:
15
-
16
-
- `ServiceX <https://github.com/ssl-hep/ServiceX>`_ - Main repository, contains Helm charts for deployment to Kubernetes
17
12
- `ServiceX_frontend <https://github.com/ssl-hep/ServiceX_frontend>`_ - The ServiceX Python library, which enables users to send requests to ServiceX. Currently, this is the only ServiceX frontend client.
18
-
- `ServiceX_App <https://github.com/ssl-hep/ServiceX_App>`_ - The ServiceX API Server, written in Flask.
19
13
20
-
Additional repositories related to the project can be found in the `ssl-hep GitHub organization <https://github.com/ssl-hep>`_.
14
+
Additional repositories related to the ServiceX project can be found in the `ssl-hep GitHub organization <https://github.com/ssl-hep>`_.
21
15
22
-
Please read our `architecture document <https://servicex.readthedocs.io/en/latest/development/architecture/>`_ for more details.
16
+
Join us on Slack
17
+
-----------------
18
+
19
+
We coordinate our efforts on the `IRIS-HEP Slack <http://iris-hep.slack.com>`_.
20
+
Come join this intellectual hub!
21
+
22
+
Issues
23
+
------
24
+
25
+
All development work on the code should start with an issue. Please submit issues for bugs and feature
26
+
requests to the `repository <https://github.com/ssl-hep/ServiceX_frontend>`_.
23
27
24
28
Branching Strategy
25
29
-------------------
26
30
27
-
ServiceX uses a slightly modified GitLab flow. Each repository has a main branch, usually named `develop` (or `master` for the Python frontend). All changes should be made on feature branches and submitted as PRs to the main branch. Releases are frozen on dedicated release branches, e.g. `v1.0.0-RC.2`.
31
+
ServiceX uses a slightly modified GitLab flow. The `master` branch is used for releases, and
32
+
all development work occurs on feature branches.
28
33
29
34
Development Workflow
30
35
---------------------
31
36
32
37
1. Set up a local development environment:
33
-
- Decide which microservice (or Helm chart) you'd like to change, and locate the corresponding repository.
34
-
- If you are a not a member of the ``ssl-hep`` GitHub organization, fork the repository.
38
+
- Fork the ``ServiceX_frontend``
35
39
- Clone the (forked) repository to your local machine:
- Set up a new environment via ``conda`` or ``virtualenv``.
48
42
- Install dependencies, including test dependencies:
49
43
50
44
.. code-block:: bash
51
45
52
-
python3 -m pip install -e .[test]
53
-
54
-
- If the root directory contains a file named ``.pre-commit-config.yaml``, you can install the `pre-commit <https://pre-commit.com/>`_ hooks with:
55
-
56
-
.. code-block:: bash
57
-
58
-
pip install pre-commit
59
-
pre-commit install
46
+
python3 -m pip install -e .[develop]
60
47
61
48
2. Develop your contribution:
62
49
- Pull latest changes from upstream:
63
50
64
51
.. code-block:: bash
65
52
66
-
git checkout develop
67
-
git pull upstream develop
53
+
git checkout master
54
+
git pull upstream master
68
55
69
56
- Create a branch for the feature you want to work on:
70
57
@@ -77,115 +64,5 @@ Development Workflow
77
64
3. Test your changes:
78
65
- Run the full test suite with ``python -m pytest``, or target specific test files with ``python -m pytest tests/path/to/file.py``.
79
66
- Please write new unit tests to cover any changes you make.
80
-
- You can also manually test microservice changes against a full ServiceX deployment by building the Docker image, pushing it to DockerHub, and setting the `image` and `tag` values as follows:
81
-
82
-
.. code-block:: yaml
83
-
84
-
app:
85
-
image: <organization>/<image repository>
86
-
tag: my-feature-branch
87
-
88
-
- For more details, please read our full `deployment guide <https://servicex.readthedocs.io/en/latest/deployment/basic>`_.
89
67
90
68
4. Submit a pull request to the upstream repository
91
-
92
-
93
-
Issues
94
-
------
95
-
96
-
Please submit issues for bugs and feature requests to the `main ServiceX repository <https://github.com/ssl-hep/ServiceX>`_, unless the issue is specific to a single microservice.
97
-
98
-
We manage project priorities with a `ZenHub board <https://app.zenhub.com/workspaces/servicex-5caba4288d0ceb76ea94ae1f/board?repos=180217333,180236972,185614791,182823774,202592339>`_.
99
-
100
-
Join us on Slack
101
-
-----------------
102
-
103
-
We coordinate our efforts on the `IRIS-HEP Slack <http://iris-hep.slack.com>`_.
104
-
Come join this intellectual hub!
105
-
106
-
Running the Full ServiceX Chart Locally
107
-
----------------------------------------
108
-
109
-
You can run ServiceX on your laptop using ``docker`` or another similar tool that supports kubernetes.
110
-
111
-
Prerequisites
112
-
--------------
113
-
114
-
1. ``docker`` is installed and ``kubernetes`` is running (see configuration options).
115
-
2. Make sure ``kubectl`` and ``helm`` are both installed in the shell you'll be doing your development work.
116
-
3. Follow instructions in the deployment guide to install your x509 certificate if you are going to be using any `rucio` or GRID services for your testing.
117
-
118
-
Running the chart
119
-
------------------
120
-
121
-
122
-
1. In the ``Servicex/helm`` directory run ``helm dependency update servicex/``
123
-
2. And install the chart with ``helm install -f values.yaml servicex-testing .\servicex\``
124
-
3. As in the deployment guide, you can now port-forward your servicex ``app`` and ``minio``.
125
-
126
-
How you write your ``values.yaml`` will depend a lot on what you are testing. Here is an example of a minimal one that will load up the `develop` tag for all the container images, and expects an ATLAS GRID cert:
127
-
128
-
.. code-block:: yaml
129
-
130
-
postgres:
131
-
enabled: true
132
-
objectStore:
133
-
publicURL: localhost:9000
134
-
135
-
gridAccount: <your-user>
136
-
137
-
x509Secrets:
138
-
# For ATLAS
139
-
vomsOrg: atlas
140
-
141
-
app:
142
-
ingress:
143
-
host: localhost:5000
144
-
145
-
transformer:
146
-
cachePrefix: '""'
147
-
148
-
149
-
Making Changes
150
-
---------------
151
-
152
-
153
-
The best way to work on ServiceX is using the unit tests. That isn't always possible, of course. When it isn't your development cycle will require you to build any changed containers. A possible workflow is:
154
-
155
-
1. Redeploy the ``helm`` chart (or perhaps use ``upgrade`` rather than ``install`` in the ``helm`` command) and add ``pullPolicy: Never`` to the appropriate app section. For example, add it under ``app:`` in the example file above if you are working on ``servicex_app``.
156
-
2. Change your code (say, in ``servicex_app``).
157
-
3. In the directory for the app should be a ``Dockerfile``. Do the build, and pay attention to the tag. For example, ``docker build -t sslhep/servicex_app:develop .``.
158
-
4. Finally restart the pod, which should cause it to pick up the new build. This might kill a port-forward you have in place, so don't forget to restart that!
159
-
160
-
Debugging Tips
161
-
---------------
162
-
163
-
Microservice architectures can be difficult to test and debug. Here are some
164
-
helpful hints to make this easier.
165
-
166
-
1. Instead of relying on the DID Finder to locate some particular datafile, you
167
-
can mount one of your local directories into the transformer pod and then
168
-
instruct the DID Finder to always offer up the path to that file regardless of
169
-
the submitted DID. You can use the ``hostMount`` value to have a local directory
170
-
mounted into each transformer pod under ``/data``. You can use the
171
-
``didFinder.staticFile`` value to instruct DID Finder to offer up a file from that
172
-
directory.
173
-
2. You can use port-forwarding to expose port 15672 from the RabbitMQ pod to
174
-
your laptop and log into the Rabbit admin console using the username: ``user`` and
175
-
password ``leftfoot1``. From here you can monitor the queues, purge old messages
176
-
and inject your own messages
177
-
178
-
Notes for Maintainers
179
-
---------------------
180
-
181
-
Hotfixes
182
-
--------
183
-
184
-
If a critical bugfix or hotfix must be applied to a previous release, it should be merged to the main branch and then applied to each affected release branch using
185
-
186
-
.. code-block:: bash
187
-
188
-
git cherry-pick <merge commit hash> -m 1
189
-
190
-
Merge commits have 2 parents, so the ``-m 1`` flag is used to specify that the first parent (i.e. previous commit on the main branch) should be used.
Copy file name to clipboardExpand all lines: docs/query_types.rst
+28-4Lines changed: 28 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -53,7 +53,7 @@ This table sumarizes the query types supported by ServiceX and the data formats
53
53
A brief introduction to the query languages
54
54
-------------------------------------------
55
55
56
-
* **FuncADL** is an Analysis Description Language inspired by functional languages. Sophisticated filtering and computation of new values can be expressed by chaining a series of simple functions. Because FuncADL is written independently of the underlying data libraries, it can run on many data formats.
56
+
* **FuncADL** is an Analysis Description Language inspired by functional languages and C#'s LINQ. Sophisticated filtering and computation of new values can be expressed by chaining a series of simple functions. Because FuncADL is written independently of the underlying data libraries, it can run on many data formats.
57
57
58
58
* **Uproot-Raw** passes user requests to the ``.arrays()`` function in ``uproot``. In particular, the branches of the input ``TTrees`` can be filtered, cuts can be specified to select events, and additional expressions can be computed. Additional non-``TTree`` objects can be copied from the inputs to the outputs.
59
59
@@ -107,15 +107,39 @@ Each dictionary either has a ``treename`` key (indicating that it is a query on
107
107
108
108
FuncADL Query Type
109
109
------------------
110
-
The FuncADL Query type is very powerful. It is based on functional programming concepts and allows
111
-
the user to specify complex queries in a very compact form. The query is written in a functional
110
+
FuncADL queries are based on functional programming concepts and allow
111
+
the user to specify complex queries in a compact form. The query is written in a functional
112
112
style, with a series of functions that are applied to the data in sequence. The query is written
113
113
in a string or as typed python objects. Depending on the source file format, the query is translated
114
114
into C++ `EventLoop <https://atlassoftwaredocs.web.cern.ch/analysis-software/AnalysisTools/el_intro/>`_
115
115
code, or uproot python code.
116
116
117
-
Full documentation on the func-adl query language can be found at this `JupyterBook <https://gordonwatts.github.io/xaod_usage/intro.html>`_.
117
+
An example that fetches the :math:`p_T, \eta` and EM fraction of jets from an ATLAS PHYSLITE file is as follows:
118
+
119
+
.. code-block:: python
120
+
121
+
from func_adl_servicex_xaodr22 import FuncADLQueryPHYSLITE, cpp_float
FuncADL is based on the concept of sequences. The events in a dataset are a sequence of events. The jets in an event are a sequence of jets.
134
+
The ``Select`` call applies a function that transforms the input sequence, element-by-element, into an output sequence. In the above example,
135
+
the first ``Select`` call is used to transform the sequence of events into a sequence of a sequence of jets (e.g. a sequence of jets representing
136
+
the jets in an event - a 2D array, if you will). The lambda function passed to the ``Select`` call
137
+
is applied to each event in the input sequence, and the result is a sequence of jets for each event.
138
+
139
+
The dictionary defines the columns of the output file (e.g. the leaves in a ``TTree``). In each case, the three ``lambda`` functions are applied
140
+
to each jet, transforming the sequence of jets into a sequence of :math:`p_T` values, a sequence of :math:`\eta` values, and a sequence of EM fractions.
118
141
142
+
Full documentation on the func-adl query language can be found at this `JupyterBook <https://gordonwatts.github.io/xaod_usage/intro.html>`_.
0 commit comments