Skip to content

Commit f72d26a

Browse files
authored
Add support for tagging input files in CLI and UI #708 (#1069)
* Add support for tagging input files in CLI #708 Signed-off-by: tdruez <tdruez@nexb.com> * Add ability to update input source tag in UI #708 Signed-off-by: tdruez <tdruez@nexb.com> * Add changelog entry and unit tests #708 Signed-off-by: tdruez <tdruez@nexb.com> * Add support for tag in API #708 Signed-off-by: tdruez <tdruez@nexb.com> * Refine documentation for the tagging features #708 Signed-off-by: tdruez <tdruez@nexb.com> * Fix changelog #708 Signed-off-by: tdruez <tdruez@nexb.com> --------- Signed-off-by: tdruez <tdruez@nexb.com>
1 parent 6873047 commit f72d26a

22 files changed

+302
-59
lines changed

CHANGELOG.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,15 @@ Unreleased
2424
- Improve the inspect_manifest pipeline to accept archives as inputs.
2525
https://github.com/nexB/scancode.io/issues/1034
2626

27-
- Add support for "tagging" download URL inputs using the "#<fragment>" section of the
28-
URL.
27+
- Add support for "tagging" download URL inputs using the "#<fragment>" section of URLs.
2928
This feature is particularly useful in the map_develop_to_deploy pipeline when
3029
download URLs are utilized as inputs. Tags such as "from" and "to" can be specified
3130
by adding "#from" or "#to" fragments at the end of the download URLs.
31+
Using the CLI, the uploaded files can be tagged using the "filename:tag" syntax
32+
while using the `--input-file` arguments.
33+
In the UI, tags can be edited from the Project details view "Inputs" panel.
34+
On the REST API, a new `upload_file_tag` field is available to use along the
35+
`upload_file`.
3236
https://github.com/nexB/scancode.io/issues/708
3337

3438
v33.0.0 (2024-01-16)

docs/built-in-pipelines.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,16 @@ Load Inventory
7070

7171
Map Deploy To Develop
7272
---------------------
73+
74+
.. warning::
75+
This pipeline requires input files to be tagged with the following:
76+
77+
- "from": For files related to the source code (also known as "develop").
78+
- "to": For files related to the build/binaries (also known as "deploy").
79+
80+
Tagging your input files varies based on whether you are using the REST API,
81+
UI, or CLI. Refer to the :ref:`faq_tag_input_files` section for guidance.
82+
7383
.. autoclass:: scanpipe.pipelines.deploy_to_develop.DeployToDevelop()
7484
:members:
7585
:member-order: bysource

docs/command-line-interface.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,9 +87,17 @@ Optional arguments:
8787
- ``--input-file INPUTS_FILES`` Input file locations to copy in the :guilabel:`input/`
8888
work directory.
8989

90+
.. tip::
91+
Use the "filename:tag" syntax to **tag** input files:
92+
``--input-file path/filename:tag``
93+
9094
- ``--input-url INPUT_URLS`` Input URLs to download in the :guilabel:`input/` work
9195
directory.
9296

97+
.. tip::
98+
Use the "url#tag" syntax to tag downloaded files:
99+
``--input-url https://url.com/filename#tag``
100+
93101
- ``--copy-codebase SOURCE_DIRECTORY`` Copy the content of the provided source directory
94102
into the :guilabel:`codebase/` work directory.
95103

@@ -128,9 +136,17 @@ Adds input files in the project's work directory.
128136
- ``--input-file INPUTS_FILES`` Input file locations to copy in the :guilabel:`input/`
129137
work directory.
130138

139+
.. tip::
140+
Use the "filename:tag" syntax to **tag** input files:
141+
``--input-file path/filename:tag``
142+
131143
- ``--input-url INPUT_URLS`` Input URLs to download in the :guilabel:`input/` work
132144
directory.
133145

146+
.. tip::
147+
Use the "url#tag" syntax to tag downloaded files:
148+
``--input-url https://url.com/filename#tag``
149+
134150
- ``--copy-codebase SOURCE_DIRECTORY`` Copy the content of the provided source directory
135151
into the :guilabel:`codebase/` work directory.
136152

docs/faq.rst

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,3 +143,38 @@ You can refer to the :ref:`automation` to automate your projects management.
143143
Also, A new GitHub action is available at
144144
`scancode-action repository <https://github.com/nexB/scancode-action>`_
145145
to run ScanCode.io pipelines from your GitHub Workflows.
146+
147+
.. _faq_tag_input_files:
148+
149+
How to tag input files?
150+
-----------------------
151+
152+
Certain pipelines, including the :ref:`pipeline_map_deploy_to_develop`, require input
153+
files to be tagged. This section outlines various methods to tag input files based on
154+
your project management context.
155+
156+
Using download URLs as inputs
157+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
158+
159+
You can provide tags using the "#<fragment>" section of URLs. This tagging method is
160+
universally applicable in the User Interface, REST API, and Command Line Interface.
161+
162+
Example:
163+
164+
.. code-block::
165+
166+
https://url.com/sources.zip#from
167+
https://url.com/binaries.zip#to
168+
169+
Uploading local files
170+
^^^^^^^^^^^^^^^^^^^^^
171+
172+
There are multiple ways to tag input files when uploading local files:
173+
174+
- **User Interface:** Utilize the "Edit flag" link in the "Inputs" panel of the Project
175+
details view.
176+
177+
- **REST API:** Use the "upload_file_tag" field in addition to the "upload_file" field.
178+
179+
- **Command Line Interface:** Tag uploaded files using the "filename:tag" syntax.
180+
Example: ``--input-file path/filename:tag``.

docs/rest-api.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,11 @@ Using cURL:
121121
To upload more than one file, you can use the :ref:`rest_api_add_input` endpoint of
122122
the project.
123123

124+
.. tip::
125+
126+
To tag the ``upload_file``, you can provide the tag value using the
127+
``upload_file_tag`` field.
128+
124129
Using Python and the **"requests"** library:
125130

126131
.. code-block:: python
@@ -222,6 +227,7 @@ This action adds provided ``input_urls`` or ``upload_file`` to the ``project``.
222227
Data:
223228
- ``input_urls``: A list of URLs to download
224229
- ``upload_file``: A file to upload
230+
- ``upload_file_tag``: An optional tag to add on the uploaded file
225231

226232
Using cURL to provide download URLs:
227233

scanpipe/api/serializers.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,7 @@ class ProjectSerializer(
156156
help_text="Execute pipeline now",
157157
)
158158
upload_file = serializers.FileField(write_only=True, required=False)
159+
upload_file_tag = serializers.CharField(write_only=True, required=False)
159160
input_urls = StrListField(
160161
write_only=True,
161162
required=False,
@@ -182,6 +183,7 @@ class Meta:
182183
"url",
183184
"uuid",
184185
"upload_file",
186+
"upload_file_tag",
185187
"input_urls",
186188
"webhook_url",
187189
"created_date",
@@ -265,6 +267,7 @@ def create(self, validated_data):
265267
This ensures the Project data integrity before running any pipelines.
266268
"""
267269
upload_file = validated_data.pop("upload_file", None)
270+
upload_file_tag = validated_data.pop("upload_file_tag", "")
268271
input_urls = validated_data.pop("input_urls", [])
269272
pipeline = validated_data.pop("pipeline", [])
270273
execute_now = validated_data.pop("execute_now", False)
@@ -273,7 +276,7 @@ def create(self, validated_data):
273276
project = super().create(validated_data)
274277

275278
if upload_file:
276-
project.add_uploads([upload_file])
279+
project.add_upload(upload_file, tag=upload_file_tag)
277280

278281
for url in input_urls:
279282
project.add_input_source(download_url=url)

scanpipe/api/views.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -273,14 +273,20 @@ def add_input(self, request, *args, **kwargs):
273273
return Response(message, status=status.HTTP_400_BAD_REQUEST)
274274

275275
upload_file = request.data.get("upload_file")
276+
upload_file_tag = request.data.get("upload_file_tag", "")
276277
input_urls = request.data.get("input_urls", [])
277278

278279
if not (upload_file or input_urls):
279280
message = {"status": "upload_file or input_urls required."}
280281
return Response(message, status=status.HTTP_400_BAD_REQUEST)
281282

282283
if upload_file:
283-
project.add_uploads([upload_file])
284+
project.add_upload(upload_file, tag=upload_file_tag)
285+
286+
# Add support for providing multiple URLs in a single string.
287+
if isinstance(input_urls, str):
288+
input_urls = input_urls.split()
289+
input_urls = [url for entry in input_urls for url in entry.split()]
284290

285291
for url in input_urls:
286292
project.add_input_source(download_url=url)

scanpipe/forms.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222

2323
from django import forms
2424
from django.apps import apps
25+
from django.core.exceptions import ObjectDoesNotExist
2526
from django.core.exceptions import ValidationError
2627

2728
from taggit.forms import TagField
@@ -183,6 +184,27 @@ def save(self, project):
183184
return project
184185

185186

187+
class EditInputSourceTagForm(forms.Form):
188+
input_source_uuid = forms.CharField(
189+
max_length=50,
190+
widget=forms.widgets.HiddenInput,
191+
required=True,
192+
)
193+
tag = forms.CharField(
194+
widget=forms.TextInput(attrs={"class": "input"}),
195+
)
196+
197+
def save(self, project):
198+
input_source_uuid = self.cleaned_data.get("input_source_uuid")
199+
try:
200+
input_source = project.inputsources.get(uuid=input_source_uuid)
201+
except (ValidationError, ObjectDoesNotExist):
202+
return
203+
204+
input_source.update(tag=self.cleaned_data.get("tag", ""))
205+
return input_source
206+
207+
186208
class ArchiveProjectForm(forms.Form):
187209
remove_input = forms.BooleanField(
188210
label="Remove inputs",

scanpipe/management/commands/__init__.py

Lines changed: 29 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ def add_arguments(self, parser):
150150
parser.add_argument(
151151
"--input-file",
152152
action="append",
153-
dest="inputs_files",
153+
dest="input_files",
154154
default=list(),
155155
help="Input file locations to copy in the input/ work directory.",
156156
)
@@ -171,28 +171,45 @@ def add_arguments(self, parser):
171171
),
172172
)
173173

174-
def handle_input_files(self, inputs_files):
175-
"""Copy provided `inputs_files` to the project's `input` directory."""
174+
@staticmethod
175+
def extract_tag_from_input_files(input_files):
176+
"""
177+
Add support for the ":tag" suffix in file location.
178+
179+
For example: "/path/to/file.zip:tag"
180+
"""
181+
input_files_data = {}
182+
for file in input_files:
183+
if ":" in file:
184+
key, value = file.split(":", maxsplit=1)
185+
input_files_data.update({key: value})
186+
else:
187+
input_files_data.update({file: ""})
188+
return input_files_data
189+
190+
def handle_input_files(self, input_files_data):
191+
"""Copy provided `input_files` to the project's `input` directory."""
176192
copied = []
177193

178-
for file_location in inputs_files:
194+
for file_location, tag in input_files_data.items():
179195
self.project.copy_input_from(file_location)
180196
filename = Path(file_location).name
181197
copied.append(filename)
182-
self.project.add_input_source(filename=filename, is_uploaded=True)
198+
self.project.add_input_source(
199+
filename=filename,
200+
is_uploaded=True,
201+
tag=tag,
202+
)
183203

184-
msg = f"File{pluralize(inputs_files)} copied to the project inputs directory:"
204+
msg = f"File{pluralize(copied)} copied to the project inputs directory:"
185205
self.stdout.write(msg, self.style.SUCCESS)
186206
msg = "\n".join(["- " + filename for filename in copied])
187207
self.stdout.write(msg)
188208

189209
@staticmethod
190-
def validate_input_files(inputs_files):
191-
"""
192-
Raise an error if one of the provided `inputs_files` is not an existing
193-
file.
194-
"""
195-
for file_location in inputs_files:
210+
def validate_input_files(input_files):
211+
"""Raise an error if one of the provided `input_files` entry does not exist."""
212+
for file_location in input_files:
196213
file_path = Path(file_location)
197214
if not file_path.is_file():
198215
raise CommandError(f"{file_location} not found or not a file")
@@ -224,17 +241,6 @@ def handle_copy_codebase(self, copy_from):
224241
shutil.copytree(src=copy_from, dst=project_codebase, dirs_exist_ok=True)
225242

226243

227-
def validate_input_files(file_locations):
228-
"""
229-
Raise an error if one of the provided `file_locations` is not an existing
230-
file.
231-
"""
232-
for file_location in file_locations:
233-
file_path = Path(file_location)
234-
if not file_path.is_file():
235-
raise CommandError(f"{file_location} not found or not a file")
236-
237-
238244
def validate_copy_from(copy_from):
239245
"""Raise an error if `copy_from` is not an available directory"""
240246
if copy_from:

scanpipe/management/commands/add-input.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ class Command(AddInputCommandMixin, ProjectCommand):
3232

3333
def handle(self, *args, **options):
3434
super().handle(*args, **options)
35-
inputs_files = options["inputs_files"]
35+
input_files = options["input_files"]
3636
input_urls = options["input_urls"]
3737
copy_from = options["copy_codebase"]
3838

@@ -41,14 +41,15 @@ def handle(self, *args, **options):
4141
"Cannot add inputs once a pipeline has started to execute on a project."
4242
)
4343

44-
if not (inputs_files or input_urls or copy_from):
44+
if not (input_files or input_urls or copy_from):
4545
raise CommandError(
4646
"Provide inputs with the --input-file, --input-url, or --copy-codebase"
4747
)
4848

49-
if inputs_files:
50-
self.validate_input_files(inputs_files)
51-
self.handle_input_files(inputs_files)
49+
if input_files:
50+
input_files_data = self.extract_tag_from_input_files(input_files)
51+
self.validate_input_files(input_files=input_files_data.keys())
52+
self.handle_input_files(input_files_data)
5253

5354
if input_urls:
5455
self.handle_input_urls(input_urls)

0 commit comments

Comments
 (0)