Skip to content

Commit 3097fc8

Browse files
JonoYangtdruez
andauthored
Move MatchCode-related code (#1077)
* Rename match_to_purld to match_to_matchcode * Move matchcode related code from pipes/purldb.py to pipes/matchcode.py * Update scancode.io settings to get matchcode API settings * This is done since matchcode and purldb are technically two seperate services with their own users, etc. Signed-off-by: Jono Yang <jyang@nexb.com> * Update docstring in MatchToMatchCode pipeline Signed-off-by: Jono Yang <jyang@nexb.com> * Update test_matchcode.py Signed-off-by: Jono Yang <jyang@nexb.com> * Update test file directory name Signed-off-by: Jono Yang <jyang@nexb.com> * Create migration for match_to_matchcode pipeline renaming Signed-off-by: Jono Yang <jyang@nexb.com> * Add MatchCode.io integration docs * Update references to match_to_purldb to match_to_matchcode in docs Signed-off-by: Jono Yang <jyang@nexb.com> * Fix the documentation regarding MatchCode #1077 Signed-off-by: tdruez <tdruez@nexb.com> --------- Signed-off-by: Jono Yang <jyang@nexb.com> Signed-off-by: tdruez <tdruez@nexb.com> Co-authored-by: tdruez <tdruez@nexb.com>
1 parent 99061b9 commit 3097fc8

16 files changed

+527
-370
lines changed

CHANGELOG.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ v34.1.0 (unreleased)
2323
https://github.com/nexB/scancode.io/issues/1121
2424
https://github.com/nexB/scancode.io/issues/1122
2525

26+
- Rename the ``match_to_purldb`` pipeline to ``match_to_matchcode``, and add
27+
MatchCode.io API settings to ScanCode.io settings.
28+
2629
v34.0.0 (2024-03-04)
2730
--------------------
2831

docs/application-settings.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,26 @@ you can provide the API key using ``VULNERABLECODE_API_KEY``::
314314

315315
VULNERABLECODE_API_KEY=insert_your_api_key_here
316316

317+
.. _scancodeio_settings_matchcodeio:
318+
319+
MATCHCODE.IO
320+
^^^^^^^^^^^^
321+
322+
There is currently no public instance of MatchCode.io.
323+
324+
Alternatively, you can deploy your own instance of MatchCode.io by
325+
following the instructions provided in the documentation at
326+
https://purldb.readthedocs.io/.
327+
328+
To configure your local environment, set the ``MATCHCODEIO_URL`` in your ``.env`` file::
329+
330+
MATCHCODEIO_URL=https://<Address to MatchCode.io host>/
331+
332+
If authentication is enabled on your MatchCode.io instance, you can provide the
333+
API key using ``MATCHCODEIO_API_KEY``::
334+
335+
MATCHCODEIO_API_KEY=insert_your_api_key_here
336+
317337
.. _scancodeio_settings_fetch_authentication:
318338

319339
Fetch Authentication

docs/built-in-pipelines.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -122,17 +122,17 @@ Map Deploy To Develop
122122
:members:
123123
:member-order: bysource
124124

125-
.. _pipeline_match_to_purldb:
125+
.. _pipeline_match_to_matchcode:
126126

127-
Match to PurlDB (addon)
128-
-----------------------
127+
Match to MatchCode (addon)
128+
--------------------------
129129

130130
.. warning::
131-
This pipeline requires access to a PurlDB service.
132-
Refer to :ref:`scancodeio_settings_purldb` to configure access to PurlDB in your
133-
ScanCode.io instance.
131+
This pipeline requires access to a MatchCode.io service.
132+
Refer to :ref:`scancodeio_settings_matchcodeio` to configure access to
133+
MatchCode.io in your ScanCode.io instance.
134134

135-
.. autoclass:: scanpipe.pipelines.match_to_purldb.MatchToPurlDB()
135+
.. autoclass:: scanpipe.pipelines.match_to_matchcode.MatchToMatchCode()
136136
:members:
137137
:member-order: bysource
138138

docs/faq.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -70,10 +70,10 @@ existing data, allowing for more comprehensive analysis and insights.
7070
Before executing this pipeline, make sure to set up
7171
:ref:`PurlDB <scancodeio_settings_purldb>`.
7272

73-
- To **match your project codebase resources to PurlDB for Package matches**,
74-
utilize the :ref:`match_to_purldb <pipeline_match_to_purldb>` pipeline.
75-
It's essential to set up :ref:`PurlDB <scancodeio_settings_purldb>` before executing
76-
this pipeline.
73+
- To **match your project codebase resources to MatchCode.io for Package matches**,
74+
utilize the :ref:`match_to_matchcode <pipeline_match_to_matchcode>` pipeline.
75+
It's essential to set up :ref:`MatchCode.io <scancodeio_settings_matchcodeio>` before
76+
executing this pipeline.
7777

7878
What is the difference between scan_codebase and scan_single_package pipelines?
7979
-------------------------------------------------------------------------------

scancodeio/settings.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -403,3 +403,10 @@
403403
PURLDB_USER = env.str("PURLDB_USER", default="")
404404
PURLDB_PASSWORD = env.str("PURLDB_PASSWORD", default="")
405405
PURLDB_API_KEY = env.str("PURLDB_API_KEY", default="")
406+
407+
# MatchCode.io integration
408+
409+
MATCHCODEIO_URL = env.str("MATCHCODEIO_URL", default="")
410+
MATCHCODEIO_USER = env.str("MATCHCODEIO_USER", default="")
411+
MATCHCODEIO_PASSWORD = env.str("MATCHCODEIO_PASSWORD", default="")
412+
MATCHCODEIO_API_KEY = env.str("MATCHCODEIO_API_KEY", default="")
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Generated by Django 5.0.3 on 2024-03-20 22:52
2+
3+
from django.db import migrations
4+
5+
6+
pipeline_old_names_mapping = {
7+
"match_to_purldb": "match_to_matchcode",
8+
}
9+
10+
11+
def rename_pipelines_data(apps, schema_editor):
12+
Run = apps.get_model("scanpipe", "Run")
13+
for old_name, new_name in pipeline_old_names_mapping.items():
14+
Run.objects.filter(pipeline_name=old_name).update(pipeline_name=new_name)
15+
16+
17+
def reverse_rename_pipelines_data(apps, schema_editor):
18+
Run = apps.get_model("scanpipe", "Run")
19+
for old_name, new_name in pipeline_old_names_mapping.items():
20+
Run.objects.filter(pipeline_name=new_name).update(pipeline_name=old_name)
21+
22+
23+
class Migration(migrations.Migration):
24+
dependencies = [
25+
("scanpipe", "0053_restructure_pipelines_data"),
26+
]
27+
28+
operations = [
29+
migrations.RunPython(
30+
rename_pipelines_data,
31+
reverse_code=reverse_rename_pipelines_data,
32+
),
33+
]

scanpipe/pipelines/match_to_purldb.py renamed to scanpipe/pipelines/match_to_matchcode.py

Lines changed: 31 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,26 @@
2121
# Visit https://github.com/nexB/scancode.io for support and download.
2222

2323
from scanpipe.pipelines import Pipeline
24-
from scanpipe.pipes import purldb
24+
from scanpipe.pipes import matchcode
2525

2626

27-
class MatchToPurlDB(Pipeline):
27+
class MatchToMatchCode(Pipeline):
2828
"""
29-
Match the codebase resources of a project against PurlDB to identify packages.
29+
Match the codebase resources of a project against MatchCode.io to identify packages.
3030
3131
This process involves:
3232
33-
1. generating a JSON scan of the project codebase
34-
2. transmitting it to MatchCode on PurlDB and awaiting match results
35-
3. creating discovered packages from the package data obtained
36-
4. associating the codebase resources with those discovered packages
33+
1. Generating a JSON scan of the project codebase
34+
2. Transmitting it to MatchCode.io and awaiting match results
35+
3. Creating discovered packages from the package data obtained
36+
4. Associating the codebase resources with those discovered packages
37+
38+
Currently, MatchCode.io can only match for archives, directories, and files
39+
from Maven and npm Packages.
40+
41+
This pipeline requires a MatchCode.io instance to be configured and available.
42+
There is currently no public instance of MatchCode.io. Reach out to nexB, Inc.
43+
for other arrangements.
3744
"""
3845

3946
download_inputs = False
@@ -42,29 +49,34 @@ class MatchToPurlDB(Pipeline):
4249
@classmethod
4350
def steps(cls):
4451
return (
45-
cls.check_purldb_service_availability,
52+
cls.check_matchcode_service_availability,
4653
cls.send_project_json_to_matchcode,
4754
cls.poll_matching_results,
4855
cls.create_packages_from_match_results,
4956
)
5057

51-
def check_purldb_service_availability(self):
52-
"""Check if the PurlDB service if configured and available."""
53-
if not purldb.is_configured():
54-
raise Exception("PurlDB is not configured.")
58+
def check_matchcode_service_availability(self):
59+
"""Check if the MatchCode.io service if configured and available."""
60+
if not matchcode.is_configured():
61+
msg = (
62+
"MatchCode.io is not configured. Set the MatchCode.io "
63+
"related settings to a MatchCode.io instance or reach out "
64+
"to the maintainers for other arrangements."
65+
)
66+
raise Exception(msg)
5567

56-
if not purldb.is_available():
57-
raise Exception("PurlDB is not available.")
68+
if not matchcode.is_available():
69+
raise Exception("MatchCode.io is not available.")
5870

5971
def send_project_json_to_matchcode(self):
60-
"""Create a JSON scan of the project Codebase and send it to MatchCode."""
61-
self.run_url = purldb.send_project_json_to_matchcode(self.project)
72+
"""Create a JSON scan of the project Codebase and send it to MatchCode.io."""
73+
self.run_url = matchcode.send_project_json_to_matchcode(self.project)
6274

6375
def poll_matching_results(self):
6476
"""Wait until the match results are ready by polling the match run status."""
65-
purldb.poll_until_success(self.run_url)
77+
matchcode.poll_until_success(self.run_url)
6678

6779
def create_packages_from_match_results(self):
6880
"""Create DiscoveredPackages from match results."""
69-
match_results = purldb.get_match_results(self.run_url)
70-
purldb.create_packages_from_match_results(self.project, match_results)
81+
match_results = matchcode.get_match_results(self.run_url)
82+
matchcode.create_packages_from_match_results(self.project, match_results)

0 commit comments

Comments
 (0)