You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data structure of the JSON output has changed for file-level licenses:
24
-
25
-
- We now return ``license_detections`` information at the manifest file-level
26
-
rather than ``licenses``. This has three data attributes: ``license_expression``,
27
-
``detection_log`` and ``matches``. Here the ``matches`` attribute is similar
28
-
to previous ``licenses`` with some additional changes.
29
-
30
-
- We added a new file-level attribute for``license_clues`` with license
31
-
matches that is the same as the ``matches`` field in ``license_detections``.
32
-
This has license matches that are mere clues and not proper conclusive
33
-
license detections.
34
-
35
-
- We removed the ``license_expressions`` field and replaced this list of
36
-
license expressions with a single license expression attribute named
37
-
``detected_license_expression``. Similarly removed and replaced the
38
-
``spdx_license_expressions`` list with ``detected_license_expression_spdx``.
39
-
40
-
- See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-resource>`_
41
-
for examples and details.
42
-
43
-
Similarly, we updated the JSON data structure for license attributes for packages
44
-
both at the file-level ``package_data`` and at the codebase level ``packages``:
45
-
46
-
- We renamed the ``declared_license`` attribute to ``extracted_license_statement``.
47
-
This field is always a string now, encoded as YAML; before it could be a list,
48
-
an object/mapping or a string.
49
-
50
-
- We added new ``license_detections`` and ``other_license_detections``
51
-
attributes. The ``license_detections`` attribute tracks detections for the
52
-
primary, top-level licensing of the package; we track other secondary
53
-
license detections in ``other_license_detections``.
54
-
55
-
- We replaced the ``license_expression`` attribute by two new attributes:
56
-
``declared_license_expression`` and ``other_license_expression`` with their
57
-
SPDX counterparts: ``declared_license_expression_spdx`` and
58
-
``other_license_expression_spdx``. The meaning for declared vs. other is the
59
-
same as for license vs. other license
60
-
61
-
See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#comparision-before-after-license-references>`_
62
-
for examples and details.
63
-
64
-
- There is a new ``--get-license-data`` command line option. This adds two
65
-
codebase-level attributes: ``license_references`` and ``rule_references``
66
-
that are respectively a list of licenses and a list of rules objects. These
67
-
objects have hold one scan-wide copy of license metadata and license text.
68
-
We removed the now redundant license attributes from ``license_detections matches``.
69
-
See the `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-package>`_
70
-
for examples and details.
71
-
72
-
- We updated how we report license matches. Previously we were reporting a
73
-
license ``key`` for each license detected in the license expression of a match
74
-
leading to repetion and data duplication. We now return one match for each
75
-
detected ``license_expression``. The license match data structure is now a flat
76
-
and simpler list of attributes and we no longer nest a ``matched_rule``
77
-
attribute inside each ``match``. We added abew ``licenses`` attribute listing
78
-
all the license keys present in the matched license expression.
79
-
See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#licensematch-result-data>`_
- The data structure of the JSON output has changed for licenses at file level:
63
+
64
+
- The previously used ``licenses`` attribute is deleted.
65
+
66
+
- To replace the ``licenses`` attribute, a new ``license_detections`` attribute
67
+
is added at the file-level with the license detections in that file.
68
+
This has three data attributes: ``license_expression``, ``detection_log``
69
+
and ``matches``. Here ``matches`` is similar to previous ``licenses``
70
+
with some additional changes in data structure as detailed in the
71
+
following sections.
72
+
73
+
- A new attribute ``license_clues`` is added, which has license matches with the
74
+
same data structure as the ``matches`` field in ``license_detections``.
75
+
This has license matches which are mere clues and not proper detections.
76
+
77
+
- The ``license_expressions`` field is removed, which was a list of license
78
+
expressions and it is replaced with ``detected_license_expression`` which
79
+
is a single license expression. Similarly ``spdx_license_expressions`` was
80
+
removed and replaced by ``detected_license_expression_spdx``.
81
+
82
+
- See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-resource>`_
83
+
for examples and more details.
84
+
85
+
- Similarly the data structure of license fields in ``package_data`` and the
86
+
codebase level ``packages`` has also changed:
87
+
88
+
- There is a ``license_detections`` attribute with the detections, same as the
89
+
file ``license_detections`` attribute, and there is also a
90
+
``other_license_detections`` attribute. Here ``license_detections`` has
91
+
the detections for the primary/declared licenses, and the rest of the
92
+
secondary detecions are at ``other_license_detections``.
93
+
94
+
- The ``license_expression`` field has been dropped, and instead we have
95
+
``declared_license_expression`` and ``other_license_expression`` fields
96
+
with their SPDX counterparts: ``declared_license_expression_spdx`` and
97
+
``other_license_expression_spdx``.
98
+
99
+
- The ``declared_license`` field also has been renamed to
100
+
``extracted_license_statement``, and previously this ``declared_license``
101
+
field could be a list, a dict or a string, but now
102
+
``extracted_license_statement`` is always a string.
103
+
104
+
See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-package>`_
105
+
for examples and more details.
106
+
107
+
- The data structure of License matches has also changed: for every license match
108
+
we previously had the attribute ``key`` i.e. a license key, but now we have
109
+
``license_expression`` instead. So we now return match details once for each
110
+
matched license expression rather than once for each license in a matched expression.
111
+
We also have a flat data structure inside ``matches`` instead of the ``matched_rule``
112
+
data dictionary, and the ``licenses`` now contains data for all the licenses present in the
113
+
license expression. See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#licensematch-result-data>`_
114
+
for examples and more details.
115
+
116
+
- There is a new command line option ``--licenses-reference`` which would add license
117
+
data as reference for all the license detections. This option would add two
118
+
codebase level attributes: ``license_references`` and ``rule_references``,
119
+
which are lists of license and rules respectively. This also removes the corresponding
120
+
fields from ``matches`` in ``license_detections`` as they are referenced in these
121
+
two codebase level fields. This also removes duplication as license/rule data is
122
+
given only once across the scan and not at every license match.
123
+
See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#comparision-before-after-license-references>`_
124
+
for examples and more details.
125
+
126
+
- There is a new ``scancode-reindex-licenses`` command that replace the
127
+
``scancode --reindex-licenses`` command line option.
128
+
129
+
- The ``--reindex-licenses-for-all-languages`` CLI option is also moved to
130
+
the ``scancode-reindex-licenses`` command as an option ``--all-languages``.
131
+
132
+
- We can now detect licenses using custom license texts and license rules.
133
+
These can be provided as a one off in a directory or packaged as a plugin
134
+
for consistent reuse and deployment.
135
+
136
+
- There is an ``--additional-directory`` option with the ``scancode-reindex-licenses``
137
+
command to use the licenses from the directory.
138
+
139
+
- There is also a ``--only-builtin`` option added to only use the builtin
140
+
licenses to build the cache, once there are plugins installed with
141
+
additional licenses/rules.
142
+
143
+
- See https://github.com/nexB/scancode-toolkit/issues/480 for more details.
143
144
144
145
- Scancode LICENSE and RULE files now also contain their data as YAML frontmatter,
145
146
which previously used to be in their respective YAML files. This reduces number of
0 commit comments