Skip to content

Commit a4a640e

Browse files
authored
Merge pull request #4 from blackducksoftware/dev
v1.7 - Minor doc changes and code cleanup
2 parents 70be5f3 + dbe44ac commit a4a640e

File tree

5 files changed

+59
-88
lines changed

5 files changed

+59
-88
lines changed

README.md

Lines changed: 47 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# bd_sig_filter - v1.6
1+
# bd_sig_filter - v1.7
22
BD Script to ignore components matched from Signature scan likely to be partial or invalid matches, and
33
mark components reviewed which are definitive matches (dependency or component name and version in matched path for
44
signature matches).
@@ -137,41 +137,64 @@ The list of components shows the name, matchtypes and current ignore/review stat
137137
(after running the script with the `--ignore` and `--review` options) in the `To Be Ignored` and `To Be Reviewed`
138138
columns with an explanation in the `Action` column.
139139

140-
The following options can be specified:
141-
142-
--ignore: Ignore components as shown in the `To Be Ignored` column
143-
--review: Mark components as reviewed as shown in the `To Be Reviewed` column
144-
--no_ignore_test: Do not ignore components with signature paths within test folders
145-
--no_ignore_synopsys: Do not ignore components with signature paths within Synopsys tools folders (for example '.synopsys')
146-
--no_ignore_defaults: Do not ignore components with signature paths in cache/config folders (for example '.git', '.m2', '.local')
147-
--version_match_required:
148-
Enforce search for component version string in signature paths for marking reviewed
149-
(Paths containing only the component name will be used for matching otherwise)
150-
--ignore_no_path_matches:
151-
Components with no match in the signature path are left unreviewed by default, allowing
152-
manual review. Use this option to ignore these components instead but use with caution
153-
as it may exclude components which are legitimate (the Signature match path does not
154-
have to include the component name or version).
140+
The `Match Score` value shows the result of fuzzy match searching for component name and version strings (note that
141+
origin component ID is used where available as opposed to the textual name of the component). A score of 200 shows
142+
an exact match of both component name and version in Signature paths; a lower value shows the possibility of less
143+
accurate matching.
144+
145+
Options can be used to modify the behaviour of the script as follows:
146+
147+
`--no_ignore_test`:
148+
Stops components matched only by Signature scanning and containing test folders (test, tests,
149+
testsuite or testsuites - case insensitive) being marked for ignore (which happens by default).
150+
151+
`--no_ignore_synopsys`:
152+
Stops components matched only by Signature scanning and containing Synopsys tools folders (.synopsys,
153+
synopsys-detect, .coverity, synopsys-detect.\*.jar, scan.cli.impl-standalone.jar, seeker-agent.\*,
154+
Black_Duck_Scan_Installation - case insensitive) being marked for ignore (which happens by default).
155+
156+
`--no_ignore_defaults`:
157+
Stops components matched only by Signature scanning and containing default folders (.cache,
158+
.m2, .local, .config, .docker, .npm, .npmrc, .pyenv, .Trash, .git, node_modules - case insensitive)
159+
being marked for ignore (which happens by default).
160+
161+
`--version_match_required`:
162+
Enforce search for component version string in signature paths for marking components reviewed
163+
(Paths containing only the component name will be used for matching otherwise)
164+
165+
`--ignore_no_path_matches`:
166+
Components with no match in the signature path are left unreviewed by default, allowing
167+
manual review. Use this option to ignore these components instead but use with caution
168+
as it may exclude components which are legitimate (the Signature match path does not
169+
have to include the component name or version).
170+
171+
`--report_unmatched`:
172+
Create a list of Signature components which will be left UNreviewed
155173

156174
The options `--report_file` and `--logfile` can be used to output the tabular report and logging data to
157175
specified files.
158176

159177
## PROPOSED WORKFLOW
160-
The script can classify Signature scan results.
178+
The script can be used to classify Signature scan results.
161179

162-
It can mark components as reviewed which are either Dependencies, or which have signature match paths containing
180+
It can mark components as reviewed which are either Dependencies, or which have Signature match paths containing
163181
the component name (and optionally component version) and which are therefore highly likely to be correctly identified
164-
by Signature matching. Fuzzy pattern matching is used so there is the possibility
165-
that components could be marked as reviewed where only a partial match exists, or components which should be matched
166-
are not identified meaning that some manual curation may still be required.
182+
by Signature matching.
167183

168-
It will also ignore components only matched within extraneous folders (for example created by Synopsys tools,
184+
It can also ignore components only Signature matched within extraneous folders (for example created by Synopsys tools,
169185
config/cache folders or test folders).
170186

171187
Components shown with `No action` are Signature matches where the component name or version
172188
could not be identified in the signature paths, so they are potential false matches and require manual review.
173-
Specify the `--ignore_no_path_matches` option to ignore these components automatically,
174-
however this should be used with caution as these components may be valid and should be manually reviewed.
189+
190+
After running the script and ignoring/reviewing components (using options `--ignore --review`), review the reported
191+
list of components from the script focussing on those marked with `No Action`. Optionally use the option `--report_unmatched`
192+
to list the `No Action` components with the full list of Signature match paths to enable assessment whether they should
193+
be included in the BOM.
194+
195+
If, after inspection, all `No Action` components can be removed from the BOM, the `--ignore_no_path_matches` option can be used to
196+
ignore these components automatically, however this should be used with caution as these components may be valid
197+
and should be manually reviewed.
175198

176199
## PROCESSING DUPLICATE COMPONENTS
177200
The script processes multiple versions of the same component in the BOM in several ways as described below:

bd_sig_filter/BOMClass.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,20 +55,20 @@ def report_summary(self):
5555
['After', self.complist.count(), self.complist.count_to_be_ignored(),
5656
self.complist.count_to_be_reviewed(), self.complist.count_not_to_be_reviewed_ignored()]]
5757
print("SUMMARY:")
58-
print(tabulate(table, headers=["", "Components", "Ignored", "Reviewed", "Neither"], tablefmt="simple"))
58+
print(tabulate(table, headers=["", "Components", "Ignored", "Reviewed", "Neither (No Action)"], tablefmt="simple"))
5959
print()
6060

6161
if global_values.report_file != '':
6262
with open(global_values.report_file, "w") as rfile:
6363
# Writing data to a file
6464
rfile.writelines("SUMMARY:")
65-
rfile.writelines(tabulate(table, headers=["", "Components", "Ignored", "Reviewed", "Neither"],
65+
rfile.writelines(tabulate(table, headers=["", "Components", "Ignored", "Reviewed", "Neither (No Action)"],
6666
tablefmt="Simple"))
6767
rfile.writelines("")
6868

6969
def report_full(self):
7070
table = self.complist.get_component_report_data()
71-
print(tabulate(table, headers=["Component", "Match Type", "Ignored", "Reviewed", "To be Ignored",
71+
print(tabulate(table, headers=["Component/Version", "Match Type", "Ignored", "Reviewed", "To be Ignored",
7272
"To be Reviewed", "Action"]))
7373
print()
7474
if global_values.report_file != '':

bd_sig_filter/SigEntryClass.py

Lines changed: 8 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,21 @@ def __init__(self, src_entry):
1010
self.src_entry = src_entry
1111
self.path = src_entry['commentPath']
1212
elements = re.split(r"!|#|" + os.sep, self.path)
13-
# self.elements = self.path.replace("!", os.sep).replace("#", os.sep).split(os.sep)
1413
self.elements = list(filter(None, elements))
1514

1615
except KeyError:
1716
return
1817

1918
def search_component(self, compname_arr, compver):
20-
# logging.debug("")
21-
# logging.debug(f"search_component() Checking Comp '{compname}/{compver}' - {self.path}:")
2219
# If component_version_reqd:
2320
# - folder matches compname and compver
2421
# - folder1 matches compname and folder2 matches compver
2522
# Else:
2623
# - folder matches compname
2724
# Returns:
28-
# Bool1 - compname found
29-
# Bool2 - version found
30-
# Match_value - search result against both
25+
# - Bool1 - compname found
26+
# - Bool2 - version found
27+
# - Match_value - search result against both
3128

3229

3330
best_match_name = 0
@@ -39,96 +36,48 @@ def search_component(self, compname_arr, compver):
3936
# compstring = f"{cname} {compver}"
4037

4138
# test of path search
42-
# newpath = self.path.replace(os.sep, " ")
4339
rep = f"[{os.sep}!#]"
4440
newpath = re.sub(rep, ' ', self.path).lower()
45-
# comp_in_path = fuzz.token_set_ratio(compstring, newpath)
4641
compname_setratio = fuzz.token_set_ratio(cname, newpath)
47-
compname_sortratio = fuzz.token_sort_ratio(cname, newpath)
48-
compname_partialratio = fuzz.partial_ratio(cname, newpath)
49-
compver_setratio = fuzz.token_set_ratio(compver, newpath)
50-
compver_sortratio = fuzz.token_sort_ratio(compver, newpath)
42+
# compname_sortratio = fuzz.token_sort_ratio(cname, newpath)
43+
# compname_partialratio = fuzz.partial_ratio(cname, newpath)
44+
# compver_setratio = fuzz.token_set_ratio(compver, newpath)
45+
# compver_sortratio = fuzz.token_sort_ratio(compver, newpath)
5146
compver_partialratio = fuzz.partial_ratio(compver, newpath)
5247

5348
if compname_setratio + compver_partialratio > best_match_name + best_match_ver:
5449
best_match_name = compname_setratio
5550
best_match_ver = compver_partialratio
56-
# match_path = self.path
5751
logging.debug(f"search_component(): TEST '{cname}/{compver}' - {compname_setratio,compver_partialratio}: path='{self.path}")
5852

5953
if best_match_name > 45:
6054
name_bool = True
6155
if best_match_ver > 80:
6256
ver_bool = True
6357
return name_bool, ver_bool, best_match_name + best_match_ver
64-
# compstring = f"{compname} {compver}"
65-
# element_in_compname = 0
66-
# compver_in_element = 0
67-
# found_compname_only = False
68-
# for element in self.elements:
69-
# pos = re.search(r"\.dll|\.obj|\.o|\.a|\.lib|\.iso|\.qcow2|\.vmdk|\.vdi|\.ova|\.nbi|\.vib|\.exe|\.img|"
70-
# "\.bin|\.apk|\.aac|\.ipa|\.msi|\.zip|\.gz|\.tar|\.xz|\.lz|\.bz2|\.7z|\.rar|"
71-
# "\.cpio|\.Z|\.lz4|\.lha|\.arj|\.jar|\.ear|\.war|\.rpm|\.deb|\.dmg|\.pki", element)
72-
# if pos is not None:
73-
# element = element[:pos.start()]
74-
# # How much of the element string is from the compname and version?
75-
# # - for example acl-1.3.0.jar
76-
# # - Value of 100 indicates either compname or version exists in element
77-
# element_in_compstring = fuzz.token_set_ratio(element, compstring)
78-
# element_in_compname = fuzz.token_set_ratio(element, compname)
79-
# compver_in_element = fuzz.token_set_ratio(compver, element)
80-
#
81-
# if element_in_compstring > 80:
82-
# if compver_in_element > 50:
83-
# # element has both compname and version
84-
# logging.debug(f"search_component() - MATCHED component name & version ({compstring}) in '{element}'")
85-
# return True, True, element_in_compname + compver_in_element
86-
# elif element_in_compname > 50 and len(element) > 2:
87-
# found_compname_only = True
88-
# logging.debug(f"search_component() - FOUND component name ONLY ({compname}) in '{element}'")
89-
# elif found_compname_only:
90-
# if compver_in_element > 50:
91-
# logging.debug(f"search_component() - MATCHED component version ({compver}) in '{element}'")
92-
# return True, True, element_in_compname + compver_in_element
93-
# else:
94-
# test = 1
95-
#
96-
# if found_compname_only:
97-
# logging.debug("search_component() - MATCHED Compname only")
98-
# return True, False, element_in_compname + compver_in_element
99-
#
100-
# logging.debug(f"search_component() - NOT MATCHED")
10158

10259

10360
def filter_folders(self):
10461
# Return True if path should be ignored + reason
10562
if not global_values.no_ignore_synopsys:
106-
# syn_folders = ['.synopsys', 'synopsys-detect', '.coverity', 'synopsys-detect.jar',
107-
# 'scan.cli.impl-standalone.jar', 'seeker-agent.tgz', 'seeker-agent.zip',
108-
# 'Black_Duck_Scan_Installation']
109-
11063
syn_folders_re = (f"\\{os.sep}(\.synopsys|synopsys-detect|\.coverity|synopsys-detect.*\.jar|scan\.cli\.impl-standalone\.jar|"
11164
f"seeker-agent.*|Black_Duck_Scan_Installation)\\{os.sep}")
11265
res = re.search(syn_folders_re, os.sep + self.path + os.sep)
11366
if res:
11467
return True, f"Found {res.group()} folder in Signature match path '{self.path}'"
11568

11669
if not global_values.no_ignore_defaults:
117-
# def_folders = ['.cache', '.m2', '.local', '.cache','.config', '.docker', '.npm', '.npmrc', '.pyenv',
118-
# '.Trash', '.git', 'node_modules']
11970
def_folders_re = (f"\\{os.sep}(\.cache|\.m2|\.local|\.config|\.docker|\.npm|\.npmrc|"
12071
f"\.pyenv|\.Trash|\.git|node_modules)\\{os.sep}")
12172
res = re.search(def_folders_re, os.sep + self.path + os.sep)
12273
if res:
12374
return True, f"Found {res.group()} folder in Signature match path '{self.path}'"
12475

12576
if not global_values.no_ignore_test:
126-
test_folders = f"\\{os.sep}(test|tests|testsuite)\\{os.sep}"
77+
test_folders = f"\\{os.sep}(test|tests|testsuite|testsuites)\\{os.sep}"
12778
res = re.search(test_folders, os.sep + self.path + os.sep, flags=re.IGNORECASE)
12879
if res:
12980
return True, f"Found {res.group()} in Signature match path '{self.path}'"
130-
# if e in test_folders:
131-
# return True, f"Found '{e}' in Signature match path '{self.path}'"
13281

13382
return False, ''
13483

bd_sig_filter/config.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,6 @@ def check_args():
118118
if args.report_unmatched:
119119
global_values.report_unmatched = True
120120

121-
122121
if terminate:
123122
sys.exit(2)
124123
return

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "bd_sig_filter"
7-
version = "1.6"
7+
version = "1.7"
88
authors = [
99
{ name="Matthew Brady", email="mbrad@synopsys.com" },
1010
]

0 commit comments

Comments
 (0)