-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Hey! New checkov user here 👋
How I noticed the problem
While trying to run checkov's test suit on my local machine I noticed that some files were being (unexpectedly) ignored. After a little digging, I realised that this is because I cloned the repo under a path that looks something like: ~/my_git_forks/checkov
.
The reason this is a problem is because the following patterns are treated as regular expressions by checkov:
checkov/checkov/secrets/utils.py
Line 10 in 42d3178
EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"] |
and the .git
pattern obviously excludes any paths that include the word 'git' (incl. at least one preceding character)
The more fundamental issues
I think the more fundamental problem is that checkov documents --skip-path
as including regexes, documents CKV_IGNORED_DIRECTORIES
to include non-regex patterns (default values arenode_modules,.terraform,.serverless
instead of node_modules,\\.terraform,\\.serverless
), but then treats them equally in the code by merging them in some places, and explicitly treats them as both regexes and non-regex patterns in the same line of code.
I think someone has noticed something similar in the past because I see a few p.replace(".terraform", r"\.terraform")
sprinkled around the code base.
However this just seems like another inconsistency because if users set --skip-path="^.*\.terraform.*$"
, it will get converted internally to ^.*\\.terraform.*$
, which would not ignore paths like path/to/.terraform
.
How to fix it?
Since checkov allows users to pass regexes to --skip-path
and regards CKV_IGNORED_DIRECTORIES
as legacy, it makes sense to consistently treat all patterns as regular expressions. However, since I just started looking at this project a couple of days ago, I have no idea what the consequences of doing this are...
Some (naive?) suggestions
- Escape the default values for
CKV_IGNORED_DIRECTORIES
:
IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless") |
i.e.,:
-IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless")
+IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,\.terraform,\.serverless")
Same needs to be done here:
self.IGNORED_DIRECTORIES = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless") |
- Same here:
checkov/checkov/secrets/utils.py
Line 10 in 42d3178
EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"] |
-EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"]
+EXCLUDED_PATHS = [*ignored_directories, re.escape(DEFAULT_EXTERNAL_MODULES_DIR), "\.idea", "\.git", "venv"]
- Remove any hard-coded references to
p.replace(".terraform", r"\.terraform")
- Consistently threat all references to
ignored_directories
,excluded_paths
, andskip_path
as regex patterns. This means removing all ~path in skip_path
invocations since this is not compatible with regex (e.g."\.hidden" not in ".hidden/secret.txt"
)- As a safeguard and to avoid ~
path in skip_path
code being committed in the first place, theexcluded_paths
variable should probably be a set ofre.pattern
s instead of strings. e.g.:excluded_paths = {re.compile(p) for p in self.config.skip_path}`
- As a safeguard and to avoid ~
Other things to consider
Currently, patterns set in the form of .hidden
will always exclude directories of the form .hidden-not
. However, this is a problem both in the non-regex ~path in skip_path
way of doing things and naive regex definitions. However, at least with regex the user can be explicit and say something like (^|.*\/)\.hidden\/?



However, this is not very user friendly and maybe there should be a way to allow users to define excluded directories in a .hidden,also_hidden
flavour rather than the verbose (and non-Windown-compatible) --skip-path='(^|.*\/)\.hidden\/?' --skip-path='(^|.*\/)also_hidden\/?'