Skip to content

[Fix] Line number issue for custom detector #3997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 59 commits into from
May 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
f53135d
initial commit
kashifkhan0771 Mar 26, 2025
b4b506d
initial commit
kashifkhan0771 Mar 26, 2025
05c799c
Merge branch 'main' into fix/csm-864
kashifkhan0771 Mar 27, 2025
689f7e4
Merge branch 'main' into fix/csm-864
kashifkhan0771 Mar 28, 2025
9820bed
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 3, 2025
e314175
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 4, 2025
89b8676
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 7, 2025
07cc53e
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 7, 2025
0f7c93b
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 8, 2025
1cf60ce
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 8, 2025
4845160
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 9, 2025
ac805f7
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 9, 2025
215e79d
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 9, 2025
1a007ba
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 9, 2025
1a5a79d
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 10, 2025
e2b58a2
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 11, 2025
8c986bd
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 11, 2025
197dfcb
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 14, 2025
2a4dfcc
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 14, 2025
39558e9
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 14, 2025
a86c204
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 15, 2025
40837cf
added test cases
kashifkhan0771 Apr 15, 2025
aaa767c
fixed test cases
kashifkhan0771 Apr 15, 2025
557954c
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 17, 2025
a3c8c01
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 17, 2025
9282f23
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 17, 2025
f09fb67
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 18, 2025
206889b
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 18, 2025
8cfac10
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 21, 2025
e8e226c
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 22, 2025
05fd56d
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 22, 2025
20c865b
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 23, 2025
1c833fb
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 23, 2025
87bb0f6
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 23, 2025
e68f117
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 24, 2025
46aaf9a
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 25, 2025
0a7eb18
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 28, 2025
d1ea6d5
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 29, 2025
38bf996
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 29, 2025
b2ddeba
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 29, 2025
dc70cc0
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 29, 2025
917b3d9
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 30, 2025
0c9cbd1
Merge branch 'main' into fix/csm-864
kashifkhan0771 Apr 30, 2025
0cb2470
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 2, 2025
b4be3d5
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 5, 2025
bf4a88b
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 5, 2025
df14a8b
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 5, 2025
1b1fa83
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 6, 2025
300a4f5
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 8, 2025
367dd13
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 8, 2025
161bcf5
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 8, 2025
27a18b7
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 9, 2025
bee7791
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 12, 2025
94246c5
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 13, 2025
40b8acc
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 14, 2025
c38961e
removed primary secret from shopify
kashifkhan0771 May 14, 2025
efc3048
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 15, 2025
b83bcf4
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 16, 2025
e83e492
Merge branch 'main' into fix/csm-864
kashifkhan0771 May 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pkg/custom_detectors/CUSTOM_DETECTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ This guide will walk you through setting up a custom detector in TruffleHog to i
- **`verify`**: An optional section to validate detected secrets. If you want to verify or unverify detected secrets, this section needs to be configured. If not configured, all detected secrets will be marked as unverified. Read [verification server examples](#verification-server-examples)

**Other allowed parameters:**
- **`primary_regex_name`**: This parameter allows you designate the primary regex pattern when multiple regex patterns are defined in the regex section. If a match is found, the match for the designated primary regex will be used to determine the line number. The value must be one of the names specified in the regex section.
- **`exclude_regexes_capture`**: This parameter allows you to define regex patterns to exclude specific parts of a detected secret. If a match is found within the detected secret, the portion matching this regex is excluded from the result.
- **`exclude_regexes_match`**: This parameter enables you to define regex patterns to exclude entire matches from being reported as secrets.
- **`entropy`**: This parameter is used to assess the randomness of detected strings. High entropy often indicates that a string is a potential secret, such as an API key or password, due to its complexity and unpredictability. It helps in filtering false-positives. While an entropy threshold of `3` can be a starting point, it's essential to adjust this value based on your project's specific requirements and the nature of the data you have.
Expand Down
22 changes: 15 additions & 7 deletions pkg/custom_detectors/custom_detectors.go
Original file line number Diff line number Diff line change
Expand Up @@ -186,21 +186,29 @@ func (c *CustomRegexWebhook) createResults(ctx context.Context, match map[string
// TODO: Log we're possibly leaving out results.
return ctx.Err()
}

result := detectors.Result{
DetectorType: detectorspb.DetectorType_CustomRegex,
DetectorName: c.GetName(),
ExtraData: map[string]string{},
}

var raw string
for _, values := range match {
for key, values := range match {
// values[0] contains the entire regex match.
secret := values[0]
if len(values) > 1 {
secret = values[1]
}
raw += secret

// if the match is of the primary regex, set it's value as primary secret value in result
if c.PrimaryRegexName == key {
result.SetPrimarySecretValue(secret)
}
}
result := detectors.Result{
DetectorType: detectorspb.DetectorType_CustomRegex,
DetectorName: c.GetName(),
Raw: []byte(raw),
ExtraData: map[string]string{},
}

result.Raw = []byte(raw)

if !verify {
select {
Expand Down
19 changes: 19 additions & 0 deletions pkg/custom_detectors/custom_detectors_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,25 @@ func TestDetector(t *testing.T) {
assert.Equal(t, results[0].Raw, []byte(`123456`))
}

func TestDetectorPrimarySecret(t *testing.T) {
detector, err := NewWebhookCustomRegex(&custom_detectorspb.CustomRegex{
Name: "test",
Keywords: []string{"secret"},
Regex: map[string]string{"id": "id_[A-Z0-9]{10}_yy", "secret": "secret_[A-Z0-9]{10}_yy"},
PrimaryRegexName: "secret",
})
assert.NoError(t, err)
results, err := detector.FromData(context.Background(), false, []byte(`
// getData returns id and secret
func getData()(string, string){
return "id_ALPHA10100_yy", "secret_YI7C90ACY1_yy"
}
`))
assert.NoError(t, err)
assert.Equal(t, 1, len(results))
assert.Equal(t, "secret_YI7C90ACY1_yy", results[0].GetPrimarySecretValue())
}

func BenchmarkProductIndices(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = productIndices(3, 2, 6)
Expand Down
28 changes: 28 additions & 0 deletions pkg/detectors/detectors.go
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,14 @@ type Result struct {
// analysis to run. The keys of the map are analyzer specific and
// should match what is expected in the corresponding analyzer.
AnalysisInfo map[string]string

// primarySecret is used when a detector has multiple secret patterns.
// This secret is designated to determine the line number.
// If set, the line number will correspond to this secret.
primarySecret struct {
Value string
Line int64
}
}

// CopyVerificationInfo clones verification info (status and error) from another Result struct. This is used when
Expand All @@ -134,6 +142,26 @@ func (r *Result) VerificationError() error {
return r.verificationError
}

// SetPrimarySecretValue set the value passed as primary secret in the result
func (r *Result) SetPrimarySecretValue(value string) {
if value != "" {
r.primarySecret.Value = value
}
}

// SetPrimarySecretLine set the passed line number as primary secret line number
func (r *Result) SetPrimarySecretLine(line int64) {
// line number is only set if value is set for primary secret
if r.primarySecret.Value != "" {
r.primarySecret.Line = line
}
}

// GetPrimarySecretValue return primary secret match value
func (r *Result) GetPrimarySecretValue() string {
return r.primarySecret.Value
}

// redactSecrets replaces all instances of the given secrets with [REDACTED] in the error message.
func redactSecrets(err error, secrets ...string) error {
lastErr := unwrapToLast(err)
Expand Down
9 changes: 8 additions & 1 deletion pkg/engine/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -1254,11 +1254,18 @@ func SupportsLineNumbers(sourceType sourcespb.SourceType) bool {

// FragmentLineOffset sets the line number for a provided source chunk with a given detector result.
func FragmentLineOffset(chunk *sources.Chunk, result *detectors.Result) (int64, bool) {
before, after, found := bytes.Cut(chunk.Data, result.Raw)
// get the primary secret value from the result if set
secret := result.GetPrimarySecretValue()
if secret == "" {
secret = string(result.Raw)
}

before, after, found := bytes.Cut(chunk.Data, []byte(secret))
if !found {
return 0, false
}
lineNumber := int64(bytes.Count(before, []byte("\n")))
result.SetPrimarySecretLine(lineNumber)
// If the line contains the ignore tag, we should ignore the result.
endLine := bytes.Index(after, []byte("\n"))
if endLine == -1 {
Expand Down
51 changes: 51 additions & 0 deletions pkg/engine/engine_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,57 @@ func TestFragmentLineOffset(t *testing.T) {
}
}

func TestFragmentLineOffsetWithPrimarySecret(t *testing.T) {
primarySecretResult1 := &detectors.Result{
Raw: []byte("id heresecret here"), // RAW has two secrets merged
}

primarySecretResult1.SetPrimarySecretValue("secret here") // set `secret here` as primary secret value for line number calculation

primarySecretResult2 := &detectors.Result{
Raw: []byte("idsecret"), // RAW has two secrets merged
}

tests := []struct {
name string
chunk *sources.Chunk
result *detectors.Result
expectedLine int64
ignore bool
}{
{
name: "primary secret line number - correct line number",
chunk: &sources.Chunk{
Data: []byte("line1\nline2\nid here\nsecret here\nline5"),
},
result: primarySecretResult1,
expectedLine: 3,
ignore: false,
},
{
name: "no primary secret set - wrong line number",
chunk: &sources.Chunk{
Data: []byte("line1\nline2\nid\nsecret\nline5"),
},
result: primarySecretResult2,
expectedLine: 0,
ignore: false,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
lineOffset, isIgnored := FragmentLineOffset(tt.chunk, tt.result)
if lineOffset != tt.expectedLine {
t.Errorf("Expected line offset to be %d, got %d", tt.expectedLine, lineOffset)
}
if isIgnored != tt.ignore {
t.Errorf("Expected isIgnored to be %v, got %v", tt.ignore, isIgnored)
}
})
}
}

func setupFragmentLineOffsetBench(totalLines, needleLine int) (*sources.Chunk, *detectors.Result) {
data := make([]byte, 0, 4096)
needle := []byte("needle")
Expand Down
51 changes: 31 additions & 20 deletions pkg/pb/custom_detectorspb/custom_detectors.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions pkg/pb/custom_detectorspb/custom_detectors.pb.validate.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions proto/custom_detectors.proto
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ message CustomRegex {
repeated string exclude_words = 7;
float entropy = 8;
repeated string exclude_regexes_match = 9;
string primary_regex_name = 10;
}


Expand Down
Loading