Skip to content

feat: allow input source from pipe #4088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

strazzere
Copy link
Contributor

@strazzere strazzere commented Apr 23, 2025

While performing a bit of experimentation, I found it was easier to allow data to flow from other segments of my machine via piping it in, rather than flushing it to disk. This can speed up the process when pre-processing large amounts of data. This allow allowed me to keep disk space low and just utilize a large ram machine.

Description:

Allow stdin an input source. This was helpful for dealing with large datasets and avoiding the disk (and keeping memory to a decent size). This should let other folks leverage the tool a bit easier when piping data from other tools potentially.

My specific instance was processing a rather large file in memory from the web and just piping to trufflehog. This was automated on an aws machine which a small disk but lots of CPU/memory.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

While performing a bit of experimentation,
I found it was easier to allow data to flow
from other segments of my machine via piping
it in, rather than flushing it to disk. This
can speed up the process when pre-processing
large amounts of data. This allow allowed me
to keep disk space low and just utilize a large
ram machine.
@strazzere strazzere requested review from a team as code owners April 23, 2025 21:35
@zricethezav
Copy link
Collaborator

@strazzere good lookin' PR. Would you mind adding an example command in the README?

@strazzere
Copy link
Contributor Author

Documentation fix added

@zricethezav
Copy link
Collaborator

@strazzere had a discussion internally, wdyt about calling it stdin instead of pipe?

@strazzere
Copy link
Contributor Author

It doesn't really matter to me, pipe felt better since it strongly infers this isn't inherently just readable data that is being processed. Though that might literally just be how I think of it. I can change it over to stdin.

Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good! Once you've changed all the names over from "pipe" to "stdin" or "Standard Input" we'd be happy to get it in. (Please change them in both the code and the user-facing text.)

@strazzere
Copy link
Contributor Author

Requested changes should all be done now, cheers.

@strazzere strazzere requested a review from rosecodym May 14, 2025 16:48
Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants