-
-
Notifications
You must be signed in to change notification settings - Fork 52
Configure browsertrix proxies #1847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
f0e67c8
backend: add ssh proxies configuration
vnznznz d96fff4
frontend: add wip ssh proxy selection
vnznznz 2d3e9ef
scripts: add minikube utilities
vnznznz fca5886
ssh proxy: fix changing proxy in workflow editor
vnznznz 25b813c
formatting
vnznznz 425bed6
Merge branch 'main' into configure-socks-proxies
ikreymer 80542df
cleanup: various renaming / simplifications, remove 'ssh' from names,…
ikreymer eb4f9f1
fixes: ensure proxyId defaults to "" if none
ikreymer ba07896
version: bump to 1.12.0-beta.0
ikreymer f0a3d11
fixes: ssh proxy - allow multiline known_hosts file
vnznznz e893f89
add proxy support for profiles!
ikreymer e59e1c8
make proxies more generic, can support ssh://, socks5:// and http://
ikreymer d575b87
show default proxy in `select-crawler-proxy` + misc visual fixes
vnznznz dbd51ed
Merge branch 'main' into configure-socks-proxies
ikreymer 3969513
reformat
ikreymer d96ee8c
Merge branch 'main' into configure-socks-proxies
ikreymer bd43426
Merge branch 'main' into configure-socks-proxies
ikreymer ce71535
fix ui post frontend refactor, remove authstate
ikreymer c7b33fc
more removal of authstate, including from comments
ikreymer 310b647
move proxy config to subchart, allow updating proxies without re-depl…
vnznznz e48a074
move passwd hack to main chart
vnznznz 7266d1d
add missing docstring
vnznznz cfaa3b8
fix lint error
vnznznz 8663875
proxies: add shared flag, org proxy settings
vnznznz c702ba7
proxies: fix backend bugs
vnznznz b63322c
frontend: add `proxy_not_found` error message
vnznznz b3dbfe1
frontend: add wip admin proxy gui
vnznznz 2e5fa5f
add missing docstring
vnznznz 379f0b7
Merge branch 'main' into configure-socks-proxies
ikreymer 0cb5d0e
proxy UI fixes after merge
ikreymer f591b4c
use proxyId from existing profile when running profile browser for ex…
ikreymer e08500a
proxies subchart: default to 'crawlers' namespace
ikreymer 8d54e28
Merge branch 'main' into configure-socks-proxies
ikreymer 9549123
backend: unpin motor dependency, fixes ImportError on backend start
vnznznz ca37b2b
backend: improve `get_all_crawler_proxies` endpoint path
vnznznz d958fa6
backend: disable org shared proxies by default
vnznznz eaff240
frontend: few more labels to org proxy admin modal
vnznznz 827023a
frontend: misc text changes
vnznznz d0839b4
ensure proxyId saved on Profile
ikreymer 4214572
Merge branch 'main' into configure-socks-proxies
ikreymer ae3e909
ensure proxyId is passed through to profile creation
ikreymer 7b052b5
add proxy selector to org defaults
ikreymer 81b07a6
form name fix
ikreymer 8925a2b
fix proxy clearing
ikreymer 4477a1f
misc tweaks: fix workflow default, EmailStr cast, add comments for bt…
ikreymer f94f31b
Merge branch 'main' into configure-socks-proxies
ikreymer 4dc72a9
reextract strings
ikreymer 4fd3631
WIP: Start adding documentation
tw4l 27c753e
adjust placement of socks proxy to be below profiles
ikreymer 74fa4a8
ensure proxyId included in cronjob, skip cronjob if proxy is missing
ikreymer 68571db
lint fixes
ikreymer 1b3c5dc
Update documentation based on review comments
tw4l f192bbd
Wordsmith docs
tw4l a07b4c6
More wordsmithing
tw4l 7214895
update proxy docs
ikreymer 93feaf2
update docs, add proxies subchart to release
ikreymer c90bc0a
more docs tweaks
ikreymer 3e5302c
rename proxies-passwd-hack -> force-user-and-group-name for clarity
ikreymer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Configuring Proxies | ||
|
||
Browsertrix can be configured to direct crawling traffic through dedicated proxy servers, so that websites can be crawled from a specific geographic location regardless of where Browsertrix itself is deployed. | ||
|
||
This guide covers how to set up proxy servers for use with Browsertrix, as well as how to configure Browsertrix to make those proxies available. | ||
|
||
## Proxy Configuration | ||
|
||
Browsertrix supports crawling through HTTP and SOCKS5 proxies, including through a SOCKS5 proxy over an SSH tunnel. For more information on what is supported in the underlying Browsertrix Crawler, see the [Browsertrix Crawler documentation](https://crawler.docs.browsertrix.com/user-guide/proxies/). | ||
|
||
Many commercial proxy services exist. If you are planning to use commercially-provided proxies, continue to [Browsertrix Configuration](#browsertrix-configuration) below. | ||
|
||
To set up your own proxy server to use with Browsertrix as SOCKS5 over SSH, the first thing that is needed is a physical or virtual server that you intend to use as the proxy. Once you have access to this remote machine, you will need to add the public key of a public/private key pair (we recommend using a new ECDSA key pair) to support ssh connections to the remote machine. You will need to supply the corresponding private key to Browsertrix in [Browsertrix Configuration](#browsertrix-configuration) below. | ||
|
||
(TODO: More technical setup details as needed) | ||
|
||
## Browsertrix Configuration | ||
|
||
Proxies are configured in Browsertrix through a separate deployment and subchart. This enables easier updates to available proxy servers without needing to redeploy the entire Browsertrix application. | ||
|
||
To add or update proxies to your Browsertrix Deployment, modify the `btrix-proxies` section of the main Helm chart or your local override. | ||
|
||
First, set `enabled` to `true`, which will enable deploying proxy servers. | ||
|
||
Next, provide the details of each proxy server that you want available within Browsertrix in the `proxies` list. Minimally, an id, connection string URL, label, and two-letter country code must be set for each proxy. If you want a particular proxy to be shared and potentially available to all organizations on a Browsertrix deployment, set `shared` to `true`. For SSH proxy servers, an `ssh_private_key` is required, and the contents of a known hosts file can additionally be provided to help secure a connection. | ||
|
||
Once all proxy details are set, deploy the proxies by (TODO: add these details) | ||
tw4l marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.