A Python application to help automate monitoring for website masquerades using urlscan.io and other search platforms.
Masquerade Monitor can help you track potential phishing campaigns (or other campaigns) by searching for websites mimicking legitimate brands. The tool queries search platforms for new scans that match your monitoring criteria, saving the results as HTML reports with thumbnail previews.
The basic idea: It takes a list of monitoring pivots, and performs API calls to check if there is anything new since the last check.
It currently supports urlscan.io requests and the Silent Push API (as of April 2025, the results rendering is not yet optimized).
- Queries search platforms for potential masquerade websites (BYO-Queries)
- Multiple platform support (urlscan.io and Silent Push with both WHOIS and webscan data)
- Modular architecture with platform-specific client modules
- Modular template system with components for different report sections and result types
- Extensible template registry for automatically selecting the right template for each result type
- Saves screenshots of detected sites
- Generates standalone HTML reports with embedded screenshots
- Automatically extracts and saves IOCs (Indicators of Compromise) for both urlscan and Silent Push results
- Flexible IOC extraction supporting varied result formats from different endpoints
- CSV and JSON export of extracted IOCs (domains, IPs, URLs, etc.)
- Extension system for post-processing results with custom Python scripts
- CLI support for specifying extensions on-demand with
-xor--extensionflags - Dedicated report generation module separated from main application logic
- Includes query metadata (reference, notes, frequency, priority, tags) in reports
- Supports TLP (Traffic Light Protocol) classification for information sharing control
- Supports query groups for organizing related queries and generating comprehensive reports
- Allows hierarchical organization with nested query groups
- Dark mode support with user preference memory
- Interactive image viewer for examining thumbnails in full-screen mode
- Tracks the last run timestamp for each query
- Supports custom lookback periods for searches
- Defangs IOCs (URLs and domains) in reports for safer sharing
- Platform-independent saving and loading of results for testing and development
- Customizable report username
- Branded footer with project links
- API keys stored in .env file for better security
- Automatic data type detection for Silent Push results (WHOIS vs webscan)
- Specialized report formatting for different data types
- Support for all Silent Push API endpoints with configurable endpoint parameter
- hIGMA Integration: Direct support for running hIGMA output files as input queries
Masquerade Monitor now supports direct integration with hIGMA output files. You can run masq-monitor using hIGMA-generated YAML files as input:
python masq_monitor.py --higma /path/to/higma/output.yaml --days 7This feature automatically:
- Parses hIGMA output YAML files
- Converts hIGMA queries to masq-monitor format using the
rules_titleas the query name - Uses hIGMA metadata
descriptionfield for query notes - Imports references from hIGMA metadata with TLP level set to 'red' by default
- Executes the queries against URLScan.io
- Generates comprehensive HTML reports with screenshots
- Extracts and saves IOCs in multiple formats (CSV, JSON)
The integration preserves hIGMA metadata including pivot IDs, references, and threat actor information in the generated reports.
To monitor for USAA masquerades:
- Method 1: Monitoring for scan tasks that have "USAA" in the page.title
- Method 2: Monitoring for scan tasks that have "usaa" in the domain
- Method 3: Monitoring for scan tasks that use the official USAA favicon hash (or any other official USAA asset hashes)
The monitoring techniques need to be periodically updated as needed.
- Clone this repository:
git clone https://github.com/yourusername/masq-monitor.git
cd masq-monitor
- Install dependencies:
pip install -r requirements.txt
- Create your configuration file:
cp config.example.json config.json
- Create your .env file for API keys:
cp .env.example .env
- Edit
.envwith your urlscan.io API key:
URLSCAN_API_KEY=your_urlscan_api_key_here
- Edit
config.jsonwith your desired monitoring queries.
API keys are stored in a .env file, following security best practices. This keeps sensitive credentials out of your code and configuration files. To set up your API key:
- Create a
.envfile in the root directory of the project (or copy from.env.example) - Add your URLScan.io API key:
URLSCAN_API_KEY=your_urlscan_api_key_here
The .env file is included in .gitignore to prevent accidentally committing your API keys to version control.
With this setup, team members can share their configuration files (queries, reporting preferences, etc.) without exposing their API keys. Simply share the config.json file, and each team member can use their own .env file with their personal API key.
python masq_monitor.py --list
python masq_monitor.py --query usaa-domain
By default, IOCs (domains, IPs, URLs, etc.) are automatically extracted from the results and saved to CSV files.
python masq_monitor.py --query-group usaa-monitoring
Query groups run multiple related queries and create a combined report with sections for each query. IOCs from all queries in the group are consolidated and saved.
Especially useful for initial runs to avoid processing too many results:
python masq_monitor.py --query usaa-domain -d 7
This limits the search to results from the last 7 days.
python masq_monitor.py --all
This runs all individual queries (not query groups).
python masq_monitor.py --all-groups
This runs all query groups defined in the configuration.
You can combine the --all and --all-groups flags:
python masq_monitor.py --all --all-groups
You can also limit all queries to a timeframe:
python masq_monitor.py --all --all-groups -d 30
Execute queries from a hIGMA output file:
python masq_monitor.py --higma /path/to/higma/output.yaml
With time limits and TLP level:
python masq_monitor.py --higma /path/to/higma/output.yaml --days 7 --tlp green
Example with the provided hIGMA file:
python masq_monitor.py --higma "D:\SoftwareDevelopment\GitHubRepoClones\hIGMA\plugins\urlscan\output\20250817-144704-lummastealer-sectoprat-delivery-page-title-pivot.yaml" --days 3
You can override the configured extensions and run specific extensions using the -x or --extension flag:
python masq_monitor.py --query usaa-domain --extension extract_mega_nz_url_and_password.py
Run multiple extensions:
python masq_monitor.py --query usaa-domain -x extract_mega_nz_url_and_password.py -x extract_gtm_from_urlscan_dom.py
Extensions specified via CLI will override any extensions configured in the config file for that run. This is useful for:
- Testing new extensions without modifying the config
- Running specific extensions for particular investigations
- Combining extensions from different queries
The extensions will save their output in the extensions/ subdirectory of the query's output folder.
python masq_monitor.py --config my_custom_config.json --all
You can override the default TLP level specified in the configuration file:
python masq_monitor.py --query usaa-domain --tlp red
This will generate the report with a TLP:RED classification regardless of the default setting in the config file.
If you don't want to save IOCs to CSV files, you can disable this feature:
python masq_monitor.py --query usaa-domain --no-iocs
The configuration file can be in either JSON or YAML format. Use .json or .yaml/.yml file extension to specify the format:
{
"output_directory": "output",
"default_days": 7,
"report_username": "Your Name",
"default_tlp_level": "clear",
"default_template_path": "templates/report_template.html",
"extensions": ["extract_gtm_from_urlscan_dom.py"],
"queries": {
// query configurations...
}
}output_directory: output
default_days: 7
report_username: Your Name
default_tlp_level: clear
default_template_path: templates/report_template.html
extensions:
- extract_gtm_from_urlscan_dom.py
queries:
# query configurations...When creating your configuration file, you can choose either format:
cp config.example.json config.json
# OR
cp config.example.yaml config.yaml
YAML format provides a more readable structure for complex configurations, especially for nested query properties and is recommended for easier maintenance as your monitoring queries grow.
output_directory: Directory to store reports and screenshots.default_days: Default number of days to limit the search to if nolast_runtimestamp exists and the--daysflag is not specified.report_username: Your name or username to be displayed in generated reports.default_template_path: The default template to use for all queries that don't have a specific template.extensions: An array of extension script filenames from theextensionsdirectory to run globally for all queries.queries: A map of named queries to execute against search platforms.platform: Search platform to use for this query. Currently supported: "urlscan", "silentpush". Defaults to "urlscan" if not specified.query: The search query string formatted for the specified platform.endpoint: (Silent Push only) API endpoint to use. Should start with a leading slash (e.g., "/explore/domain/search"). If not specified, defaults to "/explore/scandata/search/raw" for scandata queries.last_run: Timestamp of when the query was last executed. Used to limit searches to only new results since the last run.query_tlp_level: TLP (Traffic Light Protocol) classification for the query itself. Determines how sensitive the search pattern is. Values: "clear", "white", "green", "amber", "red".default_tlp_level: Default TLP classification for report content. Used for report elements without their own explicit TLP level. Values: "clear", "white", "green", "amber", "red".template_path: Optional per-query HTML template to use for the report. If not specified, falls back to the globaldefault_template_path.extensions: An array of extension script filenames to run specifically for this query, in addition to any global extensions.reference: Optional link to documentation or source for the query.notes: Additional contextual information about the query.frequency: Suggested frequency for running this query (e.g., "daily", "weekly").priority: Indicates importance of the query (e.g., "high", "medium", "low").tags: List of keywords to categorize the query.
Query groups allow you to organize related queries and generate combined reports. A query group is defined with the following options:
type: Must be set to "query_group" to identify this entry as a query group rather than a regular query.queries: An array of query names (or other query groups) that belong to this group.description: Description of the query group's purpose.description_tlp_level: TLP classification for the description.default_tlp_level: Default TLP classification for the group report.titles: Array of titles with TLP classifications, similar to regular queries.notes,references,frequency,priority,tags: Same metadata fields as regular queries.last_run: Timestamp of when the group was last executed.
Query groups can be nested, allowing for a hierarchical organization of your monitoring activities. For example:
- A "banking-group" query group might contain:
- The "usaa-monitoring" query group (which itself contains individual queries)
- The "chase-domain" query
- Domain contains specific text:
domain:*bank*
- Page title contains specific text:
page.title:*login*
- Multiple conditions:
domain:*paypal* AND page.title:*secure*
- Specific hash (e.g., favicon):
hash:"fa6a5a3224d7da66d9e0bdec25f62cf0"
For Silent Push queries, you can use any of the available API endpoints by specifying the endpoint parameter in your query configuration:
{
"silentpush-domain-search": {
"platform": "silentpush",
"endpoint": "/explore/domain/search",
"query": "domain=example.com",
"description": "Search for domains matching example.com"
},
"silentpush-domain-info": {
"platform": "silentpush",
"endpoint": "/explore/domain/domaininfo/example.com",
"query": "",
"description": "Get detailed information about example.com"
},
"silentpush-scandata": {
"platform": "silentpush",
"query": "domain=*phish*",
"description": "Search for phishing domains in scandata (using default endpoint)"
}
}If no endpoint is specified for a Silent Push query, the system will default to /explore/scandata/search/raw, which is used for general scandata searches.
Masquerade Monitor includes an extension system for post-processing query results with your own custom scripts. Extensions allow you to extract additional data, perform further analysis, or integrate with other tools.
Extensions are Python scripts that run after a query has completed. They can access all the query outputs, including IOCs and scan details, and can generate their own output files.
To configure extensions, add them to your configuration file:
{
"extensions": ["extract_gtm_from_urlscan_dom.py"], // Global extensions that run for all queries
"queries": {
"banking-domain-search": {
"query": "domain:*bank*",
"extensions": ["extract_analytics_ids.py"] // Query-specific extensions
}
}
}Masquerade Monitor comes with the following extensions:
-
extract_gtm_from_urlscan_dom.py: Extracts Google Tag Manager IDs from the URLScan DOM, which can be useful for tracking common infrastructure across phishing campaigns.
-
extract_strings_from_primary_request.py: Extracts string patterns from the primary request responses in URLScan results. This extension identifies the primary HTTP request in a scan, retrieves its response data, and looks for configurable string patterns that might indicate sensitive or malicious content.
To create a custom extension:
- Create a Python script in the
/extensions/directory. - Implement a
main(run_dir)function that processes the results:
def main(run_dir):
"""
Main entry point for the extension
Args:
run_dir: Path to the output directory for this query run
"""
# Your code here
print(f"Processing results in: {run_dir}")
# Access IOCs
iocs_dir = Path(run_dir) / "iocs"
# Save your findings
output_dir = Path(run_dir) / "extensions"
output_dir.mkdir(exist_ok=True)
# Write results to a file
with open(output_dir / "my_extension_results.csv", "w") as f:
f.write("data,extracted\n")
f.write("example,result\n")Extension output is stored in the /extensions/ directory inside each query's output folder:
output/
query-name_YYYYMMDD_HHMMSS/
extensions/
extract_gtm_from_urlscan_dom.csv
my_extension_results.csv
Extensions run in parallel using threads to avoid blocking the main application, and have access to all files generated by the query.
The tool generates standalone HTML reports in the output directory with the following structure:
For individual queries:
output/
query-name_YYYYMMDD_HHMMSS/
report_query-name_YYYYMMDD_HHMMSS_TLP-level.html
images/
[scan-uuid1].png
[scan-uuid2].png
...
iocs/
query-name_YYYYMMDD_HHMMSS_all_iocs.csv
query-name_YYYYMMDD_HHMMSS_domains.csv
query-name_YYYYMMDD_HHMMSS_ips.csv
query-name_YYYYMMDD_HHMMSS_urls.csv
query-name_YYYYMMDD_HHMMSS_iocs.json
...
For query groups:
output/
group-name_YYYYMMDD_HHMMSS_group/
report_group-name_YYYYMMDD_HHMMSS_TLP-level.html
images/
[scan-uuid1].png
[scan-uuid2].png
...
iocs/
group-name_combined_YYYYMMDD_HHMMSS_all_iocs.csv
group-name_combined_YYYYMMDD_HHMMSS_domains.csv
group-name_combined_YYYYMMDD_HHMMSS_iocs.json
...
The HTML reports are self-contained files with all screenshots embedded as Base64-encoded images, allowing them to be shared or archived as single files without external dependencies.
Masq Monitor automatically extracts Indicators of Compromise (IOCs) from search results for both urlscan and Silent Push platforms. This feature is enabled by default and extracts the following types of indicators:
- Domains
- IP addresses
- URLs
- Page titles
- Server details
- Email addresses (Silent Push)
- Registrars (Silent Push)
- Nameservers (Silent Push)
- Organizations (Silent Push)
IOCs are saved to CSV files in an "iocs" directory within each query's output folder. The following files are generated:
all_iocs.csv: Contains all IOCs in a single file with columns for IOC type, value, and scan ID- Individual type files (e.g.,
domains.csv,ips.csv) that contain just that specific IOC type iocs.json: A complete JSON representation of all extracted IOCs
For query groups, a consolidated set of IOCs from all member queries is also generated.
To disable IOC extraction, use the --no-iocs flag when running a query.
For a detailed history of changes and improvements, please see the changelog.md file.
- ✓ Add date filtering to only show results since last check
- ✓ Prioritize using
last_runtimestamp before falling back todefault_days - ✓ Add query metadata options (reference, notes, frequency, priority, tags)
- ✓ Add query groups for organizing related queries and generating comprehensive reports
- ✓ Move API keys to .env file for better security
- ✓ Add support for multiple search platforms including "urlscan" and "silentpush"
- ✓ Implement Silent Push API integration with both WHOIS and webscan data handling
- ✓ Implement IOC extraction and saving for both urlscan and Silent Push
- ✓ Add ability to export results to CSV/JSON
- Implement email notifications for new findings
- Support for custom report templates
- Integrate additional data sources beyond urlscan.io
- Add machine learning capabilities to detect similar domains
- Implement a web dashboard for monitoring results
- Support for scheduled automatic checks
- Create an API for programmatic access to monitoring results
- Develop plugins for security tools integration
- Implement advanced analytics to detect sophisticated phishing techniques
- Add collaborative features for team-based monitoring
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.