Skip to content

Commit 4c2b3d5

Browse files
committed
v1.1
- Improved report format
1 parent 88ec7c5 commit 4c2b3d5

File tree

7 files changed

+595
-93
lines changed

7 files changed

+595
-93
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ python main.py -F --output-dir ./reports
9696

9797
Here is an example of a generated report:
9898

99-
[Example Report](./example/all_report.md)
99+
[Example Report](./example/)
100100

101101
## Contributing
102102

example/all_report.md

Lines changed: 243 additions & 31 deletions
Large diffs are not rendered by default.

example/all_report_trends.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Security Trends Report
2+
3+
Okay, here's a cohesive summary of the key trends and security notes extracted from the provided information:
4+
5+
**Overall Trend: Widespread Vulnerabilities and Systemic Issues**
6+
7+
The data reveals a landscape of widespread vulnerabilities affecting a variety of technologies and vendors, including major players like IBM, Siemens, and Xerox. This highlights a critical need for consistent security practices across the board. The vulnerabilities are not isolated incidents but point to underlying, potentially systemic, issues in software development practices.
8+
9+
**Key Vulnerability Types and Patterns:**
10+
11+
* **WordPress Plugin Vulnerabilities:** A significant number of vulnerabilities are concentrated within WordPress plugins. These commonly stem from inadequate input sanitization and output escaping, leading to:
12+
* **Stored Cross-Site Scripting (XSS):** Malicious scripts injected into a website and executed by users.
13+
* **Local File Inclusion (LFI):** Attackers gaining access to sensitive files on the server.
14+
* **Cross-Site Request Forgery (CSRF):** Forcing authenticated users to perform actions they didn't intend.
15+
* **SQL Injection:** Injecting malicious SQL code to manipulate or access data.
16+
17+
* **Input Validation and Sanitization Failures:** Across the board, insufficient input validation and sanitization are primary causes of vulnerabilities. This leads to issues like:
18+
* **Path Traversal:** Attackers accessing files and directories outside of intended paths.
19+
* **Insecure Handling of Inputs:** Improper processing of user-supplied data, opening doors for various exploits.
20+
21+
* **Privilege Escalation:** A recurring issue, where attackers exploit vulnerabilities to gain higher access levels than intended.
22+
23+
* **Cross-Site Scripting (XSS):** A pervasive issue, often due to inadequate output escaping.
24+
25+
* **Remote Code Execution (RCE):** A significant security risk, where attackers can execute arbitrary code on a target system.
26+
27+
**Systemic Issues and Development Practices:**
28+
29+
* **Recurring Vulnerability Patterns:** The presence of similar vulnerability types across different products and vendors strongly suggests systemic weaknesses in development methodologies, such as a lack of consistent secure coding practices.
30+
* **Lack of Regular Updates and Thorough Code Reviews:** The vulnerabilities found in established software highlight the critical need for regular security updates and thorough code reviews as part of the development process.
31+
* **Insufficient Validation or Permission Controls:** A common theme is the failure to adequately validate user inputs and enforce proper access control mechanisms.
32+
33+
**Security Notes:**
34+
35+
* **Importance of Regular Patching:** The prevalence of vulnerabilities underscores the necessity for organizations to promptly install security updates and patches.
36+
* **Need for Secure Development Practices:** Vendors and developers should adopt secure coding practices, emphasizing input validation, sanitization, and proper access control mechanisms.
37+
* **Code Reviews are Essential:** Thorough code reviews can help identify and mitigate potential vulnerabilities early in the development lifecycle.
38+
* **Security Training for Developers:** Investing in security training for developers is crucial to ensure they are aware of common vulnerabilities and know how to prevent them.
39+
* **Regular Vulnerability Scanning:** Continuously scanning systems for vulnerabilities is necessary for identifying and fixing weaknesses before they can be exploited.
40+
41+
**In conclusion, the data paints a picture of a vulnerable software ecosystem. Addressing these widespread and systemic issues requires a multi-faceted approach that involves improving security practices, prioritizing updates, and actively monitoring for potential threats.**

example/all_vulnerabilities.json

Lines changed: 255 additions & 0 deletions
Large diffs are not rendered by default.

src/ai_prompts.py

Lines changed: 10 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -10,50 +10,28 @@
1010
"""
1111

1212
summarization_prompt = """Analyze the vulnerabilities in this batch and group them by their affected technology/product.
13-
For each vulnerability, provide a concise technical description.
13+
For each vulnerability:
14+
1) Provide a concise technical description (max 200 chars).
15+
2) Indicate the index of the item no JSON references needed beyond that.
16+
17+
Also, produce a short summary of observed trends or relevant security notes in free text. (call it "trendSummary")
1418
1519
Input data:
1620
THIS_JSON
1721
18-
Required format:
19-
{
20-
"technologies": [
21-
{
22-
"name": str, # Name of the technology/product
23-
"items": [
24-
{
25-
"index": int, # Index in the input batch
26-
"description": str # Technical description (max 200 chars)
27-
}
28-
]
29-
}
30-
],
31-
"trends": [ # List of observed security trends
32-
{
33-
"trend": str, # Description of the trend
34-
"impact": str # Potential security impact
35-
}
36-
]
37-
}
38-
39-
Example response:
22+
Expected response format (JSON):
4023
{
4124
"technologies": [
4225
{
43-
"name": "Apache Server",
26+
"name": str,
4427
"items": [
4528
{
46-
"index": 0,
47-
"description": "Memory corruption vulnerability in mod_proxy allows remote attackers to execute arbitrary code via crafted HTTP requests"
29+
"index": int,
30+
"description": str
4831
}
4932
]
5033
}
5134
],
52-
"trends": [
53-
{
54-
"trend": "Increase in HTTP request smuggling vulnerabilities",
55-
"impact": "Allows bypass of security controls and potential RCE"
56-
}
57-
]
35+
"trendSummary": str
5836
}
5937
"""

src/scrapers/sources.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,6 @@ def get_full_disclosure_latest(start_date, end_date, use_ai=True, max_items=None
109109
else:
110110
current_date = datetime(current_date.year, current_date.month + 1, 1)
111111

112-
# print_final_progress(Fore.GREEN + f"Collected {len(vulns)}/{max_items if max_items else 'unlimited'} items from [SecLists] full disclosure" + Style.RESET_ALL)
113112
return vulns
114113

115114
# Exploit-DB source

src/summarizer.py

Lines changed: 45 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -51,40 +51,40 @@ def format_vulnerability_entry(vuln: dict, tech_item: dict) -> str:
5151
else:
5252
return f"- [{vuln['title']}]({vuln['link']}) ({vuln['date']}) [{vuln['source']}]\n - {desc}"
5353

54-
def generate_markdown_report(vulns: List[dict], all_classifications: List[dict], report_type: str) -> str:
54+
def format_date_str(date_str: str) -> str:
55+
try:
56+
parsed_date = parse(date_str)
57+
return parsed_date.strftime('%d %b %Y')
58+
except:
59+
return date_str
60+
61+
def generate_markdown_report(vulns: List[dict], all_classifications: List[dict]) -> str:
5562
report = f"""# Vulnerability Analysis Report
5663
Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
5764
Total Vulnerabilities Analyzed: {len(vulns)}
5865
5966
## Vulnerabilities by Technology
6067
6168
"""
62-
# Add vulnerabilities grouped by technology
63-
all_trends = []
64-
for classification in all_classifications:
65-
# Add vulnerabilities
69+
for idx, classification in enumerate(all_classifications):
6670
tech_sections = classification.get('technologies', [])
67-
for tech in tech_sections:
71+
for t_idx, tech in enumerate(tech_sections):
6872
tech_name = tech['name']
6973
report += f"### {tech_name}\n\n"
7074
for item in tech['items']:
7175
vuln = vulns[item['index']]
72-
entry = format_vulnerability_entry(vuln, item)
73-
report += f"{entry}\n\n"
74-
75-
# Collect trends
76-
all_trends.extend(classification.get('trends', []))
77-
78-
# Add trends section
79-
if all_trends:
80-
report += "\n## Security Trends Analysis\n\n"
81-
for trend in all_trends:
82-
report += f"### {trend['trend']}\n"
83-
report += f"**Impact**: {trend['impact']}\n\n"
84-
76+
date_formatted = format_date_str(vuln['date'])
77+
source = vuln['source']
78+
report += f"{date_formatted} [{source}]\n"
79+
report += f"- [{vuln['title']}]({vuln['link']})\n"
80+
report += f" - {item['description']}\n\n"
81+
report += "---\n"
8582
return report
8683

87-
def summarize_vulnerabilities(input_file: str = "./output/all_vulnerabilities.json", output_file: str = "./output/vulnerability_report.md"):
84+
def summarize_vulnerabilities(
85+
input_file: str = "./output/all_vulnerabilities.json",
86+
output_file: str = "./output/vulnerability_report.md"
87+
):
8888
print(Fore.BLUE + f"[theWatcher] Loading vulnerabilities from {input_file}" + Style.RESET_ALL)
8989
if not api_key:
9090
print(Fore.YELLOW + "[theWatcher] No API key found. Skipping summarization." + Style.RESET_ALL)
@@ -96,10 +96,10 @@ def summarize_vulnerabilities(input_file: str = "./output/all_vulnerabilities.js
9696
batches = batch_vulnerabilities(all_vulns)
9797
total_batches = len(batches)
9898
all_classifications = []
99+
trends_summaries = []
99100
requests_count = 0
100101
current_batch = 0
101102

102-
# Process all batches first
103103
for i, batch in enumerate(batches):
104104
print(Fore.BLUE + f"[theWatcher] Summarizing items in batch {i+1}/{total_batches}" + Style.RESET_ALL)
105105
if current_batch != i + 1:
@@ -133,9 +133,9 @@ def summarize_vulnerabilities(input_file: str = "./output/all_vulnerabilities.js
133133
classification = json.loads(response.text)
134134
if isinstance(classification, dict) and 'technologies' in classification:
135135
all_classifications.append({
136-
'technologies': classification['technologies'],
137-
'trends': classification.get('trends', [])
136+
'technologies': classification['technologies']
138137
})
138+
trends_summaries.append(classification.get('trendSummary', ''))
139139
break
140140
print(Fore.YELLOW + f"[theWatcher] Retrying batch {i+1}/{total_batches}..." + Style.RESET_ALL)
141141
except Exception as e:
@@ -144,16 +144,33 @@ def summarize_vulnerabilities(input_file: str = "./output/all_vulnerabilities.js
144144

145145
requests_count += 1
146146

147-
# Generate final report only once
148-
report = generate_markdown_report(all_vulns, all_classifications,
149-
'nist' if 'nist' in input_file else 'sources')
150-
151-
# Write complete report
147+
report = generate_markdown_report(all_vulns, all_classifications)
152148
os.makedirs(os.path.dirname(output_file), exist_ok=True)
153149
with open(output_file, 'w', encoding='utf-8') as f:
154150
f.write(report)
155151

152+
trends_file = output_file.replace(".md", "_trends.md")
153+
print(Fore.BLUE + "[theWatcher] Generating trends report..." + Style.RESET_ALL)
154+
final_trends_prompt = (
155+
"Sumarize the main trends and security notes from these partial summaries:\n\n"
156+
+ "\n\n".join(trends_summaries) +
157+
"\n\nCreate a cohesive final explanation of key insights."
158+
)
159+
try:
160+
response2 = model.generate_content(
161+
final_trends_prompt,
162+
generation_config=genai.GenerationConfig(response_mime_type="text/plain")
163+
)
164+
final_trends = response2.text if response2 else "No trend info."
165+
except:
166+
final_trends = "No trend info."
167+
168+
with open(trends_file, 'w', encoding='utf-8') as f:
169+
f.write("# Security Trends Report\n\n")
170+
f.write(final_trends)
171+
156172
print(Fore.GREEN + f"[theWatcher] Report saved in {output_file}" + Style.RESET_ALL)
173+
print(Fore.GREEN + f"[theWatcher] Trends saved in {trends_file}" + Style.RESET_ALL)
157174

158175
def validate_summary_format(summary: dict) -> bool:
159176
"""Validate if summary follows the required format"""

0 commit comments

Comments
 (0)