This project explores global terrorism trends using big data analytics and visualization tools. Leveraging the Global Terrorism Database (GTD) — a comprehensive dataset of terrorist incidents from 1970 to 2017 — this analysis aims to uncover patterns, hotspots, and predictive insights to aid global counterterrorism strategies.
- Analyze over 180,000 terrorism records using PySpark for scalable data processing.
- Use Linear Regression to predict casualties based on attack and weapon types.
- Apply K-Means Clustering to detect geographic terrorism hotspots.
- Visualize insights using Tableau and Python libraries like Matplotlib, Seaborn, and Folium.
📁 Global_Terrorism_Analysis/
├── 📄 Big_Data_Analysis_and_Data_Visualization.ipynb # Jupyter Notebook with code and visualizations
├── 📄 Global_Terrorism_Analysis.docx # Full research paper/report
├── 📄 terrorism_map.html # Interactive map of attacks
├── 📄 choropleth_attacks_by_country.html # Choropleth map of attacks by country
├── 📄 choropleth_fatalities_by_country.html # Choropleth map of fatalities
├── 📄 heatmap_attacks.html # Heatmap of attacks
├── 📄 map_by_target_type.html # Map by target type
├── 📄 terrorism_dashboard.twbx # Tableau dashboard (if applicable)
Tool | Purpose |
---|---|
PySpark | Distributed data processing and ML modeling |
Python | Data wrangling and visualizations |
Tableau | Interactive and dynamic data dashboards |
Folium | Geospatial mapping of attacks |
Seaborn & Matplotlib | Statistical and trend plotting |
- Goal: Predict number of casualties using
attacktype1
andweaptype1
. - Features engineered: StringIndexing, VectorAssembler
- Metrics: R² Score and RMSE
- Goal: Group attacks based on latitude & longitude to identify terrorism hotspots.
- Clusters: Set to 5 for regional analysis.
- Normalization: StandardScaler used for geographic feature scaling.
- Temporal trends of terrorist activity over time.
- Attack type and region-based casualty summaries.
- Word clouds, choropleths, and interactive maps to identify and analyze high-risk areas.
A dynamic Tableau dashboard provides:
- Yearly trends in terrorism
- Interactive geographical maps of attacks
- Top terrorist groups
- Regional casualty breakdowns
- Attack methods and frequencies
📌 Figures available in the report and attached
.twbx
file (if included).
- Source: Global Terrorism Database on Kaggle
- Records: 180,000+ terrorist events from 1970–2017
- Features: 135 attributes including location, group, attack type, weapon, target, casualties, etc.
-
Install dependencies:
pip install pyspark pandas seaborn matplotlib folium wordcloud
-
Run the Jupyter Notebook:
jupyter notebook Big_Data_Analysis_and_Data_Visualization.ipynb
-
View
.html
maps in your browser. -
Open Tableau
.twbx
dashboard for dynamic exploration.
- Bombings and armed assaults are the most common attack types.
- Middle East, South Asia, and parts of Africa emerge as key hotspots.
- While regression models highlight useful predictors, additional variables like target type and socio-political factors could enhance accuracy.
- Incorporate temporal clustering (by time of day, month).
- Add socio-economic, political, or organizational affiliation features.
- Explore deeper networks within terrorist organizations.
By analyzing terrorism trends using data, this project aims to:
- Assist governments and security agencies in proactive counterterrorism.
- Provide humanitarian insights for resource allocation.
- Demonstrate the power of data science in addressing global security challenges.
This project is for educational and research purposes. Please cite appropriately if referenced.