Company: AxionRay
Role Context: Evaluating proficiency in data validation, cleaning, integration, and exploratory data analysis using Python.
Domains: Automotive quality, failure diagnostics, and component analytics.
AxionRay, a leading AI-driven engineering safety company, focuses on leveraging Large Language Models and Generative AI to improve data quality and operational insights for next-gen products like electric vehicles and airplanes.
This assignment consists of three main tasks, which were successfully completed and documented:
- Task 1: Data Validation & Cleaning
- Task 2: Data Integration & Preparation
- Task 3: Exploratory Data Analysis (Trend & Root Cause Analysis)
- Column-wise dataset profiling
- Handling missing & malformed data
- Tag extraction from free-text fields
- Exploratory visualizations
- Standardized column names
- Replaced nulls and dropped critical-missing rows
- Extracted tags from correction_verbatimvia keyword-matching
- Identified system-level causes for missing causal_part_nm
- Highlighted top 10 most failing components
- Bar plots of component failures
- Tag generation for free-text diagnostics
- Missing data insights (plant, dealer-level trends)
- Identify and justify a common primary key for dataset merging
- Perform thorough data cleaning: missing values, datatype corrections, standardizations
- Merge datasets with appropriate join strategy
- Used "Primary Key"column for a left join
- Cleaned whitespace and standardized column names
- Filled missing values:
- Coverage → "Unknown"
- Cause,- Correction→ "Not Mentioned"
 
- Converted numeric columns to float where needed
- Removed duplicates and checked null distributions
- Transformed Order Dateto datetime
- Grouped by month to analyze volume trends
- Visualized trends using:
- Line Plot (monthly order count)
- Heatmap (variable correlation)
 
- Bar chart showing top 8 components contributing to revenue loss
- Boxplot comparing actual hours spent across failure components
- Identified components with highest cost/time burden
- Discovered correlations that inform resource planning
- Visualization-driven exploration enables root cause prioritization
The tasks demonstrate:
- Clear identification and resolution of data quality issues
- Integration of disparate datasets using relational keys
- Visualization-driven insights into component behavior and trends
- Strategic recommendations for data validation and cost optimization
These outputs can directly support AxionRay's mission to enhance quality engineering through AI and data-driven diagnostics.