An R package for automated machine learning explanation and comparative reporting. AutoXplainR provides an efficient, automated, and standardized method for generating comparative model explanations from AutoML outputs with clear reports suitable for diverse audiences.
π Repository: https://github.com/Matt17BR/AutoXplainR
AutoXplainR generates comprehensive interactive dashboards that make machine learning models interpretable and actionable:
- π€ Automated ML Pipeline: Seamless integration with H2O AutoML
- π Custom Explanations: Permutation importance and partial dependence plots implemented from scratch
- π Interactive Visualizations: Custom plotly-based plots for all explanation types
- π¬ Comparative Analysis: Side-by-side model comparison and explanation
- π§ Natural Language Reports: LLM-generated summaries using Google Generative AI
- π HTML Dashboards: Comprehensive interactive dashboards using flexdashboard
- π― Self-Reliant: No dependencies on existing XAI packages like DALEX or iml
Ensure you have the required dependencies installed:
# Install devtools if not already installed
if (!require(devtools)) install.packages("devtools")
# H2O installation (if not already installed)
if (!require(h2o)) {
install.packages("h2o")
}
# Install from GitHub
devtools::install_github("Matt17BR/AutoXplainR")
# Load the package
library(AutoXplainR)
Set your API key for natural language report generation:
# Set environment variable
Sys.setenv(GEMINI_API_KEY = "your_api_key_here")
# Or pass directly to function
generate_natural_language_report(result, api_key = "your_api_key_here")
# Basic usage
data(mtcars)
result <- autoxplain(mtcars, "mpg", max_models = 10, max_runtime_secs = 120)
# Generate comprehensive dashboard
generate_dashboard(result, "my_dashboard.html")
autoxplain()
: Main function for automated ML with explanation pipeline
calculate_permutation_importance()
: Feature importance via permutationcalculate_partial_dependence()
: Partial dependence for single featurescalculate_partial_dependence_multi()
: Partial dependence for multiple features
plot_permutation_importance()
: Interactive importance bar chartsplot_partial_dependence()
: Interactive PDP line plotsplot_partial_dependence_multi()
: Multi-feature PDP subplotsplot_model_comparison()
: Model performance scatter plots
generate_dashboard()
: Comprehensive HTML dashboard with flexdashboardcreate_simple_dashboard()
: Lightweight HTML dashboardgenerate_natural_language_report()
: LLM-powered report generation
library(AutoXplainR)
# 1. Run AutoML
data(mtcars)
result <- autoxplain(mtcars, "mpg", max_models = 3, max_runtime_secs = 180)
# 2. Generate explanations for best model
model <- result$models[[1]]
importance <- calculate_permutation_importance(model, mtcars, "mpg")
top_features <- head(importance$feature, 3)
pdp_data <- calculate_partial_dependence_multi(model, mtcars, top_features)
# 3. Generate explanations for a specific model or variables
importance <- calculate_permutation_importance(result$models[[1]], mtcars, "mpg")
pdp_data <- calculate_partial_dependence_multi(result$models[[1]], mtcars, c("wt", "hp"))
# 4. Generate comprehensive dashboard
generate_dashboard(
result,
output_file = "mtcars_analysis.html",
top_features = 5,
sample_instances = 3,
include_llm_report = TRUE
)
# 5. Create individual plots
plot_model_comparison(result)
plot_permutation_importance(importance)
plot_partial_dependence_multi(pdp_data)
AutoXplainR follows a modular architecture:
- AutoML Engine: H2O AutoML integration for model training
- Explanation Engine: Custom implementations of:
- Permutation feature importance
- Partial dependence plots
- Visualization Engine: Interactive plotly-based charts
- Reporting Engine: Dashboard generation and LLM integration
- Testing Suite: Comprehensive test coverage
- h2o (>= 3.40.0)
- plotly (>= 4.10.0)
- data.table (>= 1.14.0)
- jsonlite (>= 1.8.0)
- httr (>= 1.4.0)
- stringr (>= 1.4.0)
- flexdashboard (>= 0.6.0)
- rmarkdown (>= 2.0.0)
- DT (>= 0.20.0)
- testthat (>= 3.0.0)
MIT License - see LICENSE file for details.