Model Explainability with LIME and SHAP

Theory 60 min

Why Explainability Matters

Imagine a bank rejects your loan application. You ask why, and they say: "The algorithm said no." Would you accept that? Of course not — you'd want to know which factors led to the decision. This is the core of model explainability.

The Doctor's Diagnosis Analogy

A good doctor doesn't just say "you have condition X." They explain:

What symptoms led to the diagnosis (feature importance)
How confident they are (prediction probability)
What alternatives were considered (other classes)
What would change the diagnosis (counterfactual reasoning)

An explainable AI model works the same way — it tells you not just what it predicted, but why.

Four Pillars of Explainability

Pillar	Description	Stakeholder
Trust	Users and stakeholders need to trust the model's decisions	End users, management
Regulation	Laws require explanations for automated decisions (EU AI Act, GDPR Article 22)	Legal, compliance
Debugging	Understanding why a model fails helps fix it	Data scientists, ML engineers
Fairness	Detecting if a model discriminates based on protected attributes	Ethics teams, auditors

EU AI Act — The Legal Imperative

The EU AI Act (effective 2025) classifies AI systems by risk level. High-risk systems (healthcare, finance, hiring, law enforcement) MUST provide explanations for their decisions. Non-compliance can result in fines up to €35 million or 7% of global revenue.

Black-Box vs. White-Box Models

White-Box Models (Inherently Interpretable)

These models are simple enough to understand directly:

Model	How to Interpret
Linear Regression	Coefficients show direct feature impact: `price = 200 × sqft + 50000 × bedrooms - 10000 × age`
Decision Tree	Follow the tree branches: if age > 30 AND income > 50k → approve
Logistic Regression	Odds ratios show how each feature changes the probability

Black-Box Models (Opaque)

These models are too complex to inspect directly:

Model	Why It's a Black Box
Random Forest	Hundreds of trees, each with different splits
Gradient Boosting (XGBoost, LightGBM)	Thousands of sequential trees, complex interactions
Neural Networks	Millions of parameters, non-linear transformations
Ensemble Models	Combination of multiple different models

The Explainability-Accuracy Trade-off

View Interpretability Spectrum

The Solution

Post-hoc explainability methods like LIME and SHAP let you explain black-box models without sacrificing accuracy. You get the best of both worlds: powerful models + interpretable explanations.

Local vs. Global Explanations

Type	Question Answered	Scope	Example
Local	"Why did the model predict this specific result for this specific input?"	Single prediction	"This loan was rejected because the applicant's debt-to-income ratio is 0.85"
Global	"What features matter in general across all predictions?"	Entire dataset	"Income and credit score are the two most important features overall"

LIME — Local Interpretable Model-agnostic Explanations

How LIME Works

LIME explains individual predictions by building a simple, interpretable model (like linear regression) around the prediction point.

The intuition: Even if the overall model is complex, the decision boundary near a specific point might be approximately linear.

LIME Algorithm Step by Step

Analogy: Explaining a Decision to a Child

Imagine explaining to a 5-year-old why you chose a restaurant:

Think about what you considered (features): price, distance, rating, cuisine
Slightly change one factor at a time (perturbations): "What if it were cheaper? What if it were farther?"
See how your decision changes (predictions on perturbations)
Summarize in simple terms (linear model): "I mainly chose it because it's close and cheap"

LIME Implementation

import lime
import lime.lime_tabular
import numpy as np
import joblib

model = joblib.load("models/model_v1.joblib")

X_train = np.load("data/X_train.npy")
feature_names = ["feature_1", "feature_2", "feature_3", "feature_4", "feature_5"]
class_names = ["Class 0", "Class 1"]

explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=feature_names,
    class_names=class_names,
    mode="classification",
    random_state=42,
)

instance = X_train[0]

explanation = explainer.explain_instance(
    data_row=instance,
    predict_fn=model.predict_proba,
    num_features=5,
    num_samples=1000,
)

print("Prediction:", model.predict([instance])[0])
print("Prediction probabilities:", model.predict_proba([instance])[0])
print("\nFeature contributions:")
for feature, weight in explanation.as_list():
    direction = "↑ pushes toward Class 1" if weight > 0 else "↓ pushes toward Class 0"
    print(f"  {feature}: {weight:+.4f} ({direction})")

Output example:

Prediction: 1
Prediction probabilities: [0.15 0.85]

Feature contributions:
  feature_2 > 3.2: +0.2845 (↑ pushes toward Class 1)
  feature_1 <= 5.5: +0.1923 (↑ pushes toward Class 1)
  feature_4 <= 0.5: +0.1456 (↑ pushes toward Class 1)
  feature_5 > 2.0: +0.1102 (↑ pushes toward Class 1)
  feature_3 <= 1.8: +0.0834 (↑ pushes toward Class 1)

Visualizing LIME Explanations

fig = explanation.as_pyplot_figure()
fig.tight_layout()
fig.savefig("lime_explanation.png", dpi=150, bbox_inches="tight")

explanation.save_to_file("lime_explanation.html")

LIME Strengths and Limitations

Strengths	Limitations
✅ Model-agnostic — works with any model	⚠️ Explanations can vary between runs (sampling)
✅ Intuitive output — easy to communicate	⚠️ Assumes local linearity — may miss interactions
✅ Works for tabular, text, and image data	⚠️ Perturbation generation is somewhat arbitrary
✅ Fast for individual predictions	⚠️ No global explanation out of the box

SHAP — SHapley Additive exPlanations

Game Theory Foundation

SHAP is based on Shapley values from cooperative game theory (1953 Nobel Prize in Economics). The question it answers:

If a group of players (features) work together to achieve a payoff (prediction), how do we fairly distribute the credit among them?

The Pizza Bill Analogy

Imagine four friends order a pizza that costs €40:

Alice ordered the expensive toppings (+€12)
Bob ordered extra cheese (+€5)
Charlie ordered the large size (+€8)
Diana ordered nothing special (+€0)

The Shapley value fairly distributes the cost based on each person's marginal contribution — what they added in every possible ordering.

SHAP Properties (Mathematical Guarantees)

Property	Description
Local accuracy	SHAP values sum up to the difference between the prediction and the base value
Missingness	Features that don't affect the prediction get SHAP value = 0
Consistency	If a feature contributes more in model B than model A, its SHAP value is higher
Fairness	Based on rigorous game theory — the only method with these guarantees

SHAP Implementation

import shap
import joblib
import numpy as np
import matplotlib.pyplot as plt

model = joblib.load("models/model_v1.joblib")
X_train = np.load("data/X_train.npy")
X_test = np.load("data/X_test.npy")
feature_names = ["feature_1", "feature_2", "feature_3", "feature_4", "feature_5"]

explainer = shap.TreeExplainer(model)

shap_values = explainer.shap_values(X_test)

print("SHAP values shape:", np.array(shap_values).shape)
print("Base value (expected value):", explainer.expected_value)

SHAP Visualizations

Force Plot (Single Prediction)

instance_idx = 0

shap.initjs()

shap.force_plot(
    base_value=explainer.expected_value[1],
    shap_values=shap_values[1][instance_idx],
    features=X_test[instance_idx],
    feature_names=feature_names,
    matplotlib=True,
)
plt.savefig("shap_force_plot.png", dpi=150, bbox_inches="tight")

The force plot shows how each feature pushes the prediction from the base value toward the final prediction:

Red features push the prediction higher (toward Class 1)
Blue features push the prediction lower (toward Class 0)
Wider bars = stronger contribution

Summary Plot (Global Feature Importance)

shap.summary_plot(
    shap_values[1],
    features=X_test,
    feature_names=feature_names,
    show=False,
)
plt.tight_layout()
plt.savefig("shap_summary_plot.png", dpi=150, bbox_inches="tight")

The summary plot shows:

Y-axis: Features ranked by importance (top = most important)
X-axis: SHAP value (impact on prediction)
Color: Feature value (red = high, blue = low)

Waterfall Plot (Detailed Single Prediction)

shap_explanation = shap.Explanation(
    values=shap_values[1][instance_idx],
    base_values=explainer.expected_value[1],
    data=X_test[instance_idx],
    feature_names=feature_names,
)

shap.waterfall_plot(shap_explanation, show=False)
plt.tight_layout()
plt.savefig("shap_waterfall_plot.png", dpi=150, bbox_inches="tight")

Bar Plot (Mean Absolute SHAP Values)

shap.summary_plot(
    shap_values[1],
    features=X_test,
    feature_names=feature_names,
    plot_type="bar",
    show=False,
)
plt.tight_layout()
plt.savefig("shap_bar_plot.png", dpi=150, bbox_inches="tight")

Dependence Plot (Feature Interaction)

shap.dependence_plot(
    ind="feature_1",
    shap_values=shap_values[1],
    features=X_test,
    feature_names=feature_names,
    interaction_index="feature_2",
    show=False,
)
plt.tight_layout()
plt.savefig("shap_dependence_plot.png", dpi=150, bbox_inches="tight")

LIME vs. SHAP — Detailed Comparison

Aspect	LIME	SHAP
Foundation	Local linear approximation	Shapley values (game theory)
Theoretical guarantees	❌ No formal guarantees	✅ Local accuracy, consistency, missingness
Scope	Local only	Local + Global
Speed	⚡ Fast (single instance)	🐢 Slower (all instances for global)
Stability	⚠️ Can vary between runs	✅ Deterministic for tree models
Model support	Any model (model-agnostic)	Any model (KernelSHAP), optimized for trees (TreeSHAP)
Output	Feature importance weights	Additive feature attributions
Visualization	Bar chart, HTML report	Force plot, summary, waterfall, dependence
Best for	Quick single-instance explanations	Rigorous analysis, regulatory compliance
Installation	`pip install lime`	`pip install shap`

When to Use Which

Use LIME when you need a quick, approximate explanation for one prediction
Use SHAP when you need rigorous, theoretically-grounded explanations — especially for regulatory compliance
Use both in production: LIME for real-time explanations, SHAP for periodic model audits

Feature Importance Methods

Beyond LIME and SHAP, there are other ways to understand feature importance:

Built-in Feature Importance (Tree Models)

import pandas as pd

importances = model.feature_importances_
feature_importance_df = pd.DataFrame({
    "feature": feature_names,
    "importance": importances,
}).sort_values("importance", ascending=False)

print(feature_importance_df.to_string(index=False))

Permutation Importance

from sklearn.inspection import permutation_importance

result = permutation_importance(
    model, X_test, y_test,
    n_repeats=10,
    random_state=42,
)

for i in result.importances_mean.argsort()[::-1]:
    print(f"{feature_names[i]}: {result.importances_mean[i]:.4f} "
          f"± {result.importances_std[i]:.4f}")

Comparison of Feature Importance Methods

Method	Based On	Pros	Cons
Built-in (Gini/Gain)	Impurity reduction during training	Fast, no extra computation	Biased toward high-cardinality features
Permutation	Drop in accuracy when feature is shuffled	Model-agnostic, unbiased	Slow for large datasets, correlated features
SHAP	Game-theoretic fair attribution	Theoretically sound, handles interactions	Computationally expensive
LIME	Local linear approximation	Fast, intuitive	Unstable, local only

Partial Dependence Plots (PDP)

PDPs show how a feature affects predictions on average, holding all other features constant:

from sklearn.inspection import PartialDependenceDisplay

fig, ax = plt.subplots(1, 3, figsize=(15, 4))

PartialDependenceDisplay.from_estimator(
    model, X_test,
    features=[0, 1, 2],
    feature_names=feature_names,
    ax=ax,
    kind="both",
)

fig.tight_layout()
fig.savefig("partial_dependence_plots.png", dpi=150, bbox_inches="tight")

Communicating Results to Stakeholders

Different audiences need different levels of detail:

For Executives

"Our model approves/rejects loan applications based primarily on income (40% influence), credit score (30%), and employment history (15%). The remaining features have minor impact."

For Regulators

"For application #12345, the model predicted rejection with 92% confidence. The top contributing factors were:

Debt-to-income ratio of 0.85 (above threshold of 0.60): SHAP contribution = -0.28

Credit score of 580 (below average of 720): SHAP contribution = -0.22

Employment duration of 3 months (below median of 24 months): SHAP contribution = -0.15"

For Data Scientists

Share the full SHAP analysis:

Summary plots for global feature importance
Force plots for individual predictions
Dependence plots for feature interactions
Statistical tests for feature stability

Explainability Report Template

# Model Explainability Report

## Model Overview
- **Model type**: Random Forest Classifier
- **Training data**: 10,000 samples, 5 features
- **Performance**: Accuracy 92%, F1 0.91

## Global Feature Importance (SHAP)
| Rank | Feature    | Mean |SHAP| | Direction              |
|------|-----------|-----------|------------------------|
| 1    | Income    | 0.245     | Higher → more likely approved |
| 2    | Credit    | 0.198     | Higher → more likely approved |
| 3    | Employment| 0.112     | Longer → more likely approved |
| 4    | Age       | 0.067     | Moderate impact        |
| 5    | Zip Code  | 0.023     | Minimal impact         |

## Individual Explanation (Sample #12345)
[Force plot image]
[Waterfall plot image]

## Fairness Analysis
- No significant bias detected for protected attributes
- Gender SHAP value: 0.002 (negligible)

Key Takeaways

Explainability is not optional — it's required by regulation (EU AI Act) and essential for trust
LIME explains single predictions by fitting a local interpretable model around the prediction point
SHAP uses game theory (Shapley values) to fairly distribute credit among features — with mathematical guarantees
Use local explanations to justify individual decisions; use global explanations to understand overall model behavior
SHAP is preferred for regulatory compliance due to its theoretical guarantees
Different stakeholders need different explanation formats: executives want summaries, regulators want details
Combine explainability with fairness analysis to detect and mitigate bias
Make explainability part of your model development lifecycle, not an afterthought

Why Explainability Matters​

The Doctor's Diagnosis Analogy​

Four Pillars of Explainability​

Black-Box vs. White-Box Models​

White-Box Models (Inherently Interpretable)​

Black-Box Models (Opaque)​

The Explainability-Accuracy Trade-off​

Local vs. Global Explanations​

LIME — Local Interpretable Model-agnostic Explanations​

How LIME Works​

LIME Algorithm Step by Step​

Analogy: Explaining a Decision to a Child​

LIME Implementation​

Visualizing LIME Explanations​

LIME Strengths and Limitations​

SHAP — SHapley Additive exPlanations​

Game Theory Foundation​

The Pizza Bill Analogy​

SHAP Properties (Mathematical Guarantees)​

SHAP Implementation​

SHAP Visualizations​

Force Plot (Single Prediction)​

Summary Plot (Global Feature Importance)​

Waterfall Plot (Detailed Single Prediction)​

Bar Plot (Mean Absolute SHAP Values)​

Dependence Plot (Feature Interaction)​

LIME vs. SHAP — Detailed Comparison​

Feature Importance Methods​

Built-in Feature Importance (Tree Models)​

Permutation Importance​

Comparison of Feature Importance Methods​

Partial Dependence Plots (PDP)​

Communicating Results to Stakeholders​

For Executives​

For Regulators​

For Data Scientists​

Explainability Report Template​

Key Takeaways​

Additional Resources​