The Challenge
GrowthTech, a fast-growing B2B SaaS company with 5,000+ customers, was facing a critical problem: monthly churn was hovering around 5%, and their Customer Success team had no systematic way to identify which customers were at risk.
Their existing approach was reactive—by the time a customer expressed dissatisfaction, it was often too late to intervene. They needed a proactive, data-driven system to:
- Identify at-risk customers before they churned
- Understand why specific customers were at risk
- Give CSMs actionable insights they could act on immediately
- Integrate seamlessly with their existing CRM (Salesforce)
Key Constraints:
- Had to pass security review for SOC 2 compliance
- Needed explainable predictions (no black-box models)
- CSMs were non-technical and needed clear, actionable recommendations
- System had to integrate with Salesforce and their support ticketing system
Our Approach
We designed a three-phase solution:
Phase 1: Data Foundation (Weeks 1-2)
We started by consolidating data from multiple sources:
- Product usage events (Segment)
- Support tickets (Zendesk)
- Billing history (Stripe)
- CRM data (Salesforce)
Built a unified customer event store using PostgreSQL with partitioning for performance, and created feature pipelines with Airflow to compute behavioral signals daily.
Key features engineered:
- Login frequency and recency
- Feature adoption scores
- Support ticket velocity and sentiment
- Payment failures and billing issues
- Contract renewal proximity
- User engagement trends (7-day, 30-day, 90-day windows)
Phase 2: Model Development (Weeks 3-4)
We trained a gradient-boosted decision tree model (XGBoost) using 18 months of historical data, with a 60/20/20 train/validation/test split.
The model predicted churn probability at the account level with 90% accuracy and 0.85 AUC-ROC. More importantly, we integrated SHAP (SHapley Additive exPlanations) to provide feature importance for every prediction.
Phase 3: Production Deployment & Integration (Weeks 5-6)
Deployed the model as a FastAPI microservice with:
- Real-time scoring endpoint (<100ms p95 latency)
- Batch scoring job that runs daily via Airflow
- Integration with Salesforce to surface risk scores and explanations directly in account records
- Automated Slack alerts to CSMs when high-value accounts crossed risk thresholds
Built a simple React dashboard for CS leadership to track:
- Overall churn risk trends
- Top at-risk accounts by ARR
- Most common churn risk factors
- Intervention effectiveness tracking
Architecture
┌─────────────────┐
│ Data Sources │
│ (Segment, CRM, │
│ Zendesk, etc) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ PostgreSQL DB │
│ (Event Store) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Airflow ETL │
│ (Feature Eng) │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────┐
│ XGBoost Model │◄──────┤ MLflow │
│ + SHAP │ │ Registry │
└────────┬────────┘ └──────────────┘
│
▼
┌─────────────────┐
│ FastAPI │
│ Prediction API │
└────────┬────────┘
│
├──► Salesforce (CRM)
├──► Slack Alerts
└──► CS Dashboard
Results
Within the first quarter after deployment:
- 35% reduction in churn: Monthly churn dropped from 5.0% to 3.3%
- $2.1M in saved ARR: Proactive interventions retained high-value accounts that would have churned
- 90% CSM satisfaction: CS team reported high confidence in model predictions and appreciated actionable explanations
- <100ms prediction latency: Real-time scoring enabled use in live customer interactions
Most Impactful Features:
- Decline in feature usage (30% of churn risk)
- Support ticket velocity spike (25%)
- Payment failures (20%)
- Decrease in login frequency (15%)
- Contract renewal proximity + negative NPS (10%)
Key Learnings
-
Explainability is non-negotiable: CSMs needed to understand why a customer was flagged. SHAP explanations were critical for adoption.
-
Start with simple features: We initially over-engineered features. The most predictive signals were straightforward usage and engagement metrics.
-
Integrate where users already work: Surfacing predictions directly in Salesforce drove adoption far more than a standalone dashboard.
-
Model monitoring is essential: We built drift detection to alert when prediction patterns changed, catching a data pipeline bug within 48 hours.
-
Balance precision and recall carefully: We optimized for high precision (few false positives) because CSM time was limited and false alarms eroded trust.
Technology Stack
- ML Framework: Python, Scikit-learn, XGBoost, SHAP
- Data Infrastructure: PostgreSQL, Apache Airflow
- API: FastAPI, Docker, Kubernetes
- Monitoring: MLflow, Prometheus, Grafana
- Integrations: Salesforce API, Slack Webhooks
- Frontend: React, Recharts for dashboards
Conclusion
This project demonstrated that churn prediction doesn't have to be a black-box exercise. By combining strong predictive performance with clear explainability and tight integration into existing workflows, we built a system that CSMs trusted and used daily—ultimately saving millions in ARR and improving customer relationships.
Want to build something similar? Book a discovery call to discuss your churn prediction needs.