Customer Health Scoring with AI Signals (L4 Playbook)

Why this matters

The traditional "weighted" health score is a lie. Most CS teams manually assign percentages to metrics—30% for logins, 20% for NPS, 50% for "CSM Sentiment"—based on gut feeling. This approach fails because it treats every signal as static and operates in a vacuum. It can’t catch the account where usage is high, but the sentiment in support tickets is toxic, or the account where login volume is steady, but the executive sponsor hasn't opened the app in four months.

For a $50M ARR company, a 2% improvement in Net Revenue Retention (NRR) doesn't just add $1M to the bottom line; it adds $10M+ to the enterprise valuation. The cost of doing nothing is the "Silent Churn"—accounts that look healthy on a dashboard but are already interviewing your competitors. Level 4 maturity means moving from retrospective reporting to a predictive system that signals risk 90 days before a renewal conversation even starts.

How it works

1. Define and segment churn events

A health score is useless if it’s trying to predict "bad vibes." You must predict specific financial outcomes. Start by opening your CRM (Salesforce or HubSpot) and pulling your historical account data into Snowflake or BigQuery.

You need to define three binary flags for every account over the last 24 months:

Logo Churn: The contract was terminated.
Downgrade: They moved from Enterprise to Pro.
Contraction: Seat count dropped by >10% or usage-based revenue plummeted.

The Math: If your Logo Churn rate is 5% but your Contraction rate is 15%, your current health score is likely missing the CFO-led budget cuts. Separating these allows your AI model to weigh "efficiency" signals differently than "utility" signals.

2. Build a multi-dimensional signal inventory

Stop measuring "Logins." It’s a vanity metric. Instead, connect your data stack to build a three-pillared signal inventory:

Product (Usage Decay): Use Mixpanel or Pendo to track the "Core Action" (e.g., reports exported, API calls made). Export the 30-day delta. A 20% drop in core action frequency is a Tier 1 risk signal.
Support (AI Sentiment): Don't just count tickets. Use an LLM via Gong or MonkeyLearn to analyze Zendesk/Intercom threads. Assign a "Sentiment Score" (0-1). An account with 10 tickets where the user says "This is frustrating" is healthier than an account with 0 tickets and 0 logins.
Relationship (Exec Pulse): Use your calendar/email integration to track "Partner Personas." If no one with a "VP" or "C-Suite" title has logged in or replied to an email in 90 days, the account is unanchored.

3. Train and backtest the AI model

Before you roll this out to CSMs, you have to prove it works. Use a "Lookback" methodology with a tool like Akkio or a Python Random Forest model.

Take your data from six months ago—ignore what happened since—and ask the model: "Based on these signals, who will churn in the next 180 days?" Compare the model's predictions to what actually happened. You are aiming for an AUC (Area Under Curve) of >0.75. If your model scores 0.50, it's a coin flip. If it's 0.99, you are "overfitting" (usually because you're including "requested cancellation" as an input signal, which is cheating).

4. Deploy the score to the CSM workflow

Data scientists love probabilities (e.g., "0.82 risk"); CSMs need actionable scores. Map that probability to a 0-100 scale:

0-40 (Red): Immediate intervention required.
41-70 (Yellow): Monitor/Monthly check-in.
71-100 (Green): Expansion opportunity.

The "Why" Column: This is the most critical step. Use Claude or OpenAI's API to generate a 10-word summary of the score. Example: "Usage down 30%; Exec Sponsor departed; Support sentiment: Angry." Without the "Why," your CSMs will ignore the score.

5. Execute mandatory save-play sequences

A "Red" score must trigger an automated Task in Salesforce. This isn't a suggestion; it’s a playbook execution. The "Save Play" should include:

Executive Outreach: An automated (but personalized) email from your VP of CS to their VP-level contact.
Usage Audit: A technical review of why seat utilization has dropped.
Value Realization Call: A scheduled meeting specifically to re-confirm the ROI they saw during the sales cycle.

Tools you need

Data Warehouse: Snowflake or BigQuery (to house the historical data).
Product Analytics: Mixpanel, Pendo, or Amplitude.
AI/ML Modeling: Akkio (no-code), Pecan, or Python (Scikit-learn).
Sentiment/Signals: Gong (for call/email sentiment), Zendesk (support data).
Workflow: Gainsight, Totango, or Salesforce Flow.

KPIs to track

Net Revenue Retention (NRR): Goal is 105%+ for mid-market, 115%+ for enterprise.
Save Rate: % of "Red" accounts that return to "Yellow" or "Green" within 60 days.
Time-to-Intervention: How many hours pass between a score dropping to Red and a CSM executing the first step of the Save Play? (Target: <48 hours).

Common pitfalls

The "Zombie" Account: Don't waste your best CSMs on $2k ARR accounts that have been "Red" for a year. Filter your Save Plays by Customer Lifetime Value (CLV).
Ignoring the Human: AI predicts the risk; humans fix the relationship. If your CSMs treat the score as a replacement for talking to customers, your churn will actually increase.
Signal Noise: Using too many inputs. Start with 5 high-quality signals (Usage, Sentiment, Exec Presence, Billing Latency, and NPS).

When to graduate to the next level

Once your AI health score is accurately predicting 80% of churn and your Save Rate is above 30%, you are ready for L5: Automated Expansion Logic. This is where the model doesn't just find the risks; it identifies the "Hidden Champions" and automatically triggers growth sequences and automated upsells based on "Super User" behavior.