Turning On Einstein Prediction Builder with Dirty Data
“AI trained on garbage data gives you garbage predictions with confidence scores.”
What Happened
Client was excited about Einstein Prediction Builder. We turned it on to predict Opportunity close probability. Problem: their historical data was a mess. Won deals were never updated from 'Prospecting' stage. Lost deals sat in 'Negotiation' forever. The AI learned that 'Prospecting' was the best stage to be in because that's where most won deals lived. It gave 90% close probability to brand new leads. The sales VP started making revenue forecasts based on these predictions.
The Wrong Way
Einstein Prediction Builder Setup: Object: Opportunity Predict: IsWon (Boolean) Data Quality: → 40% of Won opps never moved past "Prospecting" stage → 30% of Lost opps still show "Negotiation" stage → CloseDate is the creation date on 60% of records (never updated) → Amount is $0 on 25% of Won opps Result: "Prospecting" stage = 90% win probability Reality: Garbage in, garbage out with a confidence score
The Right Way
// Step 1: Clean your data BEFORE enabling Einstein // Run data quality report: SELECT StageName, IsWon, IsClosed, COUNT(Id) ct FROM Opportunity WHERE CreatedDate = LAST_N_YEARS:2 GROUP BY StageName, IsWon, IsClosed ORDER BY StageName // Step 2: Fix historical data // - Update Won opps to Closed Won stage // - Update Lost opps to Closed Lost stage // - Populate missing Amount values // - Correct CloseDate to actual close dates // Step 3: Add validation rules to prevent future dirty data // - Require Amount before moving past Qualification // - Auto-update CloseDate when stage = Closed Won/Lost // - Require Close Reason on Closed Lost // Step 4: Wait for 2 cycles of clean data (6-12 months) // Step 5: THEN enable Einstein with clean training data // Step 6: Validate predictions against actual outcomes monthly
The Lesson
AI amplifies your data quality problems. If your data is dirty, Einstein will confidently make terrible predictions. Clean first, predict later.
Enjoyed this? Get more like it.
Glen's Musings — AI, investing, and building things. Occasional. Free.
More AI & Agents Mistakes
Deploying Einstein Copilot Without Testing Its Guardrails
An AI assistant with full org access and no guardrails is a data breach waiting to happen.
Read moreAnnoyingBuilding Prompt Templates Without Grounding in Record Data
An ungrounded prompt hallucinates. A grounded prompt uses your actual Salesforce data.
Read moreCareer-EndingLetting AI Trigger Actions Without Human-in-the-Loop Approval
AI should suggest. Humans should approve. Not the other way around.
Read more