Skip to content

Week 7 | Session 2: Demand Forecasting — Regression Tree Case Study (FMCG)

Course: Supply Chain Digitization — Module 3: Analytics in SCM



  • Setting: Large FMCG company with retailer network across India (South/East/West/North)
  • Problem: Demand planning head unsatisfied with forecast accuracy. Gap between actual and forecasted demand leads to either excess inventory OR lost sales.
  • Trigger: Manager attended AI-ML training → understood that AI-ML is needed given how quickly demand patterns are changing
  • Goal: Develop a better demand forecasting model to reduce the gap between actual and predicted demand → lower inventory holding cost + fewer lost sales
  • Model chosen: Regression Tree (decision tree with continuous output) — predicts the ORDER QUANTITY for each retailer

Regression Tree vs. Classification Tree — Key Difference

Section titled “Regression Tree vs. Classification Tree — Key Difference”
AspectClassification TreeRegression Tree
Target variable (Y) typeCategorical (e.g. Fail / Not Fail)Continuous (e.g. Order quantity in units)
Prediction at leaf nodeMajority class label (e.g. “Failed”)Mean (ȳ) of all Y values in that node
Splitting criterionGini index or Entropy (impurity-based)Mean Squared Error (MSE) / Variance reduction
Used in this course forPredictive maintenance (Machine failure)Retailer order quantity prediction

Data Setup — 1000 Retailers, 7 Features, 1 Target

Section titled “Data Setup — 1000 Retailers, 7 Features, 1 Target”
  • Dataset: 1000 retailers (serial numbers 0 to 999). Each row = one retailer in one week.
  • Train-test split: 700 observations for training (70%) | 300 for testing (30%)
  • Target (Y): Order Quantity — continuous number of units ordered by that retailer in that week

7 independent variables (features) + 1 target variable

Section titled “7 independent variables (features) + 1 target variable”
#VariableType / UnitWhy It Matters for Demand
X1RegionCategorical (South/East/West/North)Geographic demand patterns vary significantly by region.
X2Balanced Credit AmountContinuous (₹ Lakhs)Amount the retailer still owes the FMCG company. High outstanding balance → may affect their ordering behaviour.
X3LocationCategorical (Urban / Semi-urban / Rural)Urban retailers typically serve higher footfall → higher demand expected.
X4Age of RetailerContinuous (Years)Older retailers have stronger customer relationships, loyal customer base → higher footfall.
X5Size of Retail StoreContinuous (‘000 sq ft)Larger store → more products displayed → higher customer footfall.
X6Promotional OfferBinary (1 = offered, 0 = not offered)Promotions directly boost demand. A promotional week → significantly higher orders placed.
X7Number of HolidaysCount (0, 1, 2, 3…)More holidays → more shopping occasions → higher demand in that week.
YOrder Quantity (Target)Continuous (Units ordered)How many units the retailer orders in that week.

  • Model used: Regression tree | Training data: 700 obs | Tree depth: 2
  • Variables selected by model: Size of store | Promotional offer | Age of store → model ignored Region, Balance Credit, Location, Holidays as less discriminatory

All 7 nodes — conditions, predicted demand (mean), observations, support

Section titled “All 7 nodes — conditions, predicted demand (mean), observations, support”
NodeTypeConditions to Reach This NodeSplit Variable & ThresholdPredicted Demand (ȳ)Obs.Support
0 (Root)InternalAll training dataSize ≤ 30.5K sq ft2270 (baseline)700100%
1InternalSize of the store ≤ 30.5K sq ftPromotion (0 or 1)190261287%
2InternalSize of the store > 30.5K sq ftAge (≤ 17.5 yrs threshold)48298813%
3 (Leaf)LEAFSize ≤ 30.5K sq ft, Promotion = 0STOP94319828%
4 (Leaf)LEAFSize ≤ 30.5K sq ft, Promotion = 1STOP236041459%
5 (Leaf)LEAFSize > 30.5K sq ft, Age ≤ 17.5 yrsSTOP2887568%
6 (Leaf)LEAFSize > 30.5K sq ft, Age > 17.5 yrsSTOP8227325%

StageInformation UsedPredicted DemandInterpretation
Node 0 (No info)None — random pick from 700 retailers2270 (ȳ for all)Baseline: simple average of all 700 retailers’ orders.
After Split 1 (Node 1)Store size ≤ 30.5K sq ft1902 (↓ from 2270)Adding size info refines prediction for small stores. Now know they order less than average.
After Split 1 (Node 2)Store size > 30.5K sq ft4829 (↑ from 2270)Large stores order much more. Prediction jumps to 4829.
After Split 2 (Node 4, Leaf)Size ≤ 30.5K + Promotion = 12360 (promo effect: +458)Same small store but running a promotion → demand surges.
After Split 2 (Node 6, Leaf)Size > 30.5K + Age > 17.5 yrs8227 (old + large = highest)Large AND old store = highest predicted demand. Loyal customer base + large display space = best combination.

Core logic: More relevant information about a retailer → more refined group it falls into → mean of that group is closer to its actual demand.


4 Business Rules — Shop Floor Reference Card

Section titled “4 Business Rules — Shop Floor Reference Card”
RuleCondition 1 (Store Size)Condition 2 (Promotion or Age)Retailer ProfilePredicted DemandSupport
R1 (Node 3)Size ≤ 30.5K sq ftPromotion = 0 (No offer)Small store, no promotion → low demand week943 units28%
R2 (Node 4)Size ≤ 30.5K sq ftPromotion = 1 (Offer given)Small store BUT promotion running → demand boosted2360 units59%
R3 (Node 5)Size > 30.5K sq ftAge ≤ 17.5 yrs (Relatively new)Large but newer store — not yet established strong customer base2887 units8%
R4 (Node 6)Size > 30.5K sq ftAge > 17.5 yrs (Old, established)Large AND old store → loyal customers, high footfall → highest demand8227 units5%

Given data for Retailer A: Region: West | Balance Credit: ₹10 lakh | Location: Urban | Age: 12 years | Store size: 8,000 sq ft (8K) | Promotional offer: Yes (1) | Number of holidays: 3

  1. Check store size: 8K sq ft ≤ 30.5K sq ft → go to Node 1 (left branch)
  2. Check promotion: Promotion = 1 → go to Node 4 (right branch of Node 1)
  3. Result: Node 4 → Predicted demand = 2360 units

Support Interpretation — How Confident Is the Prediction?

Section titled “Support Interpretation — How Confident Is the Prediction?”

Support = % of training observations that fall into that leaf node.

  • Node 4 (59% support): Most reliable — covers majority of small-store + promotion retailers.
  • Node 6 (5% support): Lowest confidence — only 32 of 700 are large + old stores.
  • Yes: Splitting nodes further with additional features → lower mean squared error within each leaf → predictions get closer to individual retailer actual demand.
  • Trade-off: Deeper tree = better training accuracy BUT higher risk of overfitting. Same stopping criteria apply as in classification tree (Max depth limit, Min observations, Min variance reduction).

  • Case: FMCG company demand planning head → uses regression tree to predict retailer order quantities.
  • Regression tree vs. classification tree: Continuous target → leaf predicts mean (ȳ). Splitting criterion = variance reduction (not Gini).
  • Model selects 3 of 7 features: Store size | Promotional offer | Age of store
  • 4 leaf nodes → 4 business rules: 943 (small, no promo) | 2360 (small + promo, 59% support) | 2887 (large + young) | 8227 (large + old — highest demand)
  • Retailer A prediction: Size = 8K sq ft (≤ 30.5K) | Promotion = 1 → Node 4 → Predicted demand = 2360 units
  • Next session: Build regression tree in Python + demand forecast error metrics (MAE, RMSE, MAPE)