Week 9 | Session 1: Intelligent Decision Tools — DC Location Problem & K-means Clustering (Intro)
Course: Supply Chain Digitization — Module 3: Analytics in SCM
Session Agenda
Section titled “Session Agenda”1. Case Study — Pharma Company Expansion
Section titled “1. Case Study — Pharma Company Expansion”Background: A pharmaceutical company wants to expand into a new region and has collected location data (latitude & longitude) for 811 prospective customers.

The Core Problem
Section titled “The Core Problem”Need to decide: where to open Distribution Centers (DCs) to serve these 811 customers.
- Decision 1: Where to locate DCs? (How many? At what coordinates?)
- Decision 2: Which customers served by which DC? (How to segment?)
2. The DC Count Tradeoff
Section titled “2. The DC Count Tradeoff”As the number of DCs increases, customers per DC decreases → faster service → responsiveness improves. However, fixed + variable cost per DC × more DCs → total cost increases.
| No. of DCs | Customers per DC | Responsiveness | Total Cost |
|---|---|---|---|
| 1 | All 811 | Low ↓ | Low ↓ (cheapest) |
| 3–4 | ~200 each | ↑↑ | ↑↑ |
| 6 | ~135 each | High ↑↑↑ | High ↑↑↑ (expensive) |
Need to find the optimal K (number of clusters) that balances responsiveness with cost.

3. K-means Clustering — Introduction
Section titled “3. K-means Clustering — Introduction”
What is K-means Clustering? An unsupervised machine learning technique for grouping data points into K clusters.
- Input: dataset with coordinates (lat, long) + value of K.
- Output: (1) cluster assignment for each data point, (2) centroid of each cluster.
K-means Concept Applied to This Case
Section titled “K-means Concept Applied to This Case”| Concept | Explanation |
|---|---|
| K (input) | Number of clusters = number of DCs to open. User defines K. |
| Output 1 — Segmentation | Each of 811 customers assigned to exactly 1 cluster. |
| Output 2 — Centroid | Center point of each cluster = proposed DC location (lat, long). |
4. Why is Centroid the Best DC Location?
Section titled “4. Why is Centroid the Best DC Location?”Centroid = geometric center of all customer coordinates in that cluster. Distance from centroid to each cluster member is minimized on average. Placing the DC at the centroid → minimum average travel distance to all assigned customers.
5. How K-means Answers Both Decisions
Section titled “5. How K-means Answers Both Decisions”- Decision 1 — Customer Segmentation: K-means automatically assigns each of 811 customers to exactly 1 cluster based on spatial proximity.
- Decision 2 — DC Location: The centroid of each cluster gives the latitude and longitude of the proposed DC.

Remaining Open Question
Section titled “Remaining Open Question”K-means solves BOTH decisions for a given K — but what is the right K? Is K = 1, 3, 4, 5, or 6 optimal? This is answered using the Elbow Method (covered in the next session).
Session Summary
Section titled “Session Summary”- Problem: 811 customer locations (lat/long) — where to open DCs, how to assign customers.
- Tradeoff: More DCs → better responsiveness, higher cost.
- K-means Clustering: unsupervised ML technique — groups customers into K clusters.
- Output 1: Segmentation (customer assigned to 1 DC).
- Output 2: Centroid (proposed DC location).
- Why centroid? Minimizes average distance from DC to all customers in that cluster.