Skip to content

Week 9 | Session 1: Intelligent Decision Tools — DC Location Problem & K-means Clustering (Intro)

Course: Supply Chain Digitization — Module 3: Analytics in SCM



1. Case Study — Pharma Company Expansion

Section titled “1. Case Study — Pharma Company Expansion”

Background: A pharmaceutical company wants to expand into a new region and has collected location data (latitude & longitude) for 811 prospective customers.

Customer Locations

Need to decide: where to open Distribution Centers (DCs) to serve these 811 customers.

  1. Decision 1: Where to locate DCs? (How many? At what coordinates?)
  2. Decision 2: Which customers served by which DC? (How to segment?)

As the number of DCs increases, customers per DC decreases → faster service → responsiveness improves. However, fixed + variable cost per DC × more DCs → total cost increases.

No. of DCsCustomers per DCResponsivenessTotal Cost
1All 811Low ↓Low ↓ (cheapest)
3–4~200 each↑↑↑↑
6~135 eachHigh ↑↑↑High ↑↑↑ (expensive)

Need to find the optimal K (number of clusters) that balances responsiveness with cost.

Tradeoff Graph


K-means Clustering DC Location Problem

What is K-means Clustering? An unsupervised machine learning technique for grouping data points into K clusters.

  • Input: dataset with coordinates (lat, long) + value of K.
  • Output: (1) cluster assignment for each data point, (2) centroid of each cluster.
ConceptExplanation
K (input)Number of clusters = number of DCs to open. User defines K.
Output 1 — SegmentationEach of 811 customers assigned to exactly 1 cluster.
Output 2 — CentroidCenter point of each cluster = proposed DC location (lat, long).

Centroid = geometric center of all customer coordinates in that cluster. Distance from centroid to each cluster member is minimized on average. Placing the DC at the centroid → minimum average travel distance to all assigned customers.


  • Decision 1 — Customer Segmentation: K-means automatically assigns each of 811 customers to exactly 1 cluster based on spatial proximity.
  • Decision 2 — DC Location: The centroid of each cluster gives the latitude and longitude of the proposed DC.

K-means Output Examples

K-means solves BOTH decisions for a given K — but what is the right K? Is K = 1, 3, 4, 5, or 6 optimal? This is answered using the Elbow Method (covered in the next session).


  • Problem: 811 customer locations (lat/long) — where to open DCs, how to assign customers.
  • Tradeoff: More DCs → better responsiveness, higher cost.
  • K-means Clustering: unsupervised ML technique — groups customers into K clusters.
  • Output 1: Segmentation (customer assigned to 1 DC).
  • Output 2: Centroid (proposed DC location).
  • Why centroid? Minimizes average distance from DC to all customers in that cluster.