Datasets for Chapters and Capstone Project
(with the NorthStar Golden Thread Design)
This document provides structured descriptions of the datasets used throughout the book and the capstone project.
It is designed to help you:
Data are not just inputs—they are representations of how organizations operate, respond, and decide.
Throughout this book, most chapters use a shared dataset:
NorthStar Retail Group — Everyday Essentials™ Weekly Sales
This is not accidental. It is a deliberate instructional design choice known as the Golden Thread.
The Golden Thread means:
Without a Golden Thread, learning forecasting often feels like:
With the Golden Thread, you instead experience:
You are not relearning new data each chapter.
You are deepening your understanding of the same system.
The dataset evolves as your skills evolve:
In real organizations:
The Golden Thread mirrors this reality.
Because the dataset remains consistent:
The goal is not to learn many datasets.
The goal is to learn how one system can be understood, modeled, and governed over time.
|
Dataset |
Chapters |
Role in Learning Progression |
|
essentials_sales_lite.csv |
Chapters 1–5 |
Seeing signal and structure (Golden Thread – simplified view) |
|
essentials_sales_residuals.csv |
Chapter 6 |
Diagnosing behavior and trust (Golden Thread – diagnostic layer) |
|
essentials_sales.csv |
Chapters 7–8 |
Decision-aware forecasting (Golden Thread – full system) |
|
healthcare_capacity_weekly.xlsx |
Capstone Project |
New domain, full decision system under uncertainty |
Used in Chapters 1, 2, 3, 4, 5
Golden Thread — Foundational Layer
This dataset introduces time-based thinking using a simplified structure.
It supports:
|
Variable Name |
Type |
Description |
Example |
Decision Meaning |
|
week |
Date / string |
Weekly time period |
2023-W01 |
Time anchor |
|
week_index |
Integer |
Sequential index |
1, 2, 3 |
Ordering of observations |
|
sales |
Numeric |
Weekly unit sales |
12,540 |
Demand signal |
This is your first view of the system—simple, but foundational.
Before modeling, you must learn to see.
Used in Chapter 6
Golden Thread — Diagnostic Layer
This dataset introduces forecast diagnostics and validation.
It supports understanding:
|
Variable Name |
Type |
Description |
Example |
Decision Meaning |
|
week |
Date |
Time index |
2023-W10 |
Tracking residual over time |
|
actual_sales |
Numeric |
Observed sales |
12,300 |
Ground truth |
|
forecast_sales |
Numeric |
Model prediction |
12,100 |
Expected value |
|
residual |
Numeric |
Actual − Forecast |
+200 |
Forecast error |
|
abs_error |
Numeric |
Absolute error |
200 |
Error magnitude |
|
squared_error |
Numeric |
Squared error |
40,000 |
Penalized error |
|
rolling_residual_mean |
Numeric |
Smoothed residual |
150 |
Drift detection |
Residuals reveal:
This is where the system becomes self-aware.
Forecasting is not complete until you understand when it fails.
Used in Chapters 7 and 8
Golden Thread — Decision Layer
This dataset introduces decision-aware forecasting by adding operational and contextual variables.
It supports:
|
Variable Name |
Type |
Description |
Example |
Decision Link |
|
week |
Date |
Time index |
2023-W15 |
Temporal anchor |
|
sales |
Numeric |
Weekly sales |
13,200 |
Target variable |
|
Variable Name |
Type |
Description |
Example |
Decision Link |
|
inventory |
Numeric |
1-sales/3500 |
0.2 |
Supply constraints |
|
Variable Name |
Type |
Description |
Example |
Decision Link |
|
holiday_flag |
Binary |
Holiday week |
1 |
Seasonal demand |
|
promotion_flag |
Binary |
Promotion active |
1 |
Marketing actions |
This is the full system view, where forecasting meets decisions.
Forecasts become meaningful when they reflect how decisions shape outcomes.
Used in Capstone Project
Capstone Transition — New Domain
This dataset introduces a new domain where forecasting must guide high-stakes decisions.
It represents:
|
Variable Name |
Type |
Description |
Example |
Decision Link |
|
week |
Date |
Weekly time period |
2022-W40 |
Time anchor |
|
patient_demand |
Numeric |
Weekly patient volume |
1,250 |
Demand planning |
|
bed_capacity |
Numeric |
Available beds |
1,100 |
Capacity limits |
|
staffing_level |
Numeric |
Available staff |
320 |
Resource planning |
|
utilization_rate |
Numeric (%) |
Capacity usage |
0.92 |
System stress |
|
emergency_flag |
Binary |
Surge condition |
1 |
Crisis response |
|
policy_change_flag |
Binary |
Policy shift |
1 |
Structural change |
You must transfer your learning from the Golden Thread into a new, unfamiliar system.
Mastery is demonstrated when you can apply design thinking beyond the original dataset.
|
Stage |
Dataset |
Learning Focus |
|
Early |
Lite dataset |
Seeing patterns |
|
Middle |
Residual dataset |
Understanding behavior |
|
Late |
Full dataset |
Designing systems |
|
Capstone |
Healthcare dataset |
Making decisions under uncertainty |
The Golden Thread ensures that learning progresses as:
see → model → diagnose → design → decide
Before modeling, always ask:
Across all datasets, one principle remains:
Forecasting is not about the data you have—it is about how you interpret data to support decisions.