Projects - Purdue Business

Past Projects

Interpretation of Forecasting Models

Opportunity: Create an approximate model for black box forecasting to understand how parameters like discount, seasonality etc. affect product sales over time
Solution: Gained insights into the inner workings of forecasting model by utilizing data modeling, statistical modeling, and predictive modeling techniques
Outcome: Improved interpretability, informed decision-making, and enhanced forecasting accuracy for optimized sales strategies

Click to learn more

Interpretation of Forecasting Models

Objectives and Methodology
1. To explore different algorithms that can be used for interpretation of the workings behind black box models
2. To study the contribution and impact of each input variable on the forecast prediction provided by the black box model
3. To create approximation models to get above outputs for single observation and observations over a period
Data and overview of experiments
1. Out of ~14k unique DFUs and 24 input features, top 10 DFUs and 4 input features wereused for the final model building process
2. Based on business requirements of the output, XGBoost interpretation using Linear Regression and LIME; and Prophet were selected as final methods for dashboard
3. Global surrogate models explain holistic variable contributions and impacts from the black box model for the entire dataset
4. Whereas local surrogate models explain the variable contribution for a single test observation
XG Boost – Global Explanations
1. Forecasts from the XGBoost model are used as target variable for the linear regression that gives a global interpretation of how XGBoost works under the hood
2. First level of interpretation can be done using significance to understand the drivers of the forecast; sign and magnitude of the coefficient can then serve to give a direction for marketing strategy
XG Boost – Local Explanations
1. Total Sales(LIME) = Intercept + Total Points of Distribution Total + Depth of Discount + Month + Day
2. Contribution by seasonality to product sales increases on Sept 21
Prophet – Local and Global Explanations
1. Prophet Model: F(t) + F(s) + holidays + extra regressors ( Total Points of Distribution + Depth of Discount)
2. Sales Forecast = Day of Month + Discount + Holiday + Month + TPD + Year
Conclusion and Future Scope
1. This study tries to create an approximate model around the original forecasting model in order to give users a sense of how different parameters such as discount, trend, seasonality are affecting the sales of their product at different points in time.
2. Checking the statistical significance of the variable and visualizing the contributions can help understand the business impact from the variable
3. Combining Prophet + LIME Models explains and supports feature contribution on sales and business critical values
4. Not all variables for all products will have similar impacts on sales and for few products there aren't much explanations with the given variables.
5. Since interpretation models are approximations of black box models, the exact contribution cannot be directly inferred. However, it is useful to indicate the direction and magnitude of the impact.

Forecasting and Methodology: Improving the Accuracy of Predicted Ocean Travel Time

Opportunity: Identify the factors affecting ocean travel time and improve current forecasting methodology using external factors like marine traffic and port congestion.
Solution: Gained insights into the inner workings of forecasting model by utilizing data modeling, statistical modeling, and predictive modeling techniques
Outcome: Achieved a validation WAPE of 11.13% and demonstrated that Anchor variables are not significant in the presence of the other three clusters.

Click to learn more

Forecasting and Methodology: Improving the Accuracy of Predicted Ocean Travel Time

An American exercise equipment and media company had received a record-breaking order volume alongside a reduction in staff due to the COVID-19.

The supply chain was broken, and delivery times changed from 30 to 60 days due to anchoring and port congestion. Our objective was to:

Identify the factors affecting ocean travel time i.e., from point A to B.
Improve the current forecasting methodology using external factors like marine traffic and port congestion.

Methodology

The client presented their forecasting model, wherein ocean lead time was factored using simple exponential smoothing with no trend, seasonality and external factors.

Validation WAPE (Weighted Absolute Percentage Error) of 11.35% for data from week of 1st February 2021 to week of 17th May 2021 had been achieved till our team was brought in.

Our approach to problem solving involved the following steps:

Data Collection: Collect the Anchorage Time, Port Time, No. of Vessels, and No. of Calls at the port of arrival for each week.

Data Exploration: Remove “Calls” cluster of variables using correlation analysis.

Feature Engineering: Create lagged derivatives for Ocean, Anchor, Port & Vessels clusters.

Data Partition: Split the Train & Test data using rolling forecasting origin techniques.

Forecasting: Build regression, SVM, Random Forest, Gradient Boosting models.

Outcome

We achieved a validation WAPE of 11.13% and were able to demonstrate that among the external factor clusters - Anchor, Port, Vessels and Lead Times; Anchor variables are not significant in the presence of the other three clusters.

Apart from the Historical Lead Times in the Ocean, the previous week’s Port Time & Number of vessels 2-weeks earlier are affecting the Lead Time Ocean for the current week. 

Tools used

Python & Tableau

Redesigning a Value Based Program

Opportunity: Redesign the Value Based Program with changes to the reimbursement model and components for improved payments
Solution: Developed a semantic layer for reports, creating report content, preparing test data, and building data extracts
Outcome: Successfully delivered 5 reporting dashboards integrated with the client's database. Enhanced reporting capabilities, enabling better insights and decision-making

Click to learn more

Redesigning a Value Based Program

In the collaboration project with Accenture and Krenicki Center, students assisted in developing semantic layer for reports, developed report content, prepared test data, and assisted in building data extracts. The final deliverables were 5 reporting dashboards successfully integrated with the client’s database. The students undergo three main phases of the project to achieve the final deliverables:

Developing wireframes with test data
Connecting reports to database
Querying data

Throughout the project, students learned technical skills such as data visualization with Tableau Desktop and SQL querying with QMF. Moreover, students also strengthened soft skills by building strong communication and presentation skills through weekly meetings with the Accenture team.

Reducing the Uncertainty During a SAP Transition

Opportunity: Create an interactive dashboard and reduce the uncertainty during SAP transition
Solution: Pre-processed complex datasets and leveraged forecasting engines in R to make the output compatible with visualization tools
Outcome: Interactive dashboard, improved forecasting accuracy, project extensions identification, enabling informed decision-making and facilitating future project handover

Click to learn more

Reducing the Uncertainty During a SAP Transition

Goals:

Create an interactive dashboard to help team view insights and tag points of interest.
Leverage predictive analytics and forecasting to decrease uncertainty in SAP transition.
Use our creativity and analytical mindset to see where other improvements could be made based on the suggestions of Accenture leadership. a. This was continuously evolving, but the overall goal – since no specific SOW was used for the majority of the project.

Processes:

Dashboard a. Learning Power BI and utilizing Data Visualization best practices to create a final product for the client.
1. Pre-processing large and unruly data sets to make understandings allowed by the visualization easier to attain.
2. Solidifying front and backend processes in order to make sure the dashboard would function across different aspects of the business.
Forecasting a. Immense pre-processing to be able to feed the data to well known forecasting engines in R (i.e. ETS).
1. Running the engines and working to increase accuracy with limited data and loose requirements for accuracy.
2. With the forecasted data, making the output file compatible with visualization tools and techniques.
Creation past the original SOW a. Present to the Accenture team the work we had done to create a jumping point for a brainstorming session.
1. Brainstorming session with Accenture team in Chicago office to direct the remainder of our time with the project.
2. Splitting up the brainstormed ideas into four subsequent projects i. Staffing Plan analysis
3. Meeting Cadence Trends
4. Peaks & Valley Analysis
5. Data Hygiene
6. Devising the analytical methods to attack the problems outlined above.
7. Working through those sub-projects and presenting at least once a month for Accenture team feedback
8. Collating all the above work for a final presentation with Accenture Leadership and the team that is taking over the work from here on out.