The MSBAIM Experiential Learning course (MGMT 690 Industry Practicum) provides our graduate students the opportunity to apply the knowledge from their studies in an applicable area via a thesis or project. This course is ideal for students who will pursue analytical-type careers (e.g. Data/OR Analyst, Data Scientist, Decision Scientist, Business Consultant, etc.) and it is designed to polish and integrate the knowledge and skills you developed from your masters-level coursework by successfully developing an analytics solution with an industry partner.
Thus, most of your time will be devoted to working with your project teammates to provide the answers and deliverables specified by the partner. Projects usually have 3-5 members per team and are assigned based on feedback and interest of each student. Every student will discuss their career placement goals and interests post-Purdue with Professor Lanham prior to beginning the program in June.
After student feedback has been received, projects will be scoped out with many industry partners to provide students the most customized experimental learning opportunity available. Often several students will have similar domain interests (e.g. retail, consulting, sports, etc.) and or job placement goals. Those students will team up to tackle a project together. All detailed project proposals with companies and proposed student teams will be posted by winter break. Once students have reviewed their assigned project and read the various project proposals, they may change teams with the constraint that no project has more than five team members or no less than two team members per project. Officially projects will begin once students return from winter break, but students can begin working on projects once they have signed the partner’s non-disclosure agreement and had their first meeting with the partner. All project requirements must be completed by the end of the third module (mid Spring semester).
If you have any questions about project opportunities, please contact Professor Lanham at lanhamm@purdue.edu.
Below is a list of some of the projects MSBAIM students worked on recently:
Risky Business: Predicting Cancellations in Imbalanced Multi-Classification Settings
We identify a rare event of a customer reneging on a signed agreement, which is akin to problems such as fraud detection, diagnosis of rare diseases, etc. where there is a high cost of misclassification. Our approach can be used in all cases where the class to be predicted is highly under-represented in the data (i.e. data is imbalanced) because it is rare by design; there is a clear benefit attached to this class’ accurate classification and even higher cost attached to its misclassification. Pre-emptive classification of churn, contract cancellations, identification of at-risk youths in a community, etc. are potential situations where our model development and evaluation approach can be used to better classify the rare but important events. We use Random Forest and Gradient Boosting classifiers to predict customers as members of a highly underrepresented class and handle imbalanced data using techniques such as SMOTE, class-weights, and a combination of both. Finally, we compare cost-based multi-class classification models by measuring the dollar value of potential lost revenue and costs that our client can save by using our model to identify at-risk projects and proactively engaging with such customers. While most research deals with binary classification problems when handling imbalanced datasets, our case is a multi-classification problem, which adds another layer of intricacy.
Optimal Clustering of Products for Regression-Type and Classification-TypePredictive Modeling for Assortment Planning
In collaboration with a national retailer, this study focused on assessing the impact of sales prediction accuracy when clustering sparse demand products in various ways, while trying to identify scenarios when framing the problem as a regression-problem or classification-problem would lead to the best demand decision-support. This problem is motivated by the fact that modeling very sparse demand products is hard. Some retailers frame the prediction problem as a classification problem, where they obtain the propensity that a product will sell or not sell within a specified planning horizon, or they might model it in a regression setting that is plagued by many zeros in the response. In our study, we clustered products using k-means, SOMs, and HDBSCAN algorithms using lifecycles, failure rates, product usability, and market-type features. We found there was a consistent story behind the clusters generated, which was primarily distinguished by particular demand patterns. Next, we aggregated the clustering results into a single input feature, which led to improved prediction accuracy of the predictive models we examined. When forecasting sales, we investigated a variety of different regression- and classification-type models and report a short list of those models that performed the best in each case. Lastly, we identify certain scenarios we observed when modeling the problem a classification problem versus a regression problem, so that our partner could be more strategic in how they use these forecasts for their assortment decision.
Effect of Forecast Accuracy on Inventory Optimization Model
This study describes an optimization solution to minimize costs at the inventory system by the retailer. In the past, all demands were forecasted yearly and information regarding item distribution was not used. The retailer used weekly and monthly demand forecasts by just dividing yearly forecast with specific numbers. Therefore, the retailer purchased items in bulk to prepare for unexpected demand from vendors, which generated huge holding costs. If we approach the distribution of each item, then a dynamic economic order quantity model would be possible. We solved this problem by using diverse distributions for each item. Then, we built up formulas to calculate costs and service levels. Then we optimized our model to minimize the cost, while meeting several business requirements, such as minimum service level, for each item type. Lastly, we show the impact that the quality of the demand forecast has on the business.
Future LPG Shipments Forecasting Based on Empty LPG Vessels Data
This project assesses the feasibility of using information about empty liquefied petroleum gas (LPG) carrier vessels that are moving within the ocean to predict the how much LPG will be shipped in the future. As prices of multiple commodities fluctuate with the supply and demand of LPG, it is crucial to identify effective indicators of LPG commodity flow to foresee future trends of the market. In order to conduct this analysis, we acquired access to the shipping schedule data of empty and full LPG vessels, and ran multiple types of regressions to understand the correlation between these two factors. After iterative analyses, we obtained a valid ridge regression predictive model among linear, ridge and SVR regressions models, receiving R square of up to 0.8 on test dataset
A Comparative Study of Machine Learning Frameworks for Demand Forecasting
In collaboration with a national consulting company, this study’s objectives are twofold: (1) which machine learning approaches perform the best at predicting demand for grocery items? and (2) what is the performance one could expect to achieve using an open-source workflow versus using proprietary in-house machine learning software? The motivation behind this research is that consulting companies regularly help their retail clients try to understand demand as accurately as possible, but also in a scalable and efficient manner. Efficient and accurate demand forecast enables retailers to anticipate demand and plan better. In addition to delivering accurate results, data science teams must also continue to develop and improve their workflow so that experiments can be performed with greater easy and speed. We found that using open-source technologies such as scikit-learn, postgreSQL, and R, a decent performing workflow could be developed to train and score forecasts for thousands of products and stores accurately at various aggregated levels (e.g. day/week/month) level using deep-learning algorithms. However, the performance of our solution is yet to be compared to the data science team’s commercial platform that we collaborated with and will be added soon. We have been able to learn how they have been able to achieve performance gains (in model accuracy and runtime), which made this collaboration a great learning experience.
Forecasting Intermittent Demand Patterns with Time Series and Machine Learning Approaches
The study aims to generalize the predictive accuracy of various machine learning approaches, along with the widely used Croston’s method for time-series forecasting. Using multiple multi-period time-series we see if there is a method that tends to capture intermittent demand better than others. The motivation of this study is that demand forecasting is an important component of business planning, but it is also a challenge, especially in the case for intermittent demand. If companies cannot capture the demand accurately, companies either risk losing sales and customers when items are stocked out, or are burdened with excessive inventories. Therefore, being able to identify even small improvements in prediction accuracy of intermittent demand can translate into significant savings (Aris A. Syntetos, Zied Babai, & Gardner, 2015). In collaboration with a supply chain consulting company, we investigated over 160 different intermittent timeseries to identify what works the best.
A Solution to Forecast Demand Using LSTM Recurrent Neural Networks for Time Series Forecasting
This study focuses on predicting demand based on data collected which spans across many periods. To help our client build a solution to forecast demand effectively, we developed a model using Long Short Term Memory (LSTM) Networks, a type of Recurrent Neural Network, to estimate demand based on historical patterns. While there may be many available models for dealing with a time series problem, the LSTM model is relatively new and highly sophisticated to its counterparts. By comparing this study which works excellently for sequential learning, to the other existing models and techniques, we are now closer to solving at least one of many complications apparent across industries The study becomes all the more important for supply chain professionals, especially those in purchasing, as they can now rely on a highly accurate model instead of basing their forecasts purely on intuition and recent customer behavior.
An Optimization Approach for Assortment Planning
In this report, we are dealing with quintessential questions of supply chain management “What to stock?”. Inventory management is the core of the supply chain that is associated with a significant percentage of cost. There is a trade-off between inventory cost and customer satisfaction as high inventory level will increase cost but will also increase service level (customer service level), and vice versa. Hence, firms aim to obtain a balance between the two so that they can maximize service level with minimum cost. This brings the idea of the assortment of inventories i.e. what combinations and quantities of SKU to be stocked so that customers can find the desirable SKUs. We have built an optimization model that maximizes profit as our objective function for the different assortment of SKU’s for various stores with total space, total cost, and quantity of eligible SKUs in the assortment as decision variables. The model is built using Gurobi R.
A Retrospective Investigation of Test & Learn Business Experiments & Lift Analysis
This study provides an analysis to retrospectively investigate how various promotional activities (e.g. discount rates and bundling) affect a firm’s KPIs such as sales, traffic, and margins. The motivation for this study is that in the retail industry, a small change in price has significant business implications. The Fortune 500 retailer we collaborated with thrives on low price margins and had historically ran many promotions, however, until this study, they had limited ability to estimate the impact of these promotions on the business. The solution given employs a traditional log-log model of demand versus price to obtain a baseline measure of price sensitivity, followed by an efficient dynamic time-series intermittent forecast to estimate the promotional lift. We believe our approach is both a novel and practical solution to retrospectively understand promotional effects of test-and learn type experiments that all retailers could implement to help improve their revenue management.
Carrier Choice Optimization with Tier Based Rebate for a National Retailer
In this study we developed a transportation delivery decision-support system in collaboration with a high-end national retailer. After understanding the constraints of the business problem and the primary transportation providers tiered rebate policy, we framed this problem into an analytics problem. Our analytical solution predicts expected rebate rate at a week-level by framing it as a time-forecasting problem. Traditional ARIMA model are used to build features that served as inputs to machine learning models (e.g. Random Forest, Artificial Neural Network, and Deep Learning model) that we found led to even better forecasts of costs. As agreed upon with the partner, we focused on obtaining a model that achieved the lowest cross-validated Root Mean Squared Error (RMSE). Lastly, we developed a tool that would simulate various transportation scenarios to demonstrate how much of the delivery business could be allocated to other smaller carriers, while still ensuring the retailer would remain in a certain strategic rebate tier with their major vendor to minimize their overall yearly transportation costs.
An Analytical Approach for Understanding Promotion Effects on Demand and Improving Profits
The objective of this study is to design and develop a better revenue management system that focuses on leveraging an understanding of price elasticity and promotional effects to predict demand for grocery items. This study is important because the use of sales promotions in grocery retailing has intensified over the last decade where competition between retailers has increased. Category managers constantly face the challenge of maximizing sales and profits for each category. Price elasticities of demand play a major role in the selection of products for promotions, and are a major lever retailers will use to push not only the products on sale, but other products as well. We model price sensitivity and develop highly accurate predictive demand models based on the product, discount, and other promotional attributes, using machine learning approaches, and compare performance of those models against time-series forecasts.
A Machine Learning Approach to Delivery Time Estimation for Industrial Equipment
Our research focuses on obtaining better predictions for lead-time of made-to-order equipment for a large multinational corporation. In collaboration with this corporate partner, our team was tasked to create a deployable solution that could provide reliable delivery predictions. The motivation for this work is that when customers place orders for pieces of equipment, and they are provided an expectation that their product will be delivered in a timely manner. Without a delivery estimation system currently in place, the company cannot provide customers an expected time window, which is an inconvenience for the customers that have their own operational planning to use this equipment. To predict this lead-time, our team was provided access to tens of thousands of entries of equipment order data. We experimented with many models considering the unique aspects of the features and were able to obtain predictions of delivery time for each product line. Our predictive approach provides a solution to this business dilemma, by providing a highly accurate, cross-validated predictions of delivery time as well as a corresponding prediction interval. We believe our approach could be easily extended to other similar type supply-chain problems.
Analytical Evidence to Justify Legislative Changes in State Government Polices
This study provides a predictive and prescriptive analytics solution to justify the proposal of legislative changes to current state government policies for credentials. The predictive component provides a model that predicts expected future demand for various customer credential types. The prescriptive component provides a workforce capacity planning decision model that identifies how to modify credential expiration dates. If this decision can be made via legislative change, it will inherently lead to a more balanced demand profile year-on-year, as well as facilitate a state government agency’s staffing decision. In collaboration with this governing body, our analyses will help provide the empirical evidence they require to propose state-wide changes to credential dates.