Select MSBAIM industry practicum students may compete in the notable analytics Future Edelman Impact Award and contest, which aims to inspire students to produce their best data work and to strive to be a finalist or winner of a future INFORMS Edelman Award.
The Franz Edelman Award for Achievement in Advanced Analytics, Operations Research and Management Science is administered by the Institute for Operations Research and the Management Sciences, or INFORMS.
The school's competition, open only to teams of MSBAIM students, emulates the INFORMS competition.
"Our goal in creating this Future Edelman Impact Award is to give our students practice via their Industry Practicum course project in delivering empirically-supported work that can be understood by the layman and that has measurable impact," says Matthew Lanham, academic director of the MSBAIM program and a clinical assistant professor of management.
"We want the experience and the way of thinking, designing, doing and delivering analytics to be a catalyst for the students as they develop into leaders who will one day compete for the INFORMS award," he says.
INFORMS is a collection of academic and industry experts whose mission is to "advance and promote the science and technology of decision making to save lives, save money and solve problems." The Edelman Award is one of several INFORMS awards and is named to honor one of the fathers of operations research and management science, more commonly known as the field of analytics. Top entities in the world compete for the Edelman Award, and recent winners include the United Nations World Food Programme, INTEL Corporation and UPS.
The proliferation of global supply chain disruptions and the rapid advancement of globalization in recent times have compelled corporations to reevaluate their stance on sustainability and the resilience of their supply chains. To address this challenge, corporations have developed models that simulate the risk of disruptions and evaluate the environmental impact in their multi-tier supply chains. In recent years, there has been a surge in the development of innovative methods for mapping suppliers, particularly those based in foreign countries, to ensure supply chain resiliency. One of the most promising methods is the application of natural language processing (NLP) techniques to retrieve essential information about suppliers from publicly accessible data sources. NLP enables corporations to make informed and data-driven decisions when selecting partners for supplier relationships. To enhance transparency across the supply chain, various methods from related fields, such as statistics, forensics and biology, have been adapted and utilized to meet the needs of both large and medium-sized corporations.
The banking industry requires efficient risk management to maintain the stability of the economy and mitigate any potential loss. The great recession of 2008 exposed the inadequacy of existing financial regulations as the banking economy collapsed with large financial institutions declaring bankruptcy. With increased scrutiny on the importance of risk management within banks, the U.S. government introduced many financial regulations like the Current Expected Credit Loss framework (CECL), Dodd Frank Stress Tests, etc. to ensure banks maintained sufficient capital to withstand losses. These financial regulatory standards are expected to continuously evolve with a changing financial landscape. As a result, banks need to continuously monitor their models and workflow to comply and adapt to the changes to minimize risk. Automation in risk reporting and standardization of processes can help banks navigate heightened regulatory and conduct controls. Some of the applications of automation in the bank industry involve streamlining existing decision-making processes, collating vast amounts of financial data, better modeling and more. The paper explores the potential applications of automation and modeling in the banking industry and the benefits they can bring to banks, especially in enhancing their risk management framework.
One of the most important stages of product development is viability estimation. Entailed calculations help to assess the market demand, competition and profitability of a product, allowing companies to make informed decisions about product development, launch and promotional strategies. By analyzing key factors such as the size of the target market, regulatory requirements and projected costs, product viability calculations can help companies to identify potential challenges and opportunities and develop strategies to mitigate risks and capitalize on potential benefits. This process becomes even more important in the U.S. healthcare industry, owing to multiple variable factors around reimbursement, demographics and customer preferences. Furthermore, if a product especially pertaining to healthcare is recalled, it not only leads to legal obligations and concerns, but the entire product needs to be redesigned and the process of FDA approval also needs to be rethought and strategized.
Minimizing the cost of risk in the foreign exchange (FX) market is one of the most crucial problems faced by international organizations. The fluctuations in exchange rates can lead to unnecessary losses, thus negatively impacting the organization's bottom line. One of the most common hedging strategies in the FX market is to enter forward contracts. This strategy aims to enter into a financial agreement to lock in an exchange rate. Entering a forward contract provides an organization with a more stable financial environment, which comes with an associated cost to enter the contract. Using historical exchange rates and Monte Carlo simulations to generate a portfolio distribution of currencies on a balance sheet, we were able to use risk measures to estimate extreme loss scenarios. To optimally recommend hedge ratios on exposed currency pairs, an optimization algorithm was designed and implemented to minimize the cost of risk (cost of hedge + expected loss level using CVaR measure). With the money saved each month following an implementation of the hedging program, FX market investment decisions informed by a deep learning prediction model (DeepAR) and a Markowitz mean-variance portfolio construction model can be used to deploy the cash for returns.
Increasingly, insurance companies are exploring ways to predict how potential and current customers will act. We created a step of models and cluster that help support insurance companies as they maneuver ways of determining segments and predicting needs of consumers. The ability to accurately determine a predictive system allows consumers to obtain adequately priced policies and allows insurers to maximize profits by customizing their offers to each individual customer. Through the use of clustering algorithms and machine learning techniques such as random forests, we created a series of models to help answers insurance companies' most prevailing questions.
Food waste and stockouts in the retail stage of a food supply chain are mostly caused by incorrect sales forecasting, which results in improper product ordering, thereby resulting in on-shelf availability issues. For perishable products, shelf life is a crucial factor in choosing the forecasting time frame. A time-series demand forecast is influenced by various factors like weather, holiday season, promotions, customer visits and month, all of which affect the sales quantity of products. The product sale is specific to gas stations and various forecasting methods like ARIMA, Auto-Arima, LSTM, Holt-Winter’s Smoothing and SARIMAX, which are gauged to measure the accuracy. This automated forecasting focuses on creating a dynamic workflow to predict the right inventory to be sold for a combination of product and store locations based on historical sales. The models are evaluated based on MAPE (Mean Absolute Percent Error) as it is not sensitive towards measuring units. The day-level prediction helps to handle the traffic on weekdays and weekends. The proposed methodologies can be easily customized and scaled across all product hierarchies and gas stations. It has been designed to ensemble among the models and proceed with the model with the best weighted MAPE value.
This project develops a solution using unstructured data methodologies to help improve product digital eligibility for a national grocer. The motivation for this problem is that only 33% of the grocers' active products are eligible to be purchased on their website due to inaccuracies in the product listing. A lack of digital eligibility restricts them from also selling those products in their stores. The generation of website product descriptions is currently handled manually by vendors, and multiple products have descriptions that are either missing or not a good representation of the product. This lowers the digital eligibility and in turn the number of products that could be listed on the website, which leads to potential losses in sales and company performance. We develop an algorithm that first correctly identifies poor product descriptions, then secondly generates a product description based on the available product image(s). We provide empirical results of various approaches we investigated for this problem, including the popular ChatGPT, and estimate how our solution ensures a quick turnaround time to make more products available on their digital platform with fewer errors. This engagement led to an improved automation that will reduce the effort invested by vendors who manually write the product descriptions.
The Taguchi Loss Function is an effective technique to demonstrate how much further a company’s deliverable deviates from what is acceptable as their net loss grows exponentially. Smooth and orderly packaging, stacking, and loading of goods onto delivery trucks is vital to a successful supply chain. By expanding these two statements further, we proactively address a potential point of friction in supply chain operations. Our digital system generates a plan to streamline the packing, stacking, and loading of palletized goods onto delivery trucks joined by orders. It also decreases the dependency on subject matter experts within warehouse operations to decide the number of pallets to fulfill an order and if/when to stack pallets for loading. Our digital system encompasses this knowledge with given dimensions, weights or orders, type of material and constraints in the entire procedure to generate the optimized loading plan of each delivery truck.
In today's global economy, organizations commonly procure goods and materials from multiple vendors across the globe through sea-going vessels. Although carriers in charge of shipments provide their own estimated time of arrivals, they are often inaccurate and provide an expected time of arrival (ETA) with larger variance. Moreover, multiple ETAs from multiple carriers degrades the decision-making of organizations causing disruptions to subsequent activities in the chain. In partnership with an award-winning container shipment tracking aggregator, we designed and tested several approaches to effectively use these noisy carrier predictions in an ensembled fashion to generate better vessel ETA predictions. Among several experiments we found using machine learning predictions in conjunction with a custom linear programming solution to assign optimal weights led to more accurate and reliable ETA predictions. The impact of this work will not only improve the value the partner provides their clients, but will also help their clients plan better, which will positively impacting P&L and the planet.
In collaboration with SIL International – a nonprofit organization, we developed an AI model that takes in audio as input and outputs a class label indicating the language being spoken in the audio. We used a wave2vec model to train the model on 6 languages and built an operationalized pipeline on AWS for this set of languages. We demonstrated how we designed and built a pipeline using AWS architecture to scale our model and ensure easier deployments. The pipeline was set up such that it will trigger and predict whenever new data was being added. This language solution was deployed on top of SIL International’s existing translation models that are used across the globe. Those interested in language modeling, deploying AI models using AWS services and using AI for good will find value in this engagement.
Readability assessment models are increasingly important in evaluating the complexity of written text and ensuring it is suitable for a specific audience. Automatic readability assessment models are particularly useful for educators, content creators and researchers who want to make their materials easily understandable and accessible. This paper presents a novel approach for language-agnostic automatic readability assessment using a combination of machine learning techniques and natural language processing tools. The proposed method was tested on a diverse dataset of texts in multiple languages and demonstrated strong performance in accurately assessing readability. This approach provides a valuable tool for working with multilingual text and can help bridge the gap in readability assessment for languages that lack dedicated tools.
Major western corporations and quickly expanding companies in emerging markets such as India, Africa and Southeast Asia face the challenge of linguistic diversity. Language support for their systems in these regions is almost negligible. SIL International and its partners have collected high-quality audio data in over 3,600 languages. To make use of these data files they need to establish an automated pipeline using Amazon Web Services (AWS) that can turn input files stored in a AWS S3 bucket to natural sounding audio output. In collaboration with SIL we developed a scalable and reproducible pipeline for Text-To-Speech (TTS) models in order to increase language coverage and provide opportunities for marginalized language speakers. We believe others challenged to build scalable, resilient and cost effective solutions in AWS will find our approach valuable.
Freight companies often face the issue of deadheading, which occurs when a truck returns empty to the point of origin after delivering a load. Deadhead miles are a significant cost for businesses, wasting resources such as fuel, time and labor costs. In addition, private fleets and dedicated trucks often face suboptimal freight or backhaul options, further increasing deadhead miles. In collaboration with a national food delivery firm, we research and develop a solution to mitigate these costs by starting a third-party brokerage to haul outside vendors' freight, leading to increased utilization of their private and dedicated fleets. Our analysis focused on comprehensive shipment, lane and carrier data to construct models that optimize return routes, reducing deadhead miles. Additional market research was performed to demonstrate the role of brokerage firms in connecting shippers and carriers and the costs associated with building or buying such a firm. This engagement accounted for real-world obstacles, such as driver rejections and liability concerns, to provide the client with a comprehensive understanding of the profitability and risks involved. This engagement led to improved transparency of the business and market, which allowed the firm to make a strategic decision to begin a brokerage.
Firms collect and analyze sensitive consumer data to gain insights about their business and develop cutting-edge strategies. With increasing regulations and the risk of sensitive data leakage, firms employ several stringent practices to ensure data privacy. However, these practices drive-up operational costs and opportunity losses. Synthetic data generation allows firms to relax these practices at the cost of some predictive power. The problem lies in understanding what data generation methodologies exist, where they could be used and how such data would perform for various predictive modeling approaches. In collaboration with a national timeshare firm, we provide some best practices that firms in all domains can follow to adopt synthetic data into their data science workflow to reduce operational costs and legal barriers needed to maintain data privacy. Our empirical results demonstrate how generating and using synthetic data can perform for various predictive models most likely to be used in business against the privacy tradeoffs one is likely to encounter.
To efficiently target marketing campaigns to a receptive audience and improve order conversion and revenue per transaction, one must first be able to predict the actions of consumers. This is even more true in the fast-service food industry, where new products are launched regularly and individual brands live and die through effective advertising. Rather than focusing solely on demographics and a brand’s most popular products, this solution predicts the date range and constituent items of a customer’s next order. This date prediction buckets users into ranges of dates using k-means clustering. For product prediction, collaborative filtering enables predicted user preference for previously unordered items by examining the orders and items of users, with co-clustering grouping similar users and products together. Generating personalized baskets of predicted items by user provides substantial gains in accuracy when predicting the brand’s most popular items for all users. Personalized basket predictions are proven to correctly select at least one out of three recommended items in a customer’s next order in 75% of the test set. An A/B test will illustrate the impact of this personalized prediction system by tracking increases in conversion rate and average order price of those shown recommendations informed by this prediction model vs. those shown generic national advertisements.
This team of six rising business analytics professionals received more than 3k views and 1k "Likes" to secure the top position and earn the $1,000 prize.
Thank you to everyone who participated and voted.
Assortment planning is one of the most important & challenging applications of analytics in retail. Often retailers use a two-stage approach where in the first stage they run thousands of prediction experiments to identify what best captures expected demand. In the second stage, they decide which combination of products will lead to the best sales for a particular store - a classic knapsack-type problem. This work in collaboration with a national retailer focuses specifically on combinatorial assortment optimization & how the hierarchical nature of the decisions & analysis can lead to drastically different outcomes with respect to in-store profitability.
Patents play a significant part in innovation and help individuals and companies safeguard and retain ownership of their ideas. However, patent infringement is common, and more than 2,500 patent infringement suits are filed each year. Currently, patent infringement detection is largely done manually, and companies spend approximately $600 to identify each case of infringement. Our work provides an approach to automate this process through machine learning. Our model first vectorizes patent text using a BERT model trained on patent text, and then calculates similarity scores between competing patent claims. The overall score is then calculated by taking a weighted average of the subsection similarities, where the weights were calculated by training a logistic regression model based on historical cases of infringement. Looking at subsection scores along with the overall score, we can identify potential infringement of two competing patent claims rather accurately.
Organizations that run a multitude of A/B tests can gain immense value from having an automated framework. In collaboration with a data science consulting company, we developed a parametrized accelerator to significantly reduce time to market for data scientists while performing A/B and A/A testing under varying circumstances using Python programming.