We set ourselves the somewhat ambitious task of explaining feature engineering (also known as feature extraction) and its role in predictive analytics and demand forecasting. Which sounds really dull, so we're going to talk about the weather instead - and specifically, why there's more to temperature than the number on a thermometer.
Let's imagine that you work for a restaurant with an outdoor dining area (even if, at the time of writing, restaurants across Europe are closed because of coronavirus...)
If asked, you'd probably say that sales go up as the temperature increases - people are out and about more in good weather, and your outdoor seating adds a lot of extra covers.
But is it as simple as that?
The answer is: probably not. Below freezing (in the UK at least), people are unlikely to venture outside so sales are low and flat. Between, say, 0C and 15C, sales increase linearly as the temperature goes up. Above 15C people will start sitting outside, which means your restaurant just got a lot bigger and so sales go up more sharply as the temperature rises. But Brits are a fussy bunch, and once it hits the high-20s it's all a bit much and they want to sit indoors again with the air conditioning.
Now each of those four scenarios is predictable and easy to model (in Excel, let alone with statistical algorithms). But in order to do so, you have to identify them in the first place. Otherwise, your model will be stuck looking at a single linear temperature variable and miss all the subtlety.
This, then, is feature engineering. The transformation just described deals with non-linear relationships between the predictor variables (in this case temperature) and what you're trying to forecast (sales). There are several ways to approach the problem; this example uses binning' - creating discrete 'bins' from a continuous variable. Others, the details of which are beyond the scope of this article (and which may or may not involve a full-blown statistics lesson) include overlapping bins, chi-squared and piecewise transformations.
A separate type of transformation is necessary to deal with interaction between different variables. The temp effect described above may hold when looking at average across all your restaurants, but when you look by region you realise that customers in Scotland are much hardier and will start sitting outside once the temperature is above 12C. This is an example of interaction between temperature and geography. As you might imagine, this can very quickly lead to an explosion of the number of potential predictors and the number of dimensions the model needs to consider...
On the other side of the equation, we might actually want to transform the dependent variable (a fancy term for the sales number that we are trying to forecast). In some cases, you can get more accurate results by predicting the difference to last year's sales, rather than the total sales themselves. This is very domain-specific and depends on the subject in question.
So that's a very brief introduction to feature engineering. It's one of the most important things for a data scientist to get right, but if you do then it means your model has a lot more information to work with. And information means accuracy, which is why we're all here.
If you'd like to learn more, these articles might be of interest:
Thanks for reading.
Skarp uses machine learning-powered predictive analytics to generate accurate, automated demand forecasts - and an explanation of what is actually driving performance.
By removing uncertainty and quantifying the impact of factors affecting performance, Skarp can reduce costs and improve customer satisfaction.
We offer a fully-managed service, designed for organisations with limited in-house data science resources.
There is no setup fee or minimum contract term with Skarp, and we offer all new clients a proof of concept free of charge. We believe the accuracy of our forecasts will speak for itself.