Welcome to the most comprehensive course on Feature Engineering for Time Series Forecasting available online. In this course, you will learn how to create and extract features from time series data for use in forecasting.
Master the Art of Feature Engineering for Time Series Forecasting
In this course, you will learn multiple feature engineering methods to create features from time series data that are suitable for forecasting with off-the-shelf regression models like linear regression, tree-based models, and even neural networks.
Specifically, you will learn:
- how to create lag features;
- how to create window features;
- how to create features that capture seasonality and trends;
- how to decompose time series with multiple seasonalities;
- how to extract features from the date and time;
- how to impute missing data in time series;
- how to encode categorical variables in time series;
- how to identify and remove outliers in time series;
- how to avoid data leakage and look-ahead bias in creating forecasting features;
- how to transform features and more.
The Challenges of Feature Engineering in Time Series Forecasting
Forecasting is the process of making predictions about the future based on past data. In the most traditional scenario, we have a time series and want to predict its future values. There are some challenges in creating forecasting features:
- we need to transform time series data into tabular data with a well-designed set of features and a target variable;
- when creating forecasting features we need to be extra careful to avoid data leakage via look-ahead bias;
- time series data, as expected, changes over time; we need to take this into account when building forecasting features;
- predicting the target value at multiple timesteps in the future requires us to think carefully about how to extrapolate our features from the past into the future.
We can forecast future values of the time series using off-the-shelf regression models like linear regression, tree-based models, support vector machines, and more. However, these models require tabular data as input. For forecasting we don’t start with a table of features and a target variable, but instead a set of time series, perhaps just one. We need to transform the time series into tabular data with a target variable and a set of features that can be used by supervised learning models. Therefore, the main challenge is about creating a well-designed target variable and specially designed features that allow us to predict the future value of a time series.
Creating the target variable and features for time series forecasting comes with its own pitfalls. A major concern is a form of data leakage known as look-ahead bias. This is where you accidentally use information that is only known in the future, not at predict time, to make a prediction. This can give you the illusion that you have a great forecasting model, however, in practice it will not perform. It is very easy to introduce look-ahead bias during feature engineering and we show how you can avoid it.
Time series data change over time, that is, future data may or may not have the same distribution and patterns that we have in past data, this is different from the assumptions made about traditional tabular data. This change in distribution and patterns over time is called non-stationarity. In time series data, the simple presence of trend and seasonality can cause non-stationarity. Creating features that capture this dynamic is thus a challenge in time series forecasting.
We very often want to forecast multiple timesteps into the future. There are multiple ways to do this, such as 1) recursively applying a model that is built to forecast one step ahead, and 2) building a model that directly forecasts the target at a later time period in the future. A challenge is that the feature engineering required for these two methods are different. We discuss these differences in the course.
How can we create a set of features that allow us to predict future values of a time series based on its past values? And how can we add additional information to create a richer dataset for our forecasts? In this course, you will learn all of that, and more.
A Comprehensive Feature Engineering Course for Time Series Forecasting
Creating useful features for forecasting has typically required carefully studying your time series to find predictive patterns, such as trend and seasonality, and integrating this with domain knowledge. Lately, there’s been a growing trend to try to automate the creation of features from time series.
In this course, you will learn how to create features from time series that allow you to train off-the-shelf machine learning models to predict future values of the time series. You will first learn to analyse time series and identify properties that you can use to create predictive features. For example, you will learn how to automatically identify and extract trend and seasonality using various algorithms, as well as how to transform your time series to make it easier to decompose and forecast. We show how you can use tools such as cross-correlation, autocorrelation, and partial autocorrelation plots to create suitable lag features. You will discover tips, tricks, and hacks to create features which model trends, change points, seasonality, calendar effects, outliers and more! Based on data analysis and domain knowledge, you will be able to carefully craft your features.
Then, you will learn how to automate the process of feature engineering to create tons of features for time series forecasting, and subsequently select the ones that are more predictive. Here, we will use open source libraries that allow us to create multiple features automatically or semi-automatically, and then select the most valuable ones. We will cover the Python library Feature-engine, and later on tsfresh and featuretools.
We'll take you step-by-step through engaging video tutorials and teach you everything you need to know to create meaningful features for time series forecasting. Throughout this comprehensive course, we will go through practically every possible methodology for engineering features for time series forecasting. We discuss their logic, Python implementation, advantages and drawbacks, and the things to keep in mind when using these methods.
Specifically, you will learn to:
- identify and isolate the components of a time series, including multi-seasonal time series, using state of the art methods;
- create features that capture trends, change points, and seasonality;
- identify and create suitable lag and window features from the target time series and covariate predictors;
- create features from the date and timestamp itself;
- encode categorical variables for forecasting;
- create features to capture holidays and other special events;
- impute missing data in time series with backward and forward fill and interpolation methods;
- identify, remove, or capture the importance of outliers in forecasting;
- automate feature creation with open source Python libraries.
By the end of the course, you will be able to decide which techniques are best suited for your dataset and forecasting challenge. You will be able to apply all the techniques in Python and discover how to improve your forecasts.
Advance Your Data Science Career
You’ve taken your first steps into data science. You know about the most commonly used forecasting models. You've probably tried some traditional algorithms like ARIMA or exponential smoothing to do your forecasts. At this stage, you’re probably starting to find out that these models make a lot of assumptions about the data that simply do not occur. You thought about trying neural networks, but they provide very complex models for an otherwise simple problem.
You may be wondering whether this is it, or if there are more appropriate, versatile, and simple solutions. You may also wonder whether your code is efficient and performant or if there is a better way to program it. You search online, but you can’t find consolidated resources on feature engineering for forecasting. Maybe just blogs? So you may start thinking: how are things really done in the industry?
In this course, you will find answers to those questions. Throughout the course, you will learn multiple ways to create features for forecasting with traditional regression models and how to implement them elegantly using Python.
You will leverage the power of Python’s open-source ecosystem, including the libraries Pandas, Scipy, Statsmodels, Scikit-learn, and special packages for feature engineering like Feature-engine and Category encoders. Finally, we will show you how you can begin to automate this process with libraries like tsfresh and featuretools.
By the end of the course, you'll be able to combine all of your feature engineering steps into a single streamlined pipeline, allowing you to bring your predictive models into production with maximum efficiency.
Why take this course
There is no single place to go to learn about feature engineering for forecasting. Even after hours of searching on the web, it is hard to find consolidated methods and best practices.
That is why we created this course. This course collates many techniques used worldwide for feature engineering from well-respected forecasting books, data competitions such as Kaggle and KDDscientific articles, and from the instructors’ experience as data scientists. This course is therefore a reference where you can learn about new methods and also revisit them along with their implementation in code; so you can always create the features that you need.
This course is taught by lead data scientists with experience in the use of machine learning in finance, insurance, health, and e-commerce. Sole is also a book author and the lead developer of a Python open source library for feature engineering. Kishan is an experienced forecaster with a PhD in Physics in applied large scale time-series analysis and modelling of cardiac arrhythmias.
This comprehensive feature engineering course contains over 100 lectures spread across approximately 10 hours of video, and ALL topics include hands-on Python code examples that you can use for reference, practice, and reuse in your own projects.
And there is more:
The course is constantly updated to include new feature engineering methods.
- Notebooks are regularly refreshed to ensure all methods are carried out with the latest releases of the Python libraries, so your code will never break.
- The course combines videos, presentations, and Jupyter notebooks to explain the methods and show their implementation in Python.
- The curriculum was developed over a period of two years with continuous research in the field of forecasting to bring you the latest technologies, tools, and trends.
Want to know more? Read on...
The course comes with a 30-day money-back guarantee, so you can sign up today with no risk.
So what are you waiting for? Enrol today and join the world's most comprehensive course on feature engineering for time series forecasting.