Dr. Ernest P.Chan & Dr. Roger Hunter – Data & Feature Engineering for Trading
$55.00
MAIL DELIVERY !!!
Please check your email ( spam, junk box) after your order
Link will be sent to you in an hour
Description
Data & Feature Engineering for Trading download , Dr. Ernest P.Chan & Dr. Roger Hunter – Data & Feature Engineering for Trading review , Dr. Ernest P.Chan & Dr. Roger Hunter – Data & Feature Engineering for Trading free
Dr. Ernest P.Chan & Dr. Roger Hunter – Data & Feature Engineering for Trading
AUTHORS
Dr. Ernest P. Chan
Dr. Roger Hunter
LEVEL
Intermediate
How many times have you created a strategy that performed well during backtesting, however failed to make money in the real markets? An essential course to create robust machine learning strategies which can be executed on trading platforms. This course teaches the data cleaning aspects on financial datasets and with real-world examples.
SKILLS COVERED
Data Engineering
- Financial data cleaning
- Exploratory data analysis
- Data types nuances
- Survivorship & Look ahead Bias
Feature Engineering
- Triple barrier method
- Dollar and volume bars
- Stationarity
- Fractional differentiation
Python
- Itertools
- Numpy
- Pandas
- Matplotlib
- Pickle
COURSE FEATURES
- Lifetime Access to the course
- Downloadable codes
- Video based course
- Sample Strategy for Live Trading
LEARNING TRACK
Machine Learning Strategy Development and Live Trading
PREREQUISITES
You should be familiar with basic machine learning principles such as train and test datasets. There are no prerequisites as such and anyone who is familiar with financial markets data can enroll in the course.
AFTER THIS COURSE YOU’LL BE ABLE TO
Preprocess price data to resolve outliers, duplicate values, multiple stock classes, survivorship bias, and look-ahead bias issues.
Work with sentiment data to identify structural break and aggregate categorical features.
Examine fundamental data and resolve multiple data merging issues.
Create features and target variables for machine learning models.
Explain various challenges associated with the financial data
SYLLABUS
Introduction to the Course
In this introductory section, you will learn the importance of data engineering and feature engineering which can be used either in your personal trading or in an institutional setting. Preprocessing of the financial dataset is essential to make it suitable for analysis. Extracting features from the datasets to feed into the machine learning algorithms, and setting the target variable for a particular ML problem increases the predictive power of your algorithm.
- Introduction by Dr. Ernest Chan
4m 35s - Course Overview
4m 22s - Quantra Features and Guidance
2m 25s
Challenges in Financial Data Engineering
Most of the time, trading strategies look great while backtesting but fail to live up to the expectations during live practice. Incorrect financial data has the potential to produce inaccurate inferences. Failure in identifying the flaws in data makes it completely useless. Learn the six most common challenges in financial datasets.
- Challenges in Financial Data Engineering
3m 8s - Survivorship Bias
2m - Alternative Data
2m
Exploratory Data Analysis in Finance
Exploring the data helps to build familiarity with the data. After exploring the data, you will be able to describe what’s in the data and the characteristics of the data. It also helps you to identify the irregularities and anomalies and to discover the patterns and relationships in the data.
- Closer Look At the Data
2m 4s - Importance of EDA
2m - Python Pickle
2m - Adjusted Close Price
2m - How to Use Jupyter Notebook?
1m 54s - Examining the OHLCV Data
10m - Read a Pickle File
5m - Find Null Values
5m - Generate Descriptive Statistics
5m - Irregularities
2m 56s - Stock Classes
2m - Minimum Value of Adjusted Close
2m - Dataframe Profiling
2m
Survivorship Bias for Stock Data
We often backtest on the stock universe that survived until today and ignoring the stocks that no longer exists. This causes survivorship bias in the backtesting. In this section, you will learn the concept of survivorship bias, why it is important to use survivorship bias-free data in the backtesting, and how to deal with it. Also, learn to identify delisted stocks from the stock universe.
- Survivorship Bias
2m 55s - Stock Disappearance
2m - Dealing With Survivorship Bias
2m - Buy-Low Price Strategy
2m - Effects of Survivorship Bias
2m - Delisted Stocks
10m - Maximum Date for Each Symbol
5m - List of Delisted Stocks
5m
Redundant Stocks Data
Learn to check for data redundancy. It is highly unlikely that two stocks or financial instruments will have the same prices across many dates. It can occur on a few dates coincidentally, but if it occurs across many numbers of dates and consecutively then something might be wrong with the data.
- Dealing With Redundant Stocks
2m 43s - Effects of Redundant Data
2m - Steps to Find Redundant Data
2m - Handling Duplicate Stock Data
10m - Create Stock Pairs
5m - Compare the Stock Prices
5m - Calculate Number of Duplicates
5m - Reasons for Redundancy
2m
Multiple Stock Classes: One or All?
A listed company can issue stock with multiple classes. These stock classes have different voting rights. Learn whether you should keep the data for all the stock classes or one. If one then which stock class to keep and which to remove.
- Dealing With Multiple Stock Classes
2m 1s - One Stock Class
2m - Stock Class
2m - Retain All Classes
2m - Multiple Stock Classes
10m - Identify Stock Classes
5m - Unique Symbols
5m
Outliers: How to Identify and Deal With Them?
In this section, we talk about the outliers. An outlier is a data point that is significantly different from other data points. It can be due to data quality issues or can be real. Learn how to identify and deal with outliers.
- Dealing With Outliers
- Outliers
2m - Dealing With Outliers
2m - Inflated Profits
2m - Dealing With Outliers
10m - Number of Trading Days With Zero Volume
5m - Sort Dataframe by Returns
5m
News Data: Numerical Features
This section covers how news data can be sourced, within the notebook, via webhose.io
- Overview of the News Data
1m 20s - Numerical Features
2m 34s - Relevance
2m - Novelty
2m - Combine Numerical Features
2m 15s - Combine Numerical Features
2m - Calculate Feature Score
2m - Aggregate News Items Daily
2m - Numerical Features
10m - Calculate Feature Score
5m - Calculate Trading Date for Each Headline
5m - Calculate Daily Feature Score
5m
News Data: Categorical Features
- Categorical Features
9s - Categorical Features
2m - One-Hot Encoding
2m - Aggregating Categorical Attributes
2m 16s - Aggregate Categorical Features
2m - Issues With Mean Aggregation
2m - Limitations of One-Hot Encoding
2m - Aggregating Categorical Features
10m - One-Hot Encoding
5m - Aggregate Using Mean
5m - Recap
1m 44s
Structural Breaks in Financial Data
Sometimes there is an unexpected and prolonged change in the structure of the time-series data. This leads to a structural break. Learn to identify structural breaks in the sentiment data and list the probable solutions to deal with that.
- Structural Breaks
2m 48s - Structural Breaks in Time Series Data
2m - Dealing With Structural Breaks
2m - Effects of Structural Breaks
2m
Fundamental Data: Merge Them Correctly
This section covers the merging of fundamental data of two popular data sources, sharader and WSH. Although these sources are not free, the notebook also elaborates on what the data looks like and how to parse it.
- Precap of Fundamental Data
58s - Sources of Fundamental Data
3m 18s - Sharadar Data
2m - Announcement and Filing Date
2m - Actual Vs Expected Earnings Date
2m - Sharadar Data
10m - Dimension Fields
2m - Why Dimension Fields?
2m - Wall Street Horizon Data
10m - Which Format?
2m - Examining the Data
2m 18s - Challenges in the Datasets
2m - Identify the Issues
2m - Multiple EPS Values
2m - Challenges in Merging Dataset
1m 5s - Common Tickers
2m - Investigate the Issues
2m
Look-ahead Bias: Deceptive Returns
Get introduced to the issues of and scenarios where data from the future is used for backtesting. This leads to deceptive returns while testing. Learn about ways to get around this ubiquitous bias or problem.
- Futures Prerequisite
10m - Futures Contract
2m - Margin Requirements
2m - Settlement Price
2m - Roll Return
2m - Calculate the Roll Returns
2m
Look-ahead Bias in Futures
- What is Look Ahead Bias?
2m - Good Results
2m - Futures’ Mean
2m - Remove Bias
2m - Calendar Spread Strategy Prerequisite
10m - Calendar Spread
2m - Disadvantages of CS Strategy
2m - Look-ahead Bias in CS Strategy
- Problem With Two Instruments
2m - Solving the Problem
2m - Illiquid Futures
2m - Bid-Ask Time Quote
2m - Liquid Futures
2m
Types of Bars: Features Extraction
The market transaction data can be sampled in a variety of ways. For example, time, number, volume and value of transactions are different data features that can be used. But some ways might be more useful than others. Get introduced to the criteria which can be used to sample the transaction data. Also, learn about how these bars differ in their statistical properties.
- Tick and Time Bars
- True for a Bar
2m - Difference Between Time and Tick Bar
2m - Limitation of Time Bar
2m - Creating Time Bars
10m - Resample Price Data
5m - Calculate Open Price of Time Bar
5m - Calculate Total Volume of Bars
5m - Creating Tick Bars
10m - Aggregate Price Data
5m - Aggregate Volume Data
5m - Volume Bars
2m - Limitations of Volume Bars
2m - Volume Bar
2m - High Value of Volume Bar
2m - Limitation of Tick Bar
2m - Creating Volume Bars
10m - Create New Group ID
5m - Dollar Bars
1m 48s - What Are Dollar Bars?
2m - Identical Bars
2m - Advantages of Dollar Bars
2m - Creating Dollar Bars
10m
Information Bars: Market Order Imbalances
In this section, you will get introduced to some of the advanced ways used to sample transaction data based on market order imbalances. You will also learn market imbalances and run bars and it’s implementation.
- Information Bars
2m 22s - Measure of Information
2m - Imbalance Bar
2m - Difference Between Run and Tick Bars
2m - Imbalance Bars
10m - Calculate Rolling Imbalance
5m - Additional Reading
10m
Data Labelling for Better Outcomes
Supervised machine learning algorithms need either of the two, input and a label to learn nuances of real data. In financial time series, the input is generally a window of price data. Whereas, ground truth or labels need to be explicitly generated based on the position that needs to be taken. Learn various methods like fixed time-horizon and triple barrier methods that can be used to label your data.
- Fixed-Time Horizon
3m 28s - ML Paradigm Labelling
2m - Labelling Fixed Threshold
2m - The Fixed-Time Horizon Method
10m - Calculate Future Returns
5m - Labelling the Target Class
2m - Calculate Daily Returns
5m - Calculate Rolling Standard Deviation
5m - Triple Barrier Method
- Fixed-Horizon V/s Triple-Barrier
2m - Calculating Horizontal Bars
2m - Horizontal Bars and Volatility
2m - Finding the Target Class
2m - Vertical Bar
2m - The Triple Barrier Method
10m - Calculate Daily Returns
5m - Call Triple Barrier Method
5m
Why Stationary Features?
The right input into a machine learning model can make all the difference in the world. Learn about the need for stationary features. Decipher the price level information vs stationarity tradeoff. Learn about fractional differentiation to create effective features.
- Dealing With Features Selection
3m 10s - Price Series
2m - Series Stationarity
2m - Adjusted Close Price
2m - Fractional Differentiation
10m - Calculating Binomial Distribution Weights
2m - Calculate the ADF Statistics
5m
Python Installation
Learn to install the Python environment in your local machine.
- Python Installation Overview
2m 18s - Flow Diagram
10m - Install Anaconda on Windows
10m - Install Anaconda on Mac
10m - Know your Current Environment
2m - Troubleshooting Anaconda Installation Problems
10m - Creating a Python Environment
10m - Changing Environments
2m - Quantra Environment
2m - Troubleshooting Tips For Setting Up Environment
10m - How to Run Files in Downloadable Section?
10m - Troubleshooting For Running Files in Downloadable Section
10m
Summary
This section consists of the summary of the course along with the downloadable files which include the data modules as well as the strategy notebooks.
- Summary
2m 50s - Downloadable Code
2m
ABOUT AUTHOR
Dr. Ernest P.Chan
Dr. Ernest Chan is the Managing Member of QTS Capital Management, LLC., a commodity pool operator and trading advisor. QTS manages a hedge fund as well as individual accounts. He has worked in IBM human language technologies group where he developed natural language processing system which was ranked 7th globally in the defense advanced research project competition. He also worked with Morgan Stanley’s Artificial intelligence and data mining group where he developed trading strategies.
Dr. Roger Hunter
Dr. Roger Hunter is the Chief Technology Officer of QTS. He is responsible for designing high performance automated execution system that achieved negative slippage. Roger is a serial entrepreneur, having founded profitable hedge funds and software firms. Roger was formerly professor of mathematics at New Mexico State University, and he obtained his Ph.D. in Mathematics from Australian National University.
WHY QUANTRA?
- Gain more in less time
- Get taught by practitioners
- Learn at your own pace
- Get data & strategy models to practice on your ownCommonly Asked Questions:
- Business Model Innovation: Acknowledge the reality of a legitimate enterprise! Our approach involves the coordination of a collective purchase, in which the costs are shared among the participants. We utilize this cash to acquire renowned courses from sale pages and make them accessible to individuals with restricted financial resources. Our clients appreciate the affordability and accessibility we provide, despite the authors’ concerns.
- Data & Feature Engineering for Trading Course
- There are no scheduled coaching calls or sessions with the author.
- Access to the author’s private Facebook group or web portal is not permitted.
- No access to the author’s private membership forum.
- There is no direct email support available from the author or their team.
Reviews
There are no reviews yet.