Financial Time Series Model in Python— Part 1 | by Chloe Data Blog | Mar, 2022 |

admin

Chloe Data Blog Follow Mar 30 · 9 min read Financial Time Series Model in Python— Part 1 This is the second semester of my Master course, and I’ve just finished a project in Financial Management class which is “Diversification benefits of Crypto currency in portfolio management” .This topic is very interesting, I have learnt…

Chloe Data Blog Follow Mar 30

· 9 min read

Financial Time Series Model in Python— Part 1 This is the second semester of my Master course, and I’ve just finished a project in Financial Management class which is “Diversification benefits of Crypto currency in portfolio management” .This topic is very interesting, I have learnt a lot of techniques in financial data modeling, and would like to share with you all of those techniques in a series of 5 articles.

For the full python implementation of this article, please have a look at my Github .

In Part 1, I will demonstrate the basic techniques in working with Financial data, how to get data, process and generate time series charts.

For me, data importing and processing part of any data analytics work is very (if not most) important.In line with the “garbage in, garbage out” maxim, we should strive to have data of the highest possible quality, and correctly preprocess it for later use with statistical and machine learning algorithms.Our analyses highly depend on the input data, and no sophisticated model will be able to compensate for that.

In this article, I will cover the introduction to import stock price data from 4 different sources: Yahoo Finance (most popular), Quandl, Intrinio and pandas-datareader .Then, I will show you how to get normal return and log return daily , how to change the data period accumulation and visualize stock price data in matplotlib .

Bonus: how to visualize outliers in chart.Los geht’s!

1.

Download Financial Data from Python libraries First we need to import the necessary libraries

1.1 Getting data from Yahoo Finance: There are many sources to download stock price and macro data, and one of the most popular sources of free financial data is Yahoo Finance.It contains not only historical and current stock prices in different frequencies (daily, weekly, monthly), but also calculated metrics, such as the beta (a measure of the volatility of an individual asset in comparison to the volatility of the entire market) and many more.

In this recipe, we focus on retrieving historical stock prices.

The result of the request is a DataFrame (2,767 rows) containing daily Open, High, Low, and Close ( OHLC ) prices, as well as the adjusted close price and volume.

Tips:

We can set auto_adjust=True to download only the adjusted prices.We can additionally download dividends and stock splits by setting actions=’inline’.1.2 Getting data from Quandl: Quandl is a provider of alternative data products for investment professionals, and offers an easy way to download data, also via a Python library.The drawback of this database is that as of April, 2018, it is no longer supported (meaning there is no recent data).

Before downloading the data, we need to create an account at Quandl ( https:/​/​www.quandl.​com ) and then we can find our personal API key in our profile ( https:/​/​www.quandl.​com/​account/​profile ).

We can search for data of interest using the search functionality ( https:/​/​www.​quandl.​com/​search ).

The result of the request is a DataFrame (2,767 rows) containing the daily OHLC (Open,High,Low,Close) prices, the adjusted prices, dividends, and potential stock splits.

1.3 Getting data from Intrinio: Another source of financial data is Intrinio, which offers access to its free (with limits) database.

Before downloading the data, we need to register at https:/​/​intrinio.​com to obtain the API key.Please see https:/​/​github.​com/​intrinio/​python-​sdk for the full list of available indicators to download.

The resulting DataFrame (2,771 rows) contains the OHLC prices and volume, as well as their adjusted counterparts.

1.4 Getting data from pandas reader: (note the preference) pandas-datareader is one of the most popular and convenient library to import stock price data in Python.This library could be used to access public financial data from the Internet and import it into Python as a DataFrame.

Some popular data sources available using pandas_datareader are :

Yahoo Finance Google Finance Morningstar IEX Robinhood Engima Quandl FRED World Bank OECD and many more.In the section below, I will take you through a tutorial on pandas datareader to collect stock price data from Yahoo Finance.

All of the data sources mentioned above provide data in a different format, so collecting data from each source follows a different method.

The result is a dataframe of SEA’s stock price with HLOC, volume and Adj Close data.

2.Process data Converting prices to returns and Add CPI columns Asset prices are usually non-stationary, meaning their statistics, such as mean and variance (mathematical moments) change over time.

This could also mean observing some trends or seasonality in the price series.By transforming the prices into returns, we attempt to make the time series stationary, which is the desired property in statistical modeling.

There are two types of returns:

1.Simple returns: They aggregate over assets; the simple return of a portfolio is the weighted sum of the returns of the individual assets in the portfolio.Simple returns are defined as:

2.Log returns: They aggregate over time; it is easier to understand with the help of an example — the log return for a given month is the sum of the log returns of the days within that month.

Log returns are defined as:

The difference between simple and log returns for daily/intraday data will be very small, however, the general rule is that log returns are smaller in value than simple returns.

Let’s see how to calculate both types of returns using SEA’s stock prices in below example.We could also add an additional CPI column (from Yahoo Finance) to compare the return of stock vs.economy’s inflation.We can then merge the inflation data with SEA’s stock returns, and account for inflation by using the following formula:

R is a time t simple return and π is the inflation rate The result is a table having both inflation_rate and real_rtn daily of SEA stock price where the return has been adjusted for inflation (from simple rtn).

Changing data frequency In our analysis, we would sometimes want to analyse data in periods other than the daily period.The general rule of thumb for changing frequency can be broken down into the following:

Multiply/divide the log returns by the number of time periods.

Multiply/divide the volatility by the square root of the number of time periods.But what is volatility?

Volatility is a statistical measure of the dispersion of returns for a given security or market index.

In most cases, the higher the volatility, the riskier the security.Volatility is often measured as either the standard deviation or variance between returns from that same security or market index.(Source: Investopedia ) .Simply speaking, volatility is the standard deviation of returns.

In this article, the used formula for realized volatility is as follows:

Steps by steps:

Download the data and calculate the log returns.

Calculate the realized volatility over the months.Annualize the values by multiplying by √12 as we are converting from monthly values.The upper chart is the normal return dataframe.The lower chart is the monthly realized volatility dataframe.

We can see that the spikes in the realized volatility coincide with some extreme returns (which might be outliers).From this chart, we could se that on a monthly basis, SEA stock brings good return.

3.

Visualizing time series data After learning how to download and preprocess financial data, it is time to learn how to plot it in a visually appealing way.We will use the default plot method of a pandas DataFrame with three columns: adj_close, simple_rtn, and log_rtn.Execute the following code to plot SEA’s stock prices together with the simple and log returns.

The resulting plot contains three axes.Each one of them presents a different series: raw prices, simple returns, and log returns .Inspecting the plot in such a setting enables us to see the periods of heightened volatility and what was happening at the same time with the price of SEA’s stock.Additionally, we see how similar simple and log returns are.

There are many more libraries to create plots in Python, including:

seaborn plotly plotly_express altair plotnine A specific use case might require using some of the previously mentioned libraries as they offer more freedom when creating the visualization.

Create candle stick charts A candlestick chart is a type of financial graph, used to describe a given security’s price movements.A single candlestick (typically corresponding to one day, but a higher frequency is possible) combines the open , high, low, and close prices ( OHLC ).

The elements of a bullish candlestick (where the close price in a given time period is higher than the open price) are presented in the following image (for a bearish one, we should swap the positions of the open and close prices):

In comparison to the plots introduced in the previous chapter, candlestick charts convey much more information than a simple line plot of the adjusted close price.That is why they are often used in real trading platforms, and traders use them for identifying patterns and making trading decisions.

In this part, we download SEA’s (adjusted) stock prices from August 2021 to 2022.We use Yahoo Finance to download the data, details as below:

In the plot, we can see that the exponential moving average ( EMA ) adapts to the changes in prices much faster than the SMA (Simple Moving Average) .

SEA’s stock price candle stick chart Bonus: Indentifying outliers While working with any kind of data, we often encounter observations that are significantly different from the majority, which are outliers.They can be a result of a wrong price, something major happening on the financial markets, an error in the data processing pipeline, and so on.Many machine learning algorithms and statistical approaches can be influenced by outliers, leading to incorrect/biased results.That is why we should handle the outliers before creating any models.In this part, I will introduce how to look into detecting outliers using the 3σ approach.

Three-sigma limits (3-sigma limits) is a statistical calculation that refers to data within three standard deviations from a mean.

Three-sigma limits are used to set the upper and lower control limits in statistical quality control charts.

On a bell curve, data that lie above the average and beyond the three-sigma line represent less than 1% of all data points.

Source: Investopedia

We will create the 3-sigma limits in our dataframe.Any data points below or above 3-sigma value will be identified as outliers.

In the plot, we can observe outliers marked with a red dot.One thing to notice is that when there are two large returns having same values, one in 2020 and one in 2002, however the algorithm identifies the first one as an outlier and the second one as a regular observation.The same is observed in 2022, where many strong volatilites are identified as Normal.This might be due to the fact that the first outlier enters the rolling window and affects the moving average/standard deviation.

Summary I hope you will find this article helpful in initializing Financial time series analyses.In the next part, I will demonstrate how to decompose time series and forecasting stock price using ARIMA class models.

Happy Sharing!.

Leave a Reply

Next Post

Tips on How to Buy a Bitcoin Miner Online

These are some of the things to consider if you're looking to buy a crypto miner online.You must first ensure that the website you are purchasing from is reliable.Before buying, you should check the seller's reputation.Make sure you understand the type you're looking at when searching for a cryptocurrency miner.A cryptocurrency miner is an investment,…

Subscribe US Now