Wednesday, June 21, 2017

Importing and Managing Financial Data

I'm excited to announce my DataCamp course on importing and managing financial data in R! I'm also honored that it is included in DataCamp's Quantitative Analyst with R Career Track!

You can explore the first chapter for free, so be sure to check it out!

Course Description

Financial and economic time series data come in various shapes, sizes, and periodicities. Getting the data into R can be stressful and time-consuming, especially when you need to merge data from several different sources into one data set. This course covers importing data from local files as well as from internet sources.

Course Outline

Chapter 1: Introduction and downloading data
A wealth of financial and economic data are available online. Learn how getSymbols() and Quandl() make it easy to access data from a variety of sources.

Chapter 2: Extracting and transforming data
You've learned how to import data from online sources, now it's time to see how to extract columns from the imported data. After you've learned how to extract columns from a single object, you will explore how to import, transform, and extract data from multiple instruments.

Chapter 3: Managing data from multiple sources
Learn how to simplify and streamline your workflow by taking advantage of the ability to customize default arguments to getSymbols(). You will see how to customize defaults by data source, and then how to customize defaults by symbol. You will also learn how to handle problematic instrument symbols

Chapter 4: Aligning data with different periodicities
You've learned how to import, extract, and transform data from multiple data sources. You often have to manipulate data from different sources in order to combine them into a single data set. First, you will learn how to convert sparse, irregular data into a regular series. Then you will review how to aggregate dense data to a lower frequency. Finally, you will learn how to handle issues with intra-day data.

Chapter 5: Importing text data, and adjusting for corporate actions
You've learned the core workflow of importing and manipulating financial data. Now you will see how to import data from text files of various formats. Then you will learn how to check data for weirdness and handle missing values. Finally, you will learn how to adjust stock prices for splits and dividends.

Wednesday, June 7, 2017

quantmod 0.4-9 on CRAN

A new release of quantmod is now on CRAN! The only change was to address changes to Yahoo! Finance and their effects on getSymbols.yahoo().  GitHub issue #157 contains some details about the fix implementation.

Unfortunately, the URL wasn't the only thing that changed.  The actual data available for download changed as well.

The most noticeable difference is that the adjusted close column is no longer dividend-adjusted (i.e. it's only split-adjusted).  Also, only the close price is unadjusted; the open, high, and low are split-adjusted.

There also appear to be issues with the adjusted prices in some instruments.  For example, users reported issues with split data for XLF and SPXL in GitHub issue #160.  For XLF, there a split and a dividend on 2016-09-16, even on the Yahoo! Finance historical price page for XLF. As far as I can tell, there was only a special dividend.  The problem with SPXL is that the adjusted close price isn't adjusted for the 4/1 split on 2017-05-01, which is also reflected on the Yahoo! Finance historical prices page for SPXL.

Another change is that the downloaded data may contain rows where all the values are "null".  These appear on the website as "0".  This is a major issue for some instruments.  Take XLU for example; 188 of the 624 days of data are missing between 2014-12-04 and 2017-05-26 (ouch!).  You can see this is even true on the Yahoo! Finance historical price page for XLU.

If these changes have made you look for a new data provider, see my post: Yahoo! Finance Alternatives.

Yahoo Finance Alternatives

I assume that you're reading this because you are one of many people who were affected by the changes to Yahoo Finance data in May (2017).  Not only did the URL change, but the actual data changed as well!

The most noticeable difference is that the adjusted close column is now only split-adjusted, whereas it used to be split- and dividend-adjusted.  Another oddity is that only the close prices is unadjusted (strangely, the open, high, and low are split-adjusted).

All these issues can be dealt with using tools that are currently available.  For example, you can unadjust the open, high, and low prices using the ratio of close to adjusted close prices.  And you can adjust for both splits and dividends using quantmod::adjustOHLC().

Unfortunately, there also appear to be issues with data quality.  Some instruments have rows where all the prices and volume are zeros (e.g. XLU).  The adjusted close in some instruments is incorrect because of missing split events, or double-counting splits and special dividends.

So, what are your alternatives?  If you're just tinkering, you can try other free data sources like Google Finance or Quandl.  Note that Google Finance data is already split-adjusted, so you might need to adjust for dividends, or un-adjust for splits, depending on your needs.  Quandl has a wiki of end-of-day stock prices curated by the community.  You only need a free account to access the data.

If you're using the data to make actual investment decisions, you should really be using a professional data provider.  At the very least, you get someone to yell at when the data have errors. :)  First, you should check if your broker provides the historical data you need (e.g. Interactive Brokers provides historical and real-time data to account-holders).

If your broker doesn't provide historical data, here are a few providers you may want to consider:

- Provide limited historical data for free
- For a one-time fee:
  - $20-$50 for 10 years of daily data
  - $40-$100 for 20 years of daily data

- Massive historical equity database
- $600 annually for 30 years of daily data
- Ability to adjust for splits and dividends

- Mainly a real-time data provider, but also has historical data
- Pricing, starts at $78/month

Leave a comment if you know of another end-of-day data provider that I didn't list!

*FULL DISCLOSURE: I receive a referral fee for annual subscriptions to CSI products if you use the FOSS coupon code.