Downloading S&P 500 Data to R

The cornerstone of your analysis and quantitative trading algorithms are data. There are lots of different ways how to do it in R (depending of what your investment instruments are). Today I am going to download data from finance.yahoo which are stock prices of companies included in S&P 500 index.

First of all, we have a list of S&P 500 companies saved in this csv file. I have removed one or two tickers.

 

View Code RSPLUS
1
2
3
4
5
6
library(quantmod)
library(tseries)
library(timeDate)
 
symbols <- read.csv("/home/robo/workspace/R-Test/sp500.csv", header = F, stringsAsFactors = F)
nrStocks = length(symbols[,1])

 

Then we have to specify the start date and make a loop to download all the data for every company in the list.

 

View Code RSPLUS
1
2
3
4
5
6
7
8
z <- zoo()
 
for (i in 1:nrStocks) {
 
	cat("Downloading ", i, " out of ", nrStocks , "\n")
	x <- get.hist.quote(instrument = symbols[i,], start = dateStart, quote = "AdjClose", retclass = "zoo", quiet = T)
	z <- merge(z, x)
}

 

I am loading quantlib, tseries and timeDate libraries for easier manipulation of data and because I am going to utilize them in the future posts.

I have encountered a strange problem several times when the provided URL couldn't be read. Have you had the same problem?

Tags: , ,

  1. Sonja’s avatar

    There's an Excel spreadsheet with an updated list of tickers (correct as of December 2013) here

    Reply

  2. amir’s avatar

    Dear
    the code is working and the only problem with the code was that, you EXCELL file is too old, and there are some securities which is added from 2011 an dsome which is removed. here I make an updated EXCEL file, which may be usefull https://www.dropbox.com/s/vc195oczrsxc4d8/sp500.csv

    Best Regards
    AMIR

    Reply

  3. Patrick’s avatar

    It looks like you haven't checked this for a while. Anyhow, thanks for posting. Below is similar code that works also on Windows:

    http://pastebin.com/P79vYZFc

    currently several of the tickers on the list fail, because, I guess, they may have expired (mergers, bankruptcies,etc.) The next step should be to make this sp500.csv list on the fly.

    Reply

    1. amir’s avatar

      Hi Patrik, thanks for the link that you post. I am using the Windows and try to run the code(copy&past it) and I didn't get the error. but My question is, where is the historical data that I downloaded located now, because there is nothing in my current directory?. I am sorry if i disturbe you, but I am an absolute begineer,

      Reply

    2. amir’s avatar

      Dear Patrick or Quantrader,
      I run the code in above link and I got this error
      Error in { :
      task 1 failed - "attempt to set 'colnames' on an object with less than two dimensions"

      Please help me, I am stand still in middle of my bachelor thesis.
      thanks ;)

      Reply

    3. PatrickT’s avatar

      http://stackoverflow.com/questions/5246843/how-to-get-a-complete-list-of-ticker-symbols-from-yahoo-finance

      Here are some clever ways to obain yahoo ticker symbols, may be useful to whomever has read this page.

      Reply

    4. PatrickT’s avatar

      Thanks for posting this. I'm learning R, very useful to have.

      Note to self: you want to define the starting date with something like:

      dateStart <- "1999-12-31"

      And yes, I too get these error messages:

      ...
      Downloading 16 out of 499
      download error, retrying ...
      download error, retrying ...
      download error, retrying ...
      download error, retrying ...
      Error in get.hist.quote(instrument = symbols[i, ], start = dateStart, :
      cannot open URL 'http://chart.yahoo.com/table.csv?s=AYE&a=11&b=31&c=1999&d=2&e=05&f=2013&g=d&q=q&y=0&z=AYE&x=.csv'
      In addition: There were 35 warnings (use warnings() to see them)

      So I wrapped the calls with try() and it went okay, skipping the urls that weren't working:

      for (i in 1:nrStocks) {
      cat("Downloading ", i, " out of ", nrStocks , "\n")
      x <- try(get.hist.quote(instrument = symbols[i,], start = dateStart, quote = "AdjClose", retclass = "zoo", quiet = TRUE),TRUE)
      z <- try(merge(z, x),TRUE)
      }


      Downloading 16 out of 499
      download error, retrying ...
      download error, retrying ...
      download error, retrying ...
      download error, retrying ...
      Downloading 17 out of 499

      Question to self and/or others: how did you get the list of tickers in the csv? Is there an online up-to-date list?

      Another question to self: merging the xs into the z, isn't that creating a huge object?

      Reply

    5. will’s avatar

      hey, I tried to implement this and I get:

      Error in get.hist.quote(instrument = symbols[i, ], start = Sys.Date() - :
      cannot open URL 'http://chart.yahoo.com/table.csv?s="MMM"&a=2&b=24&c=2012&d=3&e=22&f=2012&g=d&q=q&y=0&z="MMM"&x=.csv'

      I think the site for get.hist.qoute is gone, do you have an update?

      Reply

      1. QuantTrader’s avatar

        I will be posting an update to the downloading script today but I don't think it will help you. This function is from external package and I haven't modified it. Are you sure you are using the latest version of this package?

        Reply

        1. will’s avatar

          I've uploaded quantmod, tseries, and timeDate. Is there a package I'm missing?

          Reply

        2. will’s avatar

          seems that tseries packages has the get.hist.qoute function. I have updated to newest .10-28 version with similar results

          Reply

        3. will fan’s avatar

          seems the particular website is gone now, might you have an update?

          Reply

        4. Chris’s avatar

          Hi

          I`m trying to get your great R codes running on my computer. Unfortunately I already stuck in downloading the quotes. I get the error "Index vectors are of different classes: numeric date" after it tries to merge. I dont know what that means.

          I used the exact code above with the start date "2008-01-01".

          Can you help?

          thanx

          Chris

          Reply

          1. QuantTrader’s avatar

            Hi Chris,

            thank you for your comment! I tried the code both on Windows and Linux and had no problems. What's on my mind is that you can try to download data of the first company before the for-loop and store it into z-variable. Then perform the for-loop from 2 (not 1) and try to merge them. Let me know if it helped.

            Reply

          2. QuantTrader’s avatar

            Well.. it doe's not auto split but you can download AdjClose column from Yahoo. I updated my post. Thank you.

            Reply

          3. QuantTrader’s avatar

            Hi Noah,

            thank you for your comment. Well, in the documentation there is written that the adjusted parameter is TRUE by default. But my experience is different. I have identified couple of stocks whose prices saw a BIG jump. I will try to add adjusted = TRUE, see what happens and update the article. Thank you again for remark.

            Reply

          4. Noah’s avatar

            Does get.hist.quote auto adjust for splits?

            Reply

            1. QuantTrader’s avatar

              In the get.hist.quote function I specified which column I want to download: quote = "AdjClose". Thus, we have adjusted prices for all corporate actions.

              Reply

Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.