Friday, March 29, 2013

R/Finance 2013 Registration Open

The registration for R/Finance 2013 -- which will take place May 17 and 18 in Chicago -- is NOW OPEN!

Building on the success of the previous conferences in 2009, 2010, 2011 and 2012, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynotes by Sanjiv Das, Attilio Meucci, Ryan Sheftel, and Ruey Tsay. The main agenda (currently) includes seventeen full presentations and fifteen shorter "lightning talks". We are also excited to offer five optional pre-conference seminars on Friday morning.

To celebrate the fifth year of the conference in style, the dinner will be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:
http://www.RinFinance.com/agenda/

Registration information is available at:
http://www.RinFinance.com/register/

and can also be directly accessed by going to:
http://www.regonline.com/RFinance2013

We would to thank our 2013 Sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics
MS-Computational Finance at University of Washington

Google
lemnica
OpenGamma
OneMarketData
RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

Monday, March 18, 2013

TTR_0.22-0 on CRAN


An updated version of TTR is now on CRAN.  The biggest changes to be aware of are that all moving averages attempt to set colnames, CCI retuns an object with colnames, and the initial gap for SAR is not hard-coded at 0.01.  There are also some much-needed bug fixes - most notably to Yang Zhang volatility, MACD, SAR, EMA/EVWMA, and adjRatios.

There are some exciting new features, including a rolling single-factor model function (rollSFM, based on a prototype from James Toll), a runPercentRank function from Charlie Friedemann, stoch and WPR return 0.5 instead of NaN when there's insufficient price movement, and a faster aroon function.

Here are all of the updates (from the CHANGES file):

#-#-#-#-#-#-#-#-#-#    Changes in TTR version 0.22-0    #-#-#-#-#-#-#-#-#-#

SIGNIFICANT USER-VISIBLE CHANGES
  • CCI now returns an object with colnames ("cci").
  • All moving average functions now attempt to set colnames.
  • Added clarification on the displaced nature of DPO.
  • SAR now sets the initial gap based on the standard deviation of the high-low range instead of hard-coding it at 0.01.
NEW FEATURES
  • Added rollSFM function that calculates alpha, beta, and R-squared for a single-factor model, thanks to James Toll for the prototype.
  • Added runPercentRank function, thanks to Charlie Friedemann.
  • Moved slowest portion of aroon to C.
  • DonchianChannel gains an 'include.lag=FALSE' argument, which includes the current period's data in the calculation. Setting it to TRUE replicates the original calculation. Thanks to Garrett See and John Bollinger.
  • The Stochastic Oscillator and Williams' %R now return 0.5 (instead of NaN) when a securities' price doesn't change over a sufficient period.
  • All moving average functions gain '...'.
  • Users can now change alpha in Yang Zhang volatility calculation.
BUG FIXES
  • Fixed MACD when maType is a list. Now mavg.slow=maType[[2]] and mavg.fast=maType[[1]], as users expected based on the order of the nFast and nSlow arguments. Thanks to Phani Nukala and Jonathan Roy.
  • Fixed bug in lags function, thanks to Michael Weylandt.
  • Corrected error in Yang Zhang volatility calculation, thanks to several people for identifying this error.
  • Correction to SAR extreme point calculations, thanks to Vamsi Galigutta.
  • adjRatios now ensures all inputs are univariate, thanks to Garrett See.
  • EMA and EVWMA now ensure n < number of non-NA values, thanks to Roger Bos.
  • Fix to BBands docs, thanks to Evelyn Mitchell.


Wednesday, September 12, 2012

Computational Finance with R on Coursera

If you haven't signed up for the Introduction to Computational Finance and Financial Econometrics course taught by Eric Zivot on Coursera, it's not too late.  The second week just started and the first assignments aren't due until September 18th.

Join me in getting a good refresher on basic statistics, simulation and bootstrapping, linear algebra, and learning more about portfolio optimization, efficient portfolios, and risk budgeting.

Wednesday, August 15, 2012

A New plot.xts


The Google Summer of Code (2012) project to extend xts has produced a very promising new plot.xts function.  Michael Weylandt, the project's student, wrote R-SIG-Finance to request impressions, feedback, and bug reports.  The function is housed in the xtsExtra package of the xts project on R-Forge.

Please try xtsExtra::plot.xts and let us know what you think.  A sample of the eye-candy produced by the code in Michael's email is below.  Granted, this isn't a one-liner, but it's certainly impressive!  Great work Michael!



Tuesday, June 5, 2012

Book Review: Parallel R

You have a problem: R is single-threaded, but your code would be faster if it could simultaneously run on more than one core.  You have access to a cluster and/or your computer has multiple cores.  Parallel R, by Q. Ethan McCallum and Stephen Weston, can help you put this extra computing power to use.

The book describes 6 approaches to distributed computing.  Thoughts on each approach follow:

1) snow
The chapter starts by showing you how to create a socket cluster on a single machine (later sections discuss MPI clusters, and socket clusters of several machines).  Then a section describes how to initialize workers, with a later section giving a slightly advanced discussion on how functions are serialized to workers.

There's a great demonstration (including graphs) of why/when you should use clusterApplyLB instead of clusterApply.  There's also a fantastic discussion on potential I/O issues (probably one of the most surprising/confusing issues to people new to distributed computing) and how parApply handles them.  Then the authors provide a very useful parApplyLB function.

There are a few (but very important!) paragraphs on random number generation using the rsprng and rlecuyer packages.
2) multicore
The chapter starts by noting that the multicore package only works on a single computer running a POSIX compliant operating system (i.e. most anything except Windows).

The next section describes the mclapply function, and also explains how mclapply creates a cluster each time it's called, why this isn't a speed issue, and how it is actually beneficial.  The next few sections describe some of the optional mclapply arguments, and how you can achieve load balancing with mclapply.  A good discussion of pvec, parallel, and collect functions follow.

There are some great tips on how to use the rsprng and rlecuyer packages for random number generation, even though they aren't directly supported by the multicore package.  The chapter concludes with a short, but effective, description of multicore's low-level API.
3) parallel (comes with R >= 2.14.0)
The chapter starts by noting that the parallel package is a combination of the snow and multicore packages.  This chapter is relatively short, since those two packages were covered in detail over the prior two chapters.  Most of the content discusses the implementation differences between parallel and snow/multicore.
4) R+Hadoop
There's a full chapter primer on Hadoop and MapReduce, for those who aren't familiar with the software and concept.  The chapter ends with an introduction to Amazon's EC2 and EMR services, which significantly lower the barrier to using Hadoop.

The chapter on R+Hadoop is very little R and mostly Hadoop.  This is because Hadoop requires more setup than the other approaches.  You will need to do some work on the command line and with environment variables.

There are three examples; one Hadoop streaming and two using the Java API (which require writing/modifying some Java code).  The authors take care to describe each block of code in all the examples, so it's accessible to those who haven't written Java.
5) RHIPE
Using three examples, this chapter provides a thorough treatment of how to use RHIPE to abstract-away a lot of the boilerplate code required for Hadoop.  Everything is done in R.  As with the Hadoop chapter, the authors describe each block of code.

RHIPE does require a little setup: it must be installed on your workstation and all cluster nodes.  In the examples, the authors describe how RHIPE allows you to transfer R objects between Map and Reduce phases, and they mention the RHIPE functions you can use to manipulate HDFS data.
6) segue
This is a very short chapter because the segue package has very narrow scope: using Amazon's EMR service in two lines of code!
Final thoughts:
I would recommend this book to someone who is looking to move beyond the most basic distributed computing solutions.  The authors are careful to point you in the right direction and warn you of potential pitfalls of each approach.

All but the most basic setups (e.g. a socket cluster on a single machine) will require some familiarity with the command line, environment variables, and networking.  This isn't the fault of the authors or any of the approaches... parallel computing just isn't that easy.

I really expected to see something on using foreach, especially since Stephen Weston has done work on those packages.  It is mentioned briefly at the end of the book, so maybe it will appear in later editions.

Saturday, April 14, 2012

Long-Overdue Blogroll Update

I don't think I've updated my blogroll for at least a year... shame on me.

This update is mostly additions.  I only removed Max Dama's blog, and that was only because it no longer exists.  I left Skill Analytics because it contains excellent information, even though Damian hasn't posted in a long time.

The additions are:
Portfolio Probe
The Physics of Finance
Condor Options
Milktrader
Algorithm Zoo (by Milktrader)
SEF-Blog: Signal Extraction and Forecasting

Systematic Investor

Friday, March 23, 2012

R in Google Summer of Code 2012

This post is a slightly revised (and "blogified") version of the message Brian Peterson has sent to various R mailing lists.

Once again, R has been accepted as a mentoring organization for the Google Summer of Code (2012).  We invite students interested in this program to learn more about it.  A good starting point is the R GSoC wiki.

Students participating in the program receive US$5,000 for successful completion of a GSoC project, a great resume item, and an opportunity to work with R package authors.

There are four finance-related projects currently on the project ideas list:
and one that is not specifically finance related, but extends xts, which is the most prevalent time series class for finance in R:
  • Extend xts
    Improve data and model visualization.  Extend xts objects to contain mixed types (like data.frames).  Add interoperability to existing analytical functions (e.g. ARIMA, Holt Winters, VAR).  (FYI: there is already a highly qualified student associated with this project idea).
The list of finance project ideas above is also by no means exhaustive of the proposed R projects.  There are additional non-finance R project ideas listed on the R GSoC wiki.  Interested students or mentors are encouraged to discuss other project ideas on the gsoc-r Google group.

Those interested in either student or mentor participation should join our Google group, as this is the main means of communication.  When you apply for group membership, please introduce yourself with one sentence, so we know you're not a spammer.

Interested students should start working on applications now.  The student application process opens on 26 March, and successful students in prior years have often posted draft applications to melange for comment as soon after the opening of the application process as possible.

Note that GSoC is about coding.  It is not intended to fund research; but many activities with R require code to advance our work, so the program can be very helpful to improving R.

For information, the admins this year are Toby Dylan Hocking and John Nash, with Brian Peterson and Virgilio Gomez as backups.

References:
[1] Carl Bacon “Practical Portfolio Performance Measurement and Attribution”, (London, John Wiley & Sons. September 2004) ISBN 978-0470856796. 2nd Edition May 2008 ISBN 978-0470059289

[2] Meucci, Attilio, Managing Diversification (April 1, 2010). Risk, pp. 74-79, May 2009; Bloomberg Education & Quantitative Research and Education Paper. Available at SSRN: http://ssrn.com/abstract=1358533

[3] Meucci, Attilio, Exercises in Advanced Risk and Portfolio Management - With Step-by-Step Solutions and Fully Documented Code (August 15, 2010). Available at SSRN: http://ssrn.com/abstract=1447443

[4] Meucci, Attilio, Risk and Asset Allocation. Springer Finance (2005) ISBN: 3540222138