Tuesday, May 18, 2010

LSPM Joint Probability Tables

I've received several requests for methods to create joint probability tables for use in LSPM's portfolio optimization functions.  Rather than continue to email this example to individuals who ask, I post it here in hopes they find it via a Google search. ;-)

I'm certain there are more robust ways to estimate this table, but the code below is a start...

'x' is a matrix of market system returns
'n' is the number of bins to create for each system
'FUN' is the function to use to calculate the value for each bin
'...' are args to be passed to 'FUN'


jointProbTable <- function(x, n=3, FUN=median, ...) {

  # Load LSPM
  if(!require(LSPM,quietly=TRUE)) stop(warnings())

  # Function to bin data
  quantize <- function(x, n, FUN=median, ...) {
    if(is.character(FUN)) FUN <- get(FUN)
    bins <- cut(x, n, labels=FALSE)
    res <- sapply(1:NROW(x), function(i) FUN(x[bins==bins[i]], ...))
  }

  # Allow for different values of 'n' for each system in 'x'
  if(NROW(n)==1) {
    n <- rep(n,NCOL(x))
  } else
  if(NROW(n)!=NCOL(x)) stop("invalid 'n'")

  # Bin data in 'x'
  qd <- sapply(1:NCOL(x), function(i) quantize(x[,i],n=n[i],FUN=FUN,...))

  # Aggregate probabilities
  probs <- rep(1/NROW(x),NROW(x))
  res <- aggregate(probs, by=lapply(1:NCOL(qd), function(i) qd[,i]), sum)

  # Clean up output, return lsp object
  colnames(res) <- colnames(x)
  res <- lsp(res[,1:NCOL(x)],res[,NCOL(res)])
  return(res)
}

# Example
N <- 30
x <- rnorm(N)/100; y <- rnorm(N)/100; z <- rnorm(N)/100
zz <- cbind(x,y,z)
(jpt <- jointProbTable(zz,n=c(4,3,3)))

14 comments:

Ralph Vince said...

Josh, If I may comment:
The joint probabilities table ("jpt") ultimately should encompass and be a proxy for all potential risk for the next period.

In other words, frequently, in attempting to articulate "risk," we see the argument that it is more than mere variance. That there is liquidity risk, there is counter-party risk, this risk and that risk, and there is a notion that risk posses many facets.

The jpt should attempt to encompass all that, so that we have one matrix that presents to us the many (and many unforseen) aspects of risk.

To that end, my advice is to start with the empirical distribution of prices, of outcomes, of what has happened. Then, amend that table to incorporate any other facets of risk.

One might argue that his is therefore only an "estimate," and that the outcomes of the model are therefore only estimates. And this is entirely true, and, as we can assume, the outputs are only as good as the estimates.

But things must be estimated. There is no getting around that in this business. The jpt allows us to incorporate these estimates into a composite picture of our total risk.

Joshua Ulrich said...

Ralph,

That's a very good point. The above code only produces an estimate of the empirical distribution of outcomes, based on actual historical data.

You can amend the maximum losses (maxLoss) of a lsp object via something like:
jpt$maxLoss[1:3] <- rep(-0.6,3)

I should probably write methods to append events / probabilities to the jpt. Something like:
# This won't work right now
jpt <- c(jpt,rep(-0.8,3))

G$ said...
This comment has been removed by the author.
Joshua Ulrich said...

G$,

This isn't an appropriate place to discuss issues with your R installation.

Anonymous said...

Hey Josh, this looks great (or at least I'm sure that it will once I work out how to use it).

I am a big fan of using csv files for data input/output, and perhaps it might be worth giving some consideration to dropping the JPT out as a csv for editing purposes and then it can be imported directly into LSPM. That would conveniently tie the 2 tasks (JPT / LSPM) together.

My current task is to figure out how to import the raw data into this code from a csv.

Ross Bond

Damian said...

Joshua - looked for a contact email address - couldn't find one. Would love to get your opinion on:;

http://www.monkeyanalytics.com

It'd be great to get a setup with them that was targeted to FOSS traders.

Thanks!

Joshua Ulrich said...

Damian,

Sorry about the contact email address. You can find it in my R packages...

Monkey Analytics looks interesting. It seems like they're setting up an EC2 cluster for you so you don't have to do as much of the administrative stuff. They're not currently accepting users, so I can't investigate much. It may be very useful for people who want to run LSPM but don't have / want their own cluster.

Kostas said...

Hi Joshua,

Following Ralph's comment, regarding the creation of the jpt do you have in mind a way to "blend" jpts maybe created from two different periods? Say you want to incorporate with a certain probability two jpts, one of a more recent period and one of a period of crisis or something similar.

many thanks
Kostas Metaxas

Joshua Ulrich said...

Hi Kostas,

There currently isn't a function to join JPTs, but you can do it by hand:

library(LSPM)
data(port)
port_rows <- port_cols <- port

# combine JPTs by rows
port_rows$events <- rbind(port_rows$events, port$events)
port_rows$probs <- rbind(port_rows$probs, port$probs)

# combine JPTs by columns
port_cols$events <- cbind(port_cols$events, port$events)
port_cols$maxLoss <- c(port_cols$maxLoss, port$maxLoss)
port_cols$f <- c(port_cols$f, port$f)

TradingPro said...

Hi Josh,

Relating to jointProbTable are the following warning messages anything to be concerned about? Maybe there is some way to prevent the warnings?

> jpt <- jointProbTable(rtn,n=c(20,20,20))
Warning messages:
1: In is.na(cols) : is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(cols) : is.na() applied to non-(list or vector) of type 'NULL'

Joshua Ulrich said...

TradingPro,

You haven't provided enough information for me to know if the warnings are anything to be worried about or if there's any way to prevent them. Please email me if you would like help looking into this.

TradingPro said...

Hi Josh (this is Grant),
Thank you for the reply and offer to help.
It is okay for now then, I was presuming that this was a generic warning that everybody was receiving because I receive them on differing platforms and data.
All appears to be fine but I will let you know if any issues surface.

Chris said...

Josh

I am now trying to create a jointProbability table of the pnl table you helped me to create in the "margin constraints with LSPM" thread. The data in "pnl" is normalized dollar values with 439 rows and 41 columns. I run the following command:


(jpt <- jointProbTable(pnl,n=c(40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40)))

and get:


Error in if (any(maxLoss >= 0)) { : missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In is.na(cols) : is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(cols) : is.na() applied to non-(list or vector) of type 'NULL'

Your help will be highly appreciated.

Regards

Joshua Ulrich said...

Hi Chris,

I'm happy to help, but (as I mentioned to Grant in an earlier comment on this post) I would prefer to be emailed directly with questions that do not pertain to the post.

I don't consider, "I got an error when I tried to use the function in your post, but I don't know why..." or "how do I do X" to pertain to the post, so I would prefer questions like that be emailed to me directly, not posted as comments.

I would consider, "line XYZ of function 'foo' causes an error when the input is like 'abc'" to pertain to the post, so I welcome those types of specific comments.