Friday, January 24, 2014

Brownian Bridge or Not with Heston Quadratic Exponential QMC

At first I did not make use of the Brownian Bridge technique in Heston QMC, because the variance process is not simulated like a Brownian Motion under the Quadratic Exponential algorithm from Andersen.

It is, however, perfectly possible to use the Brownian Bridge on the asset process. Does it make a difference? In my small test, it does not seem to make a difference. An additional question would be, is it better to take first N for the asset and next N for the variance or vice versa or intertwined? Intertwined would seem the most natural (this is what I used without Brownian Bridge, but for simplicity I did Brownian bridge on first N).


By contrast, Schobel-Zhu QE scheme can make full use of the Brownian Bridge technique, in the asset process as well as in the variance process. Here is a summary of the volatility process under the QE scheme from van Haastrecht:

Another nice property of Schobel-Zhu is that the QE simulation is as fast as Euler, and therefore 2.5x faster than the Heston QE.

I calibrated the model to the same surface, and the QMC price of a ATM call option seems to have a similar accuracy as Heston QMC. But we can see that the Brownian Bridge does increase accuracy in this case. I was surprised that accuracy was not much better than Heston, but maybe it is because I did yet not implement the Martingale correction, while I did so in the Heston case.



Brownian Bridge or Not with Heston Quadratic Exponential QMC

At first I did not make use of the Brownian Bridge technique in Heston QMC, because the variance process is not simulated like a Brownian Motion under the Quadratic Exponential algorithm from Andersen.

It is, however, perfectly possible to use the Brownian Bridge on the asset process. Does it make a difference? In my small test, it does not seem to make a difference. An additional question would be, is it better to take first N for the asset and next N for the variance or vice versa or intertwined? Intertwined would seem the most natural (this is what I used without Brownian Bridge, but for simplicity I did Brownian bridge on first N).


By contrast, Schobel-Zhu QE scheme can make full use of the Brownian Bridge technique, in the asset process as well as in the variance process. Here is a summary of the volatility process under the QE scheme from van Haastrecht:

Another nice property of Schobel-Zhu is that the QE simulation is as fast as Euler, and therefore 2.5x faster than the Heston QE.

I calibrated the model to the same surface, and the QMC price of a ATM call option seems to have a similar accuracy as Heston QMC. But we can see that the Brownian Bridge does increase accuracy in this case. I was surprised that accuracy was not much better than Heston, but maybe it is because I did yet not implement the Martingale correction, while I did so in the Heston case.



Tuesday, January 21, 2014

Adjoint Algorithmic Differentiation for Black-Scholes

Adjoint algorithmic differentiation is particularly interesting in finance as we often encounter the case of a function that takes many input (the market data) and returns one output (the price) and we would like to also compute sensitivities (greeks) to each input.

As I am just starting around it, to get a better grasp, I first tried to apply the idea to the analytic knock out barrier option formula, by hand, only to find out I was making way too many errors by hand to verify anything. So I tried the simpler vanilla Black-Scholes formula. I also made various errors, but managed to fix all of them relatively easily.

I decided to compare how much time it took to compute price, delta, vega, theta, rho, rho2 between single sided finite difference and the adjoint approach. Here are the results for 1 million options:

FD time=2.13s
Adjoint time=0.63s

It works well, but doing it by hand is crazy and too error prone. It might be simpler for Monte-Carlo payoffs however.

There are not many Java tools that can do reverse automatic differentiation, I found some thesis on it, with an interesting byte code oriented approach (one difficulty is that you need to reverse loops, while statements).

Adjoint Algorithmic Differentiation for Black-Scholes

Adjoint algorithmic differentiation is particularly interesting in finance as we often encounter the case of a function that takes many input (the market data) and returns one output (the price) and we would like to also compute sensitivities (greeks) to each input.

As I am just starting around it, to get a better grasp, I first tried to apply the idea to the analytic knock out barrier option formula, by hand, only to find out I was making way too many errors by hand to verify anything. So I tried the simpler vanilla Black-Scholes formula. I also made various errors, but managed to fix all of them relatively easily.

I decided to compare how much time it took to compute price, delta, vega, theta, rho, rho2 between single sided finite difference and the adjoint approach. Here are the results for 1 million options:

FD time=2.13s
Adjoint time=0.63s

It works well, but doing it by hand is crazy and too error prone. It might be simpler for Monte-Carlo payoffs however.

There are not many Java tools that can do reverse automatic differentiation, I found some thesis on it, with an interesting byte code oriented approach (one difficulty is that you need to reverse loops, while statements).

Sunday, January 12, 2014

Placing the Strike on the Grid and Payoff Smoothing in Finite Difference Methods for Vanilla Options

Pooley et al., in Convergence Remedies for non-smooth payoffs in option pricing suggest that placing the strike on the grid for a Vanilla option is good enough:


At the same time, Tavella and Randall show in their book that numerically, placing the strike in the middle of two nodes leads to a more accurate result. My own numerical experiments confirm Tavella and Randall suggestion.

In reality, what Pooley et al. really mean, is that quadratic convergence is maintained if the strike is on the grid for vanilla payoffs, contrary to the case of discontinuous payoffs (like digital options) where the convergence decreases to order 1. So it's ok to place the strike on the grid for a vanilla payoff, but it's not optimal, it's still better to place it in the middle of two nodes. Here are absolute errors in a put option price:

on grid, no smoothing         0.04473021824995271
on grid, Simpson smoothing    0.003942854282069419
on grid, projection smoothing 0.044730218065351934
middle, no smoothing          0.004040359609906119


As expected (and mentioned in Pooley et al.), the projection does not do anything here. When the grid size is doubled, the convergence ratio of all methods is the same (order 2), but placing the strike in the middle still increases accuracy significantly.

Here is are the same results, but for a digital put option:
on grid, no smoothing         0.03781319921461046
on grid, Simpson smoothing    8.289052335705427E-4
on grid, projection smoothing 1.9698293587372406E-4
middle, no smoothing          3.5122153011418744E-4


Here only the 3 last methods are of order-2 convergence, and projection is in deed the most accurate method, but placing the strike in the middle is really not that much worse, and much simpler.

A disadvantage of Simpson smoothing (or smoothing by averaging), is that it breaks put-call parity (see the paper "Exact Forward and Put Call Parity with TR-BDF2")

I think the emphasis in their paper on "no smoothing is required" for vanilla payoffs can be misleading. I hope I have clarified it in this post.

Placing the Strike on the Grid and Payoff Smoothing in Finite Difference Methods for Vanilla Options

Pooley et al., in Convergence Remedies for non-smooth payoffs in option pricing suggest that placing the strike on the grid for a Vanilla option is good enough:


At the same time, Tavella and Randall show in their book that numerically, placing the strike in the middle of two nodes leads to a more accurate result. My own numerical experiments confirm Tavella and Randall suggestion.

In reality, what Pooley et al. really mean, is that quadratic convergence is maintained if the strike is on the grid for vanilla payoffs, contrary to the case of discontinuous payoffs (like digital options) where the convergence decreases to order 1. So it's ok to place the strike on the grid for a vanilla payoff, but it's not optimal, it's still better to place it in the middle of two nodes. Here are absolute errors in a put option price:

on grid, no smoothing         0.04473021824995271
on grid, Simpson smoothing    0.003942854282069419
on grid, projection smoothing 0.044730218065351934
middle, no smoothing          0.004040359609906119


As expected (and mentioned in Pooley et al.), the projection does not do anything here. When the grid size is doubled, the convergence ratio of all methods is the same (order 2), but placing the strike in the middle still increases accuracy significantly.

Here is are the same results, but for a digital put option:
on grid, no smoothing         0.03781319921461046
on grid, Simpson smoothing    8.289052335705427E-4
on grid, projection smoothing 1.9698293587372406E-4
middle, no smoothing          3.5122153011418744E-4


Here only the 3 last methods are of order-2 convergence, and projection is in deed the most accurate method, but placing the strike in the middle is really not that much worse, and much simpler.

A disadvantage of Simpson smoothing (or smoothing by averaging), is that it breaks put-call parity (see the paper "Exact Forward and Put Call Parity with TR-BDF2")

I think the emphasis in their paper on "no smoothing is required" for vanilla payoffs can be misleading. I hope I have clarified it in this post.

Wednesday, January 08, 2014

Coordinate Transform of the Andreasen Huge SABR PDE & Spline Interpolation

Recently, I noticed how close are the two PDE based approaches from Andreasen-Huge and Hagan for an arbitrage free SABR. Hagan gives a local volatility very close to the one Andreasen-Huge use in the forward PDE in call prices. A multistep Andreasen-Huge (instead of their one step PDE method) gives back prices and densities nearly equal to Hagan density based approach.

Hagan proposed in some unpublished paper a coordinate transformation for two reasons: the ideal range of strikes for the PDE can be very large, and concentrating the points where it matters should improve stability and accuracy. The transform itself can be found in the Andersen-Piterbarg book "Interest Rate Modeling", and is similar to the famous log transform, but for a general local volatility function (phi in the book notation).


There are two ways to transform Andreasen Huge PDE:
  • through a non-uniform grid: the input strikes are directly transformed based on a uniform grid in the inverse transformed grid (paying attention to still put the strike in the middle of two points). This is detailed in the Andersen-Piterbarg book.

  • through a variable transform in the PDE: this gives a slightly different PDE to solve. One still needs to convert then a given strike, to the new PDE variable. This kind of transform is detailed in the Tavella-Randall book "Pricing Financial Instruments: the Finite Difference Method", for example.

    Both are more or less equivalent. I would expect the later to be slightly more precise but I tried the former as it is simpler to test if you have non uniform parabolic PDE solvers.

    It works very well, but I found an interesting issue when computing the density (second derivative of the call price): if one relies on a Hermite kind of spline (Bessel/Parabolic or Harmonic), the density wiggles around. The C2 cubic spline solves this problem as it is C2. Initially I thought those wiggles could be produced because the interpolation did not respect monotonicity and I tried a Hyman monotonic cubic spline out of curiosity, it did not change anything (in an earlier version of this post I had a bug in my Hyman filter) as it preserves monotonicity but not convexity. The wiggles are only an effect of the approximation of the derivatives value.

    Initially, I did not notice this on the uniform discretization mostly because I used a large number of strikes in the PDE (here I use only 50 strikes) but also because the effect is somewhat less pronounced in this case.

    I also discovered a bug in my non uniform implementation of Hagan Density PDE, I forgot to take into account an additional dF/dz factor when the density is integrated. As a result, the density was garbage when computed by a numerical difference.
    HaganDensity denotes the transformed PDE on density approach. Notice the non-sensical spikes

    "Bad" Call prices around the forward with Hagan Density PDE. Notice the jumps.
     
    No jumps anymore after the dF/dZ fit

    Update March 2014 - I have now a paper with Matlab code "Finite Difference Techniques for Arbitrage Free SABR"
     

    Coordinate Transform of the Andreasen Huge SABR PDE & Spline Interpolation

    Recently, I noticed how close are the two PDE based approaches from Andreasen-Huge and Hagan for an arbitrage free SABR. Hagan gives a local volatility very close to the one Andreasen-Huge use in the forward PDE in call prices. A multistep Andreasen-Huge (instead of their one step PDE method) gives back prices and densities nearly equal to Hagan density based approach.

    Hagan proposed in some unpublished paper a coordinate transformation for two reasons: the ideal range of strikes for the PDE can be very large, and concentrating the points where it matters should improve stability and accuracy. The transform itself can be found in the Andersen-Piterbarg book "Interest Rate Modeling", and is similar to the famous log transform, but for a general local volatility function (phi in the book notation).


    There are two ways to transform Andreasen Huge PDE:
    • through a non-uniform grid: the input strikes are directly transformed based on a uniform grid in the inverse transformed grid (paying attention to still put the strike in the middle of two points). This is detailed in the Andersen-Piterbarg book.

    • through a variable transform in the PDE: this gives a slightly different PDE to solve. One still needs to convert then a given strike, to the new PDE variable. This kind of transform is detailed in the Tavella-Randall book "Pricing Financial Instruments: the Finite Difference Method", for example.

    Both are more or less equivalent. I would expect the later to be slightly more precise but I tried the former as it is simpler to test if you have non uniform parabolic PDE solvers.

    It works very well, but I found an interesting issue when computing the density (second derivative of the call price): if one relies on a Hermite kind of spline (Bessel/Parabolic or Harmonic), the density wiggles around. The C2 cubic spline solves this problem as it is C2. Initially I thought those wiggles could be produced because the interpolation did not respect monotonicity and I tried a Hyman monotonic cubic spline out of curiosity, it did not change anything (in an earlier version of this post I had a bug in my Hyman filter) as it preserves monotonicity but not convexity. The wiggles are only an effect of the approximation of the derivatives value.

    Initially, I did not notice this on the uniform discretization mostly because I used a large number of strikes in the PDE (here I use only 50 strikes) but also because the effect is somewhat less pronounced in this case.

    I also discovered a bug in my non uniform implementation of Hagan Density PDE, I forgot to take into account an additional dF/dz factor when the density is integrated. As a result, the density was garbage when computed by a numerical difference.
    HaganDensity denotes the transformed PDE on density approach. Notice the non-sensical spikes

    "Bad" Call prices around the forward with Hagan Density PDE. Notice the jumps.
     
    No jumps anymore after the dF/dZ fit

    Update March 2014 - I have now a paper with Matlab code "Finite Difference Techniques for Arbitrage Free SABR"
     

    Monday, January 06, 2014

    Random Hardware Issues

    Today, after wondering why my desktop computer became so unstable (frequent crashes under Fedora), I found out that the micro usb port of my cell phone has some kind of short circuit. My phone behaves strangely in 2014, it lasted nearly 1 week on battery (I lost it for half of the week), and seems to shutdown for no particular reason once in a while.

    On the positive side, I also discovered, after owning my monitor for around 5 years, that it has SD card slots on the side, as well as USB ports. I always used the USB ports of my desktop and never really looked at the side of my monitor...

    I also managed to seriously boost the speed of my home network with a cheap TP-Link wifi router. The one included in the ISP box-modem only supported 802.11g and had really crap coverage, so crap that it was seriously limiting the internet traffic. In the end it was just a matter of disabling wifi and DHCP on the box, adding the new router in the DMZ, and adding a static WAN IP for the box in the router configuration. I did not realize how much of a difference this could make, even on simple websites. I was also surprised that, for some strange reason, routers are cheaper than access points these days.

    Random Hardware Issues

    Today, after wondering why my desktop computer became so unstable (frequent crashes under Fedora), I found out that the micro usb port of my cell phone has some kind of short circuit. My phone behaves strangely in 2014, it lasted nearly 1 week on battery (I lost it for half of the week), and seems to shutdown for no particular reason once in a while.

    On the positive side, I also discovered, after owning my monitor for around 5 years, that it has SD card slots on the side, as well as USB ports. I always used the USB ports of my desktop and never really looked at the side of my monitor...

    I also managed to seriously boost the speed of my home network with a cheap TP-Link wifi router. The one included in the ISP box-modem only supported 802.11g and had really crap coverage, so crap that it was seriously limiting the internet traffic. In the end it was just a matter of disabling wifi and DHCP on the box, adding the new router in the DMZ, and adding a static WAN IP for the box in the router configuration. I did not realize how much of a difference this could make, even on simple websites. I was also surprised that, for some strange reason, routers are cheaper than access points these days.