# CoDeFi, a new algorithm for risk measurement

In this Preprint, we present a new algorithm (CoDeFi) to tackle the Curse Of Dimensionality In Finance and deal with a broad class of Partial Differential Equations (PDE’s) including the Kolmogorov equations as, for instance, the Black and Scholes equations. As a main feature, our method allows one to solve the Kolmogorov equations in large dimensions and provides a very general framework to deal with risk measurements. In financial applications, the number of dimensions corresponds to the number of underlyings or risk sources.

The Curse of Dimensionality (CoD – This term comes from a 1957 paper by R.E. Bellman.) refers to the fact that the computational time increases exponentially with the number of risk sources, and in the present preprint we propose a solution to this long-standing open problem. Our approach is based on a combination of techniques from the theory of partial differential equations (PDE’s): Monte-Carlo trajectories, a classical numerical method that is not cursed (not affected by CoD) are used as moving-in-time grids. These grids allow solving Kolmogorov equations, using unstructured mesh PDE techniques together with optimal transport theory, to overcome the cursing. We are also using zero-error calibration techniques, for a perfect fit of replication instruments or complex financial modeling. All these techniques are bundled together in a framework that we call CoDeFi: it can be seen as a general risk measurement framework based on the algorithm presented in [4]-[6].

This preprint reviews this technology and proposes a first benchmark to the multi-dimensional case, filling it up to 64 dimensions (or risk sources). Indeed, to our knowledge, little other technology could benchmark CoDeFi framework: optimal quantization [1] or wavelet analysis [9] might be used for up to, let say 10 dimensions. Above this limit, American-Monte Carlo methods [7] might provide lower bounds, as these methods are known to compute sub-optimal exercising. We emphasize that the same computational framework is used to provide simulations for nonlinear hyperbolic problems [3].

Our perception is that this technology can already confidently compute risk measurements. There are numerous potential applications linked to the curse of dimensionality. We identified some of them in the insurance industry, but mainly in the financial sector. Above pricing and hedging, allowing to issue new financial products, more adapted to client’s needs, or market arbitraging, we are addressing risk measurements: indeed, the test realized in this paper shown that CoDeFi could compute accurate Basel regulatory measures based on VAR (Value at Risk) or CVA (Credit Value Adjustment) on a single laptop, whereas farm of thousand of computers are used on a daily basis today with approximation methods. Finally, such a framework could be helpful in systemic risk measurement that can be accurately modeled through high dimensional Kolmogorov equations like (4). This could notably be useful for developing tools dedicated to economical crisis supervision.

[1] V. Bally, G. Pages, J. Printems, First order schemes in the numerical quantization method, Rapport de recherche INRIA N 4424.

[3] P.G. LeFloch, J.M., Mercier, Revisiting the method of characteristics via a convex hull algorithm, J. Comput. Phys. 298 (2015), 95-112.

[4] P.G. LeFloch, J.M., Mercier, A Framework to solve high-dimensional Kolmogorov equations. In preparation.

[6] J.M., Mercier, A high-dimensional pricing framework for financial instruments valuation, April 2014, available at http://ssrn.com/abstract=2432019.

[7] Longstaff, Francis A., Schwartz, Eduardo S, Valuing American Options by Simulation: A Simple Least-Squares Approach, The Review of Financial Studies, 14(1):113-147, 2001

[9] Matache A.M., Nitsche P.A. and Schwab C.,Wavelet Galekin Pricing of American Options on Levy Driven Assets, Research reports N 2003/06, (2003).

# The shortest path from Anarchy to Alcoholism

goes through Jesus and ants, according to wikipedia (14/07/2012). This path is precisely the following one :

For instance, consider the wikipedia page Ture Nerman, you should find a link toward  the wikipedia page Alcoholism. This is the shortest path according to the length of the linked wikipedia page names (i.e. the path quoted below has a distance of 91 characters – “AnarchismJesusAntTermiteTermitomycesR.HeimRoger HeimUppsala UniversityTure NermanAlcoholism”- that is the shortest wikipedia path from Anarchy to Alcoholism according to this distance).

You could be skeptical reading such a result : why some people are loosing their time and money to compute such a path ? Or you could be enthusiastic : this might open some work upon knowledge cartography. For instance, some pertinent metrics could be found (why not links density with a more pertinent distance function) to pop up potential interesting things.

I want here to illustrate a technical point : this shortest path is computed considering a BIG graph. All nodes of this graph are wikipedia pages (about 10 millions pages), and edges are links to other pages (about 100 millions links). The data are publicly available as wikipedia dump (40 Go) as Extensible_Markup_Language (xml). However xml is not a graph structure, but a tree data structure. In particular, there is no way to use XML technologies to compute this wikipedia shortest path.

In fact I did use XML technologies to compute this shortest path. Indeed, the main point of this post is to prove that these technologies can scale up quite efficiently to handle Big Graph. Applications of such an XML based Graph Databases are numerous, specially for Data Scientists. In fact, since xml is very rich from a semantic point of view, data extraction, exploitation, and transformation, using XML based technology to query graphs is more than pertinent, it is really very efficient.

Technically, this work relies mainly over an open-sourced Extensible_Markup_Language   (xml) Database, called BaseX. A common query langage to query XML is called XPATH, extensively used through the functional programming language XQUERY.  As noted above, since xml is not a graph structure, you can’t use XQUERY / XPATH to query graphs. For this wikipedia proof-of-concept, I extended XPATH to directed graph, so that I can now use  XQUERY together with this extension to query graphs. I give to this this technology the name XQUERY++ (but I am open to a better name !).

Indeed, I already noticed (see this post) that this extension of XML technology was really powerful for data exploitation.  I am also already using these technologies for finance applications, because they are allowing fast data extraction / transformation, as well as easy and powerful reporting tools libraries.

You might ask why investing in Yet-Another-Graph-Technology, as this market is already slowly maturing. Indeed, I had no choice : I have tried to use Graph Databases. However they present two major inconvenient:

– Firstly, I do not know a powerful enough graph query language for my purposes. On the contrary, XQUERY is a very powerful data programming language, that is W3C standardized.

– XML is at heart of big company information systems (IS), as the de-facto tools used in IS urbanization. My clients data are usually XML or SQL based (SQL can be quite easily exported to XML).

Note that XML Databases and XQUERY have the reputation to exhibit poor performances. I confirm this point for heavy computational algorithmic task. Indeed, the “hard” algorithmic parts of this wikipedia work are written binding the database BaseX with C++ (binding XML databases with C++/JAVA / C# / Python … is quite easy now).

I hope that this small proof-of-concept convinced the reader that this technology can now be considered a serious potential concurrent to Graph Databases solution vendors.  And for those interested in this technology, I included a XQUERY ++ presentation !

# Explicit solutions to multi-dimensional conservation laws

In this post, I am releasing a reviewed article concerning the convex hull algorithm, including numerical analysis in higher dimensions, see this link. It was a challenging algorithmic task to extend this work to higher dimensions.

This reviewed version applies the CHA algorithm to a two-dimensional problems that is the solution $u=u(t,x_1,x_2)$ (for $(x_1,x_2) \in {\mathbb{R}}^2$) to the following Burgers-type initial value problem:

$\partial_t u + \nabla \cdot f(u) = 0,\quad f(u)(t,x) \in {\mathbb{R}}^2,\quad u(0,x_1,x_2)=sup\Big(\frac{1 - |(x_1,x_2)|_\infty}{10},0.\Big)$

where we denote the norm $|X-Y|_p = \Big( \sum_{d=1, \ldots, D}(x_d-y_d)^p \Big)^{1/p}$ for any exponent $1 \le p \le \infty$. The following picture illustrates the behavior of this equation for different times with a simple flux $f(u)= (\frac{u^2}{2},0)$ with a structured mesh. The next picture present a more complex flux $f(u)= (\frac{u^2}{2},\frac{u^3}{30})$ using an unstructured mesh.

I am quite impressed by these numerical results, the shock location (corresponding to time 5 and 7 on the figures) are lying on a one-dimensional manifold, as expected (see the article for details). Some work though remains to do to stabilize this algorithm for unstructured mesh.

2D solution of a non-linear conservation law (CHA-algorithm)

2-D solution using an unstructured mesh

# The Convex Hull transformation

In this post, I am releasing a preprint, that can be found at Arxiv, submitted to Journal of Computational Physics, related to a joint work with Philippe Lefloch, work that I had begun to discuss in these posts : here and here .

I am really excited about this work. We introduce a transformation of maps, named the convex hull transformation.  This transformation is a new one to my knowledge, that might be part of  Transport theory, and that can be  be compared to the Polar factorization one of Yann Brenier. We use  this new transformation  to compute EXACT solutions to numerous hyperbolic non-linear equations arising in mathematical physics. This is probably my best mathematical work so far. Included is the abstract of this paper.

Abstract

We revisit the method of characteristics for shock wave solutions to nonlinear hyperbolic problems and we describe a novel numerical algorithm —the convex hull algorithm (CHA)— in order to compute, both, entropy dissipative solutions (satisfying all relevant entropy inequalities) and entropy conservative (or multivalued) solutions to nonlinear hyperbolic conservation laws. Our method also applies to Hamilton-Jacobi equations and other problems endowed with a method of characteristics. From the multivalued solutions determined by the method of characteristic, our algorithm ”extracts” the entropy dissipative solutions, even after the formation of shocks. It applies to, both, convex or non-convex flux/Hamiltonians. We demonstrate the relevance of the proposed approach with a variety of numerical tests including a problem from fluid
dynamics.

# Fast technologies for strategies or risk measures evaluations over big portfolio with a lot of risk sources

I posted some months ago a working paper at SSRN, that can be found at SSRN siterelated to a post called “a breach in the curse of dimensionality“. I recently implemented this method with success, feeding it with real big portfolio data through a CRIMERE database concept.

I am recalling that, today, nobody knows how to evaluate complex risk measures, as CVA or systemic risk ones, for big bank or insurance portfolios, because there are too many risk sources.

The result seems to be in accordance with my expectations : fast algorithms to evaluate complex risk measures, as CVA or systemic risk ones, for big bank portfolios, seems now possible. Moreover, most probably, a single PC is enough to compute them : don’t waste your money and your intelligence to buy complex distributed systems to compute these measures. It means also that this technique should be able to evaluate complex strategies over big portfolios, as those coming from bank or insurance. Please do not hesitate to contact me if you have any needs in this direction.

# Explicit Solver for non linear hyperbolic equations

In this post, I am publishing a small Executable, which has the particularity to be able to calculate the exact solution of a nonlinear equations class, given by scalar conservation laws in one dimension, which prototype is the Burger equation. This executable illustrates some of the work we have underway with Philippe Lefloch, work that I had begun to discuss in this post.

This small executable is distributed for educational purposes, without warranty of any kind, to allow teachers to illustrate the behavior of the solutions of these equations, as the phenomena of shock interactions or asymptotic behavior for large times. A very primitive interface allows the user to define his or her own tests. This is a first demonstrator: if there is any interest there, do not hesitate to contact me for changes or bugs or suggestions.

This program was designed to compute the solution $u(t,x), t\ge 0, x \in {\mathbb{R}}$ of the Cauchy problem for scalar conservation laws in one dimension:

$\partial_t u+\partial_x f(u)=0,\quad u(0,\cdot) = u_0(\cdot),$

where $u_0$ is the initial condition, and $f (u)$ the nonlinearity. I coded two types of nonlinearities: a convex $f (u) = u ^2/2$, and a non-convex $f (u) = u ^ 3/3 - u$, but would be very easy to integrate any other non-linearity. In fact, the computation engine used for this small program is able to compute the much more complex solution than one dimension scalar conservation laws : the approach works equally well in any dimension, for systems of non linear equations as those coming from Hamilton-Jacobi equations or other nonlinear hyperbolic type, and for all types of non-linearity (convex and non-convex). Do not hesitate to contact me if you have any needs numerical computations for this type of equations. Finally, I recall that there is an infinity of solutions of scalar conservation laws. This program compute the limiting viscosity solution, considered as the “physical” solution of interest, but it could also produce the other ones.

I designed these numerical techniques. However, as a numerical analyst, it seems clear that these techniques are much more efficient than the existing ones to compute hyperbolic non linear equations.

This exe is a rar archive. Once extracted,  a folder is containing two files :

– Exact_Solver.exe is an exe running under Windows.

– Initial_Condition.xml is an input file allowing the user to consider its own initial conditions.

1. Exact_Solver.exe : Here is the main windows, once ran the exe:

The basic functionality are the following ones:

1) The box labelled “Inc. Time”  allows to define an increment time, default value 0.1 sec. You can input any Inc. Time between 0. to 99.99.

2) The button “++” computes the solution accordingly to the box “Inc. Time”.  As an illustration, here is the solution of Buger equation with gaussian initial conditions computed at time 99.99

3) The right combo box allows to select any initial conditions defined in “Initial_Conditions.xml”

• Initial_Condidions.xml

This is an xml file. Here are its default value :

One sees the two nonlinearities available: “u2″ or “u3_u”. Within, each of the three types of initial conditions available:

1)  <Gaussian> indicates that the initial condition is a Gaussian.

2)  <Riemann> indicates that the initial condition is a condition of Riemann type with $u_0 (x) = u_l, x \le S$, $u_r, x \ge S$

3)  <RiemannInteraction>indicates that the initial condition is a Riemann data of kind  $u_0(x) = u_l;x \le S_l, \quad u_0(x) = u_m; S_l \le x \le S_r, \quad u_r \text{ else}$.

# Solveur explicite pour les équations hyperboliques non-linéaires

Dans ce post, je diffuse un petit programme, qui a la particularité de savoir calculer la solution exacte d’une classe d’équations non-linéaires, les lois de conservation scalaires en une dimension, dont le prototype est l’équation de Burger. Cet exécutable illustre une partie des travaux que nous avons en cours avec Philippe Lefloch, travaux que j’avais commencés à évoquer dans ce post .

Ce petit exécutable est diffusé à titre pédagogique, sans garantie aucune, pour permettre à des enseignants d’illustrer le comportement des solutions de ces équations, comme les phénomènes d’interactions de chocs ou le comportement asymptotique en temps grand. Une interface très primitive permet à l’utilisateur de définir lui-même ses propres tests. Il s’agit d’un premier démonstrateur: s’il y a un intérêt, n’hésitez pas à me contacter pour me demander des modifications ou me faire part de bugs ou suggestions.

Ce programme a été conçu pour calculer la solution $u(t,x), t\ge 0, x \in {\mathbb{R}}$ du problème de Cauchy pour les lois de conservation scalaire en une dimension :

$\partial_t u+\partial_x f(u)=0,\quad u(0,\cdot) = u_0(\cdot),$

où  $u_0(\cdot)$ est la condition initiale, et $f(u)$ la non-linéarité. J’ai codé pour ce programme deux types de non-linéarités : une convexe $f(u)=u^2 / 2$, et une non convexe $f(u)=u^3 / 3-u$, mais il me serait très simple d’intégrer n’importe quelle autre non-linéarité. En fait, le moteur de calcul utilisé pour ce petit programme est capable également de calculer la solution de problèmes beaucoup plus complexes que les lois de conservation scalaires en une dimension : l’approche fonctionne indifféremment en dimension quelconque, pour les systèmes d’équations non-linéaires provenant des équations d’Hamilton-Jacobi, et bien d’autres équations de type hyperboliques non-linéaires, et pour tout types de non-linéarités (convexe et non convexe). Ici aussi, n’hésitez pas à me contacter si vous avez des besoins en calcul numériques sur ce type d’équations. Finalement, je rappelle qu’il existe une infinité de solutions des lois de conservation scalaires. Ce programme calcule la solution de viscosité, considérée comme la solution “physique” intéressante, mais il pourrait également exhiber les autres solutions.

Je suis juge et parti dans ce travail bien sûr. Cependant, en tant que numéricien, il me semble clair que les techniques numériques qui ont été développées pour ce travail sont bien plus performantes que l’existant pour simuler ce type de problème.

Le programme se présente sous forme d’un fichier .rar. Une fois extrait, apparaît le dossier contenant deux fichiers :

– Exact_Solver.exe est un exécutable Windows.

– Initial_Condition.xml est un fichier xml permettant à l’utilisateur de définir des jeux de conditions initiales.

1. Exact_Solver.exe : Une fois lancé, l’exécutable apparaît :

Les fonctionnalités de base de ce programme sont les suivantes :

1) La box de la bel “Inc. Time”  permet de définir un incrément de temps, par défaut à 0.1 sec. Vous pouvez définir cet incrément de temps de 0. à 99.99.

2) Le bouton “++” calcule la solution sur le laps de temps défini dans “Inc. Time”.  Par exemple, voici la solution de l’équation de Buger avec conditions initiales gaussiennes, calculée au temps 99.99

3) La liste à droite su solveur permet de sélectionner les différents jeux de conditions initiales utilisateur entrées dans le fichier “Initial_Conditions.xml”

• Initial_Condidions.xml

Il s’agit d’un fichier xml. Voici la version par défaut fournies avec le programme

On voit apparaître les deux non-linéarités disponibles : “u2″ ou “u3_u”. Au sein de chacune, les trois types de conditions initiales disponibles :

1)  <Gaussian> indique que la condition initiale est une gaussienne.

2)  <Riemann> indique que la condition initiale est une condition de Riemann avec $u_0(x) = u_l;x \le S, \quad u_r \text{ else}$.

3)  <RiemannInteraction> indique que la condition initiale est une condition de Riemann avec $u_0(x) = u_l;x \le S_l, \quad u_0(x) = u_m; S_l \le x \le S_r, \quad u_r \text{ else}$.

# A High-Dimensional Pricing Framework for Financial Instrument valuation

I posted a working paper at SSRN, that can be found either at SSRN site, or directly here in this blog site, related to my previous post “a breach in the curse of dimensionality“. This is a working paper, and I am actually implementing this method, thus this paper will likely to change while experimenting. Do not hesitate to contact me for further details or feed back.

# A breach in the Curse of Dimensionality

I am quite excited to release a work devoted to the Curse of dimensionality, in the context of Financial mathematics. I am actually trying to validate this work by peers, and will publish it as soon as it will be.

This work proposes a framework allowing to consider valuations of complex OTC products, including american exercising, written on a large number of underlyings. The motivation is to provide numerical algorithms able to quickly valuate big investment bank portfolios with sophisticated pricing models, as those coming from risk measures, and we think particularly to CVA (counterpart risk valuation) ones, leading to consider high dimensional Optimal Stopping Problems.

Indeed, it has already been noticed that current numerical methods devoted to pricing algorithms face an open mathematical problem, the curse of dimensionality (see for instance these authors Crepey, Labordere, Reghai), particularly in the context of CVA computations. We notice also that the econometry problem, consisting in finding a risk-neutral dynamic of the underlyings, faces the very same open problem. Indeed, the CVA approach, as some others existing risk measures, although vertuous, can not be credible as long as they will be “cursed”.

The framework introduced has two purposes : firstly to provide a fast calibration algorithm to determine a risk-neutral econometry, using risk-less market prices, able to capture correlation structures between underlyings. Secondly, to design a fast backwardation algorithms upon this econometry. Both algorithms breaks the curse of dimensionality wall. This is achieved using thoroughly a particular context, given by martingales processes, a classical assumption in mathematical finance. From a mathematical point of view, these results are obtained blending together stochastic, partial differential equations, and optimal transport theories, and are a generalization to the multi-dimensional case of this paper.

My perception is that these tools should be the right candidates to provide a standardized framework for computing risk measures of large portfolios or investment strategies, as well as a pedestal to tackle credible systemic risk measures. Moreover, this approach seems to be very performent from an algorithmic point of view. The resulting algorithms should be able to reduce drastically the computational burden for existing measures, as CVA or risk-neutral Front-office valuations.

The curse of dimensionality has been opened roughly sixty years ago as a mathematical problem. This problem is of crucial importance in the context of mathematical finance, where the dimensionality is the number of underlyings of a given portfolio, usually counting thousands of instruments as shares, rates, commodities, default probabilities and so on. Among the authors that tackled this problem in a financial context, let us cite for instance C. Schwab’s PDE approach, or V. Bally, G. Pages, J.Printems stochastic approach. Actually, these authors provide methods working for “low” dimensions. It is usually considered that the algorithmic burden of these methods is too heavy for let say ten underlyings.

For partial differential equations (P.D.E.), the curse of dimensionality is quite easy to illustrate: most numerical methods rely over cartesian meshes, consisting in $N^D$ points, $N$ being the number of points necessary to model one dimension, $D$ being the number of dimensions. Crude methods as the alternate direction implicit one uses $N^{D+1}$ operations to compute a solution, converging at rate $N^{-2}$ in the best case toward the solution of the continuous equation. Hence, the overall computational burden to reach an accuracy of order $\epsilon$, for a small $\epsilon$>0, is $\epsilon^{-\frac{D+1}{2}}$ in the best cases, being a clear algorithmic complexity limitation to address higher dimensional problems, i.e. when $D$ becomes large, as it is in Finance.

We designed algorithms based over $N$ points, lying in ${\mathbb{R}}^D$. They show rate of convergence of order $N^{-1/2}$, in the worst case, independently of the dimension $D$. For convex solutions, this rate can be as good as $N^{-2}$ without american exercising, and $N^{-1}$ with, dimension independent. To reach this order of convergence, $N^3$ operations are necessary. Hence, the overall computational burden to reach an accuracy of order $\epsilon$, is $\epsilon^{-3/2}$ in the best case, $\epsilon^{-3}$ for american exercising, $\epsilon^{-6}$ for the worst case. Low number of discretization points should be necessary to get accurate enough, in order to reach the confidence limit of state-of-the-art in finance modeling.

# Data Heterogeneity with XQUERY++ and BaseX …

… is the title of a conference that I should give tomorrow (14/02/14) for the XML Prague conferencea conference on markup languages and data on the web. My slides can be found here : XMLPrague0214.