Browse our papers

Can ChatGPT Compute Trustworthy Sentiment Scores from Bloomberg Market Wraps?

Access to the SSRN paper

Authors: Baptiste Lefort, Eric Benhamou, David Saltiel, Beatrice Guez, Jean-Jacques Ohana and Damien Challet

Abstract: We used a dataset of daily Bloomberg Financial Market Summaries from 2010 to 2023, reposted on large financial media, to determine how global news headlines may affect stock market movements using ChatGPT and a two-stage prompt approach. We document a statistically significant positive correlation between the sentiment score and future equity market returns over short to medium term, which reverts to a negative correlation over longer horizons. Validation of this correlation pattern across multiple equity markets indicates its robustness across equity regions and resilience to non-linearity, evidenced by comparison of Pearson and Spearman correlations. Finally, we provide an estimate of the optimal horizon that strikes a balance between reactivity to new information and correlation. Read more

Submitted: January 9, 2024

Deep Decoding of Strategies

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Beatrice Guez and Jean-Jacques Ohana

Abstract: To the best of our knowledge, the application of machine learning and in particular graphical models in the field of quantitative risk management is still a relatively recent and new phenomenon. This paper presents a new and effective methodology for decoding strategies. Given an investment universe, we calculate dynamic weights for a sparse portfolio whose aim is to replicate the strategy with the most stable allocation rules. Naturally, this can be formulated as a reinforcement learning problem whose reward is a weighted sum of tracking error and turnover. We show on stylized examples that we can accurately decode strategies or funds with meaningful factors and allocations. Read more

Submitted: June 6, 2022

Adaptive Supervised Learning for Volatility Targeting Models

Collaboration with  LOMBARD ODIER  team

Access to the SSRN paper

Authors: Eric BenhamouDavid SaltielSerge TabachnikCorentin BourdeixFrançois Chareyron  and  Beatrice Guez

Abstract: In the context of risk-based portfolio construction and pro-active risk management, finding robust predictors of future realised volatility is paramount to achieving optimal performance. Volatility has been documented in economics literature to exhibit pronounced persistence with clusters of high or low volatility regimes and to mean-revert to a normal level, underpinning Nobel prize-winning work on Generalized Autoregressive Heteroskedastic (GARCH) models. From a Reinforcement Learning (RL) point of view, this process can be interpreted as a model-based RL approach where the goal of the models is twofold: first, to represent the volatility dynamics and forecast its term structure and second, to compute a resulting allocation to match a given target volatility: hence the name ”volatility targeting method for risk-based portfolios”. However, the resulting volatility model-based RL approaches are hard to distinguish as each model results in similar performance without a clear dominant one. We therefore present an innovative approach with an additional supervised learning step to predict the best model(s), based on historical performance ordering of RL models. Our contribution shows that adding a supervised learning overlay to decide which model(s) to use provides improvement over a naive benchmark consisting in averaging all RL models. A salient ingredient in this supervised learning task is to adaptively select features based on their significance, thanks to minimum importance filtering. This work extends our previous work on combining model-free and model-based RL. It mixes different types of learning procedures, namely model-based RL and supervised learning opening new doors to combine different machine learning approaches. Read more

Submitted: September 15, 2021

Explainable AI (XAI) Models Applied to Planning in Financial Markets

Access to the SSRN paper

Authors: Eric BenhamouJean Jacques OhanaDavid SaltielBeatrice Guez  and   Steve Ohana

Abstract: Regime changes planning in financial markets is well known to be hard to explain and interpret. Can an asset manager ex-plain clearly the intuition of his regime changes prediction on equity market ? To answer this question, we consider a gradi-ent boosting decision trees (GBDT) approach to plan regime changes on S&P 500 from a set of 150 technical, fundamen-tal and macroeconomic features. We report an improved ac-curacy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. Shapley values have recently been intro-duced from game theory to the field of ML. This approach allows a robust identification of the most important variables planning stock market crises, and of a local explanation of the crisis probability at each date, through a consistent features attribution. We apply this methodology to analyse in detail the March 2020 financial meltdown, for which the model of-fered a timely out of sample prediction. This analysis unveils in particular the contrarian predictive role of the tech equity sector before and after the crash. Read more

Submitted: June 8, 2021

Adaptive learning for financial markets mixing model-based and model-free RL for volatility targeting

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Serge Tabachnik, Sui Kai Wong and François Chareyron

Abstract:  Model-Free Reinforcement Learning has achieved meaningful results in stable environments but, to this day, it remains problematic in regime changing environments like financial markets. In contrast, model-based RL is able to capture some fundamental and dynamical concepts of the environment but suffer from cognitive bias. In this work, we propose to combine the best of the two techniques by select selecting various model-based approaches thanks to Model-Free Deep Reinforcement Learning. Using not only past performance and volatility, we include additional contextual information such as macro and risk appetite signals to account for implicit regime changes. We also adapt traditional RL methods to real-life situations by considering only past data for the training sets. Hence, we cannot use future information in our training data set as implied by K-fold cross validation. Building on traditional statistical methods, we use the traditional "walk-forward analysis", which is defined by successive training and testing based on expanding periods, to assert the robustness of the resulting agent. Finally, we present the concept of statistical difference's significance based on a two-tailed T-test, to highlight the ways in which our models differ from more traditional ones. Our experimental results show that our approach outperforms traditional financial baseline portfolio models such as the Markowitz model in almost all evaluation metrics commonly used in financial mathematics, namely net performance, Sharpe and Sortino ratios, maximum drawdown, maximum drawdown over volatility. Read more

Submitted: 22 April, 2021

From Forecast to Decisions in Graphical Models: A Natural Gradient Optimization Approach

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Beatrice Guez, Jamal Atif and Rida Laraki

Graphical models and in particular Hidden Markov Models or their continuous space equivalent, the so called Kalman filter model, are a powerful tool to make some inference that can be used in decision making contexts. The estimation of their parameters is usually based on the Expectation Maximization approach as this is a natural statistical way to train them. When used for decision making, it may be more relevant to find parameters that are relevant to our decisions rather than just try to fit the model from a statistical point of view. Hence, we can reformulate the determination of graphical model as an inference problem where the true concern is the quality of the decisions from the forecast given by the model. We show that the resulting optimization problem can be reformulated as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters. Graphical models and in particular Hidden Markov Models or their continuous space equivalent, the so called Kalman filter model, are a powerful tool to make some inference that can be used in decision making contexts. The estimation of their parameters is usually based on the Expectation Maximization approach as this is a natural statistical way to train them. When used for decision making, it may be more relevant to find parameters that are relevant to our decisions rather than just try to fit the model from a statistical point of view. Hence, we can reformulate the determination of graphical model as an inference problem where the true concern is the quality of the decisions from the forecast given by the model. We show that the resulting optimization problem can be reformulated as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters..  Read more

Submitted: 26 March, 2021

Combining Model-Based and Model-Free RL for Financial Markets

Collaboration with  LOMBARD ODIER  team

Access to the video and to the SSRN paper

Authors: Eric BenhamouDavid Saltiel, Serge Tabachnik Sui Kai Wong  and  François Chareyron

Abstract: Model Free Reinforcement Learning has achieved great results in stable environments but has not been able sofar to generalize well in regime changing environments like financial markets. In contrast, model based RL are able to capture some fundamental and dynamical concepts of the environment but suffer from cognitive bias. In this work, we propose to combine the best of the two approaches by selecting thanks to Model free Deep Reinforcement Learning various model based approaches. Using not only past performance and volatility, we include additional contextual information to account for implicit regime changes like macro and risk appetite signals . We also adapt traditional RL methods to take into account that in real life training takes always place in the past. Hence we cannot use future information in our training data set as implied by K-fold cross validation. Building on traditional statistical methods, we introduce "walk-forward analysis", which is defined by successive training and testing based on expanding periods, to assert the robustness of the resulting agent. Last but not least, we present the concept of statistical difference significance based on a two-tailed T-test, to highlight the ways in which our models differ from more traditional ones. Our experimental results show that our approach outperforms traditional financial baselines portfolio models like Markowitz in almost all evaluation metrics commonly used in financial mathematics, namely net performance, Sharpe ratio, Sortino, maximum drawdown, maximum drawdown over volatility. Read more

Submitted: March 25, 2021

Explainable AI Models of Stock Crashes: A Machine-Learning Explanation of the Covid March 2020 Equity Meltdown

Collaboration with  HOMA CAPITAL  team

Access to the SSRN paper

Authors: Jean Jacques OhanaSteve OhanaEric BenhamouDavid Saltiel  and  Beatrice Guez

Abstract: We consider a gradient boosting decision trees (GBDT) approach to predict large S&P 500 price drops from a set of 150 technical, fundamental and macroeconomic features. We report an improved accuracy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. Shapley values have recently been introduced from game theory to the field of ML. They allow for a robust identification of the most important variables predicting stock market crises, and of a local explanation of the crisis probability at each date, through a consistent features attribution. We apply this methodology to analyze in detail the March 2020 financial meltdown, for which the model offered a timely out of sample prediction. This analysis unveils in particular the contrarian predictive role of the tech equity sector before and after the crash. Read more

Submitted: March 21, 2021

Knowledge discovery with Deep RL for selecting financial hedges

Collaboration with  SOCIETE GENERALE  team

Access to the video and to the paper

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari, Abhishek Mukhopadhyay, Jamal Atif and Rida Laraki 

Abstract: Can an asset manager gain knowledge from different data sources to select the right hedging strategy for his portfolio? We use Deep Reinforcement Learning (Deep RL or DRL) to extract information from not only past performances of the hedging strategies but also additional contextual information like risk aversion, correlation data, credit information and estimated earnings per shares. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment, (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies achieves superior returns and lower risk. Read more

Submitted: 9 February, 2021

Time your hedge with Deep Reinforcement Learning

Collaboration with  SOCIETE GENERALE  team

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari and Abhishek Mukhopadhyay

Abstract: Can an asset manager plan the optimal timing for her/his hedging strategies given market conditions? The standard approach based on Markowitz or other more or less sophisticated financial rules aims to find the best portfolio allocation thanks to forecasted expected returns and risk but fails to fully relate market conditions to hedging strategies decision. In contrast, Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions. In this paper, we present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging strategy. Our experiment for an augmented asset manager interested in sizing and timing his hedges shows that our approach achieves superior returns and lower risk. Read more

Submitted: 9 November, 2020

Detecting and adapting to crisis pattern with context based Deep Reinforcement Learning

Collaboration with  HOMA CAPITAL  team

Access to the  SSRN paper

Authors: Eric Benhamou, David Saltiel, Jean Jacques Ohana  and Jamal Atif

Abstract: Deep reinforcement learning (DRL) has reached super human levels in complex tasks like game solving (Go, StarCraft II), and autonomous driving. However, it remains an open question whether DRL can reach human level in applications to financial problems and in particular in detecting pattern crisis and consequently dis-investing. In this paper, we present an innovative DRL framework consisting in two subnetworks fed respectively with portfolio strategies past performances and standard deviations as well as additional contextual features. The second sub network plays an important role as it captures dependencies with common financial indicators features like risk aversion, economic surprise index and correlations between assets that allows taking into account context based information. We compare different network architectures either using layers of convolutions to reduce network’s complexity or LSTM block to capture time dependency and whether previous allocations is important in the modeling. We also use adversarial training to make the final model more robust. Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markovitz and is able to detect and anticipate crisis like the current Covid one. Read more

Submitted: 9 November, 2020

Bridging the gap between Markowitz planning and deep reinforcement learning

Collaboration with  SOCIETE GENERALE  team

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari and Abhishek Mukhopadhyay

Abstract: While researchers in the asset management industry have mostly focused on techniques based on financial and risk planning techniques like Markowitz efficient frontier, minimum variance, maximum diversification or equal risk parity, in parallel, another community in machine learning has started working on reinforcement learning and more particularly deep reinforcement learning to solve other decision making problems for challenging tasks like autonomous driving, robot learning, and on a more conceptual side games solving like Go. This paper aims to bridge the gap between these two approaches by showing Deep Reinforcement Learning (DRL) techniques can shed new lights on portfolio allocation thanks to a more general optimization setting that casts portfolio allocation as an optimal control problem that is not just a one-step optimization, but rather a continuous control optimization with a delayed reward. The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods. We present on an experiment some encouraging results using convolution networks. Read more

Submitted: 30 September, 2020

Trade Selection with Supervised Learning and Optimal Coordinate Ascent (OCA)

Access to the SSRN paper

Authors: David Saltiel, Eric Benhamou, Rida Laraki  and Jamal Atif

Abstract: Can we dynamically extract some information and strong relationship between some financial features in order to select some financial trades over time? Despite the advent of representation learning and end-to-end approaches, mainly through deep learning, feature se- lection remains a key point in many machine learning scenarios. This paper introduces a new theoretically motivated method for feature se- lection. The approach thatfits within the family of embedded methods, casts the feature selection conundrum as a coordinate ascent optimiza- tion with variables dependencies materialized by block variables. Thanks to a limited number of iterations, it proves eficiency for gradient boost- ing methods, implemented with XGBoost. In case of convex and smooth functions, we are able to prove that the convergence rate is polynomial in terms of the dimension of the full features set. We provide comparisons with state of the art methods, Recursive Feature Elimination and Bi- nary Coordinate Ascent and show that this method is competitive when selecting some financial trades. Read more

Submitted: September, 2020

AAMDRL: Augmented Asset Management with Deep Reinforcement Learning

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari, Abhishek Mukhopadhyay and Jamal Atif

Abstract:  Can an agent learn efficiently in a noisy and self adapting environment with sequential, non-stationary and non-homogeneous observations? Through trading bots, we illustrate how Deep Reinforcement Learning (DRL) can tackle this challenge. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment, (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies shows that AAMDRL achieves superior returns and lower risk. Read more

Submitted: 29 September, 2020

Deep Reinforcement Learning for Portfolio Selection

Collaboration with  HOMA CAPITAL  team

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Jean Jacques Ohana, Jamal Atif  and Rida Laraki

Abstract: Deep reinforcement learning (DRL) has reached an unprecedent level on complex tasks like game solving (Go, StarCraft II), and autonomous driving. However, applications to real Financial assets are still largely unexplored and it remains an open question whether DRL can reach super human level. In this demo, we showcase state-of-the-art DRL methods for selecting portfolios according to financial environment, with a final network concatenating three individual networks using lay- ers of convolutions to reduce network's complexity. The multi entries of our network enables capturing dependencies from common financial indicators features like risk aversion, citigroup index surprise, portfolio specific features and previous portfolio allocations. Results on test set show this approach can overperform traditional portfolio optimization methods with results available at our demo website. Read more

Submitted: September, 2020

Three remarkable properties of the Normal distribution

Access to the SSRN paper

Authors: Eric Benhamou, Beatrice Guez and Nicolas Paris

Abstract:  In this paper, we present three remarkable properties of the normal distribution: first that if two independent variables's sum is normally distributed, then each random variable follows a normal distribution (which is referred to as the Levy Cramer theorem), second a variation of the Levy Cramer theorem that states that if two independent symmetric random variables with finite variance have their sum and their difference independent, then each random variable follows a standard normal distribution, and third that the normal distribution is characterized by the fact that it is the only distribution for which the sample mean and variance are independent (which is a central property for deriving the Student distribution and referred as the Geary theorem). The novelty of this paper is to provide new, quicker or self contained proofs of theses theorems. Read more

Submitted: 11 July, 2020

Omega and Sharpe ratio

Access to the SSRN paper

Authors: Eric Benhamou, Beatrice Guez and Nicolas Paris

Abstract:  Omega ratio, defined as the probability-weighted ratio of gains over losses at a given level of expected return, has been advocated as a better performance indicator compared to Sharpe and Sortino ratio as it depends on the full return distribution and hence encapsulates all information about risk and return. We compute Omega ratio for the normal distribution and show that under some distribution symmetry assumptions, the Omega ratio is oversold as it does not provide any additional information compared to Sharpe ratio. Indeed, for returns that have elliptic distributions, we prove that the optimal portfolio according to Omega ratio is the same as the optimal portfolio according to Sharpe ratio. As elliptic distributions are a weak form of symmetric distributions that generalized Gaussian distributions and encompass many fat tail distributions, this reduces tremendously the potential interest for the Omega ratio. Read more

Submitted: 15 October, 2019

<

Variance Reduction in Actor Critic Methods (ACM)

Access to the SSRN paper

Authors: Eric Benhamou

Abstract:  After presenting Actor Critic Methods ACM), we show ACM are control variate estimators. Using the projection theorem, we prove that the Q and Advantage Actor Critic (A2C) methods are optimal in the sense of the L2 norm for the control variate estimators spanned by functions conditioned by the current state and action. This straightforward application of Pythagoras theorem provides a theoretical justification of the strong performance of QAC and AAC most often referred to as A2C methods in deep policy gradient methods. This enables us to derive a new formulation for Advantage Actor Critic methods that has lower variance and improves the traditional A2C method.  Read more

Submitted: 23 July, 2019

Testing Sharpe ratio: luck or skill?

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Beatrice Guez and Nicolas Paris

Abstract:  Sharpe ratio (sometimes also referred to as information ratio) is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the (excess) net return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is likely to be error prone because of statistical estimation errors. In this paper, we provide various tests to measure the quality of the Sharpe ratios. By quality, we are aiming at measuring whether a manager was indeed lucky of skillful. The test assesses this through the statistical significance of the Sharpe ratio. We not only look at the traditional Sharpe ratio but also compute a modified Sharpe insensitive to used Capital. We provide various statistical tests that can be used to precisely quantify the fact that the Sharpe is statistically significant. We illustrate in particular the number of trades for a given Sharpe level that provides statistical significance as well as the impact of auto-correlation by providing reference tables that provides the minimum required Sharpe ratio for a given time period and correlation. We also provide for a Sharpe ratio of 0.5, 1.0, 1.5 and 2.0 the skill percentage given the auto-correlation level. Read more

Submitted: 21 May, 2019

Connecting Sharpe ratio and Student t-statistic, and beyond

Access to the SSRN paper

Authors: Eric Benhamou

Abstract: Sharpe ratio is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the excess return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is subject  to be error prone because of statistical estimation error. Lo (2002), Mertens (2002) derive explicit expressions for the statistical distribution of the Sharpe ratio using standard asymptotic theory under several sets of assumptions (independent normally distributed - and identically distributed returns). In this paper, we provide the exact distribution of the Sharpe ratio for independent normally distributed return. In this case, the Sharpe ratio statistic is up to a rescaling factor a non centered Student distribution whose characteristics have been widely studied by statisticians. The asymptotic behavior of our distribution provide the result of Lo (2002). We also illustrate the fact that the empirical Sharpe ratio is asymptotically optimal in the sense that it achieves the Cramer Rao bound. We then study the empirical SR under AR(1) assumptions and investigate the effect of compounding period on the Sharpe (computing the annual Sharpe with monthly data for instance). We finally provide general formula in this case of heteroscedasticity and autocorrelation. Read more

Submitted: 14 May, 2019

NGO-GM: Natural Gradient Optimization for Graphical Models

Access to the SSRN paper

Authors: Eric Benhamou, Jamal Atif, Rida Laraki and David Saltiel

Abstract: This paper deals with estimating model parameters in graphical models. We reformulate it as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters. We show that our approach is a strong alternative to the celebrated EM approach for learning in graphical models. Actually, our natural gradient based strategy leads to learning optimal parameters for the final objective function without artificially trying to fit a distribution that may not correspond to the real one. We support our theoretical findings with the question of trend detection in financial markets and show that the learned model performs better than traditional practitioner methods and is less prone to overfitting.  Read more

Submitted: 14 May, 2019

Similarities between policy gradient methods in reinforcement and supervised learning

Access to the SSRN paper

Authors: Eric Benhamou and David Saltiel

Abstract: Reinforcement learning (RL) is about sequential decision making and is traditionally opposed to supervised learning (SL) and unsupervised learning (USL). In RL, given the current state, the agent makes a decision that may in uence the next state as opposed to SL where the next state remains the same, regardless of decisions taken. Although this difference is fundamental, SL and RL are not so different. In particular, we emphasize in this paper that gradient policy methods can be cast as a SL problem where true label are replaced with discounted rewards. We pro- vide a simple experiment where we interchange label and pseudo rewards to show that SL techniques can be directly translated into RL methods. Read more

Submitted: 2 May, 2019

BCMA-ES II: revisiting Bayesian CMA-ES

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Beatrice Guez and Nicolas Paris

Abstract: This paper revisits the Bayesian CMA-ES and provides updates for normal Wishart. It emphasizes the difference between a normal and normal inverse Wishart prior. After some computation, we prove that the only difference relies surprisingly in the expected covariance. We prove that the expected covariance should be lower in the normal Wishart prior model because of the convexity of the inverse. We present a mixture model that generalizes both normal Wishart and normal inverse Wishart model. We finally present various numerical experiments to compare both methods as well as the generalized method. Read more

Submitted: 9 April, 2019

BCMA-ES: A Bayesian approach to CMA-ES

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel, Sebastien Verel and Fabien Teytaud

Abstract: This paper introduces a novel theoretically sound approach for the celebrated CMA-ES algorithm. Assuming the parameters of the multi variate normal distribution for the minimum follow a conjugate prior distribution, we derive their optimal update at each iteration step. Not only provides this Bayesian framework a justification for the update of the CMA-ES algorithm but it also gives two new versions of CMA-ES either assuming normal-Wishart or normal-Inverse Wishart priors, depending whether we parametrize the likelihood by its covariance or precision matrix. We support our theoretical findings by numerical experiments that show fast convergence of these modified versions of CMA-ES.  Read more

Submitted: 2 April, 2019

A discrete version of CMA-ES

Access to the SSRN paper

Authors: Eric Benhamou, Jamal Atif and Rida Laraki

Abstract: Modern machine learning uses more and more advanced optimization techniques to find optimal hyper parameters. Whenever the objective function is non-convex, non continuous and with potentially multiple local minima, standard gradient descent optimization methods fail. A last resource and very different method is to assume that the optimum(s), not necessarily unique, is/are distributed according to a distribution and iteratively to adapt the distribution according to tested points. These strategies originated in the early 1960s, named Evolution Strategy (ES) have culminated with the CMA-ES (Covariance Matrix Adaptation) ES. It relies on a multi variate normal distribution and is supposed to be state of the art for general optimization program. However, it is far from being optimal for discrete variables. In this paper, we extend the method to multivariate binomial correlated distributions. For such a distribution, we show that it shares similar features to the multi variate normal: independence and correlation is equivalent and correlation is efficiently modeled by interaction between different variables. We discuss this distribution in the framework of the exponential family. We prove that the model can estimate not only pairwise interactions among the two variables but also is capable of modeling higher order interactions. This allows creating a version of CMA ES that can accommodate efficiently discrete variables. We provide the corresponding algorithm and conclude. Read more

Submitted: 11 February, 2019

Operator norm upper bound for sub-Gaussian tailed random matrices

Access to the SSRN paper

Authors: Eric Benhamou, Jamal Atif and Rida Laraki

Abstract: This paper investigates an upper bound of the operator norm for sub-Gaussian tailed random matrices. A lot of attention has been put on uniformly bounded sub-Gaussian tailed random matrices with independent coefficients. However, little has been done for sub-Gaussian tailed random matrices whose matrix coefficients variance are not equal or for matrix for which coefficients are not independent. This is precisely the subject of this paper. After proving that random matrices with uniform sub-Gaussian tailed independent coefficients satisfy the Tracy Widom bound, that is, their matrix operator norm remains bounded by O(n−−√) with overwhelming probability, we prove that a less stringent condition is that the matrix rows are independent and uniformly sub-Gaussian. This does not impose in particular that all matrix coefficients are independent, but only their rows, which is a weaker condition. Read more

Submitted: 19 January, 2019

Kalman filter demystified: from intuition to probabilistic graphical model to real case in financial markets

Access to the SSRN paper

Authors: Eric Benhamou

Abstract: In this paper, we revisit the Kalman filter theory. After giving the intuition on a simplified financial markets example, we revisit the maths underlying it. We then show that Kalman filter can be presented in a very different fashion using graphical models. This enables us to establish the connection between Kalman filter and Hidden Markov Models. We then look at their application in financial markets and provide various intuitions in terms of their applicability for complex systems such as financial markets. Although this paper has been written more like a self contained work connecting Kalman filter to Hidden Markov Models and hence revisiting well known and establish results, it contains new results and brings additional contributions to the field. First, leveraging on the link between Kalman filter and HMM, it gives new algorithms for inference for extended Kalman filters. Second, it presents an alternative to the traditional estimation of parameters using EM algorithm thanks to the usage of CMA-ES optimization. Third, it examines the application of Kalman filter and its Hidden Markov models version to financial markets, providing various dynamics assumptions and tests. We conclude by connecting Kalman filter approach to trend following technical analysis system and showing their superior performances for trend following detection. Read more

Submitted: 13 December, 2018

Feature selection with optimal coordinate ascent (OCA)

Access to the SSRN paper

Authors: Eric Benhamou and David Saltiel

Abstract:  In machine learning, Feature Selection (FS) is a major part of efficient algorithm. It fuels the algorithm and is the starting block for our prediction. In this paper, we present a new method, called Optimal Coordinate Ascent (OCA) that allows us selecting features among block and individual features. OCA relies on coordinate ascent to find an optimal solution for gradient boosting methods score  (number of correctly classified samples). OCA takes into account the notion of dependencies between variables forming blocks in our optimization. The coordinate ascent optimization solves the issue of the NP hard original problem where the number of combinations rapidly explode making a grid search unfeasible. It reduces considerably the number of iterations changing this NP hard problem into a polynomial search one. OCA brings substantial differences and improvements compared to previous coordinate ascent feature selection method: we group variables into block and individual variables instead of a binary selection. Our initial guess is based on the k-best group variables making our initial point more robust. We also introduced new stopping criteria making our optimization faster. We compare these two methods on our data set. We found that our method outperforms the initial one. We also compare our method to the Recursive Feature Elimination (RFE) method and find that OCA leads to the minimum feature set with the highest score. This is a nice byproduct of our method as it provides empirically the most compact data set with optimal performance.  Read more

Submitted: 3 December, 2018

Gram Charlier and Edgeworth expansion for sample variance

Access to the SSRN paper

Authors: Eric Benhamou

Abstract: In this paper, we derive a valid Edgeworth expansions for the Bessel corrected empirical variance when data are generated by a strongly mixing process whose distribution can be arbitrarily. The constraint of strongly mixing process makes the problem not easy. Indeed, even for a strongly mixing normal process, the distribution is unknown. Here, we do not assume any other assumption than a  sufficiently fast decrease of the underlying distribution to make the Edgeworth expansion convergent. This results can obviously apply to strongly mixing normal process and provide an alternative to the work of Moschopoulos (1985) and Mathai (1982). Read more

Submitted: 18 September, 2018

A few properties of sample variance

Access to the SSRN paper

Authors: Eric Benhamou

Abstract: A basic result is that the sample variance for i.i.d. observations is an unbiased estimator of the variance of the underlying distribution (see for instance Casella and Berger (2002)). But what happens if the observations are neither independent nor identically distributed. What can we say? Can we in particular compute explicitly the first two moments of the sample mean and hence generalize formulae provided in Tukey (1957a), Tukey (1957b) for the first two moments of the sample variance? We also know that the sample mean and variance are independent if they are computed on an i.i.d. normal distribution. This is one of the underlying assumption to derive the Student distribution Student alias W. S. Gosset (1908). But does this result hold for any other underlying distribution? Can we still have independent sample mean and variance if the distribution is not normal? This paper precisely answers these questions and extends previous work of Cho, Cho, and Eltinge (2004). We are able to derive a general formula for the first two moments and variance of the sample variance under no specific assumption. We also provide a faster proof of a seminal result of Lukacs (1942) by using the log characteristic function of the unbiased sample variance estimator. Read more

Submitted: 11 September, 2018

T-statistic for Autoregressive process

Access to the SSRN paper

Authors: Eric Benhamou

Abstract: In this paper, we discuss the distribution of the t-statistic under the assumption of normal autoregressive distribution for the underlying discrete time process. This result generalizes the classical result of the traditional t-distribution where the underlying discrete time process follows an uncorrelated normal distribution. However, for AR(1), the underlying process is correlated.  All traditional results break down and the resulting t-statistic is a new distribution that converges asymptotically to a normal. We give an explicit formula for this new distribution obtained as the ratio of two dependent distribution (a normal and the distribution of the norm of another independent normal distribution). We also provide a modified statistic that follows a non central t-distribution. Its derivation comes from finding an orthogonal basis for the the initial circulant Toeplitz covariance matrix. Our findings are consistent with the asymptotic distribution for the t-statistic derived for the asympotic case of large number of observations or zero correlation. This exact finding of this distribution has applications in multiple fields and in particular provides a way to derive the exact distribution of the Sharpe ratio under normal AR(1) assumptions. Read more

Submitted: 11 September, 2018

Seven proofs of the Pearson Chi-squared independence test and its graphical interpretation

Access to the SSRN paper

Authors: Eric Benhamou and Valentin Melot

Abstract: This paper revisits the Pearson Chi-squared independence test. After presenting the underlying theory with modern notations and showing new way of deriving the proof, we describe an innovative and intuitive graphical presentation of this test. This enables not only interpreting visually the test but also measuring how close or far we are from accepting or rejecting the null hypothesis of non independence. Read more

Submitted: 3 September, 2018

Incremental Sharpe and other performance ratios

Access to the SSRN paper

Authors: Eric Benhamou and Beatrice Guez

Abstract: We present a new methodology of computing incremental contribution for performance ratios for portfolio like Sharpe, Treynor, Calmar or Sterling ratios. Using Euler's homogeneous function theorem, we are able to decompose these performance ratios as a linear combination of individual modified performance ratios. This allows understanding the drivers of these performance ratios as well as deriving a condition for a new asset to provide incremental performance for the portfolio. We provide various numerical examples of this performance ratio decomposition. Read more

Submitted: 26 August, 2018

BCMA-ES: A Bayesian approach to CMA-ES

Access to the SSRN paper

Authors: Eric Benhamou, David Saltiel and Sebastien Verel

Abstract: This paper introduces a novel theoretically sound approach for the celebrated CMA-ES algorithm. Assuming the parameters of the multi variate normal distribution for the minimum follow a conjugate prior distribution, we derive their optimal update at each iteration step. Not only provides this Bayesian framework a justification for the update of the CMA-ES algorithm but it also gives two new versions of CMA-ES either assuming normal-Wishart or normal-Inverse Wishart priors, depending whether we parametrize the likelihood by its covariance or precision matrix. We support our theoretical findings by numerical experiments that show fast convergence of these modified versions of CMA-ES.  Read more

Submitted: 2 April, 2019