Abstract: During the last 50 years, a considerable effort has been devoted to connecting the expected utility approach to a utility function directly expressed in terms of moments of the probability distribution of an uncertain prospect. We follow the alternative route of providing, for the first time, the theoretical, autonomous foundation of an ordinal utility function of moments, representing rational choices under uncertainty, free of any independence axiom and compatible with many paradoxes of choice and behavior, well documented in the literature.
INTRODUCTION
Tobin (1958), in setting, 50 years ago, the microeconomic foundations of the keynesian liquidity preference theory in the light of the Markowitz (1952, 1959) portfolio approach, has shown an important equivalence between the Von Neumann and Morgenstern (1944) expected utility (VNM) and a preference function in mean and standard deviation.
His extension of this result to all two-parameter distributions was subsequently proved incorrect by Samuelson (1967), Borch (1969) and Feldstein (1969) so that a mean-variance analysis has been justified for a long time only under the restrictive assumption of quadratic VNM utility or Gaussian distribution.
Two research areas have therefore attracted considerable interest: one focused on suitable distributional assumptions and one devoted to the most appealing form of the utility function.
Regarding the former, the admissible distributions have been identified by the elliptical class (Chamberlain, 1983; Owen and Rabinowitch, 1983), which is closed under linear transformations of random variables. Elliptical distributions are also known as scale-location parameter distributions (as in Tobins conjecture) or linear distributions (Sinn, 1983; Meyer, 1987; Levy, 1989). They include the symmetric stable distributions analyzed by Fama (1971).
In addition, a large amount of work has been done to justify mean-variance (EV) analysis just as a second-order approximation of the expected utility model, avoiding the absurd assumption of quadratic utility (Hicks, 1962; Pratt, 1964; Arrow, 1965) and its property of increasing risk aversion and decreasing asset demand as wealth increases, making risky assets inferior goods.
Early studies concerning EV as an approximation of expected utility were carried out by Samuelson (1967, 1970), Tsiang (1972, 1974) and Rubinstein (1973), also including higher-order moments in a generalized approach.
Levy and Markowitz (1979), Kroll et al. (1984), Reid and Tew (1986) and Markowitz (1987) assess the effectiveness of the EV approximation, confirming Markowitzs intuition that mean-variance is, in practice, as efficient as expected utility in selecting optimal portfolios.
Even in more theoretical works, such as Hakansson (1972), Baron (1977) and Bigelow (1993), the aim was to investigate the consistency between mean-variance (or moment-) utility and the VNM axioms or utility function, providing restrictions to a simultaneous validity of both approaches.
However, any attempt to make expected utility and moment utility equivalent does not seem a truly compelling task.
Many authors (Borch, 1969, 1973; Levy, 1989) are, in fact, well aware of the different set-up of the two approaches, particularly when the arguments of moment utility are not confined to mean and variance but include all relevant moments of the probability distributions.
Moreover, during the last century, a number of works, from Knight (1921) to Keynes (1921), from Hicks (1935, 1962) to Marschak (1938), from Lange (1944) to Simpson (1950), expressed intriguing suggestions on variance, dispersion and higher-order moments as relevant parameters directly influencing the agents decisions under uncertainty.
In this study, we develop the foundations of an ordinal utility of moments as a rational and autonomous criterion of choice under uncertainty, showing that it dissolves all the best known behavioral paradoxes which are still embarrassing the expected utility theory. This ordinal approach is strongly reminiscent of standard microeconomic theory and it could be used to reset and generalize both assets demand theory and asset pricing models.
FOUNDATIONS
It is well known that the theory of choice under uncertainty assumes that preferences are defined over the set of probability distribution functions (Savage, 1954; DeGroot, 1970).
Confining ourselves, for ease of exposition, to the case of univariate distributions, we assume that the essential information concerning any distribution F is contained in the m-dimensional vector of moments M ≡ (μ, μ(2), μ(3),...., μ(m)) where, μ is the mean and μ(s) is the s-order central moment in original units, so that (μ(s))s is the usual central moment of order s≥2.
Note that, instead of central moments, noncentral moments could, equivalently, be used. Moreover, scale, location and dispersion parameters can be considered in the case of distributions (e.g., stable) for which moments do not exist.
The existence of an ordinal utility of moments is obtained under the assumption of a preference order satisfying the axioms of (1) asymmetry, (2) transitivity and (3) continuity (Appendix).
Assumption of Preference Order
Let Q⊆Rm be a rectangular subset of Rm (the cartesian
product of m real intervals), whose elements are the m-dimensional vectors of
moments, M∈Q. Let be a preference order i.e., a binary relation
defined by a subset
We write MaMb instead of (Ma,Mb)
Clearly, or MaMb or Ma
We write Ma~Mb (equivalence) if and only if Ma
Theorem 1: Of Complete Preferences
The preference order is complete.
Theorem 2: Of Negatively Transitive Preferences
The preference order is negatively transitive.
Theorem 3: Of Equivalence Classes
The equivalence ~ is reflexive, symmetric and transitive.
Theorem 4: Of Ordinal Utility on Moments
Under Axioms I, II, III there is a real function H: Q→R which represents
the preferences , i.e., such that, for every Ma, Mb∈Q:
(1) |
The function H is unique up to any order-preserving transformation Ø:
(2) |
The function H is called an ordinal utility because it just represents the given preference order in terms of otherwise arbitrary real numbers (utils).
Theorem 5: Of Continuous Utility
Under Axioms I, II, III and the usual relative topology for Q (intersection
of Q with the set of all open rectangles in Rm, including arbitrary
unions and finite intersections; Debreu (1959), Rader
(1963)), the utility function H is continuous if and only if, for every
Ma∈Q, the sets {Mb∈Q: MaMb}
and {Mb∈Q: MbMa} belong to the
topology.
In the following, we assume that H is continuous with bounded first order partial derivatives. The usual concepts available in the theory of choice under uncertainty can be extended to our approach.
Let F, G be two probability distributions with relevant moment vectors:
MF ≡ (μF, μF(2), μF(3),...., μF(m) ) and MG≡(μG, μG(2), μG(3),...., μG(m)), respectively. Let H(M) be a differentiable utility function.
Definition of Non Satiation
The utility H is non satiated if, for every δ>0:
(3) |
In differential terms:
Definition of Risk Aversion
A utility H is risk averse if, for every F:
(4) |
Note that for m = 2, risk aversion means
Definition of Certainty Equivalent
The certainty equivalent of F is defined as the amount CF such
that:
(5) |
Definition of Risk
Given two distributions F, G with equal mean, μF = μG,
we say that F is less risky than G if H(MF)> H(MG)
for every risk averse utility H.
Definition of Stochastic Dominance
Given two distribution functions, F, G defined over the same support, we
have mth-order stochastic dominance of F over G, FmG, m≥1,
if F≠G and:
where:
Clearly, if FmG then Fm+1G.
We recall a well known result linking stochastic dominance and VNM expected utility functions.
Theorem 6 on Stochastic Dominance and Expected Utility
We have:
• | F1G ⇔ EF (U(x))≥EG(U(x)) for every U with U≥0 |
• | F2G ⇔ EF (U(x))≥EG(U(x)) for every U with U≥0, U≤0 |
• | F3G ⇔ EF (U(x))≥EG(U(x)) for every U with U≥0, U≤0, U≥0 |
Therefore, first order stochastic dominance means increasing VNM utilities; second order stochastic dominance means increasing and concave VNM utilities.
This is no longer true for the moment ordinal utility.
Theorem 7 on Stochastic Dominance and Moment Utility
If FmG then MF ≡ (μF,
μF(2), μF(3),...., μF(m))≠(μG,
μG(2), μG(3),...., μG(m))
≡ MG and (-1)k-1μF(k)>
(-1)k-1μG(k) for the smallest k for which
μF(k)≠μG(k).
As special cases we have:
• | If F1G then μF>μG |
• | If F2G then (μF>μG) or (μF = μG and μF(2)<μG(2)) |
• | If F3G then (μF>μG) or (μF = μG and μF(2)<μG(2)) or (μF = μG and μF(2) = μG(2) and μF(3)>μG(3)) |
In the first case, in particular, it does not necessarily follow (even if it is, however, plausible) that H(MF)>H(MG) whenever higher order moments are relevant. In particular, Allais (1953, 1979) assumes that F1G implies H(F)>H(G) (axiom of absolute preference). In the expected utility approach it is equivalent to assume U>0 (nonsatiation).
INDIFFERENCE PRICING AND THE (DIS)SOLUTION OF PARADOXES
The Von Neumann-Morgenstern (1944) theory of choice under
uncertainty has long been the standard approach to model the maximizing behavior
of agents in financial markets. The basic result is the existence of a utility
function U(.) (the VNM utility) describing the optimal decisions of an investor
as those maximizing the expected utility of his or her future wealth E(U(
In fact, the independent axiom assumes that randomization is irrelevant: it says that if AB then pA+(1-p)CpB+(1-p)C (Machina, 1987; Fishburn, 1982, 1988 and the seminal work in Von Neumann and Morgenstern, 1944, especially chapter 1 and Appendix, where, considering their penetrating results, they ask themselves, p.28, Have we not shown too much?).
The approach followed above replaces the VNM expected utility with the more
fundamental and less demanding ordinal utility H(E(
Note also that this approach is different from (and much simpler than) other generalizations (non-expected utility approaches) suggested in the literature after Allais (1953) contribution to the empirical critique of the VNM expected utility theory.
In fact, in most cases, including Allais (1979), Kahneman and Tversky (1979), Chew (1983), Fishburn (1983), Machina (1982) and many others, the utility function is a function of moments of a Bernoullian utility U, or a nonlinear function of both Bernoullian utilities and subjective probabilities:
In our case, we obtain a more general result based on more intuitive elements:
(6) |
Moreover, it is well known that the expected utility can be approximated by a particular function of m central moments:
(7) |
where, U(j) is the j-th derivative of U, but this is only a special (polynomial) case of H and is not always able to account for the observed phenomena.
The utility of moments is perfectly compatible with the empirical observations in all well known behavioral paradoxes, described in terms of games and lotteries.
In order to show this compatibility we use the so called utility-indifference pricing (Henderson and Hobson, 2009), which can be applied in all cases of personal valuation of non-traded assets and incomplete markets.
For the sake of simplicity, let us consider the two-moment ordinal utility.
Let W be the current wealth of the decision maker and
(8) |
where, W-PG is wealth left after payment for the game, invested at the riskless rate r = 1/P0-1 (often set to zero) and the personal indifference price PG is defined as the price at which the agent is indifferent to paying the price and entering the game or paying nothing and avoiding the game:
(9) |
In the l.h.s., using a Taylor series approximation for small risks (Pratt, 1964), we have:
(10) |
so that, imposing Eq. 9 and simplifying, we obtain:
(11) |
where, the term in brackets is the (negative) personal price of the vol, Pσ and it is independent of monotonic transformation of H. Therefore, the indifference price, PG, of the game is obtained as moment quantities, MG, ΣG, times subjective moment prices, P0, Pσ.
Moreover, Eq. 11 can be easily generalized to higher moments (skewness Γ and kurtosis Ψ):
(12) |
and this version will be used in the following to show that the behavior of a moment-utility maximizer is perfectly compatible with all the proposed paradoxes, from St. Petersburg (1713) to Friedman and Savage (1948), Allais (1953, 1979) and Kahneman and Tversky (1979).
Table 1: | The St. Petersburg game |
The St. Petersburg Paradox
The name of the paradox is due to the solution, proposed by Daniel Bernoulli
in 1738 to a question posed by his cousin Nicolas Bernoulli, in a letter dated
September 9th, 1713. Note that at that time, the method to value an uncertain
prospect was established by Christian Huygens in 1657 as the mathematical expectation
of the gain (Hacking, 1975). Peter tosses a coin and continues
to do so until it should land heads when it comes to the ground. He agrees to
give Paul one ducat if he gets heads on the very first throw, two ducats if
he gets it on the second, four if on the third, eight if on the fourth and so
on, so that with each additional throw, the number of ducats he must pay is
doubled.
Paul is the player: if he obtains heads at the first flip he wins 1, at the second flip he wins 2,..., at the n-th flip he wins 2n-1 and so on. The question is to determine a fair price, P(G), to enter the game.
Clearly, the price must be at least 1 (the minimum gain), but the expected gain is infinite (Table 1).
However, as Nicolas Bernoulli observed in stating the paradox, nobody would pay an arbitrarily large amount to play the game: it has, he said, to be admitted that any fairly reasonable man would sell his chance, with great pleasure, for twenty ducats.
The famous Bernoulli solution, in log terms, provided a path-breaking device, introducing the concept of utility (moral expectation) and reducing expectation to a finite value:
Alternatively, the price, P(G), of the game can be obtained, in our approach, by considering the sequential game as a one-shot game (a lottery) with an infinity of tickets, identified by natural numbers (1, 2,..., n,.....) with decreasing probability of extraction (½, ¼,...,½n,....) and increasing rewards (1, 2,..., 2n-1,...). This means that the original game is a portfolio of sub-games Gn n≥1 or Arrow-Debreu securities (one for each row in the table above), the n-th of which implies a prize of 2n-1 if we get heads at the n-th flip and zero otherwise.
Clearly, each ticket could be sold separately at its price P(Gn) and the price of the lottery is the sum of the prices of all tickets.
We show that a two-moment utility approach is sufficient to solve the paradox.
Table 2: | The first two moments of the St. Petersburg game |
For each ticket n, the expected value is always ½ and the standard deviation is 0.5 (2n-1)0.5:
Considering the first two moments (Table 2) the price of the n-th ticket is, from Eq. 12:
where, the limited liability provision has been applied and the price of the game is simply the sum of the (non negative) prices of all tickets:
For example, if P0 = 1 and Pσ = -0.134 then P(Gn) = 0 for n>5 and P(G) = 1.507. Coin tosses beyond the fifth have no economic value. Note that in this game-situation, Pσ is not a proper market price but just the gamblers personal price of volatility. Analogously for higher-order moments; a refined price P(G) could also be obtained using higher moments: the skewness of the n-th ticket is [0.25 (2n-1) (2n-1-1)]1/3; the kurtosis is [(2n-1) ((2n-1)3+1)/2n+4]1/4.
The Friedman and Savage (1948) Paradox
In their classical article, Friedman and Savage (1948)
observed the difficulty of combining the belief in diminishing marginal utility
and the observation that the same individual buy insurance as well as lottery
tickets. Clearly, the first choice is evidence of risk aversion but the second
one can be rationalized only by a risk loving behavior, being well known that
lotteries are largely unfair games, accepted only by individuals having a strong
preference for risk.
The clever solution they proposed, in the framework of the Von Neumann-Morgenstern theory, just published a few years before, was based on a complex hypothesis concerning the shape of the utility function, made by three segments, concave-convex-concave, where current income is in the initial convex segment (ivi, paragraph IV).
In terms of moment utility the solution is much simpler and it clarify that skewness is the key element behind the observed behavior.
A typical lottery represents a large chance of losing a small amount (the price of the lottery ticket) plus a small change of winning a large amount (a prize).
Viceversa, the game against which you buy insurance, paying the premium π, contains a small chance of a much larger loss and a large chance of no loss. Let L and J be, respectively, the two random variables so that L (lottery) is preferred to 0 as well as -π is preferred to J:
Table 3: | The first three moments of the Friedman-Savage lottery (L) and insurance (J) |
Let us calculate the first three moments of L and J (Table 3).
Mean and standard deviation are the same but skewness is reversed in sign, with L having a large, positive skewness and J a large, negative one. Assuming the following prices of the three moments: P0 = 1, Pσ = -0.34, Pζ = 0.1 we obtain the prices P(L) = 5.71>0 and P(J) = -40.70<0 so that the ticket is bought with pleasure (and considered cheap) and up to 40.7 money units could be willingly paid to avoid the insurable risk.
The Allais Paradox
The Allais (1953) paradox was the first factual
evidence against expected utility. In fact, asking people to choose between
games A and B, where A gives 1 million with certainty and B gives 1 million
with 89% probability and 0 or 5 millions with, respectively, 1 and 10% probabilities:
people prefer in large majority A to B: AB.
Then, asking them to choose between A and B defined by:
the same people very often prefer B to A: BA.
The paradox stems from the fact that, from AB, the expected utility approach deduces AB, which is at variance with the experimental evidence (Allais reports 53% of cases of violation of the logical implication).
In fact, AB means:
but collecting U(1) and adding to both members of the inequality 0.89U(0) you obtain, algebraically:
i.e., AB, against the empirical evidence.
Table 4: | The first four moments of the Allais games |
According to Allais, either people in experimental situations do not use the rational thinking used in real world decision-making, or people do not follow the expected utility paradigm.
In fact, using our approach, the rationality of the actual choices may be easily recognized.
Considering each lottery as an asset, the first four moments are in Table 4.
Assuming the following prices of the four moments: P0 = 1, Pσ = -0.34, Pζ = 0.01, Pκ = -0.001 we obtain the prices of the lotteries: P(A) = 1>P(B) = 0.994 and P(A) = 0.007< P(B) = 0.008, in accordance with the Allais experiments.
This means that, using the Marschak triangle as in Machina (1987), the indifference curves in our approach are nonlinear in the probabilities and may display a fanning out effect from the sure event A, as implied by actual behavior.
The Kahneman and Tversky (1979) Paradox
In a famous experiment, a systematic violation of the independence axiom
was documented: 80% of 95 respondents preferred A to B where:
65% preferred B to A where:
and more than 50% of respondents violated the independent axiom, given that, if Q pays 0 for sure, then:
and B is considered equal to B in terms of outcomes and probabilities.
Note that treating lotteries as assets implies that linear combinations such as 0.75Q+0.25A are meaningful and P(0.75Q+0.25A) = 0.25P(A)≠P(A).
The point is that, in terms of valuation, B and B are not the same asset and B is equivalent to:
Table 5: | The fist four moments of Kahneman-Tversky games |
Table 6: | The first four moments of the Tversky-Kahneman games |
Using the first four moments in Table 5 and assuming the following prices of the four moments: P0 = 1, Pσ = -0.2, Pζ = 0.1, Pκ = -0.001 we obtain the prices of the lotteries: P(A) = 3000>P(B) = 2694.70 and P(A) = 624.87<P(B) = 661.01, in accordance with the experimental results. Note also that P(B) = 561.28<P(A)<P(B).
The Tversky and Kahneman (1981) Paradox
Most subjects, confronted with the following alternatives: A versus B and
A versus B, prefer B to A but also A to B where:
The paradox (reversal or isolation effect) stems from the fact that not only A = A but also B = B in terms of ultimate outcomes and probabilities.
However, from the point of view of our theory of valuation and choice, the two-stage frame in B is not irrelevant: in B, not 0 means 45; in B, not 0 means a new game B, which can be sold for a certain price.
Using the first four moments in Table 6 and assuming the following prices of the four moments: P0 = 1, Pσ = -0.2, Pζ = 0.05, Pκ = -0.1, we obtain the prices of the lotteries: P(A) = 3.98 <P(B) = 4.01 and P(A) = 3.98>P(B) = 3.84, being 30>P(B) = 28.95. This result is in accordance with the Tversky and Kahneman (1981) experiment, showing that, in effect, no paradox is implied in the observed behavior.
In particular, note, once again, that in terms of valuation:
and the risk-neutral probabilities for B, for which the personal price of B is the expected value:
are given by:
This observation also holds for the Markowitz (1959) formulation of Allaiss experiment:
CONCLUSIONS
After Tobin (1958), considerable effort has been devoted to connecting the expected utility approach to a utility function directly expressed in terms of moments. In contrast with this approach, we have provided the theoretical foundation of an ordinal utility function of moments which is free of any independence axiom and compatible with all the behavioral paradoxes documented in recent and less recent works on decisions under uncertainty. This moment-utility can be used as the starting point for a new formulation of asset demand models and asset pricing, having more general properties and greater flexibility than existing expected utility results. Future research could be addressed in this promising direction.
APPENDIX: BASIC AXIOMS AND PROOFS OF THE THEOREMS
Let (Ω,
Definition of s-order modified central moment:
Note that (μ(s))s is the usual central moment of order s≥2.
Let Q⊆Rm be a rectangular subset of Rm (the Cartesian product of m real intervals), whose elements are the m-dimensional vectors of moments, M∈Q.
Assumption of Preference Order
Let be a preference order i.e., a binary relation defined by a subset
We write MaMb instead of (Ma,Mb)
Clearly, or MaMb or Ma
I. Axiom of Asymmetric Preferences
We assume that is asymmetric i.e., that:
(A.2) |
The relation is therefore irreflexive and, moreover, if Mb
In the latter case we say that Ma and Mb are equivalent and we write Ma ~ Mb.
Definition of Equivalence
Ma ~ Mb if and only if Ma
Theorem 1 of Complete Preferences
Given Ma, Mb ∈Q then one and only one case holds:
MaMb or MbMa or Ma~Mb.
Proof: It is easy to show that any two cases are a contradiction.
II. Axiom of Transitive Preferences
We assume that is transitive:
(A.3) |
Definition of Weak Order
The preference order is a weak order if it is asymmetric and transitive.
Definition of Negatively Transitive Preferences
If Ma
Lemma 1
is negatively transitive if and only if, for every Ma,
Mb, Mc∈Q, MaMb implies
MaMc or McMb.
Proof: Under negative transitivity, if MaMb but Ma
Viceversa, if MaMb implies MaMc
or McMb and negative transitivity is false we have
from Ma
Theorem 2 of Negatively Transitive Preferences
Asymmetric and transitive preferences are equivalent to asymmetric and negatively
transitive.
Proof: Under transitivity if MaMb and MbMc then MaMc; therefore, by asymmetry, if Mb
Viceversa, under negative transitivity, if MaMb and MbMc then, from Lemma 1, (MaMc or McMb) and (MbMa or MaMc). But, by asymmetry, Mb
Theorem 3 of Equivalence Classes
The equivalence ~ is reflexive, symmetric and transitive and we have:
(A.4) |
Moreover, on Q|~ (the set of equivalence classes of Q under ~) is a strict order i.e., it is a weak order and for every equivalence class MA, MB ∈Q|~ one and only one case holds: MAMB or MBMA (weak connectedness).
Proof: The equivalence is clearly reflexive and symmetric. Suppose it is not transitive: Ma~Mb and Mb~Mc but Ma ~ Mc is false. Then, by definition, either MaMc or McMa. From Lemma 1, in the first case, MaMb or MbMc; in the second case McMb or MbMa, in contradiction with the hypothesis.
If MaMb and Ma~Mc then by Theorem 1 and Lemma 1 we have McMb. If MaMb and Mb~Mc then by Theorem 1 and Lemma 1 we have MaMc.
Finally, on Q|~ is a weak order, being asymmetric and negative transitive: under symmetry, if MAMB and MBMA then exist Ma, Ma in MA and Mb, Mb in MB such that Ma~Ma, Mb~Mb and MaMb and MbMa. From (A.4) MaMb and MbMa which is a contradiction; for negative transitivity, if MAMB then MaMb for any Ma in MA and Mb in MB and if Mc is any vector in MC then we have, from Lemma 1, McMb or MaMc. Therefore, MCMB or MAMC.
For weak connectedness, given that MA and MB are disjoint, from Theorem 1, either MaMb or MbMa for every Ma in MA and Mb in MB. Therefore, either MAMB or MBMA.
III. Axiom of Continuity
There is a countable subset D⊆Q|~ that is-dense in Q|~
i.e., for every MA, MC ∈Q|~\D, MAMC
there is MB∈D such that:
(A.5) |
Note that the subset of rational numbers is >-dense and <-dense in the set of real numbers.
Theorem 4 of Ordinal Utility on Moments
Proof: The proof follows the steps as in Fishburn (1970).
Theorem 5 of Continuous Utility
Proof: See Debreu (1964).
Theorem 6 on Stochastic Dominance and Expected Utility
Proof: (i) using integration by parts:
Therefore, if F(x)≤G(x) then EF(U(x))≥EG(U(x)). Viceversa, if EF(U(x))≥EG(U(x)) let I be an interval in which F(x)>G(x) and let χI be the indicator function of I. Define:
so that U(x)≡χI(x)≥0 and inserted into the above
equation gives a contradiction. For (ii) and (iii) Fishburn
and Vickson (1978) and Whitmore (1970).
Theorem 7 on Stochastic Dominance and Moment Utility
Proof: Apply Fishburn (1980, theorem 1) and the
relation between central, μ and non central, v, moments: