(A pdf version may be clearer)
This page puts an upper bound on the difference between the mode and mean of a random variable with a weakly unimodal probability density function or with a weakly unimodal probability distribution, and proves this bound. It also discusses when the bound is achieved or approached, what constraints there are on the distribution on one side of the mode, how this relates to Gauss's inequality, an equivalent speculative result for the median, and the minimising properties of a uniform distribution
Step 0 

Step 1 

Step 2 

Step 3 

Step 4 

or go to some Statistics Jokes 
or look at a onetailed
version of Chebyshev's
inequality or a more general relationship between the mean, median, mode and standard deviation 
To prove that for a
weakly unimodal random variable X:
mode(X)E(X)<=sqrt(3).sd(X)
or equivalently (mode(X)E(X))^{2}<=3.Var(X)
Step 0 (Return to top)
Step 1 (Return to top)
Step 2 (Return to top)
Step 3 (Return to top)
Note A: Random variables combining continuous and discrete elements (Return to top)
Steps 0 to 3 dealt with continuous random variables while Step 4 dealt with discrete random variables. The remaining question is that of combinations of the two
We can exclude random variables with two or more points of positive probability together with continuous elements since they will fail a suitable definition of being unimodal similar to that in Step 4
That leaves random variables that combine a single point of positive probability together with continuous elements. For this to be unimodal, the point of positive probability needs to be the mode of the random variable, and the continuous elements need to be weakly monotonically increasing up to that point and weakly monotonically decreasing down from that point. In such a case the calculations in Steps 0 to 3 still apply, and so the result applies.
Note B: Achieving equality mode(X)E(X)=sqrt(3).sd(X) (Return to top)
The equality mode(X)E(X)=sqrt(3).sd(X) is achieved for any uniform distribution if the mode is taken as being at one end
If this is not seen as being sufficiently unimodal, then equality is not achieved for a continuous random variable except in the trivial case where the distribution is a single atom, since equality requires a=0 in Step 3, and requires f(x)=u(x), possibly except at x=0 and x=2.m/(1a), in Step 2
But consider a continuous random variable X_{d} with
probability distribution function p_{d}(x)
where for some d, i and j with j>0 and 0<d<=1/j
p_{d}(x)=0 if x<=i, p_{d}(x)=(1+2.d.x2.d.id.j)/j if
i<x<=i+j, p_{d}(x)=0 if i+j<x,
then mode(X_{d})=i+j
E(X_{d})=i+j/2+d.j^{2}/6
Var(X_{d})=j^{2}/12d^{2}.j^{4}/36
(mode(X_{d})E(X_{d}))^{2}/Var(X_{d})=(3d.j)^{2}/(3d^{2}.j^{2})
=3d.j.2+d^{2}.j^{2}.(4d.j.2)/(3d^{2}.j^{2})
So as d tends to 0 from above, X_{d}
approaches a uniform distribution
and (mode(X_{d})E(X_{d}))^{2}/Var(X_{d})
tends to 3 from below
By comparison,
for a triangular distribution, d=1/j and (mode(X_{d})E(X_{d}))^{2}/Var(X_{d})=2
while for an exponential distribution (mode(X)E(X))^{2}/Var(X)=1
For a discrete random variable equality is not achieved except in the trivial case where the distribution is a single atom, since equality in the inequality mode(X)E(X)>=h/2 in Step 4 is only achieved if mode(Y)E(Y)=0
But consider a discrete continuous random variable Y_{n}
with probability density P(Y_{n}=y)
where for some h, i, integer n with h>0 and n>0
P(Y_{n}=i+j.h)=1/(n+1) if j is an integer with 0<=j<=n
and P(Y_{n}=i+j.h)=0 otherwise
then choose mode(Y_{n})=i+n.h
E(Y_{n})=i+n.h/2
Var(Y_{n})= n.(n+2).h^{2}/12
(mode(X_{d})E(X_{n}))^{2}/Var(X_{n})=3.n/(n+2)
=36/(n+2)
So as n tends to infinity
(mode(X_{n})E(X_{n}))^{2}/Var(X_{n})
tends to 3 from below
and in a sense the cumulative probability function of X_{n}
approaches that of a uniform distribution, particularly if h is
made proportional to 1/n
X_{n} is only weakly unimodal but it would be easy to construct a similar sequence of strictly unimodal random variables with the same property
For a random variable which combines discrete and continuous elements equality is not achieved, since it requires a=0 in Step3
Note C: A onetailed inequality for P(X>=mode(X)) (Return to top)
Since in Step 3 above
Var(X)>=(mode(X)E(X))^{2}.(1+3.P(X>=mode(X)))/(3.(1P(X>=mode(X))))
we get the result:
If mode(X)>=E(X)
then P(X>=mode(X))<=(3.Var(X)(mode(X)E(X))^{2})/(3.(Var(X)+(mode(X)E(X))^{2}))
i.e. P(X>=mode(X))<=14.(mode(X)E(X))^{2}/(3.(Var(X)+(mode(X)E(X))^{2}))
From this we have
P(X<=mode(X))>=4.(mode(X)E(X))^{2}/(3.(Var(X)+(mode(X)E(X))^{2}))
and by considering Y= X we get the result:
If mode(Y)<=E(Y)
then P(Y>=mode(Y))>=4.(E(Y)mode(Y))^{2}/(3.(Var(Y)+(E(Y)mode(Y))^{2}))
Note D: Weakening and simplifying Gauss's inequality (Return to top)
The terms in these results have slight similarities with
Gauss's inequality of 1821 for a unimodal distribution:
P(Xmode(X)>=g.sqrt(Var(X)+(mode(X)E(X))^{2}))<=4/(9.g^{2})
This could become P(Xmode(X)>2.g.sd(X))<=16/(9.(2.g)^{2})
i.e. P(Xmode(X)>=k.sd(X))<=16/(9.k^{2})
or P(Xmode(X)>=t)<=Var(X).16/(9.t^{2})
which is weaker but simpler than Gauss's inequality
and is not that far away in form from Chebyshev's inequality:
P(XE(X)>k.sd(X))<=1/k^{2}
or P(XE(X)>t)<=Var(X)/t^{2}
though in a unimodal case, Chebyshev's inequality could be
considerably tightened
In the extreme case of the uniform distribution where the mode
is taken as being at one end
P((Xmode(X)>=k.sd(X))=16/(9.k^{2}) for only one
value of k:
4/sqrt(3) i.e. about 2.309401.. when P((Xmode(X)>=k.sd(X))=1/3
For all other unimodal distributions or values of k, equality is
not achieved
Note E: Speculation on median(X)E(X) for unimodal distributions (Return to top)
The result above is mode(X)E(X)<=sqrt(3).sd(X)
or equivalently (mode(X)E(X))^{2}<=3.Var(X)
For a general random variable the equivalent result for the
median is well known as
median(X)E(X)<=sd(X)
or equivalently (median(X)E(X))^{2}<=Var(X)
I suspect that for a continuous random variable unimodal
distribution the equivalent result for the median is
median(X)E(X)<=sqrt(3/5).sd(X)
where sqrt(3/5)=0.77459..
or equivalently 5.(median(X)E(X))^{2}<=3.Var(X)
derived from speculation on a onetailed version of
Chebyshev's inequality for unimodal distributions, with equality
approached for distributions with about half their probability
very close to a single point and the remainder virtually
uniformly distributed on one side of that point
This last result is tighter than the case for discrete random
variables
for example if P(Y=0)=1/2d and P(Y=1)=1/2+d for some small d>0
then this meets the definition of "weakly unimodal" in
Step 4
but median(Y)E(Y)>=(12.d).sd(Y) which can be very close to
sd(Y)
Note F: Among positive monotonically decreasing continuous random variables, a uniform distribution minimises key descriptive statistics (Return to top)
Step 2 above showed that for a positive random variable with a known mean and an unknown finite variance and a monotonically decreasing probability density function, then the second moment E(X^{2}) is bounded below by that of a uniform distribution with a minimum of zero and the same mean
Similar methods can be used to show directly that this uniform distribution provides a lower bound for both the variance and the maximum
Similar methods can also show that for a positive random variable with an unknown finite mean and an unknown finite variance and a monotonically decreasing probability density function bounded above by a known value (at or near zero), then the first moment E(X), the second moment E(X^{2}), the variance and the maximum are all bounded below by those of a uniform distribution with a minimum of zero and the same bound on the maximum of the value of the probability density function
This ability of either (as in Step 2) a uniform distribution, or (as in Step 3 and Note A) a combination of point of positive probability and a uniform distribution to be the limiting case can lead to speculation of possible further results
For example, it might be the case that for a continuous random
variable X with a unimodal probability density function and with
values over a finite range:
maximum(X)minimum(X)>=3.sd(X)
with equality approached for distributions with about a third of
their probability very close to a single point and the remainder
virtually uniformly distributed on one side of that point
maximum(X)minimum(X)>=4.median(X)E(X)
with equality approached for distributions with about half their
probability very close to a single point and the remainder
virtually uniformly distributed on one side of that point
maximum(X)minimum(X)>=2.mode(X)E(X)
with equality approached for distributions virtually uniformly
distributed and with the mode at one end
The results for discrete random variables with unimodal
probability distributions (as defined in Step 4)
can be very different, perhaps
maximum(Y)minimum(Y)>=2.sd(Y)
and at first glance it seems likely that this result may in fact
apply to all bounded random variables, unimodal or not, whether
they are discrete, continuous or mixed
If so, then we have the slightly broader result
maximum(Y)minimum(Y)>=2.sd(Y)>=2.median(Y)E(Y)
with the possibility of achieving or at least approaching
equality
Return to top or go to some Statistics Jokes or look at a onetailed version of Chebyshev's inequality and further discussion or the mean, median, mode and standard deviation relationship or see Henry Bottomley's home page
Copyright December 1999 Henry Bottomley. All rights reserved.