# Fractional Moments of the Geometric Distribution

In one of my papers, I needed an upper bound on the fractional moments of the geometric distribution. The integer moments have a nice-looking upper bound involving factorials. In addition, Mathematica showed that the same upper bound holds for fractional moments once we use Gamma function instead of factorials. The problem was, I could not prove it. Worse, I could not find a proof after much looking on the Internet.

This post contains a proof that for any real $\lambda > 1$, the $\lambda$th moment of a geometric random variable with success probability $\displaystyle p \geq 1/2$ is at most $\displaystyle \Gamma(\lambda + 1)/p^\lambda$.

## Geometric distribution and its moments

Definition 1 (Geometric distributions ${\mathcal{G}_\alpha^+}$ and ${\mathcal{G}_\alpha}$) Let ${\alpha \in (0, 1/2)}$. Define ${\mathcal{G}_\alpha^+}$ as the distribution on positive integers ${k}$ satisfying ${\Pr_{X \sim \mathcal{G}_\alpha^+}[X = k] = \alpha^{k-1}(1 - \alpha)}$. Define ${\mathcal{G}_\alpha}$ as the distribution on non-negative integers ${k}$ satisfying ${\Pr_{Y \sim \mathcal{G}_\alpha}[Y = k] = \alpha^k(1 - \alpha)}$.

Let ${X \sim \mathcal{G}_\alpha^+}$ and ${Y \sim \mathcal{G}_\alpha}$. Semantically, given an ordered list of independent and identical Bernoulli trials, ${X}$ counts the number of trials to get the first success while ${Y}$ counts the number of failures before the first success.

For any real ${\lambda > 0}$, we can express the moments of ${X}$ in terms of polylogarithm functions, as follows:

$\displaystyle \mathop\mathbb{E}[X^\lambda] = \sum_{x=1}^\infty{ x^\lambda (1-\alpha) \alpha^{x-1} } = \frac{1-\alpha}{\alpha} \sum_{x=1}^\infty{ \frac{\alpha^{x}}{x^{-\lambda} } } = \frac{1-\alpha}{\alpha}\mathrm{Li}_{-\lambda}(\alpha) \ \ \ \ \ (1)$

where ${\mathrm{Li}_{\lambda}}$ is the polylogarithm function of order ${\lambda}$. It is defined as

$\displaystyle \mathrm{Li}_{\lambda}(\alpha) \triangleq \sum_{k=1}^\infty{ \frac{\alpha^k}{k^\lambda}} \qquad \text{for}\quad \alpha, \lambda \in \mathbb{R} \,. \ \ \ \ \ (2)$

Polylogarithm functions of non-positive integer orders can be found using the recursive rule

$\displaystyle \mathrm{Li}_1(\alpha) = -\log(1-\alpha) \, , \qquad \text{and} \qquad \dfrac{\partial}{\partial \alpha}{\mathrm{Li}_{\lambda}(\alpha)} = \frac{\mathrm{Li}_{\lambda-1}(\alpha)}{\alpha} \, ,\quad \lambda \in 0, 1, 2, 3, \ldots \, .$

This allows us to compute

$\displaystyle \mathrm{Li}_{-1}(\alpha) = \frac{\alpha}{(1-\alpha)^2}\, , \qquad \text{and} \qquad \mathrm{Li}_{-2}(\alpha) = \frac{\alpha (1+\alpha)}{(1-\alpha)^3} \, . \ \ \ \ \ (3)$

Recall from~(1) that for any real ${\lambda > 0}$, ${\mathop\mathbb{E}[X^\lambda] = \frac{1-\alpha}{\alpha}\mathrm{Li}_{-\lambda}(\alpha)}$. It follows that

$\displaystyle \mathop\mathbb{E} [X] = \frac{1}{1-\alpha} \quad\text{and}\quad \mathop\mathbb{E}[X^2] = \frac{1 + \alpha}{(1-\alpha)^2} \, . \ \ \ \ \ (4)$

It is also easy to check that

$\displaystyle \mathop\mathbb{E} Y^\lambda = \mathop\mathbb{E}[(X-1)^\lambda] = \alpha\,\mathop\mathbb{E}[X^\lambda] \,. \ \ \ \ \ (5)$

## Main Proposition

Proposition For any real ${\lambda \geq 1, \alpha \in (0, 1/2)}$, and ${X \sim \mathcal{G}_\alpha^+}$,

$\displaystyle \mathop\mathbb{E}[X^\lambda] \leq \frac{\Gamma(1+\lambda)}{(1-\alpha)^\lambda} \, , \ \ \ \ \ (6)$

and an equality is achieved with ${\lambda = 1}$ and ${\alpha = 1/2}$.

Proof.

Case: ${\lambda}$ is an integer. (This special case has a short proof which we record here for completeness. Note, however, that this case is subsumed by the following cases.)

For any real ${a > 0}$ and ${\alpha \in (0, 1)}$ such that ${\alpha e^a < 1}$, the moment generating function of ${X}$ is

$\displaystyle \begin{array}{rcl} M_X(a) &=& \mathop\mathbb{E}[e^{a X}] = \sum_{\lambda=0}^\infty{\frac{a^\lambda \mathop\mathbb{E}[X^\lambda]}{\lambda!}} \nonumber \\ &=& \frac{1-\alpha}{\alpha} \sum_{k \geq 1} \alpha^k \sum_{\lambda \geq 0} \frac{a^\lambda k^\lambda }{\lambda!} = \frac{1-\alpha}{\alpha} \sum_{k \geq 1} (\alpha e^a)^k \\ &=& \frac{1-\alpha}{\alpha} (1 - \frac{1}{1-\alpha e^a}) \,, \quad\text{or} \end{array}$

$\displaystyle M_X(a) = \frac{(1-\alpha) e^a}{1-\alpha e^a}\,. \ \ \ \ \ (7)$

We can differentiate the moment generating function ${M_X(a)}$ from (7) to the order ${\lambda}$ and utilize the fact ${\alpha < 1/2}$ to get

$\displaystyle \begin{array}{rcl} \frac{d^\lambda}{d a^\lambda}M_X(a) &=& \frac{(1-\alpha)e^a}{(1-\alpha e^a)^{\lambda+1}} \sum_{k=0}^{\lambda-1}{\genfrac<>{0pt}{}{t}{k} \alpha^k} \\ &\leq& \frac{(1-\alpha)e^a}{(1-\alpha e^a)^{\lambda+1}} \cdot \frac{\lambda!}{2} \cdot \frac{1}{1-\alpha} \\ &\leq& \frac{e^a \lambda!}{(1-\alpha e^a)^{\lambda}} \end{array}$

where ${\genfrac<>{0pt}{}{\lambda}{k}}$ are the Eulerian numbers satisfying ${\genfrac<>{0pt}{}{\lambda}{k} \leq \lambda!/2}$ for any ${0 \leq k \leq \lambda-1}$. It follows that

$\displaystyle \mathop\mathbb{E}[X^\lambda] =\frac{d^\lambda}{d a^\lambda}M_X(a)\bigg\vert_{a=0} = \frac{1}{(1-\alpha )^{\lambda}} \sum_{k=0}^{\lambda-1}{\genfrac<>{0pt}{}{t}{k}\alpha^k} \,, \ \ \ \ \ (8)$

and

$\displaystyle \mathop\mathbb{E}[X^\lambda] \leq \frac{\lambda!}{(1-\alpha)^{\lambda}} \,. \ \ \ \ \ (9)$

Case: ${\lambda \geq 2}$ is a real. The proof for a real ${\lambda}$ builds on the analytic properties of ${\Gamma(1+\lambda)}$ and ${\mathrm{Li}_{-\lambda}(\alpha)}$. Let us focus on the functions

$\displaystyle f_\alpha(\lambda) \triangleq \frac{(1-\alpha)^{\lambda+1} }{\alpha} \mathrm{Li}_{-\lambda}(\alpha)\,, \qquad\text{and} \ \ \ \ \ (10)$

$\displaystyle f(\lambda) \triangleq f_{1/2}(\lambda) = 2^{-\lambda} \mathrm{Li}_{-\lambda}(1/2) \,. \ \ \ \ \ (11)$

We claim that ${f_\alpha(\lambda) \leq f(\lambda)}$ for a fixed ${\lambda}$. Specifically, observe that ${f_\alpha(\lambda)}$ is a product of ${1/\alpha, (1-\alpha)^{1+\lambda}}$, and ${\mathrm{Li}_{-\lambda}(\alpha)}$; since each of these functions is convex in ${\alpha}$, so is ${f_\alpha(\lambda)}$. Moreover, since ${f_0(\lambda) < f_{1/2}(\lambda)}$, we conclude that ${f_{\alpha}(\lambda) \leq f(\lambda)}$. Thus the claim is equivalent to showing ${f(\lambda) \leq \Gamma(1+\lambda)}$ for the specified regime of ${\lambda}$. Suppose ${\lambda \in [a, a+1]}$ for an integer ${a \geq 2}$. Let ${g(\lambda)}$ be the first-order approximation of ${\Gamma(1+\lambda)}$ around ${\lambda = a}$. Since the derivatives ${\Gamma^{(n)}(1+\lambda)}$ are positive for every ${n \geq 0}$, ${g(1+\lambda)}$ is a lower bound for ${\Gamma(1+\lambda)}$. Recall that ${\Gamma^\prime(1+\lambda) = \Gamma(1+\lambda) \Psi(1+\lambda)}$ where ${\Psi}$ is the Digamma function defined as ${\Psi(s) \triangleq \Gamma^\prime(s)/\Gamma(s)}$. Since ${\Psi(1+\lambda) > \log(1/2+\lambda)}$ (cf. (3) in \cite{diamond2016bounds}), it follows that

$\displaystyle \Gamma(1+\lambda) > g(\lambda; a) \triangleq \Gamma(1+a) + (\lambda - a) \Gamma(1+a) \log(1/2 + a) \, .$

On the other hand, the straight-line segment ${\phi(\lambda)}$ determined by the points ${(a, f(a))}$ and ${(a+1, f(a+1))}$ serves as an upper bound on the convex function ${f(\lambda)}$ in the interval ${[a, a+1]}$. This line is

$\displaystyle \phi(\lambda; a) \triangleq f(a) + (\lambda - a) ( f(a+1) - f(a) ) \, .$

Observe that the line ${\phi(\lambda; a)}$ stays below the line ${g(\lambda; a)}$ for ${\lambda \in [a, a+1], a = 2, 3, 4, \cdots}$ since we can use (1) and (9) to show that

$\displaystyle \phi(a;a) = f(a) = 2^{-a} \mathrm{Li}_{-a}(1/2) \leq 2^{-a} \cdot \frac{1/2}{1-1/2} \frac{a!}{(1-1/2)^a} = a! = \Gamma(1+a) = g(a; a) \, .$

This proves the claim for every real ${\lambda \geq 2}$.

Case: ${\lambda \in [1, 2]}$ is a real. Here, as before, we seek to show that ${f(\lambda) \leq \Gamma(1+\lambda)}$ for ${\lambda \in [1,2]}$. The straight-line bounds from the previous case does not work in this case because ${g(\lambda; 1) < f(\lambda; 1)}$ for ${\lambda \in [1, 2]}$. Our argument, instead, relies on the following expression of the negative-order Polylogarithms in term of the Gamma function (cf. (13.1) in \cite{wood1992polylog}):

$\displaystyle \mathrm{Li}_{-\lambda}(\alpha) = \Gamma(1+\lambda) \sum_{k = -\infty}^\infty{ (- \log(\alpha) + 2 \pi k i)^{1+\lambda}} \, , \qquad \lambda > 0 \, . \ \ \ \ \ (12)$

Using ${\alpha = 1/2}$ in (12) and further simplification, we can write (11) as

$\displaystyle f(\lambda) = 2^{-\lambda}\Gamma(1+\lambda) \left( T_0(\lambda) + T(\lambda) \right) \, , \ \ \ \ \ (13)$

where

$\displaystyle \begin{array}{rcl} T(\lambda) &=& \sum_{k \geq 1}{T_k(\lambda)}\, , \\ T_k(\lambda) &\triangleq& z_k^{-(1+\lambda)} + (z_k^*)^{-(1+\lambda)}\, , \qquad \text{and} \\ z_k &\triangleq& \log(2) + 2\pi k i\, , \qquad z_k \in \mathbb{C} \\ T_0(\lambda) &=& \frac{1}{(\log(2)^{1+\lambda}} \, . \end{array}$

Our plan is to show that ${T(\lambda) < 0}$ for ${\lambda \in [1,2]}$ since it would imply that

$\displaystyle \begin{array}{rcl} f(\lambda) &\leq& 2^{-\lambda} T_0(\lambda) \Gamma(1+\lambda) = \frac{2^{-\lambda} \Gamma(1+\lambda)}{ (\log(2))^{1+\lambda} } \\ &=& \frac{\Gamma(1+\lambda)}{(\log(4))^\lambda \log(2)} \leq \frac{\Gamma(1+\lambda)}{\log(8)} \\ &<& \Gamma(1+\lambda) \end{array}$

using ${\lambda \geq 1}$.

It remains to show that ${T(\lambda) < 0}$ for ${\lambda \in [1,2]}$. First, observe that we can explicitly compute ${T(\lambda)}$ for ${\lambda = 1,2}$ using (13). In particular, we can use (3) to compute ${\mathrm{Li}_{-1}(1/2) = (1/2) / (1/4) = 2}$, and ${\mathrm{Li}_{-2}(1/2) = (1/2)(3/2)/(1/8) = 6}$, and further using (13), we get

$\displaystyle \begin{array}{rcl} T(1) &=& \frac{f(1)}{2^{-1}\Gamma(1+1)} - T_0(1) = \frac{\mathrm{Li}_{-1}(1/2)}{1} - \frac{1}{(\log(2))^2} \\ &=& 2 - \frac{1}{(\log(2))^2} \approx -0.08 \, , \quad\text{and}\\ T(2) &=& \frac{f(2)}{2^{-2}\Gamma(1+2)} - T_0(2) = \frac{\mathrm{Li}_{-2}(1/2)}{2} - \frac{1}{(\log(2))^3} \\ &=& 3 - \frac{1}{(\log(2))^3} \approx -3.0027 \, , \end{array}$

and, in particular, ${T(1) < T(2) < 0}$. Thus to show that ${T(\lambda) < 0}$ for ${\lambda \in [1,2]}$, it suffices to show that ${\partial T_k(\lambda)/\partial \lambda}$ is strictly positive for every ${k \geq 1}$. The rest of the proof is devoted to this task.

Let ${r_k \triangleq \vert z_k \vert = \sqrt{\ln(2)^2 + (2 \pi k)^2}}$ and ${\theta_k \triangleq \arg{z_k} = \arctan(2 \pi k/\log(2) )}$. Then ${z_k = r_k e^{i \theta_k}, z_k^* = r_k e^{-i \theta_k}, 1/z_k = e^{-i \theta_k}/r_k}$, and ${1/z_k^* = e^{i \theta_k}/r_k}$. Thus

$\displaystyle \begin{array}{rcl} \frac{\partial}{\partial \lambda}T_k(\lambda) &=& \frac{\partial}{\partial \lambda}\left[ \left(\frac{1}{z_k} \right)^{1+\lambda} + \left(\frac{1}{z_k^*}\right)^{1+\lambda} \right] \nonumber \\ &=& \left(\frac{1}{z_k} \right)^{1+\lambda}\ln \frac{1}{z_k} + \left(\frac{1}{z_k^*}\right)^{1+\lambda} \ln \frac{1}{z_k^*} \nonumber \\ &=&\left( \frac{1}{r_k}\right)^{1+\lambda}\left( e^{-i \theta_k(1+\lambda)} \ln \frac{e^{-i \theta_k}}{r_k} + e^{i \theta_k(1+\lambda)} \ln \frac{e^{i \theta_k}}{r_k} \right)\,. \end{array}$

Thus

$\displaystyle \frac{\partial}{\partial \lambda}T_k(\lambda) = -2 \left( \frac{1}{r_k}\right)^{1+\lambda}\left( \ln(r_k) \cos(\theta_k(1+\lambda)) + \theta_k \sin (\theta_k(1+\lambda)) \right) \,. \ \ \ \ \ (14)$

Since ${r_k \geq r_1 = \log(2) > 0}$ and ${\theta_k \in [0, 2\pi)}$, the signs of the trigonometric terms decide the sign of the right-hand side above. We claim that

$\displaystyle \ln(r_k) \cos(\theta_k(1+\lambda)) + \theta_k \sin (\theta_k(1+\lambda)) < 0 \ \ \ \ \ (15)$

which, in turn, implies that the right-hand side of (14) is strictly positive, as desired. Recall that

$\displaystyle \arctan(x) = \pi/2 - \arctan(1/x)$

if ${x > 1}$, and

$\displaystyle \arctan(x) = x - \frac{x^3}{3} + \frac{x^5}{5} - \cdots < x$

if ${\vert x \vert < 1}$. Define ${\beta_k \triangleq 2 \pi k/\log(2)}$. Since ${\beta_k > 0}$ for all ${k \geq 1}$, we have

$\displaystyle \theta_k = \arctan(\beta_k) = \frac{\pi}{2} - \arctan(1/\beta_k) \in \left[ \pi/2 - 1/\beta_k, \pi/2 \right] \, ,$

and consequently, since ${\lambda \in [1,2]}$,

$\displaystyle \theta_k(1+\lambda) \in \left[ \pi - 2/\beta_k, 3\pi/2 \right] \, .$

If we have ${ \theta_k(1+\lambda) \in (\pi, 3 \pi/2)}$ then both ${\sin(\theta_k(1+\lambda)) }$ and ${\cos(\theta_k(1+\lambda))}$ are strictly negative, satisfying (15). Otherwise, if ${\theta_k(1+\lambda) \in \{\pi, 3\pi/2\}}$, exactly one of these terms is strictly negative while the other one is zero; (15) is satisfied in this case as well.

The only remaining case is ${\theta_k(1+\lambda) \in [\pi - 2/\beta_k, \pi)}$ where ${2/\beta_k = \ln(2)/\pi k < \pi/2}$; in this case, the cosine term is negative but the sine term is positive. Let ${a(k, \lambda) \triangleq \ln(r_k) \cos(\theta_k(1+\lambda))}$ and ${b(k, \lambda) \triangleq \theta_k \sin(\theta_k(1+\lambda))}$. If we can show that ${\vert a(k, \lambda) \vert > \vert b(k, \lambda) \vert}$, (15) will be satisfied. With this in mind, observe that ${k \geq 1}$ implies ${\log(r_k) = (1/2) \log(r_k^2) = (1/2) \log( (\log(2)^2 + 4 \pi^2 k^2) > 3/2}$, and that ${\theta_k \leq \pi/2}$. Using this, we calculate

$\displaystyle \begin{array}{rcl} | a(k, \lambda) | - | b(k, \lambda)| &>& (3/2) |\cos(\pi - 2/\beta_k)| - (\pi/2) |\sin(\pi - 2/\beta_k)| = (3/2) \cos(2\beta_k) - (\pi/2)\sin(2\beta_k) \\ &>& (3/2) \cos(\ln(2)/\pi) - (\pi/2)\sin(\ln(2)/\pi) \\ &>& 1 \end{array}$

by a direct evaluation, and~(15) is satisfied, as desired. $\Box$