Probability Ross Chapter 5 Notes

Lemma 2.1

For a nonnegative random variable $Y$

$$ \evw{Y}=\int_0^{\infty}P\set{Y>y}dy $$

Proof

$$ \int_0^{\infty}P\set{Y>y}dy=\int_0^{\infty}\int_{y}^{\infty}f_Y(x)dxdy $$

The limits of integration are

$$ y_0=0\dq y_1=\infty\dq x_0=y\dq x_1=\infty $$

The outside variable is $y$. It (always, i.e. must) has constant limits of integration, $0$ to $\infty$. The inside variable $x$ goes horizontally from the line $y=x$ to $\infty$. So the region to be integrated over is the triangle below the line $y=x$, above the $x$-axis, unbounded to the right, and with a vertex at the origin.

To change the order of integration, we first get the limits of integration for the new outside variable $x$. These limits must be constants. The triangular region we described has a minimum $x$ value of $0$ and a maximum $x$ of $\infty$. The inside variable $y$ goes from $0$ to the line $y=x$. That is

$$ x_0=0\dq x_1=\infty\dq y_0=0\dq y_1=x $$

Hence

$$ \int_0^{\infty}P\set{Y>y}dy=\int_0^{\infty}\int_{y}^{\infty}f_Y(x)dxdy $$

$$ =\int_{0}^{\infty}\int_{0}^{x}f_Y(x)dydx \tag{change order} $$

$$ =\int_{0}^{\infty}\Bop\int_{0}^{x}dy\Bcp f_Y(x)dx $$

$$ =\int_{0}^{\infty}xf_Y(x)dx $$

$$ =\evw{Y} $$

$\wes$

Variation of Proposition 2.1

If $X$ is a continuous random variable with probability density function $f(x)$, then, for any nonnegative, strictly increasing function $g:\wR\mapsto\wR$, we have

$$ \evw{g(X)}=\int_{-\infty}^{\infty}g(x)f(x)dx $$

Proof

Throughout this proof, it is helpful to imagine that $g(x)=e^{x}$ so that $g^{-1}(y)=\ln(y)$. This doesn’t limit the generality of the proof but it is a helpful tool.

First we’d like to show that $g^{-1}(y)$ is unbounded for $y\geq0$. That is, for arbitrarily large $x_0$, we want to show that there exists $y_0$ such that $g^{-1}(y_0)\geq x_0$. This is equivalent to showing that there exists $y_0$ such that $y_0\geq g(x_0)$. But we assumed that $g(x)$ is defined for all $x\in\wR$. So we set $y_0=g(x_0)$ and we’re done.

From Lemma 2.1, we have

$$ \E{g(X)}=\int_{0}^{\infty}P\set{g(X)>y}dy $$

$$ =\int_0^{\infty}\int_{x:g(x)>y}f(x)dxdy $$

$$ =\int_0^{\infty}\int_{x:x>g^{-1}(y)}f(x)dxdy $$

$$ =\int_0^{\infty}\int_{g^{-1}(y)}^{\infty}f(x)dxdy \tag{1} $$

The outside variable is $y$. It (always, i.e. must) has constant limits of integration, $0$ to $\infty$. The inside variable $x$ goes from $g^{-1}(y)$ to $\infty$. So the region to be integrated over is below the curve $y=g(x)$, above the $x$-axis, unbounded to the right, and with a vertex at the origin.

To change the order of integration, we first get the limits of integration for the new outside variable $x$. These limits must be constants. The region we described has a minimum $x$ value of $g^{-1}(0)$ (since $y\geq0$ and $g^{-1}(y)$ is strictly increasing) and a maximum $x$ of $G=\sup_{y\geq0}g^{-1}(y)$. $G$ could be $\infty$ or it could be finite, but it is constant. The inside variable $y$ goes from $0$ to the curve $y=g(x)$.

$$ \int_0^{\infty}\int_{g^{-1}(y)}^{\infty}f(x)dxdy=\int_{g^{-1}(0)}^{G}\int_{0}^{g(x)}f(x)dydx $$

$$ =\int_{g^{-1}(0)}^{G}\Bop\int_{0}^{g(x)}dy\Bcp f(x)dx $$

$$ =\int_{g^{-1}(0)}^{G}g(x)f(x)dx $$

Now if $g^{-1}(0)=-\infty$ and $G=\infty$, then we’re done.

Suppose instead that $g^{-1}(0)=x_0>-\infty$. Then it must be that $g(x)=0$ for all $x\leq x_0$. To see this, note that $g^{-1}(0)=x_0\iff 0=g(x_0)$. But $g$ is strictly increasing and nonnegative. Hence $0\leq g(x)<g(x_0)=0$ for all $x<x_0$. Hence $g(x)=0$ for all $x\leq x_0=g^{-1}(0)$. Hence

$$ \int_{g^{-1}(0)}^{G}g(x)f(x)dx=\int_{-\infty}^{G}g(x)f(x)dx $$

Similarly suppose that $\sup_{y\geq0}g^{-1}(y)=G<\infty$. ?????????? $g(x)$ is undefined for $x>G$?? So we can set $g(x)=0$ for $x>G$?? But in (1), we integrated $x$ from $g^{-1}(y)$ for all $y$ upto $\infty$.

$\wes$

Example 2c

$U$ is uniformly distributed on $(0,1)$ hence

$$ f_U(u)=\frac1{1-0}=1 $$

$$ \evw{L_p(U)}=\int_0^1L_p(U)f_U(u)du=\int_0^1L_p(U)du $$

$$ =\int_0^p(1-u)du+\int_p^1udu \tag{2c.1} $$

Let’s compute the first integral:

$$ x=1-u\dq u=1-x\dq du=-dx\dq x_0=1-u_0=1\dq x_1=1-u_1=1-p $$

$$ \int_0^p(1-u)du=-\int_1^{1-p}xdx=-\frac{x^2}2\bigbar_1^{1-p}=\frac12-\frac{(1-p)^2}2 $$

Hence

$$ \evw{L_p(U)}=\frac12-\frac{(1-p)^2}2+\frac12-\frac{p^2}2 $$

$$ =\frac12+\frac{-(1-p)^2+1-p^2}2 $$

$$ =\frac12+\frac{-(1-2p+p^2)+1-p^2}2 $$

$$ =\frac12+\frac{-1+2p-p^2+1-p^2}2 $$

$$ =\frac12+\frac{2p-2p^2}2 $$

$$ =\frac12+p(1-p) $$

Example 4i

In [2664]: nycnsbn=lambda n=11,gp=.52,sp=.5: sum([brv(n,i,gp) for i in range(int(sp*n)+1,n+1)])

In [2665]: nycns=lambda n=11,gp=.52,sp=.5: (nycnsbn(n,gp,sp),phi(.04*np.sqrt(n)))

In [2666]: nycns(1692)
Out[2666]: (0.947592560414427, 0.95005190494014158)

In [2667]: nycns(1693)
Out[2667]: (0.950194799856719, 0.95010198225329223)

Graphs of Exponential Density and Distribution

When trying to plot these graphs, I encountered a python error that I posted to stackoverflow.

The first code snippet is the efficient approach. The second snippet is inefficient but still works.

In [1]: from numpy import exp

In [2]: from matplotlib import pyplot as plt

In [3]: import numpy as np

In [4]: def graph(funct, x_range):
   ...:     x=np.array(x_range)
   ...:     y=funct(x)
   ...:     plt.plot(x,y,'r--')
   ...:     plt.show()
   ...:     

In [5]: pdf_exp=lambda x,lam=1:lam*exp(-lam*x)*(x>=0)

In [6]: cdf_exp=lambda x,lam=1:(1-exp(-lam*x))*(x>=0)

In [7]: graph(lambda x: pdf_exp(x), np.linspace(-10,10,10000))

In [8]: graph(lambda x: cdf_exp(x), np.linspace(-10,10,10000))

Inefficient Implementation:

In [1]: from numpy import exp

In [2]: from matplotlib import pyplot as plt

In [3]: def graph(funct, x_range):
   ...:     y_range=[]
   ...:     for x in x_range:
   ...:         y_range.append(funct(x))
   ...:     plt.plot(x_range,y_range,'r--')
   ...:     plt.show()
   ...:     

In [4]: import numpy as np

In [5]: pdf_exp=lambda x,lam=1:lam*exp(-lam*x) if x>=0 else 0

In [6]: cdf_exp=lambda x,lam=1:1-exp(-lam*x) if x>=0 else 0

In [7]: graph(lambda x: pdf_exp(x), np.linspace(-10,10,10000))

In [8]: graph(lambda x: cdf_exp(x), np.linspace(-10,10,10000))

Poisson and Exponential, Time Estimates

Let $N_t$ denote the number of arrivals during time period $t$.

Let $X_t$ denote the time it take for one additional arrival to arrive given that someone just arrived at time $t$.

By definition, the following conditions are equivalent:

$$ X_t>x\iff N_t=N_{t+x} \tag{5.41.1} $$

To show $\implies$: if $X_t>x$, then no one arrived from time $t$ to time $t+x$. Hence $N_t=N_{t+x}$.

To show $\impliedby$: if $N_t=N_{t+x}$, then no one arrived from time $t$ to time $t+x$. Hence $X_t>x$.

Then

$$ \pr{X_t\leq x}=1-\pr{X_t>x}=1-\pr{N_{t+x}-N_t=0} $$

$$ =1-\pr{N_x=0}=1-\frac{(\lambda x)^0}{0!}\e{-\lambda x}=1-\e{-\lambda x}=F_{X_t}(x) $$

where $F_{X_t}$ is the distribution function of the the exponential random variable.

Example 5c

Let $Y$ denote the amount of time that the other person (either Ms. Jones or Mr. Brown) spends being served. Let $X$ denote the amount of time that Mr. Smith spends being served. Let $t$ denote the amount of time that Mr. Smith spends waiting to be served.

Then we wish to compare $\cp{Y>s+t}{Y>t}$ with $P(X>s)$.

$$ \cp{Y>s+t}{Y>t}=P(Y>s)=e^{-\lambda s}=P(X>s) $$

That is, the other person is just as likely as Mr. Smith to finish being served at time $s+t$. Hence both have a probability of $\frac12$ of finishing first.

The Only Memoryless Distribution Is the Exponential Distribution

Let $T$ be a continuous random variable on $[0,\infty)$. Then $T$ is memoryless if, for all $s,t\geq0$, we have:

$$ \cp{T>s+t}{T>t}=P(T>s) \tag{1} $$

By the definition of conditional probability, this is equivalent to

$$ \frac{P\set{T>s+t,T>t}}{P\set{T>t}}=P(T>s) $$

But $\set{T>s+t}$ is a subset of $\set{T>t}$, which implies that $P\set{T>s+t,T>t}=P\set{T>s+t}$. Hence (1) is equivalent to

$$ \frac{P\set{T>s+t}}{P\set{T>t}}=P(T>s) $$

$$ P\set{T>s+t}=P(T>s)P\set{T>t} \tag{2} $$

Proof

First recall the PDF and CDF for the exponential distribution. If $X$ is exponentially distributed, then for some $\lambda>0$, the density $f$ of $X$ is

$$ f(x)=\cases{\lambda e^{-\lambda x}&x\geq0\\0&x<0} $$

Then, for $a\geq0$, the cumulative distribution function $F$ of $X$ is

$$ F(a)=P\set{X\leq a}=\int_{0}^{a}\lambda e^{-\lambda x}dx=-e^{-\lambda x}\eval{0}{a}=e^{-\lambda0}-e^{-\lambda a}=1-e^{-\lambda a} $$

Hence $P\set{X>a}=1-\bop1-e^{-\lambda a}\bcp=e^{-\lambda a}$.

Suppose $T$ is a memoryless distribution and let $g(x)=P\set{T>x}$. Hence, to show that $T$ must be the exponential distribution, it is sufficient to show that $g(x)=e^{-\lambda x}$ for some $\lambda>0$ and for all $x\geq0$.

To this end, first note that the memoryless property of $T$ gives us

$$ \pr{T>2}=\pr{T>1+1}=\pr{T>1}\pr{T>1}=\bop\pr{T>1}\bcp^2 $$

Similarly

$$ \pr{T>3}=\pr{T>1+2}=\pr{T>1}\pr{T>2}=\pr{T>1}\bop\pr{T>1}\bcp^2=\bop\pr{T>1}\bcp^3 $$

For any positive integer $n$, an inductive argument yields

$$ \pr{T>n}=\bop\pr{T>1}\bcp^n $$

Let $\lambda=-\ln\bop\pr{T>1}\bcp$. Multiplying by $-1$ and taking the exponential of both sides, we see that $\pr{T>1}=e^{-\lambda}$ and

$$ g(n)=\pr{T>n}=\bop\pr{T>1}\bcp^n=\bop e^{-\lambda}\bcp^n=e^{-\lambda n} $$

So we have proven the desired result for the positive integers. To prove this for the positive rational numbers, note that

$$ e^{-\lambda}=\pr{T>1}=\prB{T>\frac12+\frac12}=\prB{T>\frac12}\prB{T>\frac12}=\Bop\prB{T>\frac12}\Bcp^2 $$

Hence

$$ e^{-\frac{\lambda}2}=\prB{T>\frac12} $$

Similarly, for any positive integer $m$, we have

$$ e^{-\lambda m}=\pr{T>m}=\prB{T>\frac{m}2+\frac{m}2}=\prB{T>\frac{m}2}\prB{T>\frac{m}2}=\Bop\prB{T>\frac{m}2}\Bcp^2 $$

Hence

$$ e^{-\lambda\frac{m}2}=\prB{T>\frac{m}2} $$

Similarly

$$ e^{-\lambda m}=\pr{T>m}=\prB{T>\frac{m}3+\frac{m}3+\frac{m}3}=\Bop\prB{T>\frac{m}3}\Bcp^3 $$

Hence

$$ e^{-\lambda\frac{m}3}=\prB{T>\frac{m}3} $$

An inductive argument shows that for any positive integer $n$, we have

$$ e^{-\lambda m}=\pr{T>m}=\prB{T>\sum_1^n\frac{m}n}=\Bop\prB{T>\frac{m}n}\Bcp^n $$

Hence

$$ e^{-\lambda\frac{m}n}=\prB{T>\frac{m}n}=g\Bop\frac{m}n\Bcp $$

So the desired result is proven for positive rational numbers and it suffices to prove it for positive irrational numbers. Let $t_n, n\geq1$ be a sequence of decreasing rational numbers that converge to the irrational number $t>0$. Then

$$ g(t)=\pr{T>t}=\lim_{n\goesto\infty}\pr{T>t_n}=\lim_{n\goesto\infty}e^{-\lambda t_n}=e^{-\lambda t} $$

The second equality uses the right-continuous property of the CDF.

$\wes$

Example 5d

Let $X$ denote the number of miles before the battery dies.

$$ 10K=\evw{X}=\frac1{\lambda_b}\iff\lambda_b=\frac1{10K} $$

Then $\frac{X}{1K}$ denotes the number of 1K miles before the battery dies:

$$ 10=\frac{10K}{1K}=\evwB{\frac{X}{1K}}=\frac{1}{1K\lambda_b}=\frac{1}{1K\frac1{10K}}=\frac1{\frac1{10}}=\frac1{\lambda_a}\iff\lambda_a=\frac1{10} $$

We can compute it this way

$$ \bop P(X>5K)\bcp^{2}=P(X>5K+5K) $$

$$ =P(X>10K)=1-F_b(10K)=1-\bop1-e^{-\lambda_b10K}\bcp=e^{-\lambda_b10K}=e^{-1} $$

So that

$$ P(X>5K)=\bop e^{-1}\bcp^{\frac12}=e^{-1\wts\frac12}=e^{-\frac12} $$

Or we can compute it this way

$$ P(X>5K)=1-F_b(5K)=1-\bop1-e^{-\lambda_b5K}\bcp=e^{-\lambda_b5K}=e^{-\frac1{10K}5K}=e^{-\frac12} $$

Or we can compute it this way

$$ P(X>5)=1-F_a(5)=1-\bop1-e^{-\lambda_a5}\bcp=e^{-\lambda_a5}=e^{-\frac1{10}5}=e^{-\frac12} $$

The Density Closely Approximates The Intense Probability in a Small Neighborhood

Suppose $X$ is exponentially distributed with density $f$, distribution $F$, and $\lambda>0$.

The equations at the top of p.213 assume that, for a fixed $t>0$ and any small $dt>0$, we have

$$ f(t)dt\approx\pr{X\in(t,t+dt)} $$

This is easy to picture in the $x$-$y$ plane and is equivalent to

$$ f(t)dt-\int_{t}^{t+dt}f(x)dx\approx0 $$

Here is a helpful discussion.

Rigorously we want to show that

$$ \lim_{dt\downarrow0}\Bop f(t)dt-\int_{t}^{t+dt}f(x)dx\Bcp=0 $$

This is a simple result from the definition of the Reimann Integral. But I haven’t done an $\epsilon$-$\delta$ proof in a long time. So let $\epsilon>0$. We must find $\delta>0$ such that $dt<\delta$ implies

$$ \normB{f(t)dt-\int_{t}^{t+dt}f(x)dx}<\epsilon $$

We have

$$ \normB{f(t)dt-\int_{t}^{t+dt}f(x)dx}=\normB{\lambda\e{-\lambda t}dt-\Bop-\e{-\lambda x}\eval{t}{t+dt}\Bcp} $$

$$ =\normB{\lambda\e{-\lambda t}dt+\e{-\lambda x}\eval{t}{t+dt}} $$

$$ =\normB{\lambda\e{-\lambda t}dt+\e{-\lambda t-\lambda dt}-\e{-\lambda t}} $$

$$ =\normB{\lambda\e{-\lambda t}dt+\e{-\lambda t}\e{-\lambda dt}-\e{-\lambda t}} $$

$$ =\normB{\e{-\lambda t}\bop\lambda dt+\e{-\lambda dt}-1\bcp} $$

$$ =\e{-\lambda t}\normb{\lambda dt+\e{-\lambda dt}-1} $$

$$ \leq\e{-\lambda t}\bop\norm{\lambda dt}+\norm{\e{-\lambda dt}-1}\bcp \tag{Triangle Inequality} $$

$$ =\e{-\lambda t}\bop\lambda dt+1-\e{-\lambda dt}\bcp \tag{$\lambda dt>0\implies\e{-\lambda dt}<1$} $$

$$ =\e{-\lambda t}\lambda dt+\e{-\lambda t}\bop1-\e{-\lambda dt}\bcp $$

Let’s look at the first term. We want to find $\delta_1>0$ such that $dt<\delta_1$ implies

$$ \e{-\lambda t}\lambda dt<\frac\epsilon2 $$

$$ dt<\frac\epsilon{2\e{-\lambda t}\lambda} $$

Set $\delta_1=\frac\epsilon{2\e{-\lambda t}\lambda}$ and we’re half way there. Now let’s look at the second term. We want to find $\delta_2>0$ such $dt<\delta_2$ implies

$$ \e{-\lambda t}\bop1-\e{-\lambda dt}\bcp<\frac\epsilon2 $$

$\iff$

$$ 1-\e{-\lambda dt}<\frac\epsilon{2\e{-\lambda t}} $$

$\iff$

$$ -\e{-\lambda dt}<\frac\epsilon{2\e{-\lambda t}}-1 $$

$\iff$

$$ \e{-\lambda dt}>1-\frac\epsilon{2\e{-\lambda t}} $$

$\iff$

$$ -\lambda dt>\ln\Bop1-\frac\epsilon{2\e{-\lambda t}}\Bcp $$

$\iff$

$$ dt<\frac{\ln\Bop1-\frac\epsilon{2\e{-\lambda t}}\Bcp}{-\lambda} $$

We note that $\frac\epsilon{2\e{-\lambda t}}>0$ so the numerator is negative and the right-hand side is positive. Set $\delta_2$ equal to the right-hand side and set $\delta=\min(\delta_1,\delta_2)$. Then $dt<\delta$ implies

$$ \normB{f(t)dt-\int_{t}^{t+dt}f(x)dx}\leq\e{-\lambda t}\lambda dt+\e{-\lambda t}\bop1-\e{-\lambda dt}\bcp<\frac\epsilon2+\frac\epsilon2=\epsilon $$

$\wes$

5.5.1 Hazard Rate Function for the Exponential Random Variable

$$ \cp{X\in(t,t+dt)}{X>t}=\frac{\pr{X\in(t,t+dt)\cap X>t}}{\pr{X>t}} $$

But $\set{X\in(t,t+dt)}\subset\set{X>t}$: Suppose $x_0\in\set{X\in(t,t+dt)}$. Then $x_0>t$ so that $x_0\in\set{X>t}$. Hence

$$ \set{X\in(t,t+dt)}\cap\set{X>t}=\set{X\in(t,t+dt)} $$

And

$$ \pr{X\in(t,t+dt)\cap X>t}=\pr{X\in(t,t+dt)} $$

And

$$ \cp{X\in(t,t+dt)}{X>t}=\frac{\pr{X\in(t,t+dt)\cap X>t}}{\pr{X>t}} $$

$$ =\frac{\pr{X\in(t,t+dt)}}{\pr{X>t}} $$

$$ =\frac{1-\pr{X\notin(t,t+dt)}}{\pr{X>t}} $$

$$ =\frac{1-\pr{X<t\cup X>t+dt}}{\pr{X>t}} $$

$$ =\frac{1-\bop\pr{X<t}+\pr{X>t+dt}\bcp}{\pr{X>t}} $$

$$ =\frac{1-\pr{X<t}-\pr{X>t+dt}}{\pr{X>t}} $$

$$ =\frac{\pr{X>t}-\pr{X>t+dt}}{\pr{X>t}} $$

$$ =1-\frac{\pr{X>t+dt}}{\pr{X>t}} $$

$$ =1-\frac{\pr{X>t}\pr{X>dt}}{\pr{X>t}} \tag{Memoryless} $$

$$ =1-\pr{X>dt} $$

$$ =\pr{X<dt} $$

In words, the memoryless property implies that the distribution of remaining life for a $t$-year-old item is the same as that for a new item. But we also have

$$ f(t)dt\approx\pr{X\in(t,t+dt)} $$

Hence

$$ \pr{X<dt}=\cp{X\in(t,t+dt)}{X>t}=\frac{\pr{X\in(t,t+dt)}}{\pr{X>t}}\approx\frac{f(t)}{\overline{F}(t)}dt=\lambda(t)dt $$

Since this holds for all $t>0$, then $\lambda(t)$ must be constant. Indeed

$$ \lambda(t)=\frac{f(t)}{\overline{F}(t)}=\frac{\lambda\e{-\lambda t}}{\e{-\lambda t}}=\lambda $$

5.5.1 Hazard Rate Function Integration

$$ \int_0^t\frac{F'(x)}{1-F(x)}dx $$

Let $u=1-F(x)$, $du=-F’(x)dx$ to get

$$ \int\frac{F'(x)}{1-F(x)}dx=\int\frac{-du}{u}=-\ln(u)+k=-\ln\bop1-F(x)\bcp+k $$

$$ \int_0^t\frac{F'(x)}{1-F(x)}dx=-\ln\bop1-F(x)\bcp\eval{0}{t}=\ln\bop1-F(x)\bcp\eval{t}{0} $$

$$ =\ln\bop1-F(0)\bcp-\ln\bop1-F(t)\bcp $$

$$ =\ln\bop1-0\bcp-\ln\bop1-F(t)\bcp \tag{since $X>0$} $$

$$ =-ln\bop1-F(t)\bcp $$

Example 6a Expectation and Variance of Gamma Random Variable

$$ \evw{X}=\frac1{\GammaF{\alpha}}\int_0^\infty x\lambda\e{-\lambda x}(\lambda x)^{\alpha-1}dx $$

$$ =\frac1{\GammaF{\alpha}}\int_0^\infty\frac{\lambda x}\lambda\lambda\e{-\lambda x}(\lambda x)^{\alpha-1}dx $$

$$ =\frac1{\lambda\GammaF{\alpha}}\int_0^\infty\lambda\e{-\lambda x}(\lambda x)^{\alpha}dx \tag{5.6a.1} $$

We want to show that

$$ \int_0^\infty\lambda\e{-\lambda x}(\lambda x)^{\alpha}dx=\GammaF{\alpha+1} $$

The substitution $y=\lambda x$ gives us

$$ x=\frac{y}\lambda\dq dx=\frac{dy}{\lambda}\dq y_0=\lambda x_0=0\dq y_1=\lambda x_1=\infty $$

$$ \int_0^\infty\lambda\e{-\lambda x}(\lambda x)^{\alpha}dx=\int_0^\infty\lambda\e{-y}y^\alpha\frac{dy}\lambda=\int_0^\infty\e{-y}y^{\alpha}dy=\GammaF{\alpha+1} $$

Hence 5.6a.1 becomes

$$ \evw{X}=\frac{\GammaF{\alpha+1}}{\lambda\GammaF{\alpha}}=\frac\alpha\lambda $$

where the last equation follows from equation 6.1 in the book.

Next we want to find $\varw{X}$:

$$ \evwb{X^2}=\frac1{\GammaF{\alpha}}\int_0^\infty x^2\lambda\e{-\lambda x}(\lambda x)^{\alpha-1}dx $$

The substitution $y=\lambda x$ gives us

$$ x=\frac{y}\lambda\dq dx=\frac{dy}{\lambda}\dq y_0=\lambda x_0=0\dq y_1=\lambda x_1=\infty $$

$$ \evwb{X^2}=\frac1{\GammaF{\alpha}}\int_0^\infty\Prn{\frac{y}{\lambda}}^2\lambda\e{-y}y^{\alpha-1}\frac{dy}\lambda $$

$$ =\frac1{\GammaF{\alpha}}\int_0^\infty\Prn{\frac{y}{\lambda}}^2\e{-y}y^{\alpha-1}dy $$

$$ =\frac1{\lambda^2\GammaF{\alpha}}\int_0^\infty y^2\e{-y}y^{\alpha-1}dy $$

$$ =\frac1{\lambda^2\GammaF{\alpha}}\int_0^\infty\e{-y}y^{\alpha+1}dy $$

$$ =\frac{\GammaF{\alpha+2}}{\lambda^2\GammaF{\alpha}}=\frac{(\alpha+1)\GammaF{\alpha+1}}{\lambda^2\GammaF{\alpha}}=\frac{(\alpha+1)\alpha\GammaF{\alpha}}{\lambda^2\GammaF{\alpha}} $$

$$ =\frac{\alpha(\alpha+1)}{\lambda^2} $$

And

$$ \varw{X}=\frac{\alpha(\alpha+1)}{\lambda^2}-\Prn{\frac\alpha\lambda}^2=\frac{\alpha(\alpha+1)-\alpha^2}{\lambda^2}=\frac{\alpha^2+\alpha-\alpha^2}{\lambda^2}=\frac\alpha{\lambda^2} $$

Example 7b

$$ \cdfa{y}{Y}=\cdfa{\sqrt{y}}{X}-\cdfa{-\sqrt{y}}{X} $$

Then the density is

$$ \pdfa{y}{Y}=\wderiv{\cdfu{Y}}{y}=\wderiv{\cdfu{X}}{\sqrt{y}}\wderiv{\sqrt{y}}{y}-\wderiv{\cdfu{X}}{(-\sqrt{y})}\wderiv{(-\sqrt{y})}{y} $$

$$ =\pdfa{\sqrt{y}}{X}\frac12y^{-\frac12}-\pdfa{-\sqrt{y}}{X}\prn{-\frac12y^{-\frac12}} $$

$$ =\pdfa{\sqrt{y}}{X}\frac1{2\sqrt{y}}+\pdfa{-\sqrt{y}}{X}\frac1{2\sqrt{y}} $$

Succintly, we have

$$ \pdfa{y}{Y}=\frac1{2\sqrt{y}}\prn{\pdfa{\sqrt{y}}{X}+\pdfa{-\sqrt{y}}{X}} \tag{5.7b.1} $$

Now let’s try applying Theorem 7.1 to this problem and check that it agrees with 5.7b.1. Further suppose that $X$ is nonnegative. Define $g(x)\equiv x^2$. Since $X$ is nonnegative, then $g(\wt)$ is strictly increasing on the range of $X$ and hence is invertible:

$$ \inv{g}(y)=\sqrt{y}=y^{\frac12} $$

And the derivative of the inverse is

$$ \wdervb{y}{\inv{g}(y)}=\frac12y^{-\frac12}=\frac1{2\sqrt{y}}>0 $$

And

$$ \normB{\wdervb{y}{\inv{g}(y)}}=\wdervb{y}{\inv{g}(y)} $$

Also note that for every $y\geq0$, there exists $x\geq0$ such that $y=g(x)$. Namely, set $x=\sqrt{y}$. Then $g(x)=g\prn{\sqrt{y}}=\prn{\sqrt{y}}^2=y$. Hence for $y\geq0$ we have

$$ \pdfa{y}{Y}=\pdfab{\inv{g}(y)}{X}\normB{\wdervb{y}{\inv{g}(y)}}=\pdfa{\sqrt{y}}{X}\frac1{2\sqrt{y}} $$

Also note that since $X$ is nonnegative, we have $\pdfa{-\sqrt{y}}{X}=0$. Hence

$$ \pdfa{y}{Y}=\frac1{2\sqrt{y}}\pdfa{\sqrt{y}}{X}=\frac1{2\sqrt{y}}\prn{\pdfa{\sqrt{y}}{X}+0}=\frac1{2\sqrt{y}}\prn{\pdfa{\sqrt{y}}{X}+\pdfa{-\sqrt{y}}{X}} $$

This agrees with 5.7b.1 above.

Theorem 7.1

I have some intuition on the statement “When $y\neq g(x)$ for any $x$, then $\cdfa{y}{Y}$ is either $0$ or $1$, an in either case $\pdfa{y}{Y}=0$”. But I’d like to see an example.

We’ll use the exponential distribution:

$$ \pdf{x}=\cases{\lambda\e{-\lambda x}&x\geq0\\0&x<0} $$

And $Y=X^2$ is increasing on the range of $X$, which is $[0,\infty)$. Let’s look at $Y=-7$. There’s no $X$ value such that $-7=X^2$. We know from example 7b above that

$$ \pdfa{y}{Y}=\cases{\frac1{2\sqrt{y}}\pdf{\sqrt{y}}&y\geq0\\0&y<0} $$

$$ \pdfa{y}{Y}=\cases{\frac{\lambda\e{-\lambda\sqrt{y}}}{2\sqrt{y}}&y\geq0\\0&y<0} $$

Hence $\pdfa{-7}{Y}=0$ is verified for this example.