Probability Ross Chapter 4 Notes

01 Apr 2018

Example 1c

Let $X$ denote the event $\settx{either a head occurs or a total of n flips is made}$.

$$ \pr{\underbrace{T,T,T,...,T}_{n-1}T}=(1-p)^{n} $$

$$ \pr{\underbrace{T,T,T,...,T}_{n-1}H}=(1-p)^{n-1}p $$

$$ \pr{X=n}=\pr{\underbrace{T,T,T,...,T}_{n-1}T}+\pr{\underbrace{T,T,T,...,T}_{n-1}H} $$

$$ =(1-p)^{n}+(1-p)^{n-1}p=(1-p)^{n-1}\bop(1-p)+p\bcp $$

Example 1e

$$ 1-\frac1N=\frac{N}N-\frac1N=\frac{N-1}{N} $$

$$ 1-\bop\frac1N+\frac1N\bcp=\frac{N}N-\frac2N=\frac{N-2}{N} $$

Let $A=\set{T=n}$ and let $B=\set{T>n}$. We want to show that $AB=\varnothing$:

Let $a\in A$. Then $a$ is an outcome with $n$ coupon collections. That is, $a$ is a sequence $a_1,a_2,a_3,…,a_n$ where $a_i$ denotes the receipt of a single coupon.

$$ $$

  1. $\bset{X=1}=\bset{(aaB),(aBa),(Baa)}$
  2. $\bset{X=2}=\bset{(aBB),(BaB),(BBa)}$

Code up the binomial cumulative distribution funtion:

from sympy import binomial as binom
import numpy as np

def binom_cdf(n=10,i=5,p=0.5):
  bpmf=binom(n,i)*np.power(p,i)*np.power(1-p,n-i)
  bcdf=bpmf
  for j in reversed(range(1,i+1)):
    bpmf=bpmf*((1-p)/p)*(j/(n-j+1))
    bcdf+=bpmf
  return bcdf

Example 6d

  1. $dd,dd,dd,rr$
  2. $dd,dd,rr,dd$
  3. $dd,rr,dd,dd$
  4. $rr,dd,dd,dd$

Poisson Randon Variable Approximates Binomial RV

Suppose that $X$ is a binomial random variable with parameters $(n, p)$, and let $\lambda = np$. Then

$$ P\set{X=k}=\binom{n}{k}p^k(1-p)^{n-k}=\frac{n!}{(n-k)!k!}\Bop\frac{\lambda}{n}\Bcp^k\Bop1-\frac{\lambda}{n}\Bcp^{n-k} $$

$$ =\underbrace{\Bop\frac{n}{n}\Bcp\Bop\frac{n-1}{n}\Bcp\Bop\frac{n-2}{n}\Bcp\dots\Bop\frac{n-k+1}{n}\Bcp}_\phi\underbrace{\Bop\frac{\lambda^k}{k!}\Bcp}_\psi\underbrace{\Bop1-\frac{\lambda}{n}\Bcp^{n}}_\beta\underbrace{\Bop1-\frac{\lambda}{n}\Bcp^{-k}}_\gamma $$

We note that

  • $\lim_{n\goesto\infty}\phi=1$
  • $\lim_{n\goesto\infty}\psi=\psi$
  • $\lim_{n\goesto\infty}\beta=e^{-\lambda}$
  • $\lim_{n\goesto\infty}\gamma=1$

so that

$$ \lim_{n\goesto\infty}P\set{X=k}=\lim_{n\goesto\infty}\phi\psi\beta\gamma $$

$$ =\lim_{n\goesto\infty}\phi\lim_{n\goesto\infty}\psi\lim_{n\goesto\infty}\beta\lim_{n\goesto\infty}\gamma $$

$$ =e^{-\lambda}\psi=e^{-\lambda}\frac{\lambda^k}{k!} $$

$\wes$

Exponential Approximation

There’s a couple ways to approximate the exponential function in python. First we need a factorial function:

from numpy import exp,prod,sum,power as pw

def wf(w):
  if w==0: return 1
  wst=[i for i in range(1,w+1)]
  return prod(wst)

And then this

def wexp(x=-1,n=10):
  wst=[pw(x,i)/wf(i) for i in range(0,n)]
  return sum(wst)

And here is another way:

def wexp2(x=-1,n=100):
  return pw(1+x/n,n)

But the first approach converges much faster. I’m using a default n=100 in the second approach and default n=10 in the first approach and the approximation is still much better in the first approach:

In [16]: wexp(2)
Out[16]: 7.3887125220458545

In [17]: exp(2)
Out[17]: 7.3890560989306504

In [18]: wexp2(2)
Out[18]: 7.2446461182523478

Match Hat

We can compute the percentage of at least one hat match

from random import shuffle as shf

def match_hat(n=10,trials=100):
    at_least_one_cnt=0
    for k in range(0,trials):
        hats=[i for i in range(1,n+1)]
        shf(hats)
        alo=sum([1 if i==j else 0 for i,j in enumerate(hats, 1)])
        at_least_one_cnt+=1 if alo > 0 else 0
    return at_least_one_cnt/trials

Example 5m from chapter 2 (p41) tells us that the probability of no matching hats is $e^{-1}\approx0.36788$ for large n. This means the probability of at least one matching hat is $1-e^{-1}\approx0.63212$. And we can see that this matches well with the results from this function:

In [173]: match_hat(n=100000)
Out[173]: 0.6

In [174]: match_hat(n=100000)
Out[174]: 0.67

In [175]: match_hat(n=100000)
Out[175]: 0.71

In [176]: match_hat(n=100000)
Out[176]: 0.63

In [177]: match_hat(n=100000)
Out[177]: 0.67

In [160]: match_hat()
Out[160]: 0.64

In [161]: match_hat()
Out[161]: 0.66

In [162]: match_hat()
Out[162]: 0.55

In [163]: match_hat()
Out[163]: 0.54

In [164]: match_hat()
Out[164]: 0.67

In [165]: match_hat()
Out[165]: 0.54

In [166]: match_hat()
Out[166]: 0.6

In [167]: match_hat()
Out[167]: 0.73

In [168]: match_hat()
Out[168]: 0.64

In [169]: match_hat()
Out[169]: 0.65

$$ 1-\frac{365\times364}{365^2}=\frac{365^2-365\times364}{365^2}=\frac{365(365-364)}{365^2}=\frac{365}{365^2}=\frac1{365} $$

$$ 1-\frac{365\times364\times363}{365^3}=\frac{365^3-365\times364\times363}{365^3}=\frac{365(365^2-364\times363)}{365^3} $$

$$ =\frac{365^2-364\times363}{365^2}=\frac{365(365-\frac{364}{365}\times363)}{365^2}=\frac{365-\frac{364}{365}\times363}{365} $$

  1. ab
  2. ba
  3. ac
  4. ca
  5. ad
  6. da
  7. bc
  8. cb
  9. bd
  10. db
  11. cd
  12. dc

3 Share Same B-day

Let’s count. For three people, J, T, and Q, the number of possible birthdays is $365^3$. And there are $365\wt364\wt363$ permutations of different birthdays, where no one shares the same birthday. And there are $365$ permutations where all three share the same birthday. And lastly, we want to count the permutations where two of the three share the same birthday. Suppose J and T share a birthday. There are $365$ ways this can occur. In each case, there are $364$ ways that Q can have a different birthday. Hence there are $365\wt364$ permuations for J and T to share a birthday, Q has different b-day. Similarly there are $365\wt364$ permutations for J and Q to share a birthday, T has different b-day,… etc. More generally, there are $\binom{3}{2}\wt365\wt364$ ways for any two of them to share a birthday.

So

$$ \prst{3 different birthdays}=\frac{365\wt364\wt363}{365^3}=\frac{364\wt363}{365^2} $$

And

$$ \prst{all 3 share same birthday}=\frac{365}{365^3}=\frac{1}{365^2} $$

And

$$ \prst{exactly 2 share same birthday}=\frac{\binom32\wt365\wt364}{365^3}=\frac{\binom32\wt364}{365^2} $$

The probabilities of these three mutually exclusive and exhaustive events should sum to 1:

$$ \frac{364\wt363}{365^2}+\frac{1}{365^2}+\frac{\binom32\wt364}{365^2} $$

In [56]: 364*363/pw(365,2)+1/pw(365,2)+binom(3,2)*364/pw(365,2)
Out[56]: 1.00000000000000

Holy crap! I got an email response from Professor Ross!!!

I sent him this:

I am reading your book A First Course in Probability, 8th Edition. Thank you for writing this book. It is really enjoyable and helpful!

I hope you don’t mind if I ask a question: In Example 7d, Length of the longest run, pages 148-149, you defined N to be the number of the events Ei that occur and then you computed the mean of the Poisson distribution by setting the expected value equal to the sum of the probabilities for each of the Ei.

I am confused by this: shouldn’t the expected value be the sum of the values times the probabilities, rather than the sum of just the probabilities?

And Professor Ross responded with:

$$ E[N] = \sum_i i P(N=i) $$

but $P(N = i)$ is not $p_i$. If $X_i = 1$ if event $E_i$ occurs, then $N = \sum_i X_i$ and

$$ E[N] = \sum_i E[X_i] = \sum_i p_i $$

where $p_i=E[X_i]$. Since $X_i$ is the indicator variable and the expected value of the indicator is the probability that the $E_i$ will occur, we get

$$ E[N] = \sum_i E[X_i] = \sum_i P(E_i) $$

L_n

$p=\frac12$:

$$ -(n-k)p^k(1-p)-p^k=-(n-k)\Bop\frac12\Bcp^k\frac12-\Bop\frac12\Bcp^k $$

$$ =\frac{-(n-k)}{2^{k+1}}-\frac{1}{2^k}=-\frac{n-k}{2^{k+1}}-\frac{2}{2^{k+1}} $$

$$ =-\frac{n-k+2}{2^{k+1}} $$

Inclusion-Exclusion Identity

$$ P\Bop\bigcup_{i=1}^{5}E_i\Bcp=\sum_{i=1}^{5}P(E_i)-\sum_{i_1<i_2}P(E_{i_1}E_{i_2})+\sum_{i_1<i_2<i_3}P(E_{i_1}E_{i_2}E_{i_3}) $$

$$ -\sum_{i_1<i_2<i_3<i_4}P(E_{i_1}E_{i_2}E_{i_3}E_{i_4})+P(E_{i_1}E_{i_2}E_{i_3}E_{i_4}E_{i_5}) $$

$$ =P(E_1)+P(E_2)+P(E_3)+P(E_4)+P(E_5) $$

$$ -\Bosb P(E_1E_2)+P(E_1E_3)+P(E_1E_4)+P(E_1E_5)+P(E_2E_3) $$

$$ +P(E_2E_4)+P(E_2E_5)+P(E_3E_4)+P(E_3E_5)+P(E_4E_5)\Bcsb $$

$$ +\Bosb P(E_1E_2E_3)+P(E_1E_2E_4)+P(E_1E_2E_5)+P(E_1E_3E_4)+P(E_1E_3E_5) $$

$$ +P(E_1E_4E_5)+P(E_2E_3E_4)+P(E_2E_3E_5)+P(E_2E_4E_5)+P(E_3E_4E_5)\Bcsb $$

$$ -\Bosb P(E_1E_2E_3E_4)+P(E_1E_2E_3E_5)+P(E_1E_2E_4E_5)+P(E_1E_3E_4E_5)+P(E_2E_3E_4E_5)\Bcsb $$

$$ +P(E_{1}E_{2}E_{3}E_{4}E_{5}) $$

$$ =\sum_{r=1}^{5}(-1)^{r+1}\sum_{i_1<\dots<i_r}P(E_{i_1}\dots E_{i_r}) $$

Permutation Count

  • $n=20$
  • $r=3$
  • $k=4$
  • number of $b$’s is $n-r(k+1)=n-rk-r=5$
  • number of $a$’s is $r=3$

The number of permutations of $a$’s and $b$’s (not counting the permutations within the $a$’s and $b$’s of course) is

$$ \frac{(5+3)!}{5!3!}=\frac{8!}{5!3!}=\binom83=\binom{n-rk}r $$

Example 8a

Claim:

$$ \sum_{n=k}^{\infty}\Bop\frac{N}{M+N}\Bcp^{n-1}=\frac{\Bop\frac{N}{M+N}\Bcp^{k-1}}{1-\frac{N}{M+N}} $$

The infinite sum is just the difference between the geometric series and the finite series:

$$ \sum_{n=k}^{\infty}\Bop\frac{N}{M+N}\Bcp^{n-1}=\sum_{n=1}^{\infty}\Bop\frac{N}{M+N}\Bcp^{n-1}-\sum_{n=1}^{k-1}\Bop\frac{N}{M+N}\Bcp^{n-1} $$

Let $j=n-1\iff n=j+1$ in both sums:

$$ =\sum_{j+1=1}^{\infty}\Bop\frac{N}{M+N}\Bcp^{j}-\sum_{j+1=1}^{k-2}\Bop\frac{N}{M+N}\Bcp^{j} $$

$$ =\sum_{j=0}^{\infty}\Bop\frac{N}{M+N}\Bcp^{j}-\sum_{j=0}^{k-2}\Bop\frac{N}{M+N}\Bcp^{j} $$

$$ =\frac1{1-\frac{N}{M+N}}-\frac{1-\Bop\frac{N}{M+N}\Bcp^{k-1}}{1-\frac{N}{M+N}} $$

$$ =\frac{1-\Bocb1-\Bop\frac{N}{M+N}\Bcp^{k-1}\Bccb}{1-\frac{N}{M+N}} $$

$$ =\frac{1-1+\Bop\frac{N}{M+N}\Bcp^{k-1}}{1-\frac{N}{M+N}} $$

$$ =\frac{\Bop\frac{N}{M+N}\Bcp^{k-1}}{1-\frac{N}{M+N}} $$

Example 8a Part 2

The text reads “Of course, part (b) could have been obtained directly, since the probability that at least $i$ trials are necessary to obtain a success is equal to the probability that the first $i−1$ trials are all failures. That is, for a geometric random variable,”

$$ \pr{X\geq i}=(1-p)^{i-1} $$

Since $\pr{X=i}=(1-p)^{i-1}p$, then $\pr{X\geq i}=\sum_{j=i-1}^\infty(1-p)^jp$. If the previous equation is valid, then this implies that

$$ (1-p)^{i-1}=\sum_{j=i-1}^\infty(1-p)^jp \tag{ex.8a.2.1} $$

Let $A$ denote the event that at least $i$ trials are necessary to obtain a success. Let $B$ denote the event that the first $i−1$ trials are all failures. It is not immediately clear to me that $A=B$. Nor is equation ex.8a.2.1. Let’s prove both.

Suppose $A$ occurs. Since $i$ trials are necessary to get a success, then this means that the first $i-1$ trials are failures. In the other direction, suppose $B$ occurs. Since the first $i-1$ trials are all failures, then none of them are successes. This implies the occurence of $A$.

To prove ex.8a.2.1, we have

$$ \sum_{j=i-1}^\infty(1-p)^jp=(1-p)^{i-1}p\sum_{j=i-1}^\infty\frac{(1-p)^j}{(1-p)^{i-1}}=(1-p)^{i-1}p\sum_{j=i-1}^\infty(1-p)^{j-(i-1)} $$

$$ =(1-p)^{i-1}p\sum_{k=0}^\infty(1-p)^k=(1-p)^{i-1}p\frac1{1-(1-p)}=(1-p)^{i-1}p\frac1p=(1-p)^{i-1} $$

Example 8d Problem of the Points

From p.86, we get that the probability of $n$ successes before $m$ failures is

$$ P_{n,m}=\sum_{k=n}^{m+n-1}\binom{m+n-1}{k}p^k(1-p)^{m+n-1-k} $$

In the parlance of our times, the probability of $r$ successes before $m$ failures is

$$ P_{r,m}=\sum_{n=r}^{m+r-1}\binom{m+r-1}{n}p^n(1-p)^{m+r-1-n} $$

The solution from p.158 is

$$ P_{r,m}=\sum_{n=r}^{m+r-1}\binom{n-1}{r-1}p^r(1-p)^{n-r} $$

Example 8h Hypergeometric Animal Population Estimation

$$ \frac{P_i(N)}{P_i(N-1)} $$

$$ =\frac{\binom{m}{i}\binom{N-m}{n-i}}{\binom{N}{n}}\frac{\binom{N-1}{n}}{\binom{m}{i}\binom{N-m-1}{n-i}} $$

$$ =\frac{\binom{N-m}{n-i}}{\binom{N}{n}}\frac{\binom{N-1}{n}}{\binom{N-m-1}{n-i}} $$

$$ =\frac{(N-1)!}{(N-1-n)!n!}\frac{(N-n)!n!}{N!}\frac{(N-m)!}{(N-m-(n-i))!(n-i)!}\frac{(N-m-1-(n-i))!(n-i)!}{(N-m-1)!} $$

$$ =\frac{N-n}N\frac{N-m}{N-m-(n-i)} $$

So $P_i(N)\geq P_i(N-1)\iff\frac{N-n}N\frac{N-m}{N-m-(n-i)}=\frac{P_i(N)}{P_i(N-1)}\geq1$ exactly when

$$ (N-m)(N-n)\geq N(N-m-n+i) $$

$$ N^2-nN-mN+mn\geq N^2-mN-nN+iN $$

$$ mn\geq iN $$

$$ iN\leq mn $$

$$ N\leq\frac{mn}i $$

Otherwise $P_i$ is decreasing in $N$.

Succintly

$$ P_i(N)\cases{\text{increasing when }N\leq\frac{mn}i\\\text{decreasing otherwise}} $$

And $P_i$ reaches it largest value for the largest integer $N$ not exceeding $\frac{mn}i$.

Hypergeometric Approximates Binomial p162

$$ P\set{X=i}=\frac{\binom{m}{i}\binom{N-m}{n-i}}{\binom{N}{n}} $$

$$ =\frac{m!}{(m-i)!i!}\frac{(N-m)!}{(N-m-(n-i))!(n-i)!}\frac{(N-n)!n!}{N!} $$

$$ =\binom{n}{i}\frac{m!}{(m-i)!}\frac{(N-m)!}{(N-m-(n-i))!}\frac{(N-n)!}{N!} \tag{162.1} $$

So let’s look at each term in (162.1):

$$ \frac{m!}{(m-i)!}=m(m-1)\dots(m-i+1)=\prod_{j=m-i+1}^{m}j \tag{162.2} $$

This product has $m-(m-i+1)+1=m-m+i-1+1=i$ terms. The next term in (162.1):

$$ \frac{(N-m)!}{(N-m-(n-i))!}=(N-m)(N-m-1)\dots(N-m-(n-i)+1) \tag{162.3} $$

This product has $(N-m)-(N-m-(n-i)+1)+1=N-m-N+m+n-i-1+1=n-i$ terms. And the last term in (162.1):

$$ \frac{(N-n)!}{N!}=\frac1{N}\frac1{N-1}\dots\frac1{N-i+1}\frac1{N-i}\frac1{N-i-1}\dots\frac1{N-i-(n-i)+1} \tag{162.4} $$

This product has $n=n-i+i$ terms and we can apply $i$ terms to (162.2) and apply $n-i$ terms to (162.3) so that (162.1) becomes

$$ P\set{X=i}=\binom{n}{i}\frac{m}{N}\frac{m-1}{N-1}\dots\frac{m-i+1}{N-i+1}\frac{N-m}{N-i}\frac{N-m-1}{N-i-1}\dots\frac{N−m−(n−i)+1}{N-i-(n-i)+1} \tag{162.5} $$

Now $\frac{m}{N}=p$. But $\frac{m-1}{N-1}\approx p$ when $m$ and $N$ are large relative to $n$ and $i$. And $\frac{m-i+1}{N-i+1}\approx p$ when $m$ and $N$ are large. So, after $\binom{n}{i}$, the next $i$ terms of (162.5) are approximately $p^i$.

Also $\frac{N-m}{N}=1-p$ so that $\frac{N-m}{N-i}\approx1-p$ when $m$ and $N$ are large. And $\frac{N-m-1}{N-i-1}\approx1-p$ when $m$ and $N$ are large. And $\frac{N-m-(n-i)+1}{N-i-(n-i)+1}\approx1-p$ when $m$ and $N$ are large. So the last $n-i$ terms of (162.5) are approximately $(1-p)^{n-i}$.

Hence

$$ P\set{X=i}\approx\binom{n}{i}p^i(1-p)^{n-1} $$

Example 8j, p.162

$$ i\binom{m}{i}=i\frac{m!}{(m-i)!i!}=\frac{m!}{(m-i)!(i-1)!}=m\frac{(m-1)!}{(m-i)!(i-1)!} $$

$$ =m\frac{(m-1)!}{(m-i-1+1)!(i-1)!}=m\frac{(m-1)!}{((m-1)-(i-1))!(i-1)!} $$

$$ =m\binom{m-1}{i-1} $$

Example 10a

$$ \prB{X>\frac12}=1-\prB{X\leq\frac12}=1-F\prn{\frac12}=1-\frac{\frac12}2=1-\frac14=\frac34 $$

$$ \pr{2<X\leq4}=F(4)-F(2)=1-\frac{11}{12}=\frac1{12} $$