From a9bc29ef6212dc98b54bd52ab8efaad28d8f8f2f Mon Sep 17 00:00:00 2001
From: John Stachurski <john.stachurski@gmail.com>
Date: Thu, 1 Aug 2024 09:17:52 +1000
Subject: [PATCH 01/10] Misc edits to prob lecture

---
 lectures/prob_dist.md | 161 ++++++++++++++++++++++++++----------------
 1 file changed, 100 insertions(+), 61 deletions(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 09174da3..668c1787 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -162,33 +162,33 @@ Check that your answers agree with `u.mean()` and `u.var()`.
 Another useful distribution is the Bernoulli distribution on $S = \{0,1\}$, which has PMF:
 
 $$
-p(x_i)=
-\begin{cases}
-p & \text{if $x_i = 1$}\\
-1-p & \text{if $x_i = 0$}
-\end{cases}
+p(i) = \theta^{i-1} (1 - \theta)^i
 $$
 
-Here $x_i \in S$ is the outcome of the random variable.
+Here $\theta \in [0,1]$ is a parameter.
+
+We can think of this distribution as modeling probabilities for a random trial with success probability $\theta$.
+
+* $p(1) = \theta$ means that the trial succeeds (takes value 1) with probability $\theta$
+* $p(0) = 1 - \theta$ means that the trial fails (takes value 0) with
+  probability $1-\theta$
+
+The formula for the mean is $p$, and the formula for the variance is $p(1-p)$.
 
 We can import the Bernoulli distribution on $S = \{0,1\}$ from SciPy like so:
 
 ```{code-cell} ipython3
-p = 0.4 
+θ = 0.4
 u = scipy.stats.bernoulli(p)
 ```
 
-
-Here's the mean and variance:
+Here's the mean and variance at $\theta=0.4$
 
 ```{code-cell} ipython3
 u.mean(), u.var()
 ```
 
-The formula for the mean is $p$, and the formula for the variance is $p(1-p)$.
-
-
-Now let's evaluate the PMF:
+Now let's evaluate the PMF
 
 ```{code-cell} ipython3
 u.pmf(0)
@@ -204,13 +204,15 @@ $$
     p(i) = \binom{n}{i} \theta^i (1-\theta)^{n-i}
 $$
 
-Here $\theta \in [0,1]$ is a parameter.
+Again, $\theta \in [0,1]$ is a parameter.
 
 The interpretation of $p(i)$ is: the probability of $i$ successes in $n$ independent trials with success probability $\theta$.
 
 For example, if $\theta=0.5$, then $p(i)$ is the probability of $i$ heads in $n$ flips of a fair coin.
 
-The mean and variance are:
+The formula for the mean is $n \theta$ and the formula for the variance is $n \theta (1-\theta)$.
+
+Let's investigate an example
 
 ```{code-cell} ipython3
 n = 10
@@ -218,11 +220,18 @@ n = 10
 u = scipy.stats.binom(n, θ)
 ```
 
+According to our formulas, the mean and variance are
+
+```{code-cell} ipython3
+n * θ,  n *  θ * (1 - θ)  
+```
+
+Let's see if SciPy gives us the same results:
+
 ```{code-cell} ipython3
 u.mean(), u.var()
 ```
 
-The formula for the mean is $n \theta$ and the formula for the variance is $n \theta (1-\theta)$.
 
 Here's the PMF:
 
@@ -285,24 +294,64 @@ We can see that the output graph is the same as the one above.
 ```{solution-end}
 ```
 
+#### Geometric distribution
+
+The geometric distribution has infinite support $S = \{0, 1, 2, \ldots\}$ and its PMF is given by 
+
+$$
+    p(i) = (1 - \theta)^i \theta
+$$
+
+where $\lambda \in [0,1]$ is a parameter
+
+To understand the distribution, think of repeated independent random trials, each with success probability $\theta$.
+
+The interpretation of $p(i)$ is: the probability there are $i$ failures before the first success occurs.
+
+It can be shown that the mean of the distribution is $1/\theta$ and the variance is $(1-\theta)/\theta$.
+
+Here's an example.
+
+```{code-cell} ipython3
+θ = 0.1
+u = scipy.stats.geom(θ)
+u.mean(), u.var()
+```
+
+Here's part of the PMF:
+
+```{code-cell} ipython3
+fig, ax = plt.subplots()
+n = 20
+S = np.arange(n)
+ax.plot(S, u.pmf(S), linestyle='', marker='o', alpha=0.8, ms=4)
+ax.vlines(S, 0, u.pmf(S), lw=0.2)
+ax.set_xticks(S)
+ax.set_xlabel('S')
+ax.set_ylabel('PMF')
+plt.show()
+```
+
+
 #### Poisson distribution
 
-Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF
+The Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF
 
 $$
     p(i) = \frac{\lambda^i}{i!} e^{-\lambda}
 $$
 
-The interpretation of $p(i)$ is: the probability of $i$ events in a fixed time interval, where the events occur at a constant rate $\lambda$ and independently of each other.
+The interpretation of $p(i)$ is: the probability of $i$ events in a fixed time interval, where the events occur independently at a constant rate $\lambda$.
+
+It can be shown that the mean is $\lambda$ and the variance is also $\lambda$.
+
+Here's an example.
 
-The mean and variance are:
 ```{code-cell} ipython3
 λ = 2
 u = scipy.stats.poisson(λ)
 u.mean(), u.var()
 ```
-    
-The expectation of the Poisson distribution is $\lambda$ and the variance is also $\lambda$.
 
 Here's the PMF:
 
@@ -325,7 +374,7 @@ plt.show()
 ### Continuous distributions
 
 
-Continuous distributions are represented by a **probability density function**, which is a function $p$ over $\mathbb R$ (the set of all real numbers) such that $p(x) \geq 0$ for all $x$ and
+A continuous distribution is represented by a **probability density function**, which is a function $p$ over $\mathbb R$ (the set of all real numbers) such that $p(x) \geq 0$ for all $x$ and
 
 $$ \int_{-\infty}^\infty p(x) dx = 1 $$
 
@@ -362,11 +411,11 @@ $$
               \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
 $$
 
-This distribution has two parameters, $\mu$ and $\sigma$.  
+This distribution has two parameters, $\mu \in \mathbb R$ and $\sigma \in (0, \infty)$.  
 
-It can be shown that, for this distribution, the mean is $\mu$ and the variance is $\sigma^2$.
+Using calculus, it can be shown that, for this distribution, the mean is $\mu$ and the variance is $\sigma^2$.
 
-We can obtain the moments, PDF and CDF of the normal density as follows:
+We can obtain the moments, PDF and CDF of the normal density via SciPy as follows:
 
 ```{code-cell} ipython3
 μ, σ = 0.0, 1.0
@@ -427,9 +476,10 @@ This distribution has two parameters, $\mu$ and $\sigma$.
 
 It can be shown that, for this distribution, the mean is $\exp\left(\mu + \sigma^2/2\right)$ and the variance is $\left[\exp\left(\sigma^2\right) - 1\right] \exp\left(2\mu + \sigma^2\right)$.
 
-It has a nice interpretation: if $X$ is lognormally distributed, then $\log X$ is normally distributed.
+It can be proved that 
 
-It is often used to model variables that are "multiplicative" in nature, such as income or asset prices.
+* if $X$ is lognormally distributed, then $\log X$ is normally distributed, and
+* if $X$ is normally distributed, then $\exp X$ is lognormally distributed.
 
 We can obtain the moments, PDF, and CDF of the lognormal density as follows:
 
@@ -477,15 +527,16 @@ plt.show()
 
 #### Exponential distribution
 
-The **exponential distribution** is a distribution on $\left(0, \infty\right)$ with density
+The **exponential distribution** is a distribution supported on $\left(0, \infty\right)$ with density
 
 $$
     p(x) = \lambda \exp \left( - \lambda x \right)
+    \qquad (x > 0)
 $$
 
-This distribution has one parameter, $\lambda$.
+This distribution has one parameter $\lambda$.
 
-It is related to the Poisson distribution as it describes the distribution of the length of the time interval between two consecutive events in a Poisson process.
+The exponential distribution can be thought of as the continuous analog of the geometric distribution.
 
 It can be shown that, for this distribution, the mean is $1/\lambda$ and the variance is $1/\lambda^2$.
 
@@ -706,7 +757,7 @@ $$
 For the income distribution given above, we can calculate these numbers via
 
 ```{code-cell} ipython3
-x = np.asarray(df['income'])
+x = np.asarray(df['income'])   # Pull out income as a NumPy array
 ```
 
 ```{code-cell} ipython3
@@ -720,6 +771,7 @@ x.mean(), x.var()
 Check that the formulas given above produce the same numbers.
 ```
 
+
 ### Visualization
 
 Let's look at different ways that we can visualize one or more observed distributions.
@@ -730,12 +782,9 @@ We will cover
 - kernel density estimates and
 - violin plots
 
-+++ {"user_expressions": []}
 
 #### Histograms
 
-+++ {"user_expressions": []}
-
 We can histogram the income distribution we just constructed as follows
 
 ```{code-cell} ipython3
@@ -747,11 +796,10 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
 
 Let's look at a distribution from real data.
 
-In particular, we will look at the monthly return on Amazon shares between 2000/1/1 and 2023/1/1.
+In particular, we will look at the monthly return on Amazon shares between 2000/1/1 and 2024/1/1.
 
 The monthly return is calculated as the percent change in the share price over each month.
 
@@ -759,13 +807,12 @@ So we will have one observation for each month.
 
 ```{code-cell} ipython3
 :tags: [hide-output]
-df = yf.download('AMZN', '2000-1-1', '2023-1-1', interval='1mo' )
+df = yf.download('AMZN', '2000-1-1', '2024-1-1', interval='1mo' )
 prices = df['Adj Close']
 data = prices.pct_change()[1:] * 100
 data.head()
 ```
 
-+++ {"user_expressions": []}
 
 The first observation is the monthly return (percent change) over January 2000, which was
 
@@ -773,8 +820,6 @@ The first observation is the monthly return (percent change) over January 2000,
 data[0] 
 ```
 
-+++ {"user_expressions": []}
-
 Let's turn the return observations into an array and histogram it.
 
 ```{code-cell} ipython3
@@ -789,13 +834,15 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
 
 #### Kernel density estimates
 
-Kernel density estimate (KDE) is a non-parametric way to estimate and visualize the PDF of a distribution.
+Kernel density estimates (KDE) provide a simple way to estimate and visualize the density of a distribution.
+
+If you are not familiar with KDEs, you can think of them as a smoothed
+histogram.
 
-KDE will generate a smooth curve that approximates the PDF.
+Let's have a look at a KDE formed from the Amazon return data.
 
 ```{code-cell} ipython3
 fig, ax = plt.subplots()
@@ -825,9 +872,8 @@ A suitable bandwidth is not too smooth (underfitting) or too wiggly (overfitting
 
 #### Violin plots
 
-+++ {"user_expressions": []}
 
-Yet another way to display an observed distribution is via a violin plot.
+Another way to display an observed distribution is via a violin plot.
 
 ```{code-cell} ipython3
 fig, ax = plt.subplots()
@@ -837,43 +883,40 @@ ax.set_xlabel('KDE')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
-
 Violin plots are particularly useful when we want to compare different distributions.
 
-For example, let's compare the monthly returns on Amazon shares with the monthly return on Apple shares.
+For example, let's compare the monthly returns on Amazon shares with the monthly return on Costco shares.
 
 ```{code-cell} ipython3
 :tags: [hide-output]
-df = yf.download('AAPL', '2000-1-1', '2023-1-1', interval='1mo' )
+df = yf.download('COST', '2000-1-1', '2024-1-1', interval='1mo' )
 prices = df['Adj Close']
 data = prices.pct_change()[1:] * 100
-x_apple = np.asarray(data)
+x_costco = np.asarray(data)
 ```
 
 ```{code-cell} ipython3
 fig, ax = plt.subplots()
-ax.violinplot([x_amazon, x_apple])
+ax.violinplot([x_amazon, x_costco])
 ax.set_ylabel('monthly return (percent change)')
 ax.set_xlabel('KDE')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
 
 ### Connection to probability distributions
 
-+++ {"user_expressions": []}
-
 Let's discuss the connection between observed distributions and probability distributions.
 
 Sometimes it's helpful to imagine that an observed distribution is generated by a particular probability distribution.
 
 For example, we might look at the returns from Amazon above and imagine that they were generated by a normal distribution.
 
-Even though this is not true, it might be a helpful way to think about the data.
+(Even though this is not true, it *might* be a helpful way to think about the data.)
 
-Here we match a normal distribution to the Amazon monthly returns by setting the sample mean to the mean of the normal distribution and the sample variance equal to the variance.
+Here we match a normal distribution to the Amazon monthly returns by setting the
+sample mean to the mean of the normal distribution and the sample variance equal
+to the variance.
 
 Then we plot the density and the histogram.
 
@@ -894,14 +937,11 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
 
-The match between the histogram and the density is not very bad but also not very good.
+The match between the histogram and the density is not bad but also not very good.
 
 One reason is that the normal distribution is not really a good fit for this observed data --- we will discuss this point again when we talk about {ref}`heavy tailed distributions<heavy_tail>`.
 
-+++ {"user_expressions": []}
-
 Of course, if the data really *is* generated by the normal distribution, then the fit will be better.
 
 Let's see this in action
@@ -923,7 +963,6 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-+++ {"user_expressions": []}
 
 Note that if you keep increasing $N$, which is the number of observations, the fit will get better and better.
 

From d4c9583acf6a1ed6015483024e5396388440f894 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 09:58:44 +1000
Subject: [PATCH 02/10] fix variable name and minor formatting update

---
 lectures/prob_dist.md | 64 ++++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 668c1787..04c33076 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -46,18 +46,22 @@ Let's start with discrete distributions.
 
 A discrete distribution is defined by a set of numbers $S = \{x_1, \ldots, x_n\}$ and a **probability mass function** (PMF) on $S$, which is a function $p$ from $S$ to $[0,1]$ with the property 
 
-$$ \sum_{i=1}^n p(x_i) = 1 $$
+$$ 
+\sum_{i=1}^n p(x_i) = 1 
+$$
 
 We say that a random variable $X$ **has distribution** $p$ if $X$ takes value $x_i$ with probability $p(x_i)$.
 
 That is,
 
-$$ \mathbb P\{X = x_i\} = p(x_i) \quad \text{for } i= 1, \ldots, n $$
+$$ 
+\mathbb P\{X = x_i\} = p(x_i) \quad \text{for } i= 1, \ldots, n 
+$$
 
 The **mean** or **expected value** of a random variable $X$ with distribution $p$ is 
 
 $$ 
-    \mathbb{E}[X] = \sum_{i=1}^n x_i p(x_i)
+\mathbb{E}[X] = \sum_{i=1}^n x_i p(x_i)
 $$
 
 Expectation is also called the *first moment* of the distribution.
@@ -67,7 +71,7 @@ We also refer to this number as the mean of the distribution (represented by) $p
 The **variance** of $X$ is defined as 
 
 $$ 
-    \mathbb{V}[X] = \sum_{i=1}^n (x_i - \mathbb{E}[X])^2 p(x_i)
+\mathbb{V}[X] = \sum_{i=1}^n (x_i - \mathbb{E}[X])^2 p(x_i)
 $$
 
 Variance is also called the *second central moment* of the distribution.
@@ -75,8 +79,8 @@ Variance is also called the *second central moment* of the distribution.
 The **cumulative distribution function** (CDF) of $X$ is defined by
 
 $$
-    F(x) = \mathbb{P}\{X \leq x\}
-         = \sum_{i=1}^n \mathbb 1\{x_i \leq x\} p(x_i)
+F(x) = \mathbb{P}\{X \leq x\}
+        = \sum_{i=1}^n \mathbb 1\{x_i \leq x\} p(x_i)
 $$
 
 Here $\mathbb 1\{ \textrm{statement} \} = 1$ if "statement" is true and zero otherwise.
@@ -115,7 +119,6 @@ u.pmf(1)
 u.pmf(2)
 ```
 
-
 Here's a plot of the probability mass function:
 
 ```{code-cell} ipython3
@@ -129,7 +132,6 @@ ax.set_ylabel('PMF')
 plt.show()
 ```
 
-
 Here's a plot of the CDF:
 
 ```{code-cell} ipython3
@@ -143,10 +145,8 @@ ax.set_ylabel('CDF')
 plt.show()
 ```
 
-
 The CDF jumps up by $p(x_i)$ at $x_i$.
 
-
 ```{exercise}
 :label: prob_ex1
 
@@ -179,7 +179,7 @@ We can import the Bernoulli distribution on $S = \{0,1\}$ from SciPy like so:
 
 ```{code-cell} ipython3
 θ = 0.4
-u = scipy.stats.bernoulli(p)
+u = scipy.stats.bernoulli(θ)
 ```
 
 Here's the mean and variance at $\theta=0.4$
@@ -201,7 +201,7 @@ u.pmf(1)
 Another useful (and more interesting) distribution is the **binomial distribution** on $S=\{0, \ldots, n\}$, which has PMF:
 
 $$ 
-    p(i) = \binom{n}{i} \theta^i (1-\theta)^{n-i}
+p(i) = \binom{n}{i} \theta^i (1-\theta)^{n-i}
 $$
 
 Again, $\theta \in [0,1]$ is a parameter.
@@ -299,7 +299,7 @@ We can see that the output graph is the same as the one above.
 The geometric distribution has infinite support $S = \{0, 1, 2, \ldots\}$ and its PMF is given by 
 
 $$
-    p(i) = (1 - \theta)^i \theta
+p(i) = (1 - \theta)^i \theta
 $$
 
 where $\lambda \in [0,1]$ is a parameter
@@ -338,7 +338,7 @@ plt.show()
 The Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF
 
 $$
-    p(i) = \frac{\lambda^i}{i!} e^{-\lambda}
+p(i) = \frac{\lambda^i}{i!} e^{-\lambda}
 $$
 
 The interpretation of $p(i)$ is: the probability of $i$ events in a fixed time interval, where the events occur independently at a constant rate $\lambda$.
@@ -376,12 +376,14 @@ plt.show()
 
 A continuous distribution is represented by a **probability density function**, which is a function $p$ over $\mathbb R$ (the set of all real numbers) such that $p(x) \geq 0$ for all $x$ and
 
-$$ \int_{-\infty}^\infty p(x) dx = 1 $$
+$$ 
+\int_{-\infty}^\infty p(x) dx = 1 
+$$
 
 We say that random variable $X$ has distribution $p$ if
 
 $$
-    \mathbb P\{a < X < b\} = \int_a^b p(x) dx
+\mathbb P\{a < X < b\} = \int_a^b p(x) dx
 $$
 
 for all $a \leq b$.
@@ -391,14 +393,14 @@ The definition of the mean and variance of a random variable $X$ with distributi
 For example, the mean of $X$ is
 
 $$
-    \mathbb{E}[X] = \int_{-\infty}^\infty x p(x) dx
+\mathbb{E}[X] = \int_{-\infty}^\infty x p(x) dx
 $$
 
 The **cumulative distribution function** (CDF) of $X$ is defined by
 
 $$
-    F(x) = \mathbb P\{X \leq x\}
-         = \int_{-\infty}^x p(x) dx
+F(x) = \mathbb P\{X \leq x\}
+        = \int_{-\infty}^x p(x) dx
 $$
 
 
@@ -407,8 +409,8 @@ $$
 Perhaps the most famous distribution is the **normal distribution**, which has density
 
 $$
-    p(x) = \frac{1}{\sqrt{2\pi}\sigma}
-              \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
+p(x) = \frac{1}{\sqrt{2\pi}\sigma}
+            \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
 $$
 
 This distribution has two parameters, $\mu \in \mathbb R$ and $\sigma \in (0, \infty)$.  
@@ -468,8 +470,8 @@ plt.show()
 The **lognormal distribution** is a distribution on $\left(0, \infty\right)$ with density
 
 $$
-    p(x) = \frac{1}{\sigma x \sqrt{2\pi}}
-        \exp \left(- \frac{\left(\log x - \mu\right)^2}{2 \sigma^2} \right)
+p(x) = \frac{1}{\sigma x \sqrt{2\pi}}
+    \exp \left(- \frac{\left(\log x - \mu\right)^2}{2 \sigma^2} \right)
 $$
 
 This distribution has two parameters, $\mu$ and $\sigma$.
@@ -530,8 +532,8 @@ plt.show()
 The **exponential distribution** is a distribution supported on $\left(0, \infty\right)$ with density
 
 $$
-    p(x) = \lambda \exp \left( - \lambda x \right)
-    \qquad (x > 0)
+p(x) = \lambda \exp \left( - \lambda x \right)
+\qquad (x > 0)
 $$
 
 This distribution has one parameter $\lambda$.
@@ -586,8 +588,8 @@ plt.show()
 The **beta distribution** is a distribution on $(0, 1)$ with density
 
 $$
-    p(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}
-        x^{\alpha - 1} (1 - x)^{\beta - 1}
+p(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)}
+    x^{\alpha - 1} (1 - x)^{\beta - 1}
 $$
 
 where $\Gamma$ is the [gamma function](https://en.wikipedia.org/wiki/Gamma_function).
@@ -648,8 +650,8 @@ plt.show()
 The **gamma distribution** is a distribution on $\left(0, \infty\right)$ with density
 
 $$
-    p(x) = \frac{\beta^\alpha}{\Gamma(\alpha)}
-        x^{\alpha - 1} \exp(-\beta x)
+p(x) = \frac{\beta^\alpha}{\Gamma(\alpha)}
+    x^{\alpha - 1} \exp(-\beta x)
 $$
 
 This distribution has two parameters, $\alpha > 0$ and $\beta > 0$.
@@ -745,13 +747,13 @@ Suppose we have an observed distribution with values $\{x_1, \ldots, x_n\}$
 The **sample mean** of this distribution is defined as
 
 $$
-    \bar x = \frac{1}{n} \sum_{i=1}^n x_i
+\bar x = \frac{1}{n} \sum_{i=1}^n x_i
 $$
 
 The **sample variance** is defined as 
 
 $$
-    \frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2
+\frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2
 $$
 
 For the income distribution given above, we can calculate these numbers via

From 39bdc06d94b2b2bd17abbd059e7dec8f1564b92c Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 10:37:17 +1000
Subject: [PATCH 03/10] add explanation for infinite support

---
 lectures/prob_dist.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 04c33076..5bfa6200 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -304,6 +304,8 @@ $$
 
 where $\lambda \in [0,1]$ is a parameter
 
+(A discrete distribution has infinite support if the set of points to which it assigns positive probability is infinite.)
+
 To understand the distribution, think of repeated independent random trials, each with success probability $\theta$.
 
 The interpretation of $p(i)$ is: the probability there are $i$ failures before the first success occurs.

From e57e9eacda071f0448f223d5d6b903230372a91d Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 12:57:30 +1000
Subject: [PATCH 04/10] ENH: update code to simplify and resolve warnings

---
 lectures/prob_dist.md | 39 ++++++++-------------------------------
 1 file changed, 8 insertions(+), 31 deletions(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 5bfa6200..9a517710 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -4,14 +4,13 @@ jupytext:
     extension: .md
     format_name: myst
     format_version: 0.13
-    jupytext_version: 1.14.5
+    jupytext_version: 1.16.1
 kernelspec:
   display_name: Python 3 (ipykernel)
   language: python
   name: python3
 ---
 
-
 # Distributions and Probabilities
 
 ```{index} single: Distributions and Probabilities
@@ -23,6 +22,7 @@ In this lecture we give a quick introduction to data and probability distributio
 
 ```{code-cell} ipython3
 :tags: [hide-output]
+
 !pip install --upgrade yfinance  
 ```
 
@@ -35,7 +35,6 @@ import scipy.stats
 import seaborn as sns
 ```
 
-
 ## Common distributions
 
 In this section we recall the definitions of some well-known distributions and explore how to manipulate them with SciPy.
@@ -99,7 +98,6 @@ n = 10
 u = scipy.stats.randint(1, n+1)
 ```
 
-
 Here's the mean and variance:
 
 ```{code-cell} ipython3
@@ -195,7 +193,6 @@ u.pmf(0)
 u.pmf(1)
 ```
 
-
 #### Binomial distribution
 
 Another useful (and more interesting) distribution is the **binomial distribution** on $S=\{0, \ldots, n\}$, which has PMF:
@@ -232,7 +229,6 @@ Let's see if SciPy gives us the same results:
 u.mean(), u.var()
 ```
 
-
 Here's the PMF:
 
 ```{code-cell} ipython3
@@ -250,7 +246,6 @@ ax.set_ylabel('PMF')
 plt.show()
 ```
 
-
 Here's the CDF:
 
 ```{code-cell} ipython3
@@ -264,7 +259,6 @@ ax.set_ylabel('CDF')
 plt.show()
 ```
 
-
 ```{exercise}
 :label: prob_ex3
 
@@ -334,7 +328,6 @@ ax.set_ylabel('PMF')
 plt.show()
 ```
 
-
 #### Poisson distribution
 
 The Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF
@@ -372,7 +365,6 @@ ax.set_ylabel('PMF')
 plt.show()
 ```
 
-
 ### Continuous distributions
 
 
@@ -449,7 +441,6 @@ plt.legend()
 plt.show()
 ```
 
-
 Here's a plot of the CDF:
 
 ```{code-cell} ipython3
@@ -466,7 +457,6 @@ plt.legend()
 plt.show()
 ```
 
-
 #### Lognormal distribution
 
 The **lognormal distribution** is a distribution on $\left(0, \infty\right)$ with density
@@ -646,7 +636,6 @@ plt.legend()
 plt.show()
 ```
 
-
 #### Gamma distribution
 
 The **gamma distribution** is a distribution on $\left(0, \infty\right)$ with density
@@ -730,7 +719,6 @@ df = pd.DataFrame(data, columns=['name', 'income'])
 df
 ```
 
-
 In this situation, we might refer to the set of their incomes as the "income distribution."
 
 The terminology is confusing because this set is not a probability distribution
@@ -768,7 +756,6 @@ x = np.asarray(df['income'])   # Pull out income as a NumPy array
 x.mean(), x.var()
 ```
 
-
 ```{exercise}
 :label: prob_ex4
 
@@ -800,7 +787,6 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-
 Let's look at a distribution from real data.
 
 In particular, we will look at the monthly return on Amazon shares between 2000/1/1 and 2024/1/1.
@@ -811,25 +797,21 @@ So we will have one observation for each month.
 
 ```{code-cell} ipython3
 :tags: [hide-output]
+
 df = yf.download('AMZN', '2000-1-1', '2024-1-1', interval='1mo' )
 prices = df['Adj Close']
-data = prices.pct_change()[1:] * 100
-data.head()
+x_amazon = prices.pct_change()[1:] * 100
+x_amazon.head()
 ```
 
-
 The first observation is the monthly return (percent change) over January 2000, which was
 
 ```{code-cell} ipython3
-data[0] 
+x_amazon.iloc[0]
 ```
 
 Let's turn the return observations into an array and histogram it.
 
-```{code-cell} ipython3
-x_amazon = np.asarray(data)
-```
-
 ```{code-cell} ipython3
 fig, ax = plt.subplots()
 ax.hist(x_amazon, bins=20)
@@ -838,7 +820,6 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-
 #### Kernel density estimates
 
 Kernel density estimates (KDE) provide a simple way to estimate and visualize the density of a distribution.
@@ -893,10 +874,10 @@ For example, let's compare the monthly returns on Amazon shares with the monthly
 
 ```{code-cell} ipython3
 :tags: [hide-output]
+
 df = yf.download('COST', '2000-1-1', '2024-1-1', interval='1mo' )
 prices = df['Adj Close']
-data = prices.pct_change()[1:] * 100
-x_costco = np.asarray(data)
+x_costco = prices.pct_change()[1:] * 100
 ```
 
 ```{code-cell} ipython3
@@ -907,7 +888,6 @@ ax.set_xlabel('KDE')
 plt.show()
 ```
 
-
 ### Connection to probability distributions
 
 Let's discuss the connection between observed distributions and probability distributions.
@@ -941,7 +921,6 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-
 The match between the histogram and the density is not bad but also not very good.
 
 One reason is that the normal distribution is not really a good fit for this observed data --- we will discuss this point again when we talk about {ref}`heavy tailed distributions<heavy_tail>`.
@@ -967,8 +946,6 @@ ax.set_ylabel('density')
 plt.show()
 ```
 
-
 Note that if you keep increasing $N$, which is the number of observations, the fit will get better and better.
 
 This convergence is a version of the "law of large numbers", which we will discuss {ref}`later<lln_mr>`.
-

From 8853de126432449caab5cf6cbf0eefa903e84a3b Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:03:06 +1000
Subject: [PATCH 05/10] remove all asarray

---
 lectures/prob_dist.md | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 9a517710..0047ae6a 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -749,10 +749,7 @@ $$
 For the income distribution given above, we can calculate these numbers via
 
 ```{code-cell} ipython3
-x = np.asarray(df['income'])   # Pull out income as a NumPy array
-```
-
-```{code-cell} ipython3
+x = df['income']
 x.mean(), x.var()
 ```
 

From a21ad42d3e97ce4940dee82e8dabfb01d5d594f9 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:07:45 +1000
Subject: [PATCH 06/10] address missed merge conflict issues

---
 lectures/prob_dist.md | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 3f7ee04e..3153123d 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -229,10 +229,6 @@ Let's see if SciPy gives us the same results:
 u.mean(), u.var()
 ```
 
-<<<<<<< HEAD
-=======
-
->>>>>>> main
 Here's the PMF:
 
 ```{code-cell} ipython3
@@ -332,10 +328,6 @@ ax.set_ylabel('PMF')
 plt.show()
 ```
 
-<<<<<<< HEAD
-=======
-
->>>>>>> main
 #### Poisson distribution
 
 The Poisson distribution on $S = \{0, 1, \ldots\}$ with parameter $\lambda > 0$ has PMF

From 447c6a26d0ba1908659b6d456655f71735c37396 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:20:29 +1000
Subject: [PATCH 07/10] remove extra x=df['income']

---
 lectures/prob_dist.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 3153123d..2b619724 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -776,7 +776,6 @@ We will cover
 We can histogram the income distribution we just constructed as follows
 
 ```{code-cell} ipython3
-x = df['income']
 fig, ax = plt.subplots()
 ax.hist(x, bins=5, density=True, histtype='bar')
 ax.set_xlabel('income')

From ee0996a753b85b1dfeb733d3c59292d8f26d02a1 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:33:33 +1000
Subject: [PATCH 08/10] FIX: set pd option to see if FutureWarning is resolved
 for inf and na

---
 lectures/prob_dist.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 2b619724..2548451b 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -29,6 +29,7 @@ In this lecture we give a quick introduction to data and probability distributio
 ```{code-cell} ipython3
 import matplotlib.pyplot as plt
 import pandas as pd
+pd.option_context('mode.use_inf_as_na', False)
 import numpy as np
 import yfinance as yf
 import scipy.stats

From 2d3f66b130b808a820549a4dabf1cb39d3c8c487 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:46:00 +1000
Subject: [PATCH 09/10] revert test by setting pd option

---
 lectures/prob_dist.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lectures/prob_dist.md b/lectures/prob_dist.md
index 2548451b..2b619724 100644
--- a/lectures/prob_dist.md
+++ b/lectures/prob_dist.md
@@ -29,7 +29,6 @@ In this lecture we give a quick introduction to data and probability distributio
 ```{code-cell} ipython3
 import matplotlib.pyplot as plt
 import pandas as pd
-pd.option_context('mode.use_inf_as_na', False)
 import numpy as np
 import yfinance as yf
 import scipy.stats

From eb5c1818e227a861ce1ed98418be30173f4bb208 Mon Sep 17 00:00:00 2001
From: mmcky <mamckay@gmail.com>
Date: Thu, 1 Aug 2024 13:49:02 +1000
Subject: [PATCH 10/10] upgrade anaconda==2024.06

---
 environment.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/environment.yml b/environment.yml
index 52036b80..d834b6a3 100644
--- a/environment.yml
+++ b/environment.yml
@@ -4,7 +4,7 @@ channels:
   - conda-forge
 dependencies:
   - python=3.11
-  - anaconda=2024.02
+  - anaconda=2024.06
   - pip
   - pip:
     - jupyter-book==0.15.1