Skip to content

Commit 8c07d95

Browse files
Merge pull request #4 from yuta-nakahara/develop-web_pages
Construct Web pages
2 parents 2abf906 + ff95bcf commit 8c07d95

34 files changed

+887
-278
lines changed

README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,12 @@ You can visualize the characteristics of the created model by the following meth
5050
gen_model.visualize_model()
5151
```
5252

53-
>p:0.7
54-
>x0:[1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0]
55-
>x1:[1 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 1 1]
56-
>x2:[1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 0 1 1]
57-
>x3:[0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0]
58-
>x4:[1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0]
53+
>theta:0.7
54+
>x0:[1 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1 0 1 1 1]
55+
>x1:[1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1]
56+
>x2:[0 0 1 1 0 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1]
57+
>x3:[1 0 1 1 1 1 1 0 0 0 1 0 0 1 0 1 1 0 1 0]
58+
>x4:[1 1 0 1 0 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1]
5959
>![bernoulli_example1](./doc/images/README_ex_img1.png)
6060
6161
After confirming that the frequency of occurrence of 1 is around `theta=0.7`, we generate a sample and store it to variable `x`.
@@ -108,6 +108,7 @@ Different settings of a loss function yield different optimal estimates.
108108
The following packages are currently available. In this library, a probabilistic data generative model, prior distribution, posterior distribution (or approximate posterior distribution), and predictive distribution (or approximate predictive distribution) are collectively called a model.
109109

110110
* Bernoulli model
111+
* Categorical model
111112
* Poisson model
112113
* Normal model
113114
* Multivariate normal model

README_jp.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,11 +50,11 @@ gen_model.visualize_model()
5050
```
5151

5252
>theta:0.7
53-
>x0:[1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0]
54-
>x1:[1 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 1 1]
55-
>x2:[1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 0 1 1]
56-
>x3:[0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0]
57-
>x4:[1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0]
53+
>x0:[1 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1 0 1 1 1]
54+
>x1:[1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1]
55+
>x2:[0 0 1 1 0 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1]
56+
>x3:[1 0 1 1 1 1 1 0 0 0 1 0 0 1 0 1 1 0 1 0]
57+
>x4:[1 1 0 1 0 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1]
5858
>![bernoulli_example1](./doc/images/README_ex_img1.png)
5959
6060
1の出現頻度が`theta=0.7`程度であることを確認したら,サンプルを生成し変数`x`に保存します.
@@ -105,6 +105,7 @@ print(learn_model.estimate_params(loss='0-1'))
105105
現在,以下のモデルに関するパッケージが利用可能です.本ライブラリでは,データ生成確率モデル,事前分布,事後分布(または近似事後分布),予測分布(または近似予測分布)を合わせてモデルと呼んでいます.
106106

107107
* ベルヌーイモデル
108+
* カテゴリカルモデル
108109
* ポアソンモデル
109110
* 正規モデル
110111
* 多変量正規モデル

bayesml/autoregressive/__init__.py

Lines changed: 38 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Document Author
22
# Yuta Nakahara <yuta.nakahara@aoni.waseda.jp>
3+
# Koki Kazama <kokikazama@aoni.waseda.jp>
34
r"""
45
The linear autoregressive model with the normal-gamma prior distribution.
56
@@ -12,49 +13,66 @@
1213
* :math:`\boldsymbol{\theta} \in \mathbb{R}^{d+1}`: a regression coefficient parameter
1314
* :math:`\tau \in \mathbb{R}_{>0}`: a precision parameter of noise
1415
15-
.. math::
16-
\mathcal{N}(x_n|\boldsymbol{\theta}^\top \boldsymbol{x}'_{n-1}, \tau^{-1})
17-
= \sqrt{\frac{\tau}{2 \pi}} \exp \left\{ -\frac{\tau}{2} (x_n - \boldsymbol{\theta}^\top \boldsymbol{x}'_{n-1})^2 \right\}.
16+
.. math::
17+
p(x_n | \boldsymbol{x}'_{n-1}, \boldsymbol{\theta}, \tau) &= \mathcal{N}(x_n|\boldsymbol{\theta}^\top \boldsymbol{x}'_{n-1}, \tau^{-1}) \\
18+
&= \sqrt{\frac{\tau}{2 \pi}} \exp \left\{ -\frac{\tau}{2} (x_n - \boldsymbol{\theta}^\top \boldsymbol{x}'_{n-1})^2 \right\}.
19+
20+
.. math::
21+
&\mathbb{E}[ x_n | \boldsymbol{x}'_{n-1},\boldsymbol{\theta},\tau] = \boldsymbol{\theta}^{\top} \boldsymbol{x}'_{n-1}, \\
22+
&\mathbb{V}[ x_n | \boldsymbol{x}'_{n-1},\boldsymbol{\theta},\tau ] = \tau^{-1}.
23+
1824
1925
The prior distribution is as follows:
2026
2127
* :math:`\boldsymbol{\mu}_0 \in \mathbb{R}^{d+1}`: a hyperparameter for :math:`\boldsymbol{\theta}`
2228
* :math:`\boldsymbol{\Lambda}_0 \in \mathbb{R}^{(d+1) \times (d+1)}`: a hyperparameter for :math:`\boldsymbol{\theta}` (a positive definite matrix)
2329
* :math:`| \boldsymbol{\Lambda}_0 | \in \mathbb{R}`: the determinant of :math:`\boldsymbol{\Lambda}_0`
24-
* :math:`a_0 \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
25-
* :math:`b_0 \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
30+
* :math:`\alpha_0 \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
31+
* :math:`\beta_0 \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
2632
* :math:`\Gamma(\cdot): \mathbb{R}_{>0} \to \mathbb{R}`: the Gamma function
2733
2834
.. math::
29-
&\mathcal{N}(\boldsymbol{\theta}|\boldsymbol{\mu}_0, (\tau \boldsymbol{\Lambda}_0)^{-1}) \text{Gam}(\tau|a_0,b_0)\\
35+
p(\boldsymbol{\theta}, \tau) &= \mathcal{N}(\boldsymbol{\theta}|\boldsymbol{\mu}_0, (\tau \boldsymbol{\Lambda}_0)^{-1}) \mathrm{Gam}(\tau|\alpha_0,\beta_0)\\
3036
&= \frac{|\tau \boldsymbol{\Lambda}_0|^{1/2}}{(2 \pi)^{(d+1)/2}}
3137
\exp \left\{ -\frac{\tau}{2} (\boldsymbol{\theta} - \boldsymbol{\mu}_0)^\top
3238
\boldsymbol{\Lambda}_0 (\boldsymbol{\theta} - \boldsymbol{\mu}_0) \right\}
33-
\frac{b_0^{a_0}}{\Gamma (a_0)} \tau^{a_0 - 1} \exp \{ -b_0 \tau \} .
39+
\frac{\beta_0^{\alpha_0}}{\Gamma (\alpha_0)} \tau^{\alpha_0 - 1} \exp \{ -\beta_0 \tau \} .
40+
41+
.. math::
42+
\mathbb{E}[\boldsymbol{\theta}] &= \boldsymbol{\mu}_0 & \left( \alpha_0 > \frac{1}{2} \right), \\
43+
\mathrm{Cov}[\boldsymbol{\theta}] &= \frac{\beta_0}{\alpha_0 - 1} \boldsymbol{\Lambda}_0^{-1} & (\alpha_0 > 1), \\
44+
\mathbb{E}[\tau] &= \frac{\alpha_0}{\beta_0}, \\
45+
\mathbb{V}[\tau] &= \frac{\alpha_0}{\beta_0^2}.
3446
3547
The posterior distribution is as follows:
3648
3749
* :math:`x^n := [x_1, x_2, \dots , x_n]^\top \in \mathbb{R}^n`: given data
3850
* :math:`\boldsymbol{X}_n = [\boldsymbol{x}'_1, \boldsymbol{x}'_2, \dots , \boldsymbol{x}'_n]^\top \in \mathbb{R}^{n \times (d+1)}`
3951
* :math:`\boldsymbol{\mu}_n \in \mathbb{R}^{d+1}`: a hyperparameter for :math:`\boldsymbol{\theta}`
4052
* :math:`\boldsymbol{\Lambda}_n \in \mathbb{R}^{(d+1) \times (d+1)}`: a hyperparameter for :math:`\boldsymbol{\theta}` (a positive definite matrix)
41-
* :math:`a_n \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
42-
* :math:`b_n \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
53+
* :math:`\alpha_n \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
54+
* :math:`\beta_n \in \mathbb{R}_{>0}`: a hyperparameter for :math:`\tau`
4355
4456
.. math::
45-
&\mathcal{N}(\boldsymbol{\theta}|\boldsymbol{\mu}_n, (\tau \boldsymbol{\Lambda}_n)^{-1}) \text{Gam}(\tau|a_n,b_n)\\
57+
p(\boldsymbol{\theta}, \tau | x^n) &= \mathcal{N}(\boldsymbol{\theta}|\boldsymbol{\mu}_n, (\tau \boldsymbol{\Lambda}_n)^{-1}) \mathrm{Gam}(\tau|\alpha_n,\beta_n)\\
4658
&= \frac{|\boldsymbol{\tau \Lambda}_n|^{1/2}}{(2 \pi)^{(d+1)/2}}
4759
\exp \left\{ -\frac{\tau}{2} (\boldsymbol{\theta} - \boldsymbol{\mu}_n)^\top
4860
\boldsymbol{\Lambda}_n (\boldsymbol{\theta} - \boldsymbol{\mu}_n) \right\}
49-
\frac{b_n^{a_n}}{\Gamma (a_n)} \tau^{a_n - 1} \exp \{ -b_n \tau \} .
61+
\frac{\beta_n^{\alpha_n}}{\Gamma (\alpha_n)} \tau^{\alpha_n - 1} \exp \{ -\beta_n \tau \} .
62+
63+
.. math::
64+
\mathbb{E}[\boldsymbol{\theta} | x^n] &= \boldsymbol{\mu}_n & \left( \alpha_n > \frac{1}{2} \right), \\
65+
\mathrm{Cov}[\boldsymbol{\theta} | x^n] &= \frac{\beta_n}{\alpha_n - 1} \boldsymbol{\Lambda}_n^{-1} & (\alpha_n > 1), \\
66+
\mathbb{E}[\tau | x^n] &= \frac{\alpha_n}{\beta_n}, \\
67+
\mathbb{V}[\tau | x^n] &= \frac{\alpha_n}{\beta_n^2},
5068
5169
where the updating rules of the hyperparameters are
5270
5371
.. math::
5472
\boldsymbol{\Lambda}_n &= \boldsymbol{\Lambda}_0 + \boldsymbol{X}_n^\top \boldsymbol{X}_n,\\
5573
\boldsymbol{\mu}_n &= \boldsymbol{\Lambda}_n^{-1} (\boldsymbol{\Lambda}_0 \boldsymbol{\mu}_0 + \boldsymbol{X}_n^\top x^n),\\
56-
a_n &= a_0 + \frac{n}{2},\\
57-
b_n &= b_0 + \frac{1}{2} \left( -\boldsymbol{\mu}_n^\top \boldsymbol{\Lambda}_n \boldsymbol{\mu}_n
74+
\alpha_n &= \alpha_0 + \frac{n}{2},\\
75+
\beta_n &= \beta_0 + \frac{1}{2} \left( -\boldsymbol{\mu}_n^\top \boldsymbol{\Lambda}_n \boldsymbol{\mu}_n
5876
+ (x^n)^\top x^n + \boldsymbol{\mu}_0^\top \boldsymbol{\Lambda}_0 \boldsymbol{\mu}_0 \right).
5977
6078
The predictive distribution is as follows:
@@ -65,17 +83,21 @@
6583
* :math:`\nu_\mathrm{p} \in \mathbb{R}_{>0}`: a parameter
6684
6785
.. math::
68-
\text{St}(x_{n+1}|m_\mathrm{p}, \lambda_\mathrm{p}, \nu_\mathrm{p})
86+
\mathrm{St}(x_{n+1}|m_\mathrm{p}, \lambda_\mathrm{p}, \nu_\mathrm{p})
6987
= \frac{\Gamma (\nu_\mathrm{p}/2 + 1/2)}{\Gamma (\nu_\mathrm{p}/2)}
7088
\left( \frac{m_\mathrm{p}}{\pi \nu_\mathrm{p}} \right)^{1/2}
7189
\left[ 1 + \frac{\lambda_\mathrm{p}(x_{n+1}-m_\mathrm{p})^2}{\nu_\mathrm{p}} \right]^{-\nu_\mathrm{p}/2 - 1/2}.
7290
91+
.. math::
92+
\mathbb{E}[x_{n+1} | x^n] &= m_\mathrm{p} & (\nu_\mathrm{p} > 1), \\
93+
\mathbb{V}[x_{n+1} | x^n] &= \frac{1}{\lambda_\mathrm{p}} \frac{\nu_\mathrm{p}}{\nu_\mathrm{p}-2} & (\nu_\mathrm{p} > 2),
94+
7395
where the parameters are obtained from the hyperparameters of the posterior distribution as follows.
7496
7597
.. math::
7698
m_\mathrm{p} &= \mu_n^\top \boldsymbol{x}'_n,\\
77-
\lambda_\mathrm{p} &= \frac{a_n}{b_n} (1 + (\boldsymbol{x}'_n)^\top \boldsymbol{\Lambda}_n^{-1} \boldsymbol{x}'_n)^{-1},\\
78-
\nu_\mathrm{p} &= 2 a_n.
99+
\lambda_\mathrm{p} &= \frac{\alpha_n}{\beta_n} (1 + (\boldsymbol{x}'_n)^\top \boldsymbol{\Lambda}_n^{-1} \boldsymbol{x}'_n)^{-1},\\
100+
\nu_\mathrm{p} &= 2 \alpha_n.
79101
"""
80102

81103
from ._autoregressive import GenModel

bayesml/bernoulli/__init__.py

Lines changed: 30 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,71 @@
11
# Document Author
22
# Yuta Nakahara <yuta.nakahara@aoni.waseda.jp>
3+
# Koki Kazama <kokikazama@aoni.waseda.jp>
34
r"""
45
The Bernoulli distribution with the beta prior distribution.
56
67
The stochastic data generative model is as follows:
78
89
* :math:`x \in \{ 0, 1\}`: a data point
9-
* :math:`p \in [0, 1]`: a parameter
10+
* :math:`\theta \in [0, 1]`: a parameter
1011
11-
.. math:: \text{Bern}(x|p) = p^x (1-p)^{1-x}.
12+
.. math::
13+
p(x | \theta) = \mathrm{Bern}(x|\theta) = \theta^x (1-\theta)^{1-x}.
14+
15+
.. math::
16+
\mathbb{E}[x | \theta] &= \theta, \\
17+
\mathbb{V}[x | \theta] &= \theta (1 - \theta).
1218
1319
The prior distribution is as follows:
1420
1521
* :math:`\alpha_0 \in \mathbb{R}_{>0}`: a hyperparameter
1622
* :math:`\beta_0 \in \mathbb{R}_{>0}`: a hyperparameter
1723
* :math:`B(\cdot,\cdot): \mathbb{R}_{>0} \times \mathbb{R}_{>0} \to \mathbb{R}_{>0}`: the Beta function
1824
19-
.. math:: \text{Beta}(p|\alpha_0,\beta_0) = \frac{1}{B(\alpha_0, \beta_0)} p^{\alpha_0} (1-p)^{\beta_0}.
25+
.. math::
26+
p(\theta) = \mathrm{Beta}(\theta|\alpha_0,\beta_0) = \frac{1}{B(\alpha_0, \beta_0)} \theta^{\alpha_0} (1-\theta)^{\beta_0}.
27+
28+
.. math::
29+
\mathbb{E}[\theta] &= \frac{\alpha_0}{\alpha_0 + \beta_0}, \\
30+
\mathbb{V}[\theta] &= \frac{\alpha_0 \beta_0}{(\alpha_0 + \beta_0)^2 (\alpha_0 + \beta_0 + 1)}.
2031
2132
The posterior distribution is as follows:
2233
2334
* :math:`x^n = (x_1, x_2, \dots , x_n) \in \{ 0, 1\}^n`: given data
2435
* :math:`\alpha_n \in \mathbb{R}_{>0}`: a hyperparameter
2536
* :math:`\beta_n \in \mathbb{R}_{>0}`: a hyperparameter
2637
27-
.. math:: \text{Beta}(p|\alpha_n,\beta_n) = \frac{1}{B(\alpha_n, \beta_n)} p^{\alpha_n} (1-p)^{\beta_n},
38+
.. math::
39+
p(\theta | x^n) = \mathrm{Beta}(\theta|\alpha_n,\beta_n) = \frac{1}{B(\alpha_n, \beta_n)} \theta^{\alpha_n} (1-\theta)^{\beta_n},
40+
41+
.. math::
42+
\mathbb{E}[\theta | x^n] &= \frac{\alpha_n}{\alpha_n + \beta_n}, \\
43+
\mathbb{V}[\theta | x^n] &= \frac{\alpha_n \beta_n}{(\alpha_n + \beta_n)^2 (\alpha_n + \beta_n + 1)}.
2844
2945
where the updating rule of the hyperparameters is
3046
3147
.. math::
3248
\alpha_n = \alpha_0 + \sum_{i=1}^n I \{ x_i = 1 \},\\
33-
\beta_n = \beta_0 + \sum_{i=1}^n I \{ x_i = 0 \}.
49+
\beta_n = \beta_0 + \sum_{i=1}^n I \{ x_i = 0 \}.
3450
3551
The predictive distribution is as follows:
3652
37-
* :math:`x \in \{ 0, 1\}`: a new data point
53+
* :math:`x_{n+1} \in \{ 0, 1\}`: a new data point
3854
* :math:`\alpha_\mathrm{p} \in \mathbb{R}_{>0}`: a parameter
3955
* :math:`\beta_\mathrm{p} \in \mathbb{R}_{>0}`: a parameter
56+
* :math:`\theta_\mathrm{p} \in [0,1]`: a parameter
57+
58+
.. math::
59+
p(x_{n+1} | x^n) = \mathrm{Bern}(x_{n+1}|\theta_\mathrm{p}) =\theta_\mathrm{p}^{x_{n+1}}(1-\theta_\mathrm{p})^{1-x_{n+1}}
4060
4161
.. math::
42-
p(x|\alpha_\mathrm{p}, \beta_\mathrm{p}) = \begin{cases}
43-
\frac{\alpha_\mathrm{p}}{\alpha_\mathrm{p} + \beta_\mathrm{p}} & x = 1,\\
44-
\frac{\beta_\mathrm{p}}{\alpha_\mathrm{p} + \beta_\mathrm{p}} & x = 0,
45-
\end{cases}
62+
\mathbb{E}[x_{n+1} | x^n] &= \theta_\mathrm{p}, \\
63+
\mathbb{V}[x_{n+1} | x^n] &= \theta_\mathrm{p} (1 - \theta_\mathrm{p}).
4664
47-
where the parameters are abtained from the hyperparameters of the posterior distribution as follows.
65+
where the parameters are obtained from the hyperparameters of the posterior distribution as follows.
4866
4967
.. math::
50-
\alpha_\mathrm{p} &= \alpha_n,\\
51-
\beta_\mathrm{p} &= \beta_n
68+
\theta_\mathrm{p} = \frac{\alpha_n}{\alpha_n + \beta_n}
5269
"""
5370

5471
from ._bernoulli import GenModel

0 commit comments

Comments
 (0)