+ "markdown": "# Fundamentals of Bayesian Modeling in Julia\n\n\n\n\n## Brief Intro to Julia and Turing\n\nGoal is to teach just enough so that the reader understands the code. \nWe won't be discussing things like plotting (as it highly depends on the package used).\n\n### Installing Julia and Packages\n\nTODO.\n\n\n### Julia Basics\n\n::: {.callout-important}\n\n### Notable Differences with Python and R\n\nThese are the most common sources of confusion and errors for newcomers to Julia:\n\n- **1-indexing**: Similarly to R, Julia uses 1-based indexing, which means that the first element of a vector is `x[1]` (not `x[0]` as in Python).\n- **Positional; Keyword arguments**: Julia functions makes a clear distinction between positional and keyword arguments, and both are often separated by `;`. Positional arguments are typically passed without a name, while keyword arguments must be named (e.g., `scatter(0, 0; color=:red)`). Some functions might look like `somefunction(; arg1=val1, arg2=val2)`.\n- **Symbols**: Some arguments are prefixed with `:` (e.g., `:red` in `scatter(0, 0; color=:red)`). These *symbols* are like character strings that are not manipulable (there are more efficient).\n- **Explicit vectorization**: Julia does not vectorize operations by default. You need to use a dot `.` in front of functions and operators to have it apply element by element. For example, `sin.([0, 1, 2])` will apply the `sin()` function to each element of its vector.\n- **In-place operations**: Julia has a strong emphasis on performance, and in-place operations are often used to avoid unnecessary memory allocations. When functions modify their input \"in-place\" (without returns), a band `!` is used. For example, assuming `x = [0]` (1-element vector containing 0), `push!(x, 2)` will modify `x` in place (it is equivalent to `x = push(x, 2)`).\n- **Macros**: Some functions start with `@`. These are called macros and are used to manipulate the code before it is run. For example, `@time` will measure the time it takes to run the code that follows.\n- **Unicode**: Julia is a modern language to supports unicode characters, which are used a lot for mathematical operations. You can get the *mu* `μ` character by typing `\\mu` and pressing `TAB`.\n:::\n\n\n### Generate Data from Normal Distribution\n\n::: {#1f15e3d4 .cell execution_count=1}\n``` {.julia .cell-code code-fold=\"false\"}\nusing Turing, Distributions, Random\nusing Makie\n\n# Random sample from a Normal(μ=100, σ=15)\niq = rand(Normal(100, 15), 500)\n```\n:::\n\n\n::: {#e3dbae1f .cell execution_count=2}\n``` {.julia .cell-code}\nfig = Figure()\nax = Axis(fig[1, 1], title=\"Distribution\")\ndensity!(ax, iq)\nfig\n```\n\n::: {.cell-output .cell-output-stderr}\n```\n┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.\n└ @ Makie C:\\Users\\domma\\.julia\\packages\\Makie\\VRavR\\src\\scenes.jl:220\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=3}\n{}\n:::\n:::\n\n\n### Recover Distribution Parameters with Turing\n\n::: {#e37369d6 .cell execution_count=3}\n``` {.julia .cell-code code-fold=\"false\"}\n@model function model_gaussian(x)\n # Priors\n μ ~ Uniform(0, 200)\n σ ~ Uniform(0, 30)\n\n # Check against each datapoint\n for i in 1:length(x)\n x[i] ~ Normal(μ, σ)\n end\nend\n\nfit_gaussian = model_gaussian(iq)\nchain_gaussian = sample(fit_gaussian, NUTS(), 400)\n```\n:::\n\n\nInspecting the chain variable will show various posterior statistics (including the mean, standard deviation, and diagnostic indices).\n\n::: {#31ba55af .cell execution_count=4}\n``` {.julia .cell-code code-fold=\"false\"}\nchain_gaussian\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre>Chains MCMC chain (400×14×1 Array{Float64, 3}):\nIterations = 201:1:600\nNumber of chains = 1\nSamples per chain = 400\nWall duration = 8.8 seconds\nCompute duration = 8.8 seconds\nparameters = μ, σ\ninternals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size\nSummary Statistics\n <span class=\"ansi-bold\"> parameters </span> <span class=\"ansi-bold\"> mean </span> <span class=\"ansi-bold\"> std </span> <span class=\"ansi-bold\"> mcse </span> <span class=\"ansi-bold\"> ess_bulk </span> <span class=\"ansi-bold\"> ess_tail </span> <span class=\"ansi-bold\"> rhat </span> <span class=\"ansi-bold\"> e</span> ⋯\n <span class=\"ansi-bright-black-fg\"> Symbol </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> </span> ⋯\n μ 99.2403 0.6727 0.0333 414.3604 324.9996 0.9993 ⋯\n σ 14.4973 0.4440 0.0187 561.5709 284.5407 0.9976 ⋯\n<span class=\"ansi-cyan-fg\"> 1 column omitted</span>\nQuantiles\n <span class=\"ansi-bold\"> parameters </span> <span class=\"ansi-bold\"> 2.5% </span> <span class=\"ansi-bold\"> 25.0% </span> <span class=\"ansi-bold\"> 50.0% </span> <span class=\"ansi-bold\"> 75.0% </span> <span class=\"ansi-bold\"> 97.5% </span>\n <span class=\"ansi-bright-black-fg\"> Symbol </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span>\n μ 97.9096 98.7663 99.2552 99.7769 100.4228\n σ 13.6853 14.1811 14.5066 14.7917 15.3761\n</pre>\n```\n:::\n\n:::\n:::\n\n\nFor the purpose of this book, we will mostly focus on the 95% Credible Interval (CI), and we will assume that a parameter is ***\"significant\"*** if its CI does not include 0.\n\n::: {#45fb1631 .cell execution_count=5}\n``` {.julia .cell-code code-fold=\"false\"}\n# Summary (95% CI)\nhpd(chain_gaussian)\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n\n::: {.ansi-escaped-output}\n```{=html}\n<pre>HPD\n <span class=\"ansi-bold\"> parameters </span> <span class=\"ansi-bold\"> lower </span> <span class=\"ansi-bold\"> upper </span>\n <span class=\"ansi-bright-black-fg\"> Symbol </span> <span class=\"ansi-bright-black-fg\"> Float64 </span> <span class=\"ansi-bright-black-fg\"> Float64 </span>\n μ 97.8594 100.3178\n σ 13.5687 15.2885\n</pre>\n```\n:::\n\n:::\n:::\n\n\n## Linear Models\n\nUnderstand what the parameters mean (intercept, slopes, sigma).\n\n## Boostrapping\n\nIntroduce concepts related to pseudo-posterior distribution description\n\n## Hierarchical Models\n\nSimpson's paradox, random effects, how to leverage them to model interindividual differences\n\n## Bayesian estimation\n\nintroduce Bayesian estimation and priors over parameters\n\n## Bayesian mixed linear regression\n\nput everything together\n\n",
0 commit comments