TuringLang
diff --git a/‎_quarto.yml
Lines changed: 44 additions & 8 deletions b/‎_quarto.yml
Lines changed: 44 additions & 8 deletions
diff --git a/‎tutorials/01-gaussian-mixture-model/index.qmd
Lines changed: 12 additions & 12 deletions b/‎tutorials/01-gaussian-mixture-model/index.qmd
Lines changed: 12 additions & 12 deletions
diff --git a/‎tutorials/04-hidden-markov-model/index.qmd
Lines changed: 2 additions & 2 deletions b/‎tutorials/04-hidden-markov-model/index.qmd
Lines changed: 2 additions & 2 deletions
diff --git a/‎tutorials/05-linear-regression/index.qmd
Lines changed: 1 addition & 1 deletion b/‎tutorials/05-linear-regression/index.qmd
Lines changed: 1 addition & 1 deletion
diff --git a/‎tutorials/06-infinite-mixture-model/index.qmd
Lines changed: 2 additions & 2 deletions b/‎tutorials/06-infinite-mixture-model/index.qmd
Lines changed: 2 additions & 2 deletions
diff --git a/‎tutorials/08-multinomial-logistic-regression/index.qmd
Lines changed: 2 additions & 2 deletions b/‎tutorials/08-multinomial-logistic-regression/index.qmd
Lines changed: 2 additions & 2 deletions
diff --git a/‎tutorials/09-variational-inference/index.qmd
Lines changed: 7 additions & 7 deletions b/‎tutorials/09-variational-inference/index.qmd
Lines changed: 7 additions & 7 deletions
@@ -12,7 +12,7 @@ website:
   site-url: https://turinglang.org/
   site-path: "/"
   favicon: "assets/favicon.ico"
-  search: 
+  search:
     location: navbar
     type: overlay
   navbar:
@@ -50,7 +50,7 @@ website:
   sidebar:
     - text: documentation
       collapse-level: 1
-      contents: 
+      contents:
         - section: "Users"
           # href: tutorials/index.qmd, This page will be added later so keep this line commented
           contents:
@@ -59,7 +59,7 @@ website:
 
           - section: "Usage Tips"
             collapse-level: 1
-            contents: 
+            contents:
               - tutorials/docs-10-using-turing-autodiff/index.qmd
               - tutorials/usage-custom-distribution/index.qmd
               - tutorials/usage-probability-interface/index.qmd
@@ -72,7 +72,7 @@ website:
               - tutorials/docs-16-using-turing-external-samplers/index.qmd
 
           - section: "Tutorials"
-            contents: 
+            contents:
               - tutorials/00-introduction/index.qmd
               - text: Gaussian Mixture Models
                 href: tutorials/01-gaussian-mixture-model/index.qmd
@@ -129,7 +129,7 @@ website:
     background: "#073c44"
     left: |
       Turing is created by <a href="http://mlg.eng.cam.ac.uk/hong/" target="_blank">Hong Ge</a>, and lovingly maintained by the <a href="https://github.com/TuringLang/Turing.jl/graphs/contributors" target="_blank">core team</a> of volunteers. <br>
-      The contents of this website are © 2024 under the terms of the <a href="https://github.com/TuringLang/Turing.jl/blob/master/LICENCE" target="_blank">MIT License</a>. 
+      The contents of this website are © 2024 under the terms of the <a href="https://github.com/TuringLang/Turing.jl/blob/master/LICENCE" target="_blank">MIT License</a>.
 
     right:
       - icon: twitter
@@ -162,6 +162,42 @@ execute:
 
 # Global Variables to use in any qmd files using:
 # {{< meta site-url >}}
-site-url: https://turinglang.org/
-get-started: docs/tutorials/docs-00-getting-started/
-tutorials-intro: docs/tutorials/00-introduction/
+
+site-url: https://turinglang.org
+doc-base-url: https://turinglang.org/docs
+
+get-started: tutorials/docs-00-getting-started
+tutorials-intro: tutorials/00-introduction
+gaussian-mixture-model: tutorials/01-gaussian-mixture-model
+logistic-regression: tutorials/02-logistic-regression
+bayesian-neural-network: tutorials/03-bayesian-neural-network
+hidden-markov-model: tutorials/04-hidden-markov-model
+linear-regression: tutorials/05-linear-regression
+infinite-mixture-model: tutorials/06-infinite-mixture-model
+poisson-regression: tutorials/07-poisson-regression
+multinomial-logistic-regression: tutorials/08-multinomial-logistic-regression
+variational-inference: tutorials/09-variational-inference
+bayesian-differential-equations: tutorials/10-bayesian-differential-equations
+probabilistic-pca: tutorials/11-probabilistic-pca
+gplvm: tutorials/12-gplvm
+seasonal-time-series: tutorials/13-seasonal-time-series
+contexts: tutorials/16-contexts
+miniature: tutorial/14-minituring
+contributing-guide: tutorials/docs-01-contributing-guide
+using-turing-abstractmcmc: tutorials/docs-04-for-developers-abstractmc-turing
+using-turing-compiler: tutorials/docs-05-for-developers-compiler
+using-turing-interface: tutorials/docs-06-for-developers-interface
+using-turing-variational-inference: tutorials/docs-07-for-developers-variational-inference
+using-turing-advanced: tutorials/tutorials/docs-09-using-turing-advanced
+using-turing-autodiff: tutorials/docs-10-using-turing-autodiff
+using-turing-dynamichmc: tutorials/docs-11-using-turing-dynamichmc
+using-turing: tutorials/docs-12-using-turing-guide
+using-turing-performance-tips: tutorials/docs-13-using-turing-performance-tips
+using-turing-sampler-viz: tutorials/docs-15-using-turing-sampler-viz
+using-turing-external-samplers: tutorials/docs-16-using-turing-external-samplers
+using-turing-implementing-samplers: tutorials/docs-17-implementing-samplers
+using-turing-mode-estimation: tutorials/docs-17-mode-estimation
+usage-probability-interface: tutorials/usage-probability-interface
+usage-custom-distribution: tutorials/tutorials/usage-custom-distribution
+usage-generated-quantities: tutorials/tutorials/usage-generated-quantities
+usage-modifying-logprob: tutorials/tutorials/usage-modifying-logprob
@@ -130,7 +130,7 @@ chains = sample(model, sampler, MCMCThreads(), nsamples, nchains, discard_initia
 
 ::: {.callout-warning collapse="true"}
 ## Sampling With Multiple Threads
-The `sample()` call above assumes that you have at least `nchains` threads available in your Julia instance. If you do not, the multiple chains 
+The `sample()` call above assumes that you have at least `nchains` threads available in your Julia instance. If you do not, the multiple chains
 will run sequentially, and you may notice a warning. For more information, see [the Turing documentation on sampling multiple chains.](https://turinglang.org/dev/docs/using-turing/guide/#sampling-multiple-chains)
 :::
 
@@ -161,7 +161,7 @@ It can happen that the modes of $\mu_1$ and $\mu_2$ switch between chains.
 For more information see the [Stan documentation](https://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html). This is because it's possible for either model parameter $\mu_k$ to be assigned to either of the corresponding true means, and this assignment need not be consistent between chains.
 
 That is, the posterior is fundamentally multimodal, and different chains can end up in different modes, complicating inference.
-One solution here is to enforce an ordering on our $\mu$ vector, requiring $\mu_k > \mu_{k-1}$ for all $k$. 
+One solution here is to enforce an ordering on our $\mu$ vector, requiring $\mu_k > \mu_{k-1}$ for all $k$.
 `Bijectors.jl` [provides](https://turinglang.org/Bijectors.jl/dev/transforms/#Bijectors.OrderedBijector) an easy transformation (`ordered()`) for this purpose:
 
 ```{julia}
@@ -255,7 +255,7 @@ scatter(
 
 
 ## Marginalizing Out The Assignments
-We can write out the marginal posterior of (continuous) $w, \mu$ by summing out the influence of our (discrete) assignments $z_i$ from 
+We can write out the marginal posterior of (continuous) $w, \mu$ by summing out the influence of our (discrete) assignments $z_i$ from
 our likelihood:
 $$
 p(y \mid w, \mu ) = \sum_{k=1}^K w_k p_k(y \mid \mu_k)
@@ -299,11 +299,11 @@ end
 ::: {.callout-warning collapse="false"}
 ## Manually Incrementing Probablity
 
-When possible, use of `Turing.@addlogprob!` should be avoided, as it exists outside the 
+When possible, use of `Turing.@addlogprob!` should be avoided, as it exists outside the
 usual structure of a Turing model. In most cases, a custom distribution should be used instead.
 
 Here, the next section demonstrates the perfered method --- using the `MixtureModel` distribution we have seen already to
-perform the marginalization automatically. 
+perform the marginalization automatically.
 :::
 
 
@@ -312,8 +312,8 @@ perform the marginalization automatically.
 We can use Turing's `~` syntax with anything that `Distributions.jl` provides `logpdf` and `rand` methods for. It turns out that the
 `MixtureModel` distribution it provides has, as its `logpdf` method, `logpdf(MixtureModel([Component_Distributions], weight_vector), Y)`, where `Y` can be either a single observation or vector of observations.
 
-In fact, `Distributions.jl` provides [many convenient constructors](https://juliastats.org/Distributions.jl/stable/mixture/) for mixture models, allowing further simplification in common special cases. 
-  
+In fact, `Distributions.jl` provides [many convenient constructors](https://juliastats.org/Distributions.jl/stable/mixture/) for mixture models, allowing further simplification in common special cases.
+
 For example, when mixtures distributions are of the same type, one can write: `~ MixtureModel(Normal, [(μ1, σ1), (μ2, σ2)], w)`, or when the weight vector is known to allocate probability equally, it can be ommited.
 
 The `logpdf` implementation for a `MixtureModel` distribution is exactly the marginalization defined above, and so our model becomes simply:
@@ -330,7 +330,7 @@ end
 model = gmm_marginalized(x);
 ```
 
-As we've summed out the discrete components, we can perform inference using `NUTS()` alone. 
+As we've summed out the discrete components, we can perform inference using `NUTS()` alone.
 
 ```{julia}
 #| output: false
@@ -352,21 +352,21 @@ let
 end
 ```
 
-`NUTS()` significantly outperforms our compositional Gibbs sampler, in large part because our model is now Rao-Blackwellized thanks to 
+`NUTS()` significantly outperforms our compositional Gibbs sampler, in large part because our model is now Rao-Blackwellized thanks to
 the marginalization of our assignment parameter.
 
 ```{julia}
 plot(chains[["μ[1]", "μ[2]"]], legend=true)
 ```
 
 ## Inferred Assignments - Marginalized Model
-As we've summed over possible assignments, the associated parameter is no longer available in our chain. 
+As we've summed over possible assignments, the associated parameter is no longer available in our chain.
 This is not a problem, however, as given any fixed sample $(\mu, w)$, the assignment probability — $p(z_i \mid y_i)$ — can be recovered using Bayes rule:
 $$
 p(z_i \mid y_i) = \frac{p(y_i \mid z_i) p(z_i)}{\sum_{k = 1}^K \left(p(y_i \mid z_i) p(z_i) \right)}
 $$
 
-This quantity can be computed for every $p(z = z_i \mid y_i)$, resulting in a probability vector, which is then used to sample 
+This quantity can be computed for every $p(z = z_i \mid y_i)$, resulting in a probability vector, which is then used to sample
 posterior predictive assignments from a categorial distribution.
 For details on the mathematics here, see [the Stan documentation on latent discrete parameters](https://mc-stan.org/docs/stan-users-guide/latent-discrete.html).
 ```{julia}
@@ -399,7 +399,7 @@ chains = sample(model, sampler, MCMCThreads(), nsamples, nchains, discard_initia
 Given a sample from the marginalized posterior, these assignments can be recovered with:
 
 ```{julia}
-assignments = mean(generated_quantities(gmm_recover(x), chains)); 
+assignments = mean(generated_quantities(gmm_recover(x), chains));
 ```
 
 ```{julia}
 
@@ -14,7 +14,7 @@ This tutorial illustrates training Bayesian [Hidden Markov Models](https://en.wi
 
 In this tutorial, we assume there are $k$ discrete hidden states; the observations are continuous and normally distributed - centered around the hidden states. This assumption reduces the number of parameters to be estimated in the emission matrix.
 
-Let's load the libraries we'll need. We also set a random seed (for reproducibility) and the automatic differentiation backend to forward mode (more [here](https://turinglang.org/dev/docs/using-turing/autodiff) on why this is useful).
+Let's load the libraries we'll need. We also set a random seed (for reproducibility) and the automatic differentiation backend to forward mode (more [here]( {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}} ) on why this is useful).
 
 ```{julia}
 # Load libraries.
@@ -125,7 +125,7 @@ We will use a combination of two samplers ([HMC](https://turinglang.org/dev/docs
 
 In this case, we use HMC for `m` and `T`, representing the emission and transition matrices respectively. We use the Particle Gibbs sampler for `s`, the state sequence. You may wonder why it is that we are not assigning `s` to the HMC sampler, and why it is that we need compositional Gibbs sampling at all.
 
-The parameter `s` is not a continuous variable. It is a vector of **integers**, and thus Hamiltonian methods like HMC and [NUTS](https://turinglang.org/dev/docs/library/#Turing.Inference.NUTS) won't work correctly. Gibbs allows us to apply the right tools to the best effect. If you are a particularly advanced user interested in higher performance, you may benefit from setting up your Gibbs sampler to use [different automatic differentiation](https://turinglang.org/dev/docs/using-turing/autodiff#compositional-sampling-with-differing-ad-modes) backends for each parameter space.
+The parameter `s` is not a continuous variable. It is a vector of **integers**, and thus Hamiltonian methods like HMC and [NUTS](https://turinglang.org/dev/docs/library/#Turing.Inference.NUTS) won't work correctly. Gibbs allows us to apply the right tools to the best effect. If you are a particularly advanced user interested in higher performance, you may benefit from setting up your Gibbs sampler to use [different automatic differentiation]( {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}}#compositional-sampling-with-differing-ad-modes) backends for each parameter space.
 
 Time to run our sampler.
 
 
@@ -164,7 +164,7 @@ end
 
 ## Comparing to OLS
 
-A satisfactory test of our model is to evaluate how well it predicts. Importantly, we want to compare our model to existing tools like OLS. The code below uses the [GLM.jl]() package to generate a traditional OLS multiple regression model on the same data as our probabilistic model.
+A satisfactory test of our model is to evaluate how well it predicts. Importantly, we want to compare our model to existing tools like OLS. The code below uses the [GLM.jl](https://juliastats.org/GLM.jl/stable/) package to generate a traditional OLS multiple regression model on the same data as our probabilistic model.
 
 ```{julia}
 # Import the GLM package.
 
@@ -81,7 +81,7 @@ x &\sim \mathrm{Normal}(\mu_z, \Sigma)
 \end{align}
 $$
 
-which resembles the model in the [Gaussian mixture model tutorial](https://turinglang.org/stable/tutorials/01-gaussian-mixture-model/) with a slightly different notation.
+which resembles the model in the [Gaussian mixture model tutorial]( {{<meta doc-base-url>}}/{{<meta gaussian-mixture-model>}}) with a slightly different notation.
 
 ## Infinite Mixture Model
 
@@ -149,7 +149,7 @@ end
 ```{julia}
 using Plots
 
-# Plot the cluster assignments over time 
+# Plot the cluster assignments over time
 @gif for i in 1:Nmax
     scatter(
         collect(1:i),
 
@@ -144,8 +144,8 @@ chain
 
 ::: {.callout-warning collapse="true"}
 ## Sampling With Multiple Threads
-The `sample()` call above assumes that you have at least `nchains` threads available in your Julia instance. If you do not, the multiple chains 
-will run sequentially, and you may notice a warning. For more information, see [the Turing documentation on sampling multiple chains.](https://turinglang.org/dev/docs/using-turing/guide/#sampling-multiple-chains)
+The `sample()` call above assumes that you have at least `nchains` threads available in your Julia instance. If you do not, the multiple chains
+will run sequentially, and you may notice a warning. For more information, see [the Turing documentation on sampling multiple chains.]( {{<meta doc-base-url>}}/{{<meta using-turing>}}#sampling-multiple-chains )
 :::
 
 Since we ran multiple chains, we may as well do a spot check to make sure each chain converges around similar points.
 
@@ -13,7 +13,7 @@ Pkg.instantiate();
 In this post we'll have a look at what's know as **variational inference (VI)**, a family of _approximate_ Bayesian inference methods, and how to use it in Turing.jl as an alternative to other approaches such as MCMC. In particular, we will focus on one of the more standard VI methods called **Automatic Differentation Variational Inference (ADVI)**.
 
 Here we will focus on how to use VI in Turing and not much on the theory underlying VI.
-If you are interested in understanding the mathematics you can checkout [our write-up](../../tutorials/docs-07-for-developers-variational-inference/) or any other resource online (there a lot of great ones).
+If you are interested in understanding the mathematics you can checkout [our write-up]( {{<meta doc-base-url>}}/{{<meta using-turing-variational-inference>}} ) or any other resource online (there a lot of great ones).
 
 Using VI in Turing.jl is very straight forward.
 If `model` denotes a definition of a `Turing.Model`, performing VI is as simple as
@@ -26,7 +26,7 @@ q = vi(m, vi_alg)  # perform VI on `m` using the VI method `vi_alg`, which retur
 
 Thus it's no more work than standard MCMC sampling in Turing.
 
-To get a bit more into what we can do with `vi`, we'll first have a look at a simple example and then we'll reproduce the [tutorial on Bayesian linear regression](../../tutorials/05-linear-regression/) using VI instead of MCMC. Finally we'll look at some of the different parameters of `vi` and how you for example can use your own custom variational family.
+To get a bit more into what we can do with `vi`, we'll first have a look at a simple example and then we'll reproduce the [tutorial on Bayesian linear regression]( {{<meta doc-base-url>}}/{{<meta linear-regression>}}) using VI instead of MCMC. Finally we'll look at some of the different parameters of `vi` and how you for example can use your own custom variational family.
 
 We first import the packages to be used:
 
@@ -248,7 +248,7 @@ plot(p1, p2; layout=(2, 1), size=(900, 500))
 
 ## Bayesian linear regression example using ADVI
 
-This is simply a duplication of the tutorial on [Bayesian linear regression](../../tutorials/05-linear-regression/) (much of the code is directly lifted), but now with the addition of an approximate posterior obtained using `ADVI`.
+This is simply a duplication of the tutorial on [Bayesian linear regression]({{< meta doc-base-url >}}/{{<meta linear-regression>}}) (much of the code is directly lifted), but now with the addition of an approximate posterior obtained using `ADVI`.
 
 As we'll see, there is really no additional work required to apply variational inference to a more complex `Model`.
 
@@ -599,7 +599,7 @@ println("Training set:
     VI loss: $vi_loss1
     Bayes loss: $bayes_loss1
     OLS loss: $ols_loss1
-Test set: 
+Test set:
     VI loss: $vi_loss2
     Bayes loss: $bayes_loss2
     OLS loss: $ols_loss2")
@@ -765,8 +765,8 @@ plot(p1, p2; layout=(1, 2), size=(800, 2000))
 So it seems like the "full" ADVI approach, i.e. no mean-field assumption, obtain the same modes as the mean-field approach but with greater uncertainty for some of the `coefficients`. This
 
 ```{julia}
-# Unfortunately, it seems like this has quite a high variance which is likely to be due to numerical instability, 
-# so we consider a larger number of samples. If we get a couple of outliers due to numerical issues, 
+# Unfortunately, it seems like this has quite a high variance which is likely to be due to numerical instability,
+# so we consider a larger number of samples. If we get a couple of outliers due to numerical issues,
 # these kind affect the mean prediction greatly.
 z = rand(q_full_normal, 10_000);
 ```
@@ -795,7 +795,7 @@ println("Training set:
     VI loss: $vi_loss1
     Bayes loss: $bayes_loss1
     OLS loss: $ols_loss1
-Test set: 
+Test set:
     VI loss: $vi_loss2
     Bayes loss: $bayes_loss2
     OLS loss: $ols_loss2")