You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tutorials/probabilistic-pca/index.qmd
+15-10Lines changed: 15 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -90,7 +90,7 @@ First, we load the dependencies used.
90
90
91
91
```{julia}
92
92
using Turing
93
-
using ReverseDiff
93
+
using Mooncake
94
94
using LinearAlgebra, FillArrays
95
95
96
96
# Packages for visualization
@@ -108,7 +108,7 @@ You can install them via `using Pkg; Pkg.add("package_name")`.
108
108
::: {.callout-caution}
109
109
## Package usages:
110
110
We use `DataFrames` for instantiating matrices, `LinearAlgebra` and `FillArrays` to perform matrix operations;
111
-
`Turing` for model specification and MCMC sampling, `ReverseDiff` for setting the automatic differentiation backend when sampling.
111
+
`Turing` for model specification and MCMC sampling, `Mooncake` for automatic differentiation when sampling.
112
112
`StatsPlots` for visualising the resutls. `, Measures` for setting plot margin units.
113
113
As all examples involve sampling, for reproducibility we set a fixed seed using the `Random` standard library.
114
114
:::
@@ -194,8 +194,9 @@ Specifically:
194
194
195
195
Here we aim to perform MCMC sampling to infer the projection matrix $\mathbf{W}_{D \times k}$, the latent variable matrix $\mathbf{Z}_{k \times N}$, and the offsets $\boldsymbol{\mu}_{N \times 1}$.
196
196
197
-
We run the inference using the NUTS sampler, of which the chain length is set to be 500, target accept ratio 0.65 and initial stepsize 0.1. By default, the NUTS sampler samples 1 chain.
198
-
You are free to try [different samplers](https://turinglang.org/stable/docs/library/#samplers).
197
+
We run the inference using the NUTS sampler.
198
+
By default, `sample` samples a single chain (in this case with 500 samples).
199
+
You can also use [different samplers]({{< meta usage-sampler-visualisation >}}) if you wish.
199
200
200
201
```{julia}
201
202
#| output: false
@@ -205,17 +206,21 @@ setprogress!(false)
205
206
```{julia}
206
207
k = 2 # k is the dimension of the projected space, i.e. the number of principal components/axes of choice
207
208
ppca = pPCA(mat_exp', k) # instantiate the probabilistic model
The samples are saved in the Chains struct `chain_ppca`, whose shape can be checked:
212
+
The samples are saved in `chain_ppca`, which is an `MCMCChains.Chains` object.
213
+
We can check its shape:
212
214
213
215
```{julia}
214
216
size(chain_ppca) # (no. of iterations, no. of vars, no. of chains) = (500, 159, 1)
215
217
```
216
218
217
-
The Chains struct `chain_ppca` also contains the sampling info such as r-hat, ess, mean estimates, etc.
218
-
You can print it to check these quantities.
219
+
Sampling statistics such as R-hat, ESS, mean estimates, and so on can also be obtained from this:
220
+
221
+
```{julia}
222
+
describe(chain_ppca)
223
+
```
219
224
220
225
#### Step 5: posterior predictive checks
221
226
@@ -280,7 +285,7 @@ Another way to put it: 2 dimensions is enough to capture the main structure of t
280
285
A direct question arises from above practice is: how many principal components do we want to keep, in order to sufficiently represent the latent structure in the data?
281
286
This is a very central question for all latent factor models, i.e. how many dimensions are needed to represent that data in the latent space.
282
287
In the case of PCA, there exist a lot of heuristics to make that choice.
283
-
For example, We can tune the number of principal components using empirical methods such as cross-validation based some criteria such as MSE between the posterior predicted (e.g. mean predictions) data matrix and the original data matrix or the percentage of variation explained [^3].
288
+
For example, We can tune the number of principal components using empirical methods such as cross-validation based on some criteria such as MSE between the posterior predicted (e.g. mean predictions) data matrix and the original data matrix or the percentage of variation explained [^3].
284
289
285
290
For p-PCA, this can be done in an elegant and principled way, using a technique called *Automatic Relevance Determination* (ARD).
286
291
ARD can help pick the correct number of principal directions by regularizing the solution space using a parameterized, data-dependent prior distribution that effectively prunes away redundant or superfluous features [^4].
@@ -315,7 +320,7 @@ We instantiate the model and ask Turing to sample from it using NUTS sampler. Th
315
320
316
321
```{julia}
317
322
ppca_ARD = pPCA_ARD(mat_exp') # instantiate the probabilistic model
Copy file name to clipboardExpand all lines: usage/mode-estimation/index.qmd
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -71,14 +71,14 @@ The above are just two examples, Optimization.jl supports [many more](https://do
71
71
We can also help the optimisation by giving it a starting point we know is close to the final solution, or by specifying an automatic differentiation method
When providing values to arguments like `initial_params` the parameters are typically specified in the order in which they appear in the code of the model, so in this case first `s²` then `m`. More precisely it's the order returned by `Turing.Inference.getparams(model, Turing.VarInfo(model))`.
81
+
When providing values to arguments like `initial_params` the parameters are typically specified in the order in which they appear in the code of the model, so in this case first `s²` then `m`. More precisely it's the order returned by `Turing.Inference.getparams(model, DynamicPPL.VarInfo(model))`.
82
82
83
83
We can also do constrained optimisation, by providing either intervals within which the parameters must stay, or costraint functions that they need to respect. For instance, here's how one can find the MLE with the constraint that the variance must be less than 0.01 and the mean must be between -1 and 1.:
0 commit comments