Multivariate Structural Statespace Components #529

jessegrabowski · 2025-06-25T14:23:55Z

This PR lifts the requirement that models built with the structural sub-module of PyMC be univariate. It's a chonky PR, so I split it into commits. Most of the files changes are changed by the first commit, which is just reorganization of files. It is safe to ignore that one.

Here are the steps I followed:

The structural module was getting pretty unweildly, so I broke it into a bunch of sub-files. This makes the code easier to find and extend. This is handled in the Reorganize structural model modlue commit
We need tools that can merge different components with potentially different (or overlapping) observed time series. This is handled by the Allow combination of component with different numbers of observed states PR. I am confident this code can be improved.
Each component needs to have new logic implemented to handle the case where there are multiple observed series. Users can optionally pass a list of names to each component as observed_state_names. Every time you add two components together, all the relevant matrices are padded and expanded, and the total observed states are created as a union between the components.

For now, we assume all states in a component follow the same parameterization. It's now also valid to add together the same component twice with different states to work around this (e.g. AutoRegressive(order=1, observed_state_names=['data_1']) + Autoregressive(order=5, observed_state_names=['data_2'])) would be a valid model with 2 observed states, but each has it's own autoregressive dynamics.

When you pass a batch of observed_state_names, e.g. LevelTrend(order=2, observed_state_names=['data_1', 'data_2']), the parameters will all be given a batch dimension, but will otherwise be the same as the base case.

More docs coming, but I tried obsessively document what in there so far.

The logic for extending the components is pretty straight-forward -- mostly copying + block_diag or concat, but there are some corner cases that need attention.

This PR should be seen as a companion to #450. Instead of vectorizing across the computation of a model, we're concatenating models. There will be cases where this is superior -- for example when you want to explicitly model latent interactions between components. But in other cases, this approach will be worse. I am interested in having both.

…ates

AlexAndorra · 2025-06-26T20:42:30Z

AutoRegressive(order=1, observed_state_names=['data_1']) + Autoregressive(order=5, observed_state_names=['data_2'])) would be a valid model with 2 observed states, but each has it's own autoregressive dynamics.

This is cool! I will review ASAP.

Note that #450 is currently blocked by what I think is a pytensor bug

pymc_extras/statespace/models/utilities.py

AlexAndorra

This is 🔥 @jessegrabowski 🤯
I just left a suggestion for what I think was a typo in the docstring. I'll merge once this is resolved, and then test all of this for our PyData tutorial -- probably this weekend.

Just a quick question: IIUC, now users can also have batched RegressionComponents, correct?

AlexAndorra

This is 🔥 @jessegrabowski 🤯
I just left a suggestion for what I think was a typo in the docstring.

Still missing this feature are:

Cycle (currently worked on by @AlexAndorra)
Seasonal
Regression (currently worked on by @Dekermanjian)

We also need to:

Make sure that there are tests that combined LevelTrend + AR + error for two observed variables with no interaction model matches two separate models for each, given the same parameters.
Make sure that pytensor ops are used everywhere for building the SS matrices (no numpy/scipy)

AlexAndorra · 2025-07-02T22:12:51Z

I think I'm done for a first review from you on the Cycle component @jessegrabowski 🍾

2. Adjusted the regression component to allow multivariate regression component specification 3. Added a notebook for quick evaluation of the adjustments and additions made

2. replaced scipy block diag with pytensor block diag 3. Added forecast to test model in multivariate ssm notebook

Added multivariate regression-component

review-notebook-app · 2025-07-05T14:51:46Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jessegrabowski

@AlexAndorra I left comments for you

Since it's my own PR I can't request changes. It's better in future if you fork the PR branch and open a new PR into this PR, then we can do the usual review workflow on your PR and merge it into this PR when we're ready