feat: implement Fast-SNP, speed up loopless FVA #1444

isaf27 · 2025-06-02T21:44:25Z

fix [Feature] Nullspace calculation similar to COBRA (MATLAB) #1393
description of feature/fix
tests added/passed
add an entry to the next release

TL;DR

This PR implements Fast-SNP algorithm for adding loopless constraints. It consists of two key optimizations:

Consider for loopless constraints only the reactions that can be part of a cycle (loop).
Build sparse $N_{\text{int}}$ matrix using Fast-SNP algorithm (https://doi.org/10.1093/bioinformatics/btw555), which takes reaction flux sign (positive/negative) into account.

As a result a 10-100x (or more) speed-up for loopless FVA is achieved.

This PR is somewhat similar to #841, but differs in implementation.

Code modifications

One of the key optimizations is that add_loopless and flux_variability_analysis functions use only reactions that can be a part of a cycle (we call them cyclic). This greatly improves performance due to the number of cyclic reactions is usually much smaller than the number of internal reactions.

Function `find_cyclic_reactions`

We add a new function find_cyclic_reactions which finds all reactions in the model that can be a part of a cycle.
Also for each such reaction we identify if it can have positive and negative flux in a cycle. This is tested by restricting boundary reaction fluxes to zero and optimizing flux through each of the internal reactions.

There are two methods (method parameter) that can be called:

basic: a straightforward procedure that iterates over all of the reactions;
optimized: an optimized version inspired by Fast-SNP random weights idea, which discovers many cyclic reactions in one go.

The second method uses less LP programs and usually works at least 2x faster. It is a custom procedure, so we don't have a citation for the algorithm.

Function `add_loopless`

We introduced new method parameter. If method="fastSNP" we:

Find all cyclic reactions using find_cyclic_reactions.
Calculate $N_{\text{int}}$ for the cyclic reactions using Fast-SNP nullspace basis construction algorithm, taking model flux bounds-based signs into account. As the result, the constraints are constructed using much smaller and sparse $N_{\text{int}}$ matrix which greatly reduces number of binary variables in the corresponding MILP prolem and improves the speed of FVA.

If method="original" then the current SVD-based algorithm is ran on all of the internal reactions.

Function `flux_variability_analysis`

We made some optimizations for this function if loopless=True.

We always find all cyclic reactions and their possible directions using find_cyclic_reactions function. Reactions which are not cyclic are optimized using the basic FVA mode—their extreme values are guaranteed to be achievable without using cycle fluxes. This greatly improves the performance of the function even when using the CycleFreeFlux method.
We added a new add_loopless regime to this function, which adds loopless constraints (re-using the calculated cyclic reactions and add_loopless function) to solve loopless FVA for the cyclic reactions. In this regime the exact FVA bounds are calculated (unlike for the CycleFreeFlux method).

Performance

FVA quick example

For the quick demonstration of the performance improvement we assembled the following script, which solves FVA for a single reaction with the original and new Fast-SNP loopless constraints.

import time

from cobra.io import load_model
from cobra.flux_analysis.loopless import add_loopless


for model_name, reaction in [
    ('iIT341', 'ACKr'),
    ('iAF692', 'MDHy'),
    ('iNJ661', 'NADTRHD_copy1'),
]:
    print(f'Working with model {model_name}...')

    for method in ['original', 'fastSNP']:
        model = load_model(model_name)

        print(f'Finding loopless FVA bounds for reaction {reaction} using {method} method...')

        start_time = time.time()
        add_loopless(model, method=method)
        print(f'Loopless constraints added in {time.time() - start_time:.2f} seconds.')
        
        bounds = {}
        start_time = time.time()
        for direction in ['min', 'max']:
            rxn = model.reactions.get_by_id(reaction)
            model.solver.objective.set_linear_coefficients(
                {rxn.forward_variable: 1, rxn.reverse_variable: -1}
            )
            model.solver.objective.direction = direction
            
            bounds[direction] = model.slim_optimize()
        
        print(f'FVA bounds calculated in {time.time() - start_time:.2f} seconds.')
        print(f'Loopless FVA bounds for {reaction} ({method}): {bounds}')

Here are the example run times for the tested model/reaction pairs (CPLEX solver was used for MILP):

Model iIT341, reaction ACKr: $0.86 \to 0.09$, $9.6\times$ speed-up.
Model iAF692, reaction MDHy: $1.35 \to 0.11$, $12.3\times$ speed-up.
Model iNJ661, reaction NADTRHD_copy1: $120.61 \to 0.93$, $129\times$ speed-up.

The fastest complete loopless FVA

The most efficient way to calculate loopless FVA for all of the reactions is to directly call flux_variability_analysis function, without using explicit add_loopless:

model = load_model(model_name)
flux_variability_analysis(model, loopless="fastSNP")

Discussion points

There are several points regarding the API we wanted to highlight for discussion:

As was mentioned flux_variability_analysis(model, loopless=True, params_add_loopless={}) is the most efficient way to calculate loopless FVA. The function flux_variability_analysis identifies which method (add_loopless-based or CycleFreeFlux) will be used based on the params_add_loopless presence. This form enables backward-compatibility, but is a bit clunky. A better way could be to make loopless parameter deprecated and replace it with a new string-based method parameter to control the behavior, with the corresponding backward compatibility-handling procedure that sets method="CycleFreeFlux" if loopless=True is present.
In add_loopless function the original $N_{\text{int}}$ construction algorithm can also benefit from being run only for the cyclic reactions. We could create another option use_cyclic=True to enable that, however we decided not to implement it. Our thinking is that method=original is provided for the reproducibility of the previous behavior, but ultimately shouldn't be used: the Fast-SNP algorithm should always provide a better performance.
Conversely, flux_variability_analysis(loopless=True) can still be ran when the full loopless FVA is not feasible. Thus we decided to implement running CycleFreeFlux only for the cyclic reactions, which significantly improves the performance. This changes the behavior, but should produce the same results as before.

We would be happy to hear your feedback and are ready to implement any additional changes to API/docs/tests/etc that are needed for this PR to be merged.

cdiener · 2025-06-03T07:20:19Z

Thanks so much. This looks great! Sorry for the issues with the CI. I will try to fix them this week, but you might have to rebase on the devel branch after that.

In general I think it will be a great contribution. I will try to review as soon as possible.

isaf27 · 2025-07-03T21:33:08Z

@cdiener Hello. I've merged devel into this PR (I've seen you did some python versions adjustment). Please approve the CI workflow to check. Is it true, that now CI should work successfully?

cdiener · 2025-07-21T08:23:57Z

Sorry for the delay, currently on paternity leave. Looks like a great addition. Hopefully I can get to it in the next weeks.

I am actually not that opposed to changing the loopless argument to a string. It seems the most explicit and it would be an easy fix in existing code.

isaf27 · 2025-09-08T21:22:09Z

@cdiener Hello. I've added a small fix to fastSNP algorithm implementation to make it stable. Now tests should pass, please approve the CI workflow to check.

cdiener

Hi thanks, realy great addition.

Are the FastSNP variables and constraints incompatible with the original LP? If not it would probably be much more efficient to just modify the existing LP instance with a context which will also make sure all changes are reverted after.

src/cobra/flux_analysis/find_cyclic_reactions.py

cdiener · 2025-10-08T20:54:33Z

src/cobra/flux_analysis/find_cyclic_reactions.py

+    model = solver.Model()
+
+    q_list = []
+    for i in range(s_int.shape[1]):


wouldn't it be faster to add those to the original model and use a context? For instance the code below re-adds all the constraints from S that are already in the original model

For this function it is important to work only with internal reactions, however, the original model contains boundary reactions and additional constraints. It is harder to filter them, the easier way is just to create the new problem.

We checked that creating a new problem before solving is not a bottleneck and takes negligible time comparing with the whole function running time. Also this function is typically not called in a loop, so we won't create more than one additional problem.

cdiener · 2025-10-08T20:55:20Z

src/cobra/flux_analysis/find_cyclic_reactions.py

+    model.add(q_list)
+
+    for idx, row in enumerate(s_int):
+        nnz_list = np.flatnonzero(np.abs(row) > zero_cutoff)


this will ignore additional constraints from model.constraints that the user may have defined.

For this function, it is important that we find cyclic reactions based only on stoichiometric matrix and reaction directions. Otherwise, the implemented method does not work. This function is used for filtering reactions, so it is ok. I've added a comment to docstring about what this function exactly does.

p.s.
To satisfy all constraints (even reactions bounds, not just directions) the algorithm should be much slower. We definitely know, that in many models there are some cyclic reactions, that cannot carry any non-zero flux (if all bounds are considered), however, it is harder problem to prove that.

src/cobra/flux_analysis/variability.py

cdiener · 2025-10-08T21:06:14Z

src/cobra/flux_analysis/fast_snp.py

+    modulo_list = []
+    v_list = []
+
+    for i in range(n):


this might miss user-defined custom variables

nullspace_fast_snp is a helper function and it just takes only a matrix and directions as an input and calculates its nullspace. There are no user-defined constraints for this subtask.

Actually both nullspace_fast_snp and find_cyclic_reactions functions are two helper subtasks, that are solved independently before the flux variability optimization which is compatible with all the additional custom constraints and variables.

cdiener · 2025-10-08T21:06:40Z

src/cobra/flux_analysis/fast_snp.py

+                {x: 1.0, v: 1.0}
+            )
+
+    for idx, row in enumerate(S):


this might miss user-defined constraints

Reply is similar as in the thread about custom user-defined variables

cdiener · 2025-10-08T21:09:13Z

src/cobra/flux_analysis/find_cyclic_reactions.py

+        q_list.append(q)
+    model.add(q_list)
+
+    for idx, row in enumerate(s_int):


Should be more efficient to iterate over model.constrains directly and clone or modify

As I written in another reply, it is easier to create a new helper optimization problem, rather than filtering all unnecessary variables and constraints from the copy of the original problem. Also, from my experience it takes approximately the same time to copy the original problem and to create the new one.

isaf27 · 2025-10-16T19:46:22Z

Hello, thank you very much for your review.

I've added an update:

in flux_variability_analysis parameter add_loopless_params is removed, loopless accepts str now (with deprecated bool for backward compatibility)
in find_cyclic_reactions a Note is added about what this function does
added short docstrings for helper functions

For non-accepted suggestions I've added a response.

Looking forward for tests run and next review!

cdiener

Hi, thanks for the explanation. It makes sense to me now. It would still be possible to identify only the equalities from the set of all constraints quite easily, but using the S matrix is not that much worse because you are already using the fast contraint setters from optlang.

Just one small move for a file and it is still missing the release notes. Maybe also rebase on devel if possible.

src/cobra/flux_analysis/fast_snp.py

isaf27 · 2025-10-28T23:45:07Z

I moved the file fast_snp as you requested and added standalone tests for it.

Also I added release notes.

cdiener

Thanks, looks good now. 😄

cdiener · 2025-10-29T10:24:30Z

Note to self: this is an API breaking change because the type of the loopless arg has been changed.

isaf27 added 7 commits April 24, 2025 00:20

feat: implement FastSNP, speed up loopless FVA

3f09f8c

fix: remove unnecessary calculations and speed up a bit

0a18ed8

fix: impl

d805047

fix: small changes before tests

27ee14d

fix: add_loopless test

058a2df

fix: add variability test

7d309d9

fix: add test_find_cyclic_reactions.py

8681840

Merge branch 'opencobra:devel' into feat-implement-fastsnp

a623ad2

fix: tests involving fastSNP

26c730d

cdiener requested changes Oct 8, 2025

View reviewed changes

fix: after review

a652c4a

isaf27 requested a review from cdiener October 16, 2025 19:46

Merge branch 'opencobra:devel' into feat-implement-fastsnp

7f6523c

cdiener requested changes Oct 23, 2025

View reviewed changes

src/cobra/flux_analysis/fast_snp.py Show resolved Hide resolved

isaf27 added 2 commits October 28, 2025 23:05

fix: move fast_snp

d7ef948

fix: add release notes

ba591a1

isaf27 requested a review from cdiener October 28, 2025 23:45

cdiener approved these changes Oct 29, 2025

View reviewed changes

cdiener merged commit fc1b2ce into opencobra:devel Oct 29, 2025
7 checks passed

feat: implement Fast-SNP, speed up loopless FVA #1444

feat: implement Fast-SNP, speed up loopless FVA #1444

Uh oh!

Conversation

isaf27 commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

Code modifications

Function find_cyclic_reactions

Function add_loopless

Function flux_variability_analysis

Performance

FVA quick example

The fastest complete loopless FVA

Discussion points

Uh oh!

cdiener commented Jun 3, 2025

Uh oh!

isaf27 commented Jul 3, 2025

Uh oh!

cdiener commented Jul 21, 2025

Uh oh!

isaf27 commented Sep 8, 2025

Uh oh!

cdiener left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isaf27 commented Oct 16, 2025

Uh oh!

cdiener left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

isaf27 commented Oct 28, 2025

Uh oh!

cdiener left a comment

Choose a reason for hiding this comment

Uh oh!

cdiener commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

isaf27 commented Jun 2, 2025 •

edited

Loading

Function `find_cyclic_reactions`

Function `add_loopless`

Function `flux_variability_analysis`

cdiener left a comment •

edited

Loading