Implementation of the NTK in Colibri #422

achiefa · 2025-12-09T10:10:47Z

This PR implements the Neural Tangent Kernel (NTK) in Colibri. The idea is to compute the NTK for any PDF model that is trained using the Monte Carlo replica method and gradient-based optimisers.

To compute the NTKs, the model parameters are stored on disk during training with a user-specified recording frequency. This will create a new directory called parameters in each replica folder, which will contain the set of parameters for each recorded epoch.

The user can compute the NTK for each replica at any recorded epoch using the action compute_ntk.

LucaMantani

A little comment

LucaMantani · 2025-12-09T10:26:19Z

colibri/gradient_descent.py

Hi @achiefa , thanks for starting this. Just from a quick look, may I suggest to not have this write stuff directly, but rather add things in the GradientDescentResult? This way the writing is delegated to other dedicated functions and you don't need to modify much here. Similarly for the MonteCarloFit class.

Hi @LucaMantani, thanks for your comment. Indeed, we have considered this option, which I agree has a more solid design principle. However, I was worried that storing the parameters for all recorded epochs could yield memory issues during training. If instead we use a buffer that is saved on disk and freed at the end of each epoch, then we avoid any potential memory issue. Maybe this is not a problem at all, and we can simply store all parameters in a big array and then add it to GradientDescentResult.

Just to quantify the problem: For a neural network with 763 parameters (float64), a single array is about 0.01 MB. This is then multiplied by the number of epochs for which we want to save the parameters. For instance, if we have 100 epochs, this adds up to ~1MB for one replica. Again, probably we can afford this in favour of a better code design. What do you think?

I think 1 MB is nothing, we load in memory several Gb due to the data and FK tables. Even if one had a model with 1000 parameters, saving it 1000 times would be 8 Mb. So I would say memory is far from being an issue?

I agree. Let's put it in GradientDescentResult then.

achiefa added 2 commits December 8, 2025 20:42

Save parameters of the PDF model to disk

7b31cdd

First implementation of the ntk routine

06483c7

achiefa requested review from comane, ecole41 and vschutze-alt December 9, 2025 10:10

achiefa self-assigned this Dec 9, 2025

LucaMantani reviewed Dec 9, 2025

View reviewed changes

achiefa and others added 17 commits December 9, 2025 11:58

Add example runcard

43d0d89

Store parameters in GradientDescentResult

8ae7d28

Add new example runcard

61976d2

restore example runcard

2eda413

Merge branch 'ntk' of https://github.com/HEP-PBSP/colibri into ntk

2493eb6

Apply black formatting

c7276c8

enable ntk computation from a separate dedicated runcard

5fbdd18

Merge branch 'ntk' of https://github.com/HEP-PBSP/colibri into ntk

23c96a5

add example runcard to compute ntk

becca6b

Documentation

6c34b8a

Documentation tree

9f3a955

add more options to the runcard, add plotting function

41c5320

Merge branch 'ntk' of https://github.com/HEP-PBSP/colibri into ntk

2df9e06

populate docs

e27b1e2

remove runcards from example, as they are in the docs

6795a8a

populate docs, plots in pdf format

8f38b58

add ntk analysis plots and docs

f88b339

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementation of the NTK in Colibri #422

Implementation of the NTK in Colibri #422

Uh oh!

achiefa commented Dec 9, 2025

Uh oh!

LucaMantani left a comment

Uh oh!

LucaMantani Dec 9, 2025

Uh oh!

achiefa Dec 9, 2025

Uh oh!

LucaMantani Dec 9, 2025

Uh oh!

achiefa Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Implementation of the NTK in Colibri #422

Are you sure you want to change the base?

Implementation of the NTK in Colibri #422

Uh oh!

Conversation

achiefa commented Dec 9, 2025

Uh oh!

LucaMantani left a comment

Choose a reason for hiding this comment

Uh oh!

LucaMantani Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

achiefa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

LucaMantani Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

achiefa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants