Skip to content

Conversation

@mcbarton
Copy link
Collaborator

@mcbarton mcbarton commented Nov 19, 2025

Description

Please include a summary of changes, motivation and context for this PR.

This PR is the first in a series of PRs to add automated testing to check changes don't break xeus-cpps existing jupyter lite notebooks, in a ci friendly way. The workflow for checking nothing is broken

First run python script with driver option set to browser you want to run the notebook in (the ci will cover all options, firefox, chome and safari). This should be run while the jupyter server command is running, with --settings-overrides=overrides.json as the first option to provide the download button.

This will download the notebook once it has finished running (the reason for adding overrides.json to add Download option to notebook menu bar). This config to add this was taken from https://jupyterlite.readthedocs.io/en/latest/howto/configure/settings.html . Right clicking the file, finding the download option and clicking didn't work consistently, and never after the notebook had finished running in the case of Safari.

Final step is compare the fully executed notebook in repo (which will be used as referenced) against the fully run notebook downloaded as part of the python script using nbdiff which comes from jupyters nbdime https://github.com/jupyter/nbdime .

If the notebooks are the same nbdiff outputs nothing and the ci carries on. If they are different, nbdiff returns a non zero value and the ci stops with the outputted diff.

This idea is effectively option 2 suggested here #259 (comment), but I am using Selenium instead of Playwright. I have checked the above steps locally, but am setting this PR to draft until I have had a chance to update the documentation and ci to do these automated notebook tests.

This PR will add the

Fixes # (issue)

Type of change

Please tick all options which are relevant.

  • Bug fix
  • New feature
  • Added/removed dependencies
  • Required documentation updates

@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.94%. Comparing base (778e639) to head (617d487).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #415   +/-   ##
=======================================
  Coverage   81.94%   81.94%           
=======================================
  Files          21       21           
  Lines         853      853           
  Branches       87       87           
=======================================
  Hits          699      699           
  Misses        154      154           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 438015e to 2d68e5c Compare November 20, 2025 21:04
@mcbarton
Copy link
Collaborator Author

See here https://github.com/compiler-research/xeus-cpp/actions/runs/19553412423/job/55990268776?pr=415#step:7:605 for an example of what the output looks like when the notebook it gets once finished running being different to the reference notebook (in this case the difference comes from the fact the notebook didn't have enough time to finish running on the Github runner)

Here is the output for when the workflow output cannot be seen anymore

--- notebooks/smallpt.ipynb  2025-11-20 22:31:08.104731
+++ /home/runner/Downloads/smallpt.ipynb  2025-11-20 22:37:33.795913
## replaced (type changed from int to NoneType) /cells/5/execution_count:
-  5
+  None
## modified /cells/5/outputs/0/text:
@@ -3,6 +3,3 @@
 [DEBUG] Dimensions: 320x240, Samples: 16
 [DEBUG] SDL surface created successfully
 [DEBUG] The image should be on your screen soon
-[DEBUG] Mapped pixel buffer to surface
-[DEBUG] Surface saved to render.bmp
-[DEBUG] Surface created successfully
## deleted /cells/5/outputs/1-2:
-  output:
-    output_type: display_data
-    data:
-      image/bmp: Qk16sAQA...<snip base64, md5=d8c4849298c46ca8...>
-  output:
-    output_type: stream
-    name: stdout
-    text:
-      [DEBUG] Surface freed
-      [DEBUG] Exiting main

Comment on lines 29 to 264
options = ChromeOptions()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(options=options)

elif args.driver == "firefox":
options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, these headless options should be enabled by command line argument, in case the user locally wants to see the notebook running in a gui. Only needed due to ci needing them run in a headless browser

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 5ef8ec6 to 519adf9 Compare November 20, 2025 23:06
)
)
actions.move_to_element(run_all_menu).click().perform()
time.sleep(200)
Copy link
Collaborator Author

@mcbarton mcbarton Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These time.sleeps to run all the cells will depend on the power of the cpu running the tests. Adjusted for now to be long enough that the notebook has enough time to run on the Github runners. Locally these times can be a lot shorter.

Long term this should be switched out without something which determines if the kernel has gone idle, and aborts if he has taken too long (assumed notebook has stalled at this point). This I feel can be done as part of subsequent PR though.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 0e28cd7 to 3eaff7e Compare November 21, 2025 06:44
@anutosh491
Copy link
Collaborator

anutosh491 commented Nov 21, 2025

Hi,

I would recommend starting out with testing the demo notebook instead of the smallpt notebook.

Testing the smallpt notebook doesn't really add value. Our smallpt notebook gives an impression that SDL is usable in JupyterLite but it's just a gimmick to be fair 😅 .

if you check the notebook, we use SDL only to render to a static bitmap instead of using it for any "dynamic" opengl canvas related stuff. As xeus-cpp is run in the worker thread and we need the main thread for any events or dynamic stuff .... we're just masking SDL to convert a raw stream to a bitmap (we don't need sdl for this)

So testing the smallpt notebook != SDL works through xeus-cpp-lite. No events, no dynamics, no canvases

For events (mouse/keyboard etc), dynamic canvases ... we probably can put the jupyter widgets (ipywidgets, xwidgets) to use for now.

Check example notebooks here for the above.
https://derthorsten.github.io/pyb2d3/lite/lab/index.html

@mcbarton
Copy link
Collaborator Author

Hi,

I would recommend starting out with testing the demo notebook instead of the smallpt notebook.

Testing the smallpt notebook doesn't really add value. Our smallpt notebook gives an impression that SDL is usable in JupyterLite but it's just a gimmick to be fair 😅 .

if you check the notebook, we use SDL only to render to a static bitmap instead of using it for any "dynamic" opengl canvas related stuff. As xeus-cpp is run in the worker thread and we need the main thread for any events or dynamic stuff .... we're just masking SDL to convert a raw stream to a bitmap (we don't need sdl for this)

So testing the smallpt notebook != SDL works through xeus-cpp-lite. No events, no dynamics, no canvases

For events (mouse/keyboard etc), dynamic canvases ... we probably can put the jupyter widgets (ipywidgets, xwidgets) to use for now.

Check example notebooks here for the above. https://derthorsten.github.io/pyb2d3/lite/lab/index.html

Hi @anutosh491

For I am just trying to show a technique for running any notebook in Jupyter lite using Selenium. I am not too fussed at the moment that it isn't actually making use of SDL. I picked smallpt just because its the easiest case to start off with since it requires no user input besides executing all the cells. The ci now shows it works for Firefox and Chrome. It works locally for Safari too. I am just debugging what is going on with the download part for this browser, since it doesn't appear to be downloading the notebook for this browser in the ci. Once I have this working completely in the ci, my next goal is to work on the normal demo notebook, and add to the script so it can deal with the case when a notebook requests user input (std::cin case). The next stage I was hoping we could make a series of reference notebooks to run through the ci which check we don't break things between PRs (these don't necessarily need to be ones we deploy, and can be in a separate compiler research repo). Basically the script will be modified for increasingly difficult to execute notebooks until we have something that can deal with most notebooks we give it.

@mcbarton
Copy link
Collaborator Author

mcbarton commented Nov 21, 2025

@anutosh491 I have now got the most basic notebook (smallpt.ipynb) to try and run completely in the ci (basic in the sense it requires no human interaction beyond running all the cell), even if it doesn't show much. I will now move onto getting the demo notebook to run in the ci, so we can deal with requests for user input. In the meantime you might want to think of ideas for notebooks that it might be good to be able to run through the ci.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 1847aa0 to a92a91c Compare November 21, 2025 14:31
@mcbarton mcbarton changed the title Provide mechanism to run smallpt.ipynb in automated way to compare with reference notebook (jupyter lite xeus-cpp automated tests) Run notebooks in Jupyter Lite as part of ci (Emscripten xeus-cpp automated testing) Nov 21, 2025
@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from a92a91c to 0710b5b Compare November 21, 2025 14:44
@mcbarton
Copy link
Collaborator Author

Safari is not working again. No idea what happened. It was working, and then I squashed commits, and it wasn't working again. Back to debugging ...

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 9dd459f to 9ade43d Compare November 22, 2025 16:46
@@ -0,0 +1,33 @@
import pyautogui
Copy link
Collaborator Author

@mcbarton mcbarton Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I couldn't download the notebook after running the notebook in Safari is because of the following pop up
image
You cannot click allow, since it is running in Safari driver. I tried both locally and in the ci.

The solution is this python script, which makes is so Safari doesn't ask if you want to allow Downloads from any site (outside of Safari driver), e.g. the end result is this page at the end of the script
image

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch 7 times, most recently from 7ad52d2 to bfc10ab Compare November 26, 2025 19:14
@mcbarton
Copy link
Collaborator Author

mcbarton commented Nov 26, 2025

Removed smallpt benchmark table now the notebook has been dropped.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch 4 times, most recently from f453f30 to d1c749c Compare November 28, 2025 11:56
@mcbarton
Copy link
Collaborator Author

For future reference here is a link showing that the error catching in the ci works https://github.com/compiler-research/xeus-cpp/actions/runs/19768420282/job/56646797411?pr=415#step:7:1682 (I tried to make it so all browsers run cells in the same way, but Safari needs a small change compared to other browsers for getting to the next cell after pressing enter after standard input has been entered)

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 4170f27 to 131c41f Compare November 28, 2025 16:19
@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch 7 times, most recently from 4f622ae to 51d6856 Compare December 2, 2025 10:05
Comment on lines +117 to +174
def download_notebook(driver):
"""This function is used to download the notebook currently open."""
print("Downloading notebook by clicking download button")
search_script = """
function deepQuerySelector(root, selector) {
const walker = document.createTreeWalker(
root,
NodeFilter.SHOW_ELEMENT,
{
acceptNode: node => NodeFilter.FILTER_ACCEPT
},
false
);

while (walker.nextNode()) {
let node = walker.currentNode;

// Check if this node matches
if (node.matches && node.matches(selector)) {
return node;
}

// If this element has a shadow root, search inside it
if (node.shadowRoot) {
const found = deepQuerySelector(node.shadowRoot, selector);
if (found) return found;
}
}
return null;
}

return deepQuerySelector(document, "jp-button[data-command='docmanager:download']");
"""

download_button = driver.execute_script(search_script)

time.sleep(1)
driver.execute_script(
"""
const el = arguments[0];

// Force element to be visible and focused
el.scrollIntoView({block: 'center', inline: 'center'});

// Dispatch real mouse events since Safari WebDriver ignores .click() on Web Components
['pointerdown', 'mousedown', 'mouseup', 'click'].forEach(type => {
el.dispatchEvent(new MouseEvent(type, {
bubbles: true,
cancelable: true,
composed: true, // IMPORTANT for shadow DOM
view: window
}));
});
""",
download_button,
)

time.sleep(1)
Copy link
Collaborator Author

@mcbarton mcbarton Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost certainly not to best way to find the download button and click it, but this is the only reliable solution I could find.

@mcbarton mcbarton marked this pull request as ready for review December 2, 2025 11:43
@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch 2 times, most recently from b507bff to a4ef6d7 Compare December 2, 2025 12:47
@mcbarton
Copy link
Collaborator Author

mcbarton commented Dec 2, 2025

Once we have a example c notebook merged in, I can expand this work in a future PR to add testing of the C kernel. Also as the url for Jupyter Lab is http://127.0.0.1:8888 (same as Jupyter Lite but a different port), I can expand this work in the future so we can have Jupyter Lab testing as part of the ci (and native kernel testing even without the CppInterOp C/C++ interface)

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from a4ef6d7 to fbcc269 Compare December 5, 2025 08:46
@mcbarton mcbarton requested a review from JohanMabille December 5, 2025 09:35
@anutosh491
Copy link
Collaborator

Hi @mcbarton ,

I am guessing the PR is ready on your end ? I have written to a couple JupyterLite maintainers who shall hopefully help us with the reviews here soon !

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from c808c2b to b88c5d1 Compare December 5, 2025 09:55
@mcbarton
Copy link
Collaborator Author

mcbarton commented Dec 5, 2025

Hi @mcbarton ,

I am guessing the PR is ready on your end ? I have written to a couple JupyterLite maintainers who shall hopefully help us with the reviews here soon !

Hi @anutosh491 yes this PR is ready for review. Happy to answer any of the JupyterLite maintainers questions when they have time to review.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from 63ce79b to d7ee308 Compare December 5, 2025 10:02
@anutosh491
Copy link
Collaborator

anutosh491 commented Dec 5, 2025

@jtpio, who initiated the Jupyterlite project in 2021, says this

Anutosh : Not sure if other places like try jupyter (https://jupyter.org/try) etc test out the demo notebooks being hosted there in the CI .... so probably if this method is reasonable we can take inspiration and add something like this everywhere ?

Jeremy Tuloup: Right, testing notebooks with playwright can be a bit challenging. What we do on CI is that we build a new jupyterlite site in the workflow and then test against it.
Jeremy Tuloup: oh they use Selenium in that PR
Jeremy Tuloup: in our case, we mostly use Playwright / Galata (Galata works on top of playwright)
Jeremy Tuloup: maybe following what was done in jupyter/try-jupyter#73 can be a good start?

My remarks

  1. The PR hosted on the try-jupyter website (the website that hosts demo notebooks for all working Jupyterlite kernels) is fresh. It's just 14 days old. We should possibly build on top of it
  2. I see this issue : Build a CI tool to test jupyterlite notebooks scientific-python/summit-2025-nov#13 where Jeremy discusses his approach and gets 3 upvotes from the Jupyterlite community members.

PS : I possibly won't have the time to deep dive into the approach here or the approach used by Jeremy right away (maybe in the coming weeks) but I would go by anything Jeremy is saying for now as he's driving most of the work in the Jupyterlite community. Feel free to discuss more with him.

@mcbarton
Copy link
Collaborator Author

mcbarton commented Dec 5, 2025

@jtpio, who initiated the Jupyterlite project in 2021, says this

Anutosh : Not sure if other places like try jupyter (https://jupyter.org/try) etc test out the demo notebooks being hosted there in the CI .... so probably if this method is reasonable we can take inspiration and add something like this everywhere ?

Jeremy Tuloup: Right, testing notebooks with playwright can be a bit challenging. What we do on CI is that we build a new jupyterlite site in the workflow and then test against it. Jeremy Tuloup: oh they use Selenium in that PR Jeremy Tuloup: in our case, we mostly use Playwright / Galata (Galata works on top of playwright) Jeremy Tuloup: maybe following what was done in jupyter/try-jupyter#73 can be a good start?

My remarks

1. The PR hosted on the try-jupyter website (the website that hosts demo notebooks for all working Jupyterlite kernels) is fresh. It's just 14 days old. We should possibly build on top of it

2. I see this issue : [Build a CI tool to test jupyterlite notebooks scientific-python/summit-2025-nov#13](https://github.com/scientific-python/summit-2025-nov/issues/13) where Jeremy discusses his [approach](https://github.com/scientific-python/summit-2025-nov/issues/13#issuecomment-3586018751) and gets 3 upvotes from the Jupyterlite community members.

PS : I possibly won't have the time to deep dive into the approach here or the approach used by Jeremy right away (maybe in the coming weeks) but I would go by anything Jeremy is saying for now as he's driving most of the work in the Jupyterlite community. Feel free to discuss more with him.

@anutosh491 @jtpio Taking a quick look at the try-jupyter PR the differences that I see are that (this is only from a 5 minute glance, so may have missed stuff)

  1. The notebooks are only run in Chomium, and not Chrome, Firefox and Safari like this PR. We know from experience that xeus-cpp has worked in the past for example Chrome before, but not Safari.

A ci difference for browsers. I found what you have here works for Firefox and Chrome, but doesn't work for Safari, and why I have a weird if statement in my python script to deal with that case.

page.keyboard.type("test_input")
page.keyboard.press("Enter")
  1. It checks that the notebooks can be executed, but not that we get the expected result. My method relies comparing the executed notebook to a fully executed reference notebook in the repo.
  2. It only tests the notebook for the kernel that it has, but not all those that should work with it. For example in my method I run the demo notebook for all 3 C++ kernels.
  3. The try-jupyter method runs all notebooks in Path(file).parent.parent / "content" / "notebooks" regardless of whether you would want them too or not (at least as far as I can see). It our case that would run xeus-cpp.ipynb and xeus-cpp-lite-demo.ipynb, even though we only want xeus-cpp-lite-demo.ipynb to run.

Things I found useful from the try-jupyter PR

def wait_for_jupyterlite_ready(page: Page, timeout: int = 30000) -> None:
    page.wait_for_selector("#jp-main-dock-panel", timeout=timeout)
    wait_for_kernel_success(page, timeout)


def wait_for_notebook_ready(page: Page, timeout: int = 30000) -> None:
    page.wait_for_selector(".jp-Cell", timeout=timeout)
    wait_for_kernel_success(page, timeout)

I'm going to all take a look at how you determine you have reached the end of the notebook, since I have been relying on finding a blank cell.

It would useful to know exactly what this is checking

def wait_for_kernel_success(page: Page, timeout: int = 30000) -> None:
    page.wait_for_selector(".jp-KernelStatus-success", timeout=timeout, state="attached")

One thing I personally like about my method, but couldn't see in the try-jupyter example, is seeing the notebook being executed in the ci as its executing, so if it gets stucks or goes wrong I can see that almost immediately.

I'm sure there are lots of other differences between the methods, but I will explore more next week.

@mcbarton mcbarton force-pushed the Emscripten-notebook-tests branch from c1d4aad to 617d487 Compare December 5, 2025 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants