Skip to content

Conversation

@stephenchengCloud
Copy link
Collaborator

Sync feature branch with master. No code changed.

$ git show 27199ebb6e9408dc500b02b467671441b345801b
commit 27199ebb6e9408dc500b02b467671441b345801b (HEAD -> private/stephenche/vnc_sync_260107, mygit/private/stephenche/vnc_sync_260107, bb/private/stephenche/vnc_sync_260107)
Merge: 8c88947f3 19f2398fd
Author: Stephen Cheng <[email protected]>
Date:   Wed Jan 7 09:56:17 2026 +0800

    Merge branch 'master' into private/stephenche/vnc_sync_260107

    Signed-off-by: Stephen Cheng <[email protected]>

diff --cc ocaml/idl/schematest.ml
index 7dc03c97b,e0658e78a..a90bf8687
--- a/ocaml/idl/schematest.ml
+++ b/ocaml/idl/schematest.ml
@@@ -3,7 -3,7 +3,7 @@@ let hash x = Digest.string x |> Digest.
  (* BEWARE: if this changes, check that schema has been bumped accordingly in
     ocaml/idl/datamodel_common.ml, usually schema_minor_vsn *)

- let last_known_schema_hash = "9e085767a7a70fb84747776c4d6cc663"
 -let last_known_schema_hash = "d8cb04ccddfd91ca3f0f9074dcf7c219"
++let last_known_schema_hash = "a01358e3ff5f42d5aee162e995d2ec05"

  let current_schema_hash : string =
    let open Datamodel_types in

BengangY and others added 30 commits September 23, 2025 16:08
Two commands are used to set max_cstate: xenpm to set at runtime
and xen-cmdline to set it in grub conf file to take effect after
reboot.

Signed-off-by: Changlei Li <[email protected]>
String is used to represent the max_cstate and max_sub_cstate.
"" -> unlimited
"N" -> max cstate CN
"N,M" -> max cstate CN with max sub state M
Just follow the xen-cmdline cstate, see
https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#max_cstate-x86

Signed-off-by: Changlei Li <[email protected]>
C-states are power management states for CPUs where higher numbered
states represent deeper sleep modes with lower power consumption but
higher wake-up latency. The max_cstate parameter controls the deepest
C-state that CPUs are allowed to enter.

Common C-state values:
- C0: CPU is active (not a sleep state)
- C1: CPU is halted but can wake up almost instantly
- C2: CPU caches are flushed, slightly longer wake-up time
- C3+: Deeper sleep states with progressively longer wake-up times

To set max_cstate on dom0 host, two commands are used: `xenpm` to set at
runtime and `xen-cmdline` to set it in grub conf file to take effect
after reboot.
xenpm examples:
```
   # xenpm set-max-cstate 0 0
   max C-state set to C0
   max C-substate set to 0 succeeded
   # xenpm set-max-cstate 0
   max C-state set to C0
   max C-substate set to unlimited succeeded
   # xenpm set-max-cstate unlimited
   max C-state set to unlimited
   # xenpm set-max-cstate -1
   Missing, excess, or invalid argument(s)
```
xen-command-line examples:
```
/opt/xensource/libexec/xen-cmdline --get-xen max_cstate
     "" -> unlimited
     "max_cstate=N" -> max cstate N
     "max_cstate=N,M" -> max cstate N, max c-sub-state M *)
/opt/xensource/libexec/xen-cmdline --set-xen max_cstate=1
/opt/xensource/libexec/xen-cmdline --set-xen max_cstate=1,0
/opt/xensource/libexec/xen-cmdline --delete-xen max_cstate
```

[xen-command-line.max_cstate](https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#max_cstate-x86).

This PR adds a new field `host.max_cstate` to manage host's max_cstate.
`host.set_max_cstate` use the two commands mentioned above to configure.
While dbsync on xapi start, the filed will be synced by `xen-cmdline
--get-xen max_cstate`
- write ntp servers to chrony.conf
- interaction with dhclient
  - handle /run/chrony-dhcp/$interface.sources
  - handle chrony.sh
- restart/enable/disable chronyd

Signed-off-by: Changlei Li <[email protected]>
At XAPI start, check the actual NTP config to determine the
ntp mode, ntp enabled, ntp custom servers and store in xapi
DB.

Signed-off-by: Changlei Li <[email protected]>
New filed: `host.ntp_mode`, `host.ntp_custom_servers`
New API: `host.set_ntp_mode`, `host.set_ntp_custom_servers`,
`host.get_ntp_mode`, `host.get_ntp_custom_servers`,
`host.get_ntp_servers_status`.

**ntp_mode_dhcp**: In this mode, ntp uses the dhcp assigned ntp servers
as sources. In Dom0, dhclient triggers `chrony.sh` to update the ntp
servers when network event happens. It writes ntp servers to
`/run/chrony-dhcp/$interface.sources` and the dir `/run/chrony-dhcp` is
included in `chrony.conf`. The dhclient also stores dhcp lease in
`/var/lib/xcp/dhclient-$interface.leases`, see
https://github.com/xapi-project/xen-api/blob/v25.31.0/ocaml/networkd/lib/network_utils.ml#L925.
When switch ntp mode to dhcp, XAPI checks the lease file and finds ntp
server then fills chrony-dhcp file. The exec permission of `chrony.sh`
is added. When swith ntp mode from dhcp to others, XAPI removes the
chrony-dhcp files and the exec permission of `chrony.sh`. The operation
is same with xsconsole
https://github.com/xapi-project/xsconsole/blob/v11.1.1/XSConsoleData.py#L593.
In this feature, xsconsole will change to use XenAPI to manage ntp later
to avoid conflict.

**ntp_mode_custom**: In this mode, ntp uses `host.ntp_custom_servers` as
sources. This is implemented by changing `chrony.conf` and restart
chronyd. `host.ntp_custom_servers` is set by the user.

**ntp_mode_default**: In this mode, ntp uses default-ntp-servers in XAPI
config file.
For example, the legacy default ntp servers are
[0-3].centos.pool.ntp.org, and current default
ntp servers are [0-3].xenserver.pool.ntp.org.
After update or upgrade, the legacy default ntp
servers are recognized and changed to current
default ntp servers. The mode is ntp_mode_default
as well.

Signed-off-by: Changlei Li <[email protected]>
For example, the legacy default ntp servers are
`[0-3].centos.pool.ntp.org`, and current default ntp servers are
`[0-3].xenserver.pool.ntp.org`. After update or upgrade, the legacy
default ntp servers are recognized and changed to current default ntp
servers. The mode is `ntp_mode_default` as well.
Add a new config option named legacy-default-ntp-servers. It will be
defined in xapi.conf.d/xenserver.conf (the same with
default-ntp-servers)
Hugo supports three styles of front matter to add meta data.
See https://gohugo.io/content-management/front-matter/.
Follow other pages in content/design/ to use yaml style.

Signed-off-by: Changlei Li <[email protected]>
Hugo supports three styles of front matter to add meta data. See
https://gohugo.io/content-management/front-matter/. Follow other pages
in content/design/ to use yaml style.
Signed-off-by: Changlei Li <[email protected]>
In the case of cross pool SXM, the running VM will be created
on destination host with initial halted state. The state shall
be changed by refresh_vm. The function force_state_reset before
it will incorrectly clear some vm config.
Move force_state_reset after refresh_vm fater refresh_vm in
pool_migrate_complete to solve it.

Signed-off-by: Changlei Li <[email protected]>
…api-project#6731)

There is a regression test fail between xapi v25.30.0 and v25.33.0. The
job is cross pool SXM with vGPU. The source host VM.migrate_send failed
with exception `Storage_error ([S(Does_not_exist);[S(mirror)` which is
raised by `MIRROR.stat`. The `MIRROR.stat` is triggered by destination
host in `pool_migrate_complete`. The error is
`Server_error(HANDLE_INVALID, [ PGPU; OpaqueRef:NULL ])`. I find
xapi-project#6648 inserts
`force_state_reset_keep_current_operations` in `pool_migrate_complete`
which set VGPU resident_on to NULL. After reverting this commit, the job
pass.
Although I doubt it, please check whether Xenserver's proprietary code
uses these functions.

Signed-off-by: Pau Ruiz Safont <[email protected]>
This makes changing the chop tests easier

Signed-off-by: Pau Ruiz Safont <[email protected]>
This function is equivalent to calling take and drop on the same
parameters; as well as being very similar to the chop function, with the
difference that when using an out-of-bounds parameter now it returns
empty lists instead of raising an exception.

Signed-off-by: Pau Ruiz Safont <[email protected]>
Now users can use out-of-bounds limits without fear. I've used the
opportunity to consolidate code that sorts and splits lists in
db_gc_util.

Signed-off-by: Pau Ruiz Safont <[email protected]>
This can be replaced with a List.drop. For now this is available in
xapi's stdext, but it will become part of the standard library

Signed-off-by: Pau Ruiz Safont <[email protected]>
It's surprising this bug went unnoticed

Signed-off-by: Pau Ruiz Safont <[email protected]>
changlei-li and others added 20 commits December 31, 2025 13:51
There is race condition about vm cache between pool_migrate_complete
and VM event.
In the cross-pool migration case, it is designed to create vm with
power_state Halted in XAPI db. In pool_migrate_complete, add_caches
create an empty xenops_chae for the VM, then refresh_vm compares the
cache powerstate None with its real state Running to update the
right powerstate to XAPI db.
In the fail case, it is found that:
-> VM event 1 update_vm
-> pool_migrate_complete add_caches (cache power_state None)
-> pool_migrate_complete refresh_vm
-> VM event 1 update cache (cache power_state Running)
-> VM event 2 update_vm (Running <-> Running, XAPI DB not update)
When pool_migrate_complete add_caches, the cache update of previous
VM event 1 breaks the design intention.

This commit add a wait in pool_migrate_complete to ensure all
in-flight events complete before add_caches. Then there will be no
race condition.

Signed-off-by: Changlei Li <[email protected]>
There is race condition about vm cache between pool_migrate_complete and
VM event.
In the cross-pool migration case, it is designed to create vm with
power_state Halted in XAPI db. In pool_migrate_complete, add_caches
create an empty xenops_chae for the VM, then refresh_vm compares the
cache powerstate None with its real state Running to update the right
powerstate to XAPI db.
In the fail case, it is found that:
-> VM event 1 update_vm
-> pool_migrate_complete add_caches (cache power_state None)
-> pool_migrate_complete refresh_vm
-> VM event 1 update cache (cache power_state Running)
-> VM event 2 update_vm (Running <-> Running, XAPI DB not update)
When pool_migrate_complete add_caches, the cache update of previous VM
event 1 breaks the design intention.

This commit add a wait in pool_migrate_complete to ensure all in-flight
events complete before add_caches. Then there will be no race condition.
This function is useful to apply a function to a list that may fail, it
returns the first error, or all the successful results.

Signed-off-by: Pau Ruiz Safont <[email protected]>
Now Db_not_initialized is raised instead of Not_found, which was raised
by List.hd. This makes the exception much more recognizable and easier
to locate.

Signed-off-by: Pau Ruiz Safont <[email protected]>
These can be easily replaced with a match against the list, or changing
parameters to separate the head from the rest of the list

Signed-off-by: Pau Ruiz Safont <[email protected]>
Previously an exception was raised for last, head is a new function.

Also adds a latest_release to datamodel_types that can't fail after the
module has been loaded. (the list is populated so it won't fail when the
module is loaded either)

Signed-off-by: Pau Ruiz Safont <[email protected]>
This forces users to deal with None in the new code.

One of them is replace with the new function List.try_map that returns
the first error. While split_on_char never returns an empty list, it's a
good test for using try_map

Signed-off-by: Pau Ruiz Safont <[email protected]>
The return type is the same, but it's more obvious to see what it does

Signed-off-by: Pau Ruiz Safont <[email protected]>
This is done by using options and avoiding exceptions as much as
possible, or using custom exceptions that help finding the cause.
Currently, vhd-tool provides several "hybrid" modes where it exports into vhd
from raw, using the information from the VHD bitmaps to determine which blocks
and sectors contain data (to avoid reading zero blocks).

Other tools are also handling VHD-backed VDIs (we are exporting them as part of
XVA export, and now they can also be exported to QCOW), and currently they have
to read the whole raw disk.

Instead provide a read_headers command which provides data on allocated
clusters for other tools to use, allowing them to speed up handling sparse
VDIs. It uses a new blocks_json function in Vhd_format.

Signed-off-by: Andrii Sultanov <[email protected]>
The body has less indentation this way

Signed-off-by: Andrii Sultanov <[email protected]>
This allows using it in stream_vdi and qcow_tool_wrapper without introducing a
dependency cycle.

Signed-off-by: Andrii Sultanov <[email protected]>
Qcow_tool_wrapper and Vhd_tool_wrapper expect a particular driver to be backing
the VDI and fall back to handling the VDI as raw otherwise - they will be using
backing_file_of_device_with_driver.

Stream_vdi, however, will need to branch on the type of the driver, and it will
use backing_info_of_device (which also returns the type of the driver)

Signed-off-by: Andrii Sultanov <[email protected]>
Split common code used by {Vhd,Qcow}_tool_wrapper into a new vhd_qcow_parsing
module.

Since Vhd_tool_wrapper.run_vhd_tool is hardcoded to read the progress
percentage printed by vhd-tool, we have to use the more generic
Vhd_qcow_parsing.run_qcow_tool to run vhd-tool.

Since VHD and QCOW follow the same format of JSON, use the same parse_header
function.

Signed-off-by: Andrii Sultanov <[email protected]>
Reads the bitmaps for VHD- and QCOW-backed VDIs, determines which clusters are
allocated and only reads and writes these to the resulting xva.

This avoids the need for the "timeout workaround", which is needed when no data
has been sent for an extended period of time (so stream_vdi writes a "packet"
that doesn't carry any data, just a checksum of an empty body. in case of a
compressed export, however, the compressor binary buffers output and this
timeout workaround does not work).

This also greatly speeds up export of VMs with sparse VDIs.

Signed-off-by: Andrii Sultanov <[email protected]>
Some of the users of the function did not handle exceptions correctly - make
the "not found" case explicit with an option.

Signed-off-by: Andrii Sultanov <[email protected]>
…xport (xapi-project#6786)

Following xapi-project#6769, add a
`read_headers` command to `vhd-tool` (following `qcow-tool`'s JSON
format). This allows `stream_vdi` to determine which clusters are
allocated in QCOW- and VHD-backed VDIs, only reading and writing
allocated blocks (previously it read the whole raw disk, verifying if
blocks only contain zeros).

If there are any issues during header parsing, it falls back to the slow
path (we don't handle errors during XVA export well, embedding 500
packets inside 200s)

This greatly speeds up XVA export for VMs with sparse VDIs:

5gb empty VDI: 19s -> 3s
5gb empty VDI + 2mb filled VDI: 22s -> 6s
5gb empty VDI + half-empty VDI (~4 gigs out of 10): 89s -> 49s

Note: If the block size of the VDI is larger than the size of the XVA
blocks (this is currently the case for VHD, it has blocks of 2mb, while
stream_vdi splits xvas into files of 1mb), stream_vdi can overestimate
the size of allocated data (say, if only the first half of the VHD block
has data), but in testing with real VDIs this impact was within a margin
of error.
@changlei-li
Copy link
Contributor

changlei-li commented Jan 7, 2026

Remember to bump schema_minor_vsn as the comment says

BEWARE: if this changes, check that schema has been bumped accordingly in ocaml/idl/datamodel_common.ml, usually schema_minor_vsn

Signed-off-by: Stephen Cheng <[email protected]>
@stephenchengCloud
Copy link
Collaborator Author

Remember to bump schema_minor_vsn as the comment says

BEWARE: if this changes, check that schema has been bumped accordingly in ocaml/idl/datamodel_common.ml, usually schema_minor_vsn

Updated vsn and also the datamodel lifecycle

@stephenchengCloud stephenchengCloud merged commit 64fda7a into xapi-project:feature/limit-vnc-console-sessions Jan 7, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.