-
Notifications
You must be signed in to change notification settings - Fork 100
Add release notes for 0.8 #1277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
dd48dc9
68273c8
16384b0
c0a9a2b
c6b4930
d314d87
99e31de
f7e28e6
3815628
a48aede
274d5b2
adb5510
1ac5aed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,10 +1,121 @@ | ||
| # Release Notes | ||
|
|
||
| This document contains release notes for the Infra Controller (NICo) project. | ||
| This document contains release notes for the NVIDIA Infra Controller (NICo) project. | ||
|
|
||
| ## Infra Controller 0.2.0 | ||
| ## Infra Controller v0.8 | ||
|
|
||
| This release of Infra Controller (NICo) is open-source software (OSS). | ||
| ### Highlights | ||
|
|
||
| - **Documentation refresh + unified REST API docs**: Updated the docs look and feel at [https://docs.nvidia.com/infra-controller/documentation/introduction](https://docs.nvidia.com/infra-controller/documentation/introduction), and consolidated REST API information into the same documentation set. | ||
| - **Simplified deployment**: Added NICo deployment [prerequisite tool](https://github.com/NVIDIA/ncx-infra-controller-core/tree/main/helm-prereqs) `helm-prereqs` to install required dependencies and enable easy NICo deployment. | ||
| - **Rack Level Administration (RLA)**: Significantly expanded rack/tray operations via REST APIs (validation, power, firmware, bring-up, task query), gated by a site-config flag. | ||
|
|
||
| ### Compatibility Matrix | ||
|
|
||
| The following components are supported for this release: | ||
|
|
||
| | Component | Version | | ||
| |----------------------|---------| | ||
| | Cloud API | v1.4.2 | | ||
|
Coco-Ben marked this conversation as resolved.
Outdated
|
||
| | Carbide | v0.8.0 | | ||
|
Coco-Ben marked this conversation as resolved.
Outdated
|
||
| | Elektra (site agent) | v0.8.0 | | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we specify the Site Agent as well ? @Coco-Ben
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Site Agent version aligns with REST, we this should be: |
||
|
|
||
| The following dependencies have been validated for this release: | ||
|
|
||
| | Component | Version | | ||
| |------------------------|-----------------| | ||
| | DPU NIC Firmware (BF3) | 32.47.2682 | | ||
| | HBN | 3.2.2-doca3.2.2 | | ||
|
|
||
| ### Improvements | ||
|
|
||
| #### Deployment and Operations | ||
|
|
||
| - `helm-prereqs` deployment tool (Core): | ||
| - Helm/Helmfile-driven installation of NICo prerequisites--including MetalLB, Zalando PostgreSQL Operator, cert-manager, HashiCorp Vault, and external-secrets--along with the main NICo components--NICo Core and NICo REST. | ||
| - Includes orchestration and automation scripts such as `helmfile.yaml`, `setup.sh`, `preflight.sh`, and `clean.sh`. | ||
| - This tool significantly reduces installation time compared to manual installation. | ||
| - Location: [https://github.com/NVIDIA/ncx-infra-controller-core/tree/main/helm-prereqs](https://github.com/NVIDIA/ncx-infra-controller-core/tree/main/helm-prereqs) | ||
|
|
||
| #### Rack Level Administration (RLA) | ||
|
|
||
| - RLA REST API: | ||
| - Rack endpoints (RLA-backed): | ||
| - List racks / get rack by ID | ||
| - Query rack tasks | ||
| - Validate racks / validate rack by ID | ||
| - Power control (single + batch) | ||
| - Firmware update (single + batch) | ||
| - Bring-up (single + batch) | ||
| - GB200 NVLink switches are now supported for lifecycle management. | ||
| - GB200 power shelves are now supported for lifecycle management. | ||
| - GB200 racks are now supported for lifecycle management: | ||
| - At rack-level: rack bring up, power control, and firmware update | ||
| - At tray-level: compute, NVSwitch, and powershelf tray operations | ||
|
|
||
| #### VPC and Routing | ||
|
|
||
| - BGP session password support has been added for peering sessions initiated by managed host DPUs. | ||
| - Instance creation/update now supports explicit IP selection within a VPC prefix. | ||
|
|
||
| #### BMC and Site Explorer | ||
|
|
||
| - The BMC now supports static IP address assignment. | ||
|
|
||
| #### Health and Observability | ||
|
|
||
| - Health alerts can now have a specified severity level, which is "Critical" by default. If the alert classification is greater than or equal to the severity level, it will appear as an alert. Otherwise, it will be ignored. Refer to the `crates/health/example/config.example.toml` file for more details. | ||
| - The REST API now supports NVUE health checks. | ||
| - NICo now supports NMX-T metric collection for switches. | ||
|
|
||
| #### Identity and Security | ||
|
|
||
| - Credentials APIs have been added—operators can manage BMC/UEFI credentials via API. | ||
| - The SuperNIC lockdown key management workflow has been implemented. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this captured in our docs ? |
||
| - Vault connections now enforce TLS verification. | ||
|
|
||
| #### Debug UI/CLI | ||
|
|
||
| - A new IPAM section has been added to the admin UI covering DHCP, DNS, and networks. | ||
| - An expected rack component details panel has been added to the admin UI. | ||
|
|
||
| #### Platform and Infrastructure | ||
|
|
||
| - `libredfish` has been updated from v0.39.2 to v0.43.10. | ||
| - The x86 QCOW imager has been updated to Ubuntu 24.04. | ||
|
|
||
| ### Bug Fixes | ||
|
|
||
| #### VPC and Routing | ||
|
|
||
| - VPC peering VNI and prefix lists are now sorted deterministically in network config responses. | ||
|
|
||
| #### API Robustness and Validation | ||
|
|
||
| - Fixed Expected Machine OpenAPI issues around BMC default user fields. | ||
| - Standardized error handling and improved error attribution. | ||
| - Improved validation for RLA flows and addressed inventory/component-manager synchronization issues. | ||
| - Enhanced single and batch instance APIs for performance and clarity. | ||
| - Fixed typos/validation in `nvLinkLogicalPartitionId`. | ||
|
|
||
| #### Networking | ||
|
|
||
| - Strictly reject reserved IP addresses during interface update workflows. | ||
| - Made power status filterable for instance status queries. | ||
| - Reject unknown query parameters (400) to prevent typos from being silently ignored. | ||
|
|
||
| #### Security and Cleanup | ||
|
|
||
| - Required TLS certificates by default for IPAM and related services. | ||
| - Removed deprecated IPAM server code and cleaned up legacy DB relationships. | ||
|
|
||
| ### Debug UI/CLI | ||
|
|
||
| - A hardcoded credential bug has been fixed in the debug UI. | ||
|
|
||
| ## Infra Controller v0.2 | ||
|
|
||
| This release of NICo is open-source software (OSS). | ||
|
|
||
| ### Improvements | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
site-config flag - is that necessary ? Can we remove it >
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, task query is not released yet.
cc: @zhaozhongn - confirm pls