Skip to content

Network Traversal, Diagnostics#344

Open
rochaferraz wants to merge 36 commits intoopenthread:mainfrom
rochaferraz:traverser-5
Open

Network Traversal, Diagnostics#344
rochaferraz wants to merge 36 commits intoopenthread:mainfrom
rochaferraz:traverser-5

Conversation

@rochaferraz
Copy link
Contributor

@rochaferraz rochaferraz commented Feb 18, 2026

This PR introduces enhancements to the OT Commissioner, focusing on network discovery, improved diagnostics, and new utility scripts.

Core Enhancements:

  • Network Traversal & Topology:
    • Added a new traversenetwork command to the CLI to actively discover and map the Thread network topology.
    • Implemented a Traverser class responsible for querying the leader, routers, and children to gather network diagnostics.
    • Supports generating a detailed JSON report (--json <filename>) containing the collected network data and topology.
  • Fragmented Diagnostic Responses:
    • Reworked the handling of DIAG_GET.ans messages to support fragmented responses using the Query ID and Answer TLVs. This allows for the reassembly of large diagnostic reports split across multiple CoAP messages.
    • Enhanced NetDiagData with a Merge() function to combine data from multiple fragments.
    • Added ClearNetDiagTlvs() to manage diagnostic state.
    • Included comprehensive unit tests (CommissionerImplFragmentTest) to verify the reassembly logic for various scenarios.
  • Enhanced NetDiagData:
    • The NetDiagData structure now supports a wider range of TLVs, providing more detailed network information.
    • Improved JSON serialization (json.cpp) to include these new fields in the output.
  • CoAP & TLV Parser Improvements:
    • Added a safety check in CoAP SetToken to prevent potential issues with empty tokens.
    • Refined TLV parsing logic to better handle diagnostic data.

Other Changes:

  • Updates to CMake build files to include new source files.

@gemini-code-assist
Copy link

Summary of Changes

Hello @rochaferraz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the OpenThread Commissioner's capabilities by introducing advanced network diagnostic and provisioning tools. It enables comprehensive network topology discovery, improves the reliability of large diagnostic data transfers through fragmentation support, and adds a flexible proxy mode for joiner interactions. Furthermore, new Python scripts provide convenient utilities for QR code-based provisioning and network visualization, collectively offering a more powerful and user-friendly experience for managing Thread networks.

Highlights

  • Network Traversal Command: Introduced a new traversenetwork CLI command and a Traverser class to actively discover and map the Thread network topology, capable of generating detailed JSON reports of collected network data.
  • Fragmented Diagnostic Responses: Reworked the handling of DIAG_GET.ans messages to support fragmented responses using Query ID and Answer TLVs, enabling reassembly of large diagnostic reports across multiple CoAP messages. This includes enhancing NetDiagData with a Merge() function and adding ClearNetDiagTlvs().
  • Joiner Proxy Mode: Added a 'proxy mode' to the Commissioner configuration, allowing messages from joiners without a matching PSKd to be forwarded via OnJoinerMessage() and enabling the Commissioner to send UDP payloads back to joiners using SendToJoiner().
  • New Utility Scripts: Added qrcreds.py for converting Thread Operational Dataset TLV hex strings to a custom QR code format and generating PNG images, and node_diagram.py for visualizing Thread network topology from JSON output.
  • Enhanced Network Diagnostic Data: Expanded the NetDiagData structure to support a wider range of TLVs, providing more detailed network information, and improved JSON serialization to include these new fields.
  • CoAP and TLV Parsing Improvements: Implemented a safety check in CoAP SetToken to prevent issues with empty tokens and refined TLV parsing logic for diagnostic data, including more robust validation checks for TLV lengths.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • include/commissioner/commissioner.hpp
    • Added mProxyMode to Config struct to enable joiner proxying.
    • Introduced OnJoinerMessage virtual method to CommissionerHandler for forwarding joiner UDP payloads.
    • Added SendToJoiner virtual method to Commissioner interface for delivering UDP payloads to joiners.
  • include/commissioner/network_diag_data.hpp
    • Defined new Answer and QueryId structs for fragmented diagnostic responses.
    • Extended NetDiagData with mAnswer and mQueryId members.
    • Implemented a Merge function in NetDiagData to combine data from multiple diagnostic fragments.
    • Added kAnswerBit and kQueryIdBit constants to NetDiagData for tracking presence of new TLVs.
  • script/node_diagram
    • Added a new Python script to visualize Thread mesh network topology from JSON data, saving it as a PNG image.
  • script/qrcreds.py
    • Added a new Python script to convert between Thread Operational Dataset TLV hex strings and a custom QR code format.
    • Implemented functionality to generate PNG QR code images for device provisioning.
  • src/app/cli/CMakeLists.txt
    • Included traverser.cpp and traverser.hpp in the commissioner-cli executable build.
    • Added traverser.cpp and network_diag_data.cpp to test executables.
  • src/app/cli/interpreter.cpp
    • Included <set> header for new data structures.
    • Included app/cli/traverser.hpp.
    • Added traversenetwork command to the interpreter's evaluator map and usage map.
    • Expanded sDiagFlagMap with new diagnostic flags like mode, route64, leaderdata, addrs, childtable, and eui64.
    • Introduced Execute method to allow running multiple commands from a string.
    • Refactored ProcessBr to correctly manage event_base memory.
    • Updated ProcessNetworkDiagJob to use DiagAnsDataMap and resultStream as local variables.
    • Added ProcessTraverseNetwork and ProcessTraverseNetworkJob methods to handle network traversal commands.
  • src/app/cli/interpreter.hpp
    • Declared Traverser as a friend class.
    • Added Execute method for running command strings.
    • Declared ProcessTraverseNetwork and ProcessTraverseNetworkJob methods.
  • src/app/cli/main.cpp
    • Updated usage message to include new command-line options.
    • Added command-line options -e for executing commands and -p for specifying the CoAP port.
    • Modified main loop to execute commands from -e argument if provided, otherwise run the interactive interpreter.
  • src/app/cli/traverser.cpp
    • Added a new source file implementing the Traverser class for network topology discovery.
    • Defined kDiagFlagNames mapping diagnostic flags to their string representations.
    • Implemented ProcessTraverseNetwork and ProcessTraverseNetworkJob to query Thread network devices (leader, routers, children) in chunks.
    • Included logic to report missing TLVs and poll for diagnostic responses.
    • Supported JSON output of the collected network data to a specified file.
  • src/app/cli/traverser.hpp
    • Added a new header file declaring the Traverser class and its static methods for network traversal.
  • src/app/commissioner_app.cpp
    • Included <mutex> for thread-safe access to mDiagAnsDataMap.
    • Modified OnDiagGetAnswerMessage to merge incoming diagnostic data into mDiagAnsDataMap using a mutex.
    • Implemented ClearNetDiagTlvs to clear the diagnostic answer map in a thread-safe manner.
  • src/app/commissioner_app.hpp
    • Included <mutex> for thread safety.
    • Declared ClearNetDiagTlvs method.
    • Added mMutex member for protecting mDiagAnsDataMap.
  • src/app/commissioner_app_dummy.cpp
    • Provided a dummy implementation for ClearNetDiagTlvs.
  • src/app/commissioner_app_mock.hpp
    • Added a mock method for ClearNetDiagTlvs.
  • src/app/json.cpp
    • Added to_json serialization functions for Answer and QueryId structs.
    • Added to_json serialization functions for LeaderData, RouteDataEntry, Route64, ModeData, ChildTableEntry, ChildIpv6AddrInfo, HasRouteEntry, BorderRouterEntry, SixLowPanContext, PrefixEntry, and NetworkData.
    • Updated to_json for NetDiagData to include newly supported fields like Eui64, Addrs, Mode, Route64, LeaderData, ChildTable, ChildIpv6AddrsInfoList, NetworkData, Answer, and QueryId.
  • src/library/CMakeLists.txt
    • Added network_diag_data.cpp to the commissioner library sources.
    • Included googletest/include for public headers.
    • Added gmock and googletest/repo/googlemock/include for test builds.
  • src/library/coap.cpp
    • Added a safety check in SetToken to prevent undefined behavior when aToken is empty.
  • src/library/commissioner_impl.cpp
    • Added Query ID TLV to DIAG_GET.qry messages.
    • Modified HandleDiagGetAnswer to process fragmented diagnostic responses using Query ID and Answer TLVs, accumulating data in mPendingDiagQueries.
    • Implemented SendToJoiner to send UDP payloads to a specific joiner session.
    • Improved error logging for TLV decoding failures.
    • Corrected ChildId decoding in DecodeChildTable from << 9 to << 8.
    • Relaxed MacCounters length check to >= instead of ==.
    • Improved Connectivity length check to >= 7.
    • Added checks for empty buffers in DecodeChildIpv6AddressList, DecodeChild, DecodeRouterNeighbor, and DecodeMleCounters.
    • Added length checks for BatteryLevel, SupplyVoltage, MaxChildTimeout, and Version TLVs.
  • src/library/commissioner_impl.hpp
    • Included gtest/gtest_prod.h for friend tests.
    • Declared SendToJoiner virtual method.
    • Added mPendingDiagQueries map and mNextQueryId for handling fragmented diagnostic responses.
    • Added FRIEND_TEST declarations for new fragmentation and proxy mode tests.
  • src/library/commissioner_impl_test.cpp
    • Added NetDiagDataTest.Merge_AppendsVectors to verify correct merging of vector fields.
    • Added NetDiagDataTest.Decode_AnswerAndQueryId to test decoding of new TLVs.
    • Introduced CommissionerImplFragmentTest fixture for testing fragmented diagnostic responses.
    • Added tests for FragmentedDiagResponse, FragmentedDiagResponse_SinglePacketWithQueryId_Implicit, FragmentedDiagResponse_SinglePacketWithQueryId_Explicit, FragmentedDiagResponse_Legacy, and FragmentedDiagResponse_MultiPacket_ThreeFragments.
  • src/library/commissioner_safe.cpp
    • Implemented SendToJoiner for the CommissionerSafe wrapper, using asynchronous execution.
  • src/library/commissioner_safe.hpp
    • Included necessary headers for event2, gtest_prod, stdint, functional, memory, queue, string, thread, vector, commissioner/defines.hpp, commissioner/error.hpp, and commissioner/network_data.hpp.
    • Changed CommissionerSafe constructor to explicit.
    • Declared SendToJoiner override.
    • Added FRIEND_TEST declarations for CommissionerSafeTestProxyMode tests.
  • src/library/commissioner_safe_test.cpp
    • Included gmock/gmock.h, commissioner/commissioner.hpp, commissioner/defines.hpp, commissioner/error.hpp, common/address.hpp, and library/joiner_session.hpp.
    • Added TEST(CommissionerSafeTestProxyMode, ShouldBeAbleToReceiveJoinerMessage) to verify joiner message reception in proxy mode.
    • Added TEST(CommissionerSafeTestProxyMode, ShouldBeAbleToSendToJoinerIfJoinerSessionExists) to test sending messages to existing joiner sessions.
    • Added TEST(CommissionerSafeTestProxyMode, ShouldFailToSendToJoinerForNotFoundIfJoinerSessionDoesNotExist) to test error handling for non-existent joiner sessions.
  • src/library/joiner_session.cpp
    • Modified JoinerSession constructor to conditionally initialize DTLS and CoAP based on IsProxyMode().
    • Updated Start() method to handle proxy mode, setting a commissioning timeout instead of DTLS handshake.
    • Modified RecvJoinerDtlsRecords to accept aJoinerUdpPort and store it with the received data.
    • Updated SendRlyTx to accept aJoinerUdpPort and log proxy mode specific messages.
    • Modified RelaySocket::Send to accept aPort and pass it to SendRlyTx.
    • Modified RelaySocket::Receive to retrieve aUdpPort along with data, and added port mismatch checks.
    • Changed mRecvBuf to mRecvBufs (a queue of std::pair<ByteArray, uint16_t>) to store packets with their source ports.
  • src/library/joiner_session.hpp
    • Included stddef.h, stdint.h, memory, queue, utility, commissioner/defines.hpp, common/address.hpp, and library/socket.hpp.
    • Updated GetPeerAddr() and GetPeerPort() to use mRelaySocket.
    • Renamed Connect() to Start().
    • Modified Disabled() to check if mDtlsSession is valid before accessing its state.
    • Updated RecvJoinerDtlsRecords signature to include aJoinerUdpPort.
    • Added SendTo method for sending data to a joiner.
    • Added IsProxyMode() method to check if the session is in proxy mode.
    • Updated SendRlyTx signature to include aJoinerUdpPort.
    • Modified RelaySocket to include Send(const uint8_t *aBuf, size_t aLen, uint16_t aPort) and Receive(uint8_t *aBuf, size_t aMaxLen, uint16_t &aUdpPort).
    • Replaced ByteArray mRecvBuf with std::queue<std::pair<ByteArray, uint16_t>> mRecvBufs.
    • Changed coap::Coap mCoap to std::shared_ptr<coap::Coap> mCoap.
  • src/library/network_diag_data.cpp
    • Added a new source file defining constexpr initializations for NetDiagData bit flags.
  • src/library/tlv.cpp
    • Updated Tlv::IsValid() for Network Diagnostic TLVs to include more specific length checks based on Thread specifications.
    • Added validation for new kNetworkDiagAnswer and kNetworkDiagQueryID TLVs.
    • Improved logging for dropped invalid/unknown TLVs from LOG_WARN to LOG_DEBUG.
  • src/library/tlv.hpp
    • Added kNetworkDiagAnswer and kNetworkDiagQueryID to the tlv::Type enum.
    • Added new Network Data TLVs: kNetworkDataCommissioningData, kNetworkDataService, and kNetworkDataServer.
Activity
  • The pull request was created by rochaferraz to introduce significant new features and improvements to the OT Commissioner.
  • No human comments or reviews have been recorded yet for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@rochaferraz
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial pull request that introduces significant enhancements to the OT Commissioner, including network traversal and topology discovery, improved diagnostics with support for fragmented responses, and new utility scripts for QR code generation and network visualization. The changes are well-structured and the new features are thoughtfully implemented. My review focused on correctness and maintainability, and I've identified a couple of areas for improvement: one related to TLV validation logic and another concerning packet handling in the new proxy mode. Overall, this is a very solid contribution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements, including network traversal capabilities, improved diagnostics with support for fragmented responses, and new utility scripts for network visualization and QR code generation. The changes are well-structured and the new features are valuable. I have one suggestion to improve the consistency of the JSON output from the traversenetwork command, which would make it easier for consumer scripts to parse.

@rochaferraz rochaferraz force-pushed the traverser-5 branch 3 times, most recently from 9291836 to 1f99d5e Compare February 19, 2026 03:21
- Added safety check in CoAP SetToken to prevent undefined behavior with empty vectors.
- Enhanced TLV parsing utilities to support fragmented diagnostic data.
- Extended NetDiagData to support additional TLVs (e.g., Child Table, Address List).
- Implemented serialization helpers and constants in network_diag_data.cpp.
- Updated CMakeLists.txt to include the new source file.
- Implemented logic to handle fragmented diagnostic responses using Query ID and Answer TLVs.
- Added accumulation of multi-packet responses into a single NetDiagData structure.
- Enhanced debug logging for diagnostic transactions.
…tion

- Added NetDiagGetQuery method to CommissionerApp interface.
- Implemented dummy and mock implementations for testing.
- Added json.cpp helper for serializing NetDiagData to JSON format.
- Introduced 'traversenetwork' command in CLI.
- Implemented Traverser class to discover leader, routers, and children.
- Added support for generating detailed JSON network topology reports.
- Updated Interpreter to handle new command syntax and logic.
- Added CommissionerImplFragmentTest to verify handling of fragmented diagnostic responses.
- Covered scenarios for legacy, single-packet, and multi-packet (reassembly) responses.
- Added script/node_diagram to visualize the network topology from the JSON output.
This script converts between Thread Operational Datasets (as TLV hex strings)
and a custom QR code text string format used for provisioning Thread devices.
It can also generate PNG image files of the QR codes.

The custom "THREAD:" format is modeled after the standard Wi-Fi QR code
provisioning format. The Wi-Fi format itself is derived from the MeCard
specification, originally developed by NTT DoCoMo in Japan for sharing
contact information via QR codes on cellular phones.

Format Structure:
  THREAD:S:<Network Name>;P:<Network Key>;I:<PAN ID>;E:<Ext PAN ID>;C:<Channel>;;

Removed unused network_evaluator.cpp and network_evaluator.hpp as they were
not referenced or compiled anywhere.
The leader was observed sending 0-byte kNetworkDiagChildIpv6Address
(type 30) TLVs, which would previously fail validation and be dropped.
Update the validation logic to gracefully accept these empty TLVs.
In proxy mode, if a received UDP packet is larger than the buffer provided to
Receive(), the packet was not fully consumed and the remainder was left in
the queue. This could lead to processing a partial packet as a full packet on
the next read, which is incorrect for datagram-based communication.

Drop the oversized packet entirely by removing it from the queue.
The JSON field MacAddr holds the RLOC16, which is inconsistent with RouterNeighbor and Child objects that use Rloc16 for the same data type. This changes the key from MacAddr to Rloc16 in the C++ JSON generator and updates the Python node diagram script to use the new key.
…failure

Forward joiner messages to OnJoinerMessage callback when in Proxy Mode, bypassing DTLS. Also added missing FRIEND_TEST.
Reverted content to 299e537 to remove unnecessary refactoring and restore original Proxy Mode logic. Re-applied fix for dropping oversized packets in Proxy Mode by flushing the receive queue.
@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 22.69601% with 562 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.92%. Comparing base (6a5cddb) to head (19f2bf2).
⚠️ Report is 36 commits behind head on main.

Files with missing lines Patch % Lines
src/library/network_traverser.cpp 1.94% 202 Missing ⚠️
src/app/cli/traverser.cpp 0.00% 151 Missing ⚠️
src/app/json.cpp 0.00% 74 Missing ⚠️
src/library/commissioner_impl.cpp 61.20% 45 Missing ⚠️
src/app/cli/console.cpp 0.00% 29 Missing ⚠️
src/app/cli/interpreter.cpp 60.86% 18 Missing ⚠️
src/app/commissioner_app.cpp 0.00% 14 Missing ⚠️
src/app/cli/main.cpp 40.00% 9 Missing ⚠️
src/library/commissioner_safe.cpp 0.00% 7 Missing ⚠️
src/library/tlv.cpp 77.77% 6 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #344      +/-   ##
==========================================
- Coverage   72.75%   68.92%   -3.83%     
==========================================
  Files          72       76       +4     
  Lines        7444     9011    +1567     
==========================================
+ Hits         5416     6211     +795     
- Misses       2028     2800     +772     
Files with missing lines Coverage Δ
include/commissioner/commissioner.hpp 21.42% <ø> (-3.58%) ⬇️
include/commissioner/network_diag_data.hpp 100.00% <100.00%> (ø)
src/app/cli/job.cpp 100.00% <ø> (ø)
src/app/cli/job.hpp 100.00% <ø> (ø)
src/app/cli/job_manager.cpp 80.47% <ø> (+1.00%) ⬆️
src/app/cli/job_manager.hpp 100.00% <ø> (ø)
src/app/commissioner_app.hpp 100.00% <ø> (ø)
src/library/coap.cpp 63.11% <100.00%> (+2.65%) ⬆️
src/library/commissioner_impl.hpp 90.00% <ø> (-6.43%) ⬇️
src/library/commissioner_safe.hpp 100.00% <ø> (ø)
... and 15 more

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Added src/library/network_traverser_test.cpp with tests for:
  - Happy path traversal (Leader, Routers, Children).
  - Data merging from multiple chunks.
  - Timeout and retry logic.
  - Cancellation (Stop).
  - Chunk skipping upon failure.
- Fixed missing headers in console.cpp and coap.cpp.
- Minor updates to NetworkTraverser implementation to support testing.
@rochaferraz rochaferraz changed the title Network Traversal, Diagnostics, and Utility Scripts Network Traversal, Diagnostics Feb 20, 2026
@rochaferraz
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the OT Commissioner with a new network traversal feature, support for fragmented diagnostic responses, and improved CLI capabilities. The implementation of the Traverser class and the reassembly logic for DIAG_GET.ans messages are well-structured. However, there are critical concerns regarding memory management in the diagnostic reassembly logic, where stale pending queries could lead to a memory leak. Additionally, the derivation of child RLOC16s during traversal relies on potentially ambiguous data and should be made more robust by using the authoritative Router ID from the Leader Data. Several improvements to TLV validation and error handling have also been identified to ensure better adherence to the Thread specification and overall system stability.

@rochaferraz rochaferraz marked this pull request as ready for review February 20, 2026 23:29
- Added traverser_test.cpp to test global timeout functionality
- Refactored ProcessTraverseNetworkJob to accept a configurable timeout
- Updated CMakeLists.txt to include new test file
- Included minor formatting updates
@rochaferraz
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the OT Commissioner, primarily focusing on network discovery and diagnostics. The new traversenetwork command and its underlying Traverser implementation provide a powerful tool for mapping network topology. The rework of DIAG_GET.ans message handling to support fragmented responses is a crucial improvement for dealing with large diagnostic reports.

The code is well-structured, and the addition of comprehensive unit tests for the new fragmentation logic is commendable. I've also noted several important bug fixes, including a memory leak fix in ProcessBr, prevention of undefined behavior when handling empty tokens in CoAP messages, and a fix for an infinite loop in the joiner session logic. These improvements significantly enhance the robustness and reliability of the commissioner.

I have a few minor suggestions for improvement. Overall, this is an excellent contribution.

Added explicit error checking and logging to:
- internal::DecodeNetworkData
- internal::DecodePrefixEntry
- internal::DecodeChildTable

This improves diagnosability of malformed or unexpected TLV data during network traversing.
The Traverse_Flow_LeaderOnly, Traverse_Flow_RoutersAndChildren, and
Diff_Chunks_Merged tests were simulating network diagnostic responses
where some chunks (specifically Network Data chunk which includes Leader Data)
were empty but marked as present.

This caused the NetworkTraverser to parse an empty LeaderData structure,
resulting in a Router ID of 0 (default). Since the Leader RLOC16 (Chunk 0)
was 0xFC00 (checks against Router ID 63), this mismatch caused the traverser
to incorrectly identify the Leader as a regular Router and queue it for
querying, leading to unexpected behavior strings in the test flow.

This commit updates the test simulations to populate mRouterId correctly
whenever kLeaderDataBit is requested, ensuring the traverser correctly
identifies the Leader and proceeds with the test flow.
@rochaferraz rochaferraz requested review from bukepo and wgtdkp February 21, 2026 04:10
/**
* @brief Handler for network traversal events.
*/
struct TraverseHandler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we define this a class with virtual methods? Just like the existing CommisionerHandler class

* @retval kInvalidState Commissioner is not active.
* @retval kAlready Traversal is already in progress.
*/
virtual Error TraverseNetwork(TraverseHandler aHandler) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you API receives a handler, then it would be better that all errors are handled in the handler so that the caller doesn't need to write error handling in two places.

Suggested change
virtual Error TraverseNetwork(TraverseHandler aHandler) = 0;
virtual void TraverseNetwork(TraverseHandler aHandler) = 0;

Comment on lines +208 to +211
struct QueryId
{
uint16_t mQueryId = 0;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to define a struct

Comment on lines +49 to 54
#include <sys/socket.h>
#include <sys/time.h>

#include <fcntl.h>
#ifdef __linux__
#include <sys/socket.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include <sys/socket.h>
#include <sys/time.h>
#include <fcntl.h>
#ifdef __linux__
#include <sys/socket.h>
#include <fcntl.h>
#ifdef __linux__
#include <sys/socket.h>
#include <sys/time.h>

Comment on lines +202 to +228
{"config", &Interpreter::ProcessConfig},
{"start", &Interpreter::ProcessStart},
{"stop", &Interpreter::ProcessStop},
{"active", &Interpreter::ProcessActive},
{"token", &Interpreter::ProcessToken},
{"br", &Interpreter::ProcessBr},
{"domain", &Interpreter::ProcessDomain},
{"network", &Interpreter::ProcessNetwork},
{"sessionid", &Interpreter::ProcessSessionId},
{"borderagent", &Interpreter::ProcessBorderAgent},
{"joiner", &Interpreter::ProcessJoiner},
{"commdataset", &Interpreter::ProcessCommDataset},
{"opdataset", &Interpreter::ProcessOpDataset},
{"bbrdataset", &Interpreter::ProcessBbrDataset},
{"reenroll", &Interpreter::ProcessReenroll},
{"domainreset", &Interpreter::ProcessDomainReset},
{"migrate", &Interpreter::ProcessMigrate},
{"mlr", &Interpreter::ProcessMlr},
{"announce", &Interpreter::ProcessAnnounce},
{"panid", &Interpreter::ProcessPanId},
{"energy", &Interpreter::ProcessEnergy},
{"exit", &Interpreter::ProcessExit},
{"quit", &Interpreter::ProcessExit},
{"help", &Interpreter::ProcessHelp},
{"state", &Interpreter::ProcessState},
{"netdiag", &Interpreter::ProcessNetworkDiag},
{"traversenetwork", &Interpreter::ProcessTraverseNetwork},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those changes generated by the format tool? Otherwise, prefer keep the format unchanged and only add your new command

Comment on lines +1259 to +1263
if (len > 0)
{
// Accessing &aToken[0] when size is 0 is undefined behavior.
memcpy(mHeader.mToken, &aToken[0], len);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::vector::data():

Suggested change
if (len > 0)
{
// Accessing &aToken[0] when size is 0 is undefined behavior.
memcpy(mHeader.mToken, &aToken[0], len);
}
memcpy(mHeader.mToken, aToken.data(), len);

Comment on lines +649 to 657

// Add Query ID TLV
{
uint16_t queryId = mNextQueryId++;
SuccessOrExit(error = AppendTlv(request, {tlv::Type::kNetworkDiagQueryID, queryId}));
}

LOG_DEBUG(LOG_REGION_MESHDIAG, "sent DIAG_GET.qry with Query ID {}", mNextQueryId - 1);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Add Query ID TLV
{
uint16_t queryId = mNextQueryId++;
SuccessOrExit(error = AppendTlv(request, {tlv::Type::kNetworkDiagQueryID, queryId}));
}
LOG_DEBUG(LOG_REGION_MESHDIAG, "sent DIAG_GET.qry with Query ID {}", mNextQueryId - 1);
// Add Query ID TLV
{
uint16_t queryId = mNextQueryId++;
SuccessOrExit(error = AppendTlv(request, {tlv::Type::kNetworkDiagQueryID, queryId}));
LOG_DEBUG(LOG_REGION_MESHDIAG, "sent DIAG_GET.qry with Query ID {}", queryId);
}

if (error != ErrorCode::kNone)
{
LOG_ERROR(LOG_REGION_MESHDIAG, "DecodeNetworkData: failed to get Prefix TLV list: {}", error.ToString());
goto exit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and else where, please use ExitNow()

Suggested change
goto exit;
ExitNow();

{
LOG_WARN(LOG_REGION_JOINER_SESSION, "session(={}) insufficient buffer size {}, {} needed",
static_cast<void *>(&mJoinerSession), aMaxLen, buf.size());
mRecvBufs.pop();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bukepo Is this fix correct?

{
LOG_WARN(LOG_REGION_JOINER_SESSION, "session(={}) insufficient buffer size {}, {} needed",
static_cast<void *>(&mJoinerSession), aMaxLen, buf.size());
mRecvBufs.pop();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rochaferraz we should really submit fixes like this in separate PRs. Otherwise, reviewers will be very likely going to ignore those changes.

@wgtdkp wgtdkp requested a review from ZhangLe2016 February 27, 2026 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants