Skip to content

Conversation

@visz11
Copy link
Collaborator

@visz11 visz11 commented Dec 11, 2025

CodeAnt-AI Description

Add deterministic topic and group hashing used by the group coordinator

What Changed

  • Added computeTopicHash that produces a Murmur3-based hash from topic id, name, partition count, and per-partition racks (racks sorted and joined with ';'), with partitions processed in sorted order
  • Added computeGroupHash that combines topic hashes in a stable order (topics sorted by name) to produce a group-level hash
  • Added unit tests that assert the hashes change when topic id, name, partition count, partition order, or rack order change, and that group hash is sensitive to topic order
  • Updated import control to allow the hashing library used by these methods

Impact

✅ Detects topic metadata changes affecting group hash
✅ Stable group hashing regardless of map or partition iteration order
✅ Fewer incorrect group equality matches due to nondeterministic ordering

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

  • Chores

    • Added Guava library as a dependency to support hashing functionality.
  • Tests

    • Added comprehensive test coverage for hash computation methods, validating hash stability and sensitivity to input variations.

✏️ Tip: You can customize this high-level summary in your review settings.

@codeant-ai
Copy link

codeant-ai bot commented Dec 11, 2025

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@visz11
Copy link
Collaborator Author

visz11 commented Dec 11, 2025

@refacto-visz

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@github-actions github-actions bot added the build label Dec 11, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @visz11, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the group coordinator by introducing a mechanism to compute cryptographic hashes for Kafka topics and consumer groups. This functionality is designed to provide a quick and reliable way to detect changes in topic configurations or the composition of consumer groups, which is vital for maintaining consistent internal state and optimizing operations. The implementation leverages the Guava Hashing library for its proven hashing capabilities.

Highlights

  • New Hashing Functionality: Introduced methods to compute unique hashes for Kafka topics and consumer groups, enabling efficient change detection and state management.
  • Guava Hashing Integration: Integrated the Guava Hashing library to provide robust and well-tested hashing algorithms, specifically Murmur3 for topic hashing and combineOrdered for group hashing.
  • Comprehensive Unit Tests: Added a new dedicated test file (GroupTest.java) with extensive unit tests to validate the correctness and behavior of the new computeTopicHash and computeGroupHash methods under various conditions.
  • Build System Updates: Updated build.gradle and gradle/dependencies.gradle to include the Guava library as a dependency, and checkstyle/import-control-group-coordinator.xml to allow its imports.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Dec 11, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This pull request adds Guava 33.4.0-jre as a dependency to the group-coordinator module and introduces two static helper methods—computeGroupHash() and computeTopicHash()—for computing deterministic hashes over group and topic metadata using Murmur3-128 hashing. Comprehensive test coverage validates hash stability and sensitivity to metadata variations.

Changes

Cohort / File(s) Change Summary
Dependency & Configuration
gradle/dependencies.gradle, build.gradle, checkstyle/import-control-group-coordinator.xml
Added Guava 33.4.0-jre to versions map and declared it as an implementation dependency in group-coordinator; updated checkstyle import rules to permit com.google.common.hash imports.
Core Hash Computation Methods
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java
Added two public static methods: computeTopicHash() for Murmur3-128 hashing of topic metadata and per-partition replica racks, and computeGroupHash() for combining ordered topic name→hash map entries.
Hash Computation Tests
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupTest.java
Introduced new test class validating hash computation logic across seven test cases covering canonical hash computation, magic byte sensitivity, partition/rack ordering, and metadata field variations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify the canonical byte sequence construction in computeTopicHash() matches the intended hash algorithm
  • Confirm test cases adequately cover edge cases and potential hash collisions
  • Validate checkstyle rule placement and that import paths are correctly allow-listed
  • Review hashing logic for determinism and compatibility with Guava's Murmur3-128 implementation

Poem

🐰 With Guava's hash, we compute with care,
Topics and groups both squared and fair,
Ordered bytes dance the Murmur's way,
Tests ensure they're right, hooray!

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ⚠️ Warning The pull request has no description provided by the author, only the default template text remains. Add a detailed pull request description including: the purpose of adding topic and group hash computation, the changes made, testing strategy, and how this relates to KAFKA-17747.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding deterministic hashing methods for topics and groups in the Kafka group coordinator module.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch clone-KAFKA-17747-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai bot added the size:L This PR changes 100-499 lines, ignoring generated files label Dec 11, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces utility methods to compute hashes for topics and groups, which involves adding Guava as a dependency. The implementation of computeTopicHash has a potential issue with UUID hashing that could lead to collisions, and there's a minor opportunity for code simplification in the stream processing. The new tests will also need updates to reflect the suggested change in the hashing logic. Otherwise, the changes look good and the tests are comprehensive.

HashFunction hf = Hashing.murmur3_128();
Hasher topicHasher = hf.newHasher()
.putByte((byte) 0) // magic byte
.putLong(topicImage.id().hashCode()) // topic Id

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using topicImage.id().hashCode() for hashing the Uuid can lead to an increased chance of collisions. Uuid.hashCode() returns an int, which is then cast to a long, but it discards information from the full 128-bit UUID. To create a more robust hash, you should use the full UUID value.

Consider changing the hasher chain to include both parts of the UUID:

Hasher topicHasher = hf.newHasher()
    .putByte((byte) 0) // magic byte
    .putLong(topicImage.id().getMostSignificantBits())
    .putLong(topicImage.id().getLeastSignificantBits())
    .putString(topicImage.name(), StandardCharsets.UTF_8) // topic name
    .putInt(topicImage.partitions().size()); // number of partitions

This will also require updating the corresponding tests in GroupTest.java.

Comment on lines +260 to +261
.filter(Optional::isPresent)
.map(Optional::get)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For improved readability and conciseness, the stream operation .filter(Optional::isPresent).map(Optional::get) can be simplified using flatMap.

//...
.map(BrokerRegistration::rack)
.flatMap(Optional::stream)
.sorted()
//...

This requires Java 9+.

HashFunction hf = Hashing.murmur3_128();
Hasher topicHasher = hf.newHasher()
.putByte((byte) 0) // magic byte
.putLong(FOO_TOPIC_ID.hashCode()) // topic Id

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Following the recommended change in Group.computeTopicHash to use the full 128-bit UUID for hashing, this line and similar lines in other tests (testComputeTopicHashWithDifferentMagicByte, testComputeTopicHashWithDifferentPartitionOrder, testComputeTopicHashWithDifferentRackOrder) should be updated.

Instead of putLong(FOO_TOPIC_ID.hashCode()), you should use:

.putLong(FOO_TOPIC_ID.getMostSignificantBits())
.putLong(FOO_TOPIC_ID.getLeastSignificantBits())

@visz11
Copy link
Collaborator Author

visz11 commented Dec 11, 2025

@refacto-visz

@refacto-visz
Copy link

refacto-visz bot commented Dec 11, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

@codeant-ai
Copy link

codeant-ai bot commented Dec 11, 2025

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Topic-id hashing
    The code uses topicImage.id().hashCode() and writes it via Hasher.putLong. Using the Java object hashCode as part of a persistent/deterministic Murmur3 hash may be unstable across JVM implementations, and reduces the entropy/uniqueness of the topic id contribution. Prefer encoding the canonical id representation (e.g. id().toString() or explicit UUID bits) into the hasher to ensure determinism and avoid accidental collisions.

  • Empty map handling
    computeGroupHash calls Hashing.combineOrdered(...) with a stream that may produce an empty list when topicHashes is empty. Verify the behavior for empty input (combineOrdered on an empty Iterable may not produce a stable value or could throw). Consider explicitly handling null/empty maps to return a defined constant/seed.

  • Broad package allowance
    The PR adds a blanket allow for the whole com.google.common.hash package. Allowing an entire third-party package increases the import surface for this module; consider restricting the import-control rule to only the exact classes actually used by the new code (or using an exact-match), to preserve layering and minimize accidental dependencies.

  • Implicit map iteration order assumption
    The tests call Group.computeGroupHash(Map.of(...)) and then build an expected hash using an ordered list. Map.of(...) does not guarantee an iteration order that matches test assumptions across all JVM implementations. This may cause flaky assertions if Group.computeGroupHash depends on map entry order.

  • Non-deterministic test data
    The test uses Uuid.randomUuid() to create FOO_TOPIC_ID. This introduces randomness at class-load time and can lead to non-deterministic test behavior or make reproducing failures harder. Prefer a fixed, deterministic UUID for stable tests.

  • Hard-coded metadata strings
    The expected topic-hash builder uses hard-coded rack strings like "rack0;rack1" instead of deriving them from FOO_METADATA_IMAGE. If metadata builder implementation changes, these tests may break even if Group.computeTopicHash is correct. Compute expected rack strings from the metadata image to avoid brittle tests.

@refacto-visz
Copy link

refacto-visz bot commented Dec 11, 2025

Code Review: UUID Hashing Implementation

PR Confidence Score: 🟨 4 / 5

👍 Well Done
Deterministic Hashing Logic

The implementation correctly uses sorting for partitions, racks, and topics to ensure the resulting hashes are deterministic and stable.

Thorough Test Coverage

The new tests are excellent, verifying not just functionality but also sensitivity to all inputs and the importance of ordering for determinism.

📁 Selected files for review (5)
  • build.gradle
  • checkstyle/import-control-group-coordinator.xml
  • gradle/dependencies.gradle
  • group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java
  • group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupTest.java
📝 Additional Comments
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java (2)
Hash Algorithm Versioning

Magic byte versioning uses hardcoded value without algorithm identification. Future hash algorithm changes could create compatibility issues if version byte doesn't distinguish between different hash implementations.

Standards:

  • CWE-327
  • NIST-SSDF-PW.1
Null Pointer Protection

Hash computation lacks input validation for null topicImage or clusterImage parameters. Null inputs could cause NullPointerException during hash calculation leading to service disruption.

Standards:

  • CWE-476
  • NIST-SSDF-PW.1

HashFunction hf = Hashing.murmur3_128();
Hasher topicHasher = hf.newHasher()
.putByte((byte) 0) // magic byte
.putLong(topicImage.id().hashCode()) // topic Id
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UUID Hashing Weakness

The code uses topicImage.id().hashCode(), which converts a 128-bit UUID to a 32-bit integer before hashing. This discards 96 bits of information, significantly increasing the probability of hash collisions for different topic UUIDs, potentially causing incorrect topic state detection.

            .putLong(topicImage.id().getMostSignificantBits())
            .putLong(topicImage.id().getLeastSignificantBits()) // full UUID bits
Commitable Suggestion
Suggested change
.putLong(topicImage.id().hashCode()) // topic Id
.putLong(topicImage.id().getMostSignificantBits())
.putLong(topicImage.id().getLeastSignificantBits()) // full UUID bits
Standards
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Fault-Tolerance

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java (1)

239-267: Topic hash design looks sound; consider using full UUID bits instead of hashCode() and reusing the HashFunction

The overall structure of computeTopicHash (fixed magic byte, topic id + name + partition count, partitions sorted by id, racks sorted per partition) gives:

  • Deterministic hashes for a given metadata/rack layout.
  • Sensitivity to changes in topic identity, partitioning, or rack distribution, while ignoring replica order within a partition.

Two possible refinements:

  1. Use full UUID instead of hashCode()
    Currently you do:

    .putLong(topicImage.id().hashCode())

    Uuid.hashCode() collapses a 128‑bit UUID down to 32 bits, which is then zero-extended into a long. If you ever rely on differences in topic IDs alone (e.g., same name/partitions/racks but a different ID across images), this unnecessarily increases collision risk. You could instead feed both UUID halves:

  •        .putLong(topicImage.id().hashCode()) // topic Id
    
  •        .putLong(topicImage.id().getMostSignificantBits())
    
  •        .putLong(topicImage.id().getLeastSignificantBits())
    
    
    (Adjust if `TopicImage.id()` is not `Uuid` but exposes equivalent accessors.)
    
    This keeps the intent (“include topic id in the hash”) while preserving the full 128 bits.
    
    
  1. Optional: reuse a static HashFunction instance
    Hashing.murmur3_128() is pure and thread-safe; you could store it in a private static final HashFunction and call MURMUR3_128.newHasher() here to avoid re-creating the function object on every invocation. The impact is small but easy to adopt if this runs in a hot path.

These are behavioural-compatible improvements except for (1), which changes the hash definition. If the topic hash is persisted or part of any cross-version protocol, you’d need to decide now whether you want the stronger ID contribution and, if so, lock it in before the feature ships.

group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupTest.java (1)

40-188: Strong coverage of hash behaviour; consider a couple of small robustness tweaks

The tests exercise the key properties of both hashing helpers well:

  • testComputeTopicHash matches the exact Murmur3 input sequence.
  • Variant tests ensure changes in magic byte, partition order, rack order, and individual fields all perturb the hash.
  • testComputeGroupHash and the order-variant test correctly lock in the “sorted by topic name, then combineOrdered” semantics.

Two optional improvements you might consider:

  1. Make the topic ID deterministic in tests
    Using Uuid.randomUuid() for FOO_TOPIC_ID is fine, but swapping it for a fixed UUID (e.g., Uuid.fromString("...") or a constructor) would make failures easier to reproduce and reason about if someone later needs to compare expected hash values across runs.

  2. Add explicit edge-case tests

    • computeGroupHash with an empty topicHashes map, to document and lock in the expected result.
    • computeTopicHash for a topic where brokers have no rack information, to clarify that the hash only depends on racks when present and what the behaviour is when all racks are missing.

These would make the contract of the helpers clearer without changing the existing behaviour.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed79212 and 36cb999.

📒 Files selected for processing (5)
  • build.gradle (1 hunks)
  • checkstyle/import-control-group-coordinator.xml (1 hunks)
  • gradle/dependencies.gradle (2 hunks)
  • group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java (2 hunks)
  • group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupTest.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupTest.java (2)
clients/src/main/java/org/apache/kafka/common/Uuid.java (1)
  • Uuid (32-196)
metadata/src/main/java/org/apache/kafka/image/MetadataImage.java (1)
  • MetadataImage (33-200)
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java (5)
metadata/src/main/java/org/apache/kafka/image/ClusterImage.java (1)
  • ClusterImage (36-108)
metadata/src/main/java/org/apache/kafka/image/TopicImage.java (1)
  • TopicImage (37-92)
metadata/src/main/java/org/apache/kafka/metadata/BrokerRegistration.java (1)
  • BrokerRegistration (44-398)
core/src/test/scala/unit/kafka/log/LogManagerTest.scala (1)
  • topicImage (1390-1422)
core/src/main/scala/kafka/coordinator/group/CoordinatorPartitionWriter.scala (1)
  • hashCode (51-53)
🔇 Additional comments (3)
checkstyle/import-control-group-coordinator.xml (1)

80-80: Import-control allowance for Guava hashing looks correct

Allowing com.google.common.hash under the coordinator.group subpackage is consistent with the new usage in Group and GroupTest and keeps the scope tight.

build.gradle (1)

1410-1424: Guava dependency scope for group-coordinator is reasonable

Adding implementation libs.guava only to :group-coordinator is appropriate since the new helpers return primitives and don’t expose Guava types. Please just double-check that no public API in this module (or its -api sibling) directly uses Guava types, so we keep Guava as an internal implementation detail.

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java (1)

223-237: Deterministic group hash via sorted topics + combineOrdered

Sorting the topicHashes entries by topic name before calling Hashing.combineOrdered gives a stable group hash independent of Map iteration order, and the API returns a primitive long, so Guava stays internal here. This looks correct for the intended “ordered combination of per-topic hashes” use case.

Comment on lines +64 to 65
guava: "33.4.0-jre",
httpclient: "4.5.14",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verification script for Guava dependency review

echo "=== 1. Checking for Guava in LICENSE-binary ==="
if [ -f "LICENSE-binary" ]; then
  echo "LICENSE-binary exists. Searching for Guava..."
  rg -i "guava" LICENSE-binary -n || echo "Guava not found in LICENSE-binary"
else
  echo "LICENSE-binary does not exist"
fi

echo ""
echo "=== 2. Confirming libs.guava wire-up ==="
rg "libs\.guava" . -n

echo ""
echo "=== 3. Checking gradle/dependencies.gradle structure ==="
sed -n '60,70p' gradle/dependencies.gradle
echo "---"
sed -n '148,155p' gradle/dependencies.gradle

Repository: coderabbit-test/kafka

Length of output: 1247


Add Guava license entry to LICENSE-binary

Guava 33.4.0-jre is wired correctly in gradle/dependencies.gradle and is already in use (build.gradle:1423). However, per the dependency file comment requiring LICENSE-binary updates for new third-party dependencies, Guava is missing from LICENSE-binary. Add an entry with the correct Apache 2.0 license text for Guava 33.4.0-jre.

🤖 Prompt for AI Agents
In gradle/dependencies.gradle around lines 64-65, Guava 33.4.0-jre is declared
but missing from LICENSE-binary; add a new entry to LICENSE-binary for "Guava
33.4.0-jre" that includes the full Apache License, Version 2.0 text (or an exact
copy per project convention), the copyright/attribution for Google LLC (or the
upstream copyright listed in Guava's distribution), a reference to the Guava
project and version (e.g., https://github.com/google/guava and 33.4.0-jre), and
any SPDX identifier (Apache-2.0) so the LICENSE-binary accurately reflects this
third-party dependency.

Comment on lines +230 to +236
return Hashing.combineOrdered(
topicHashes.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.map(e -> HashCode.fromLong(e.getValue()))
.toList()
).asLong();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: When the topic hash map is empty, calling Hashing.combineOrdered with an empty list will throw an IllegalArgumentException at runtime, so the group hash computation must handle the empty-case explicitly and return a deterministic value instead of delegating to combineOrdered. [logic error]

Severity Level: Minor ⚠️

Suggested change
return Hashing.combineOrdered(
topicHashes.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.map(e -> HashCode.fromLong(e.getValue()))
.toList()
).asLong();
List<HashCode> hashCodes = topicHashes.entrySet()
.stream()
.sorted(Map.Entry.comparingByKey())
.map(e -> HashCode.fromLong(e.getValue()))
.toList();
if (hashCodes.isEmpty()) {
return 0L;
}
return Hashing.combineOrdered(hashCodes).asLong();
Why it matters? ⭐

The suggestion is correct: Guava's Hashing.combineOrdered throws if given an empty iterable, so the current implementation can throw at runtime for an empty topicHashes map. The proposed change makes the behavior deterministic for that edge case and prevents a potential IllegalArgumentException. This is a real bug fix, not a mere stylistic change.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** group-coordinator/src/main/java/org/apache/kafka/coordinator/group/Group.java
**Line:** 230:236
**Comment:**
	*Logic Error: When the topic hash map is empty, calling `Hashing.combineOrdered` with an empty list will throw an IllegalArgumentException at runtime, so the group hash computation must handle the empty-case explicitly and return a deterministic value instead of delegating to `combineOrdered`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

@codeant-ai
Copy link

codeant-ai bot commented Dec 11, 2025

CodeAnt AI finished reviewing your PR.

@visz11 visz11 closed this Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build size:L This PR changes 100-499 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants