Skip to content

Ensure proper retry and backoff for newly created monitor topics #390

@gitlw

Description

@gitlw

As shown in the conversations https://linkedin-randd.slack.com/archives/C04FMP0HB17/p1671222219329569,
if a new monitoring topic is just created in a cluster, the AdminClient.describeTopic API could result in UnknownTopicOrPartitionExceptions, which causes the whole process to crash. Below are the places that can trigger the exception (and there maybe more call sites)

_adminClient.describeTopics(Collections.singleton(_topic)).all().get().get(_topic).partitions();

List<TopicPartitionInfo> partitions = topicDescriptions.get(_requestTimeout.toMillis(), TimeUnit.MILLISECONDS).partitions();

We need to make sure that the logic calling the describeTopic API has appropriate retries and backoffs in case it's a topic that's just created.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions