Ensure proper retry and backoff for newly created monitor topics

As shown in the conversations https://linkedin-randd.slack.com/archives/C04FMP0HB17/p1671222219329569,
if a new monitoring topic is just created in a cluster, the AdminClient.describeTopic API could result in UnknownTopicOrPartitionExceptions, which causes the whole process to crash. Below are the places that can trigger the exception (and there maybe more call sites)

https://github.com/linkedin/kafka-monitor/blob/7f99c095c2ceb2d09b0e490fa138a68fac849bba/src/main/java/com/linkedin/xinfra/monitor/services/MultiClusterTopicManagementService.java#L455

https://github.com/linkedin/kafka-monitor/blob/7f99c095c2ceb2d09b0e490fa138a68fac849bba/src/main/java/com/linkedin/xinfra/monitor/services/MultiClusterTopicManagementService.java#L338

We need to make sure that the logic calling the describeTopic API has appropriate retries and backoffs in case it's a topic that's just created.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ensure proper retry and backoff for newly created monitor topics #390

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ensure proper retry and backoff for newly created monitor topics #390

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions