Skip to content
Open
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
871a6d7
optimize connect
lokidundun Nov 14, 2025
5bda6c2
run mvn spotless
lokidundun Nov 14, 2025
b8f436d
the wrong test should be removed
lokidundun Nov 14, 2025
38c21d7
fix:fix the wrong test to make ci successfully
lokidundun Nov 14, 2025
ae81603
fix the race condition
lokidundun Nov 15, 2025
713c172
add lock
lokidundun Nov 15, 2025
d7ca2d6
run mvn
lokidundun Nov 15, 2025
e90a2fc
Merge branch '2.x' into feature/reconnect
maple525866 Nov 15, 2025
b98592b
initialized reconnectTask and reuse timerExecutor
lokidundun Nov 17, 2025
e004dd3
Merge branch 'feature/reconnect' of https://github.com/lokidundun/inc…
lokidundun Nov 17, 2025
8ab62d9
log change
lokidundun Nov 17, 2025
1a1d4b2
add a new reconnectLock
lokidundun Nov 17, 2025
90864b2
Merge branch '2.x' into feature/reconnect
lokidundun Nov 19, 2025
dc0b49a
Merge branch '2.x' into feature/reconnect
lokidundun Nov 20, 2025
29c865f
Merge branch '2.x' into feature/reconnect
lokidundun Nov 21, 2025
e1653f5
add tests
lokidundun Nov 21, 2025
2183a20
fix
lokidundun Nov 21, 2025
715e35c
Merge branch '2.x' into feature/reconnect
lokidundun Nov 21, 2025
3b261b8
Merge branch '2.x' into feature/reconnect
lokidundun Nov 24, 2025
9a71eec
add some tests
lokidundun Nov 24, 2025
6d24bf8
run spotless
lokidundun Nov 24, 2025
1faddba
Merge branch 'feature/reconnect' of https://github.com/lokidundun/inc…
lokidundun Nov 24, 2025
f1f4e76
fix ci
lokidundun Nov 24, 2025
57f6dae
Merge branch '2.x' into feature/reconnect
lokidundun Nov 25, 2025
1374211
Merge branch '2.x' into feature/reconnect
lokidundun Nov 26, 2025
a39988e
improve coverage
lokidundun Nov 26, 2025
b75b01c
run spotless
lokidundun Nov 26, 2025
e832b3a
Merge branch 'feature/reconnect' of https://github.com/lokidundun/inc…
lokidundun Nov 26, 2025
b9ae7a6
optimize tests
lokidundun Dec 1, 2025
070122f
Merge branch '2.x' into feature/reconnect
lokidundun Dec 1, 2025
3248451
Merge branch '2.x' into feature/reconnect
lokidundun Dec 3, 2025
dd736a9
Merge branch '2.x' into feature/reconnect
lokidundun Dec 9, 2025
f1c7560
run mvn spotless
lokidundun Dec 9, 2025
7c705dc
Merge branch '2.x' into feature/reconnect
lokidundun Dec 10, 2025
5f183c0
Merge branch '2.x' into feature/reconnect
lokidundun Dec 10, 2025
0f28ad7
Merge branch '2.x' into feature/reconnect
lokidundun Dec 18, 2025
c194948
Merge branch '2.x' into feature/reconnect
lokidundun Dec 24, 2025
57d48ed
Merge branch '2.x' into feature/reconnect
funky-eyes Dec 24, 2025
abf9b75
optimize:optimize the init and destroy method
lokidundun Dec 24, 2025
f9da76e
Merge branch '2.x' into feature/reconnect
lokidundun Dec 29, 2025
bf60ee4
use the shareExcutor to reconnect
lokidundun Jan 7, 2026
003d4ab
Merge branch 'feature/reconnect' of https://github.com/lokidundun/inc…
lokidundun Jan 7, 2026
ba272ac
Merge branch '2.x' of https://github.com/apache/incubator-seata into …
lokidundun Jan 15, 2026
4a6e225
log change
lokidundun Jan 15, 2026
eb78486
Merge branch '2.x' into feature/reconnect
lokidundun Jan 16, 2026
c3921e7
Merge branch '2.x' into feature/reconnect
lokidundun Jan 19, 2026
4f5302d
optimize the reconnect
lokidundun Jan 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/en-us/2.x.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ Add changes here for all PR submitted to the 2.x branch.
- [[#7741](https://github.com/apache/incubator-seata/pull/7741)] supports publishing image based on JDK 25
- [[#7743](https://github.com/seata/seata/pull/7743)] upgrade Apache Tomcat dependency from 9.0.108 to 9.0.109
- [[#7740](https://github.com/apache/incubator-seata/pull/7740)] enhance HttpClient to support h2c
- [[#7783](https://github.com/apache/incubator-seata/pull/7783)] Optimize RM/TM client reconnect
- [[#7807](https://github.com/apache/incubator-seata/pull/7807)] support mariadb 3.x
- [[#7781](https://github.com/apache/incubator-seata/pull/7781)] highlight pmd-check log
- [[#7704](https://github.com/apache/incubator-seata/pull/7704)] fix frontend security vulnerabilities
Expand Down
1 change: 1 addition & 0 deletions changes/zh-cn/2.x.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@
- [[#7741](https://github.com/apache/incubator-seata/pull/7741)] 支持发布基于JDK 25的镜像
- [[#7743](https://github.com/seata/seata/pull/7743)] 将 Apache Tomcat 依赖项从 9.0.108 升级到 9.0.109
- [[#7740](https://github.com/apache/incubator-seata/pull/7740)] 优化http工具类使之支持h2c协议
- [[#7783](https://github.com/apache/incubator-seata/pull/7783)] 优化 RM/TM 客户端重连
- [[#7807](https://github.com/apache/incubator-seata/pull/7807)] 支持 mariadb 3.x
- [[#7781](https://github.com/apache/incubator-seata/pull/7781)] 高亮 pmd 检查日志信息
- [[#7704](https://github.com/apache/incubator-seata/pull/7704)] 修复前端依赖漏洞
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.ReentrantLock;
import java.util.function.Function;
Expand Down Expand Up @@ -96,6 +97,10 @@ public abstract class AbstractNettyRemotingClient extends AbstractNettyRemoting
protected final Condition mergeCondition = mergeLock.newCondition();
protected volatile boolean isSending = false;

private final Runnable reconnectTask;
private AtomicBoolean timerStarted = new AtomicBoolean(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite clear on the purpose of this PR. Currently, timerStarted is an instance variable and not shared, which means both RM and TM will each create their own reconnectTask. The reconnectTask itself is also an instance variable with no sharing.

For the old logic, what is the point of adding timerStarted? Wouldn't it be better to unify the reconnect thread pool for both RM and TM into a single one, so that there is only one task per entity (one for RM and one for TM) instead of starting two separate thread pools?

private final ReentrantLock reconnectLock = new ReentrantLock();

/**
* When sending message type is {@link MergeMessage}, will be stored to mergeMsgMap.
*/
Expand All @@ -120,29 +125,30 @@ public abstract class AbstractNettyRemotingClient extends AbstractNettyRemoting

@Override
public void init() {
timerExecutor.scheduleAtFixedRate(
() -> {
try {
clientChannelManager.reconnect(getTransactionServiceGroup());
} catch (Exception ex) {
LOGGER.warn("reconnect server failed. {}", ex.getMessage());
}
},
SCHEDULE_DELAY_MILLS,
SCHEDULE_INTERVAL_MILLS,
TimeUnit.MILLISECONDS);
if (this.isEnableClientBatchSendRequest()) {
mergeSendExecutorService = new ThreadPoolExecutor(
MAX_MERGE_SEND_THREAD,
MAX_MERGE_SEND_THREAD,
KEEP_ALIVE_TIME,
TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(),
new NamedThreadFactory(getThreadPrefix(), MAX_MERGE_SEND_THREAD));
mergeSendExecutorService.submit(new MergedSendRunnable());
reconnectLock.lock();
try {
if (timerStarted.compareAndSet(false, true)) {
timerExecutor.scheduleAtFixedRate(
reconnectTask, SCHEDULE_DELAY_MILLS, SCHEDULE_INTERVAL_MILLS, TimeUnit.MILLISECONDS);
LOGGER.info("Reconnect timer started (role: {})", transactionRole.name());
}

if (this.isEnableClientBatchSendRequest()) {
mergeSendExecutorService = new ThreadPoolExecutor(
MAX_MERGE_SEND_THREAD,
MAX_MERGE_SEND_THREAD,
KEEP_ALIVE_TIME,
TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(),
new NamedThreadFactory(getThreadPrefix(), MAX_MERGE_SEND_THREAD));
mergeSendExecutorService.submit(new MergedSendRunnable());
}
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mergeSendExecutorService can be initialized multiple times if init() is called multiple times. While the timerStarted flag prevents duplicate timer scheduling, there's no similar protection for mergeSendExecutorService. This could lead to thread pool leaks if init() is called multiple times without calling destroy() in between. Consider adding a check to prevent re-initialization of the executor service, similar to the timer protection.

Copilot uses AI. Check for mistakes.

super.init();
clientBootstrap.start();
} finally {
reconnectLock.unlock();
}
super.init();
clientBootstrap.start();
}

public AbstractNettyRemotingClient(
Expand All @@ -155,6 +161,19 @@ public AbstractNettyRemotingClient(
clientBootstrap.setChannelHandlers(new ClientHandler(), new ChannelEventHandler(this));
clientChannelManager = new NettyClientChannelManager(
new NettyPoolableFactory(this, clientBootstrap), getPoolKeyFunction(), nettyClientConfig);
this.reconnectTask = () -> {
if (!timerStarted.get()) {
return;
}
try {
String serviceGroup = getTransactionServiceGroup();
if (StringUtils.isNotBlank(serviceGroup)) {
clientChannelManager.reconnect(serviceGroup);
}
} catch (Throwable t) {
LOGGER.error("Reconnect task failed for role: {}", transactionRole.name(), t);
}
};
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reconnectTask checks timerStarted flag at the beginning and returns early if false. However, there's a potential race condition: if destroy() is called and sets timerStarted to false while reconnectTask is already past the check (line 165) but before executing reconnect (line 171), the reconnect could still execute after destroy. Consider checking the timerStarted flag again after acquiring any necessary resources or before the actual reconnect call.

Copilot uses AI. Check for mistakes.
}

@Override
Expand Down Expand Up @@ -271,11 +290,19 @@ public void destroyChannel(String serverAddress, Channel channel) {

@Override
public void destroy() {
clientBootstrap.shutdown();
if (mergeSendExecutorService != null) {
mergeSendExecutorService.shutdown();
reconnectLock.lock();
try {
if (timerStarted.compareAndSet(true, false)) {
LOGGER.info("Reconnect timer stopped (role: {})", transactionRole.name());
}
clientBootstrap.shutdown();
if (mergeSendExecutorService != null) {
mergeSendExecutorService.shutdown();
}
super.destroy();
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After setting timerStarted to false, there's no synchronization to ensure that any in-flight reconnectTask executions have completed before calling super.destroy() which shuts down the timerExecutor. This could cause the scheduled reconnectTask to throw RejectedExecutionException or attempt to use resources that are being shut down. Consider using shutdownNow() on the scheduled future or adding a small delay to allow in-flight tasks to complete.

Copilot uses AI. Check for mistakes.
} finally {
reconnectLock.unlock();
}
super.destroy();
}

public void setTransactionMessageHandler(TransactionMessageHandler transactionMessageHandler) {
Expand Down
Loading
Loading