Skip to content

Introduce enableKeepAlive() and readTimeout() in JDBC Runtime config#52797

Open
gastaldi wants to merge 1 commit intoquarkusio:mainfrom
gastaldi:jdbc_keep_alive
Open

Introduce enableKeepAlive() and readTimeout() in JDBC Runtime config#52797
gastaldi wants to merge 1 commit intoquarkusio:mainfrom
gastaldi:jdbc_keep_alive

Conversation

@gastaldi
Copy link
Copy Markdown
Member

@gastaldi gastaldi commented Feb 26, 2026

@quarkus-bot quarkus-bot bot added area/agroal area/jdbc Issues related to the JDBC extensions labels Feb 26, 2026
@gastaldi gastaldi requested review from barreiro and gsmet February 26, 2026 21:45
@quarkus-bot

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 26, 2026

🎊 PR Preview 8749ffe has been successfully built and deployed to https://quarkus-pr-main-52797-preview.surge.sh/version/main/guides/

  • Images of blog posts older than 3 months are not available.
  • Newsletters older than 3 months are not available.

@quarkus-bot

This comment has been minimized.

@gastaldi gastaldi requested review from Sanne and michalvavrik and removed request for Sanne February 27, 2026 00:35
Copy link
Copy Markdown
Member

@michalvavrik michalvavrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really code owner or knowledgeable about this code. I checked links that you provided and the timeouts should definitely do the trick. I remember I had some reproducer that I linked to the issue, so I can check it out over weekend and run it against your PR. But if you did that already, I will just stay silent :-)

Thanks for looking into this!

Copy link
Copy Markdown
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added some questions inline.

* Setting this to a value greater than zero can help to prevent connection leaks and to ensure that connections are not
* left open indefinitely.
*/
Optional<Duration> socketTimeout();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure socketTimeout is about that? Because usually socket timeout is the time you allow for the connection to be established, not the time a connection can be idle?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two types of socket timeout, one for connection and the read timeout. I'll rename this to readTimeout() for clarity

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so it has nothing to do with idle time? I don't know why, I thought you were talking about the time we could keep a connection idle. It's not that, right?

Now are we sure what we advertise as readTimeout is actually handled as a read timeout, by the databases where you use their socketTimeout JDBC property under the hood?

I'm asking because both are very different things to me so I wonder if we are confusing these two notions? Or if I missed something entirely, which is very much possible :).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so it has nothing to do with idle time? I don't know why, I thought you were talking about the time we could keep a connection idle. It's not that, right?

Correct.

Now are we sure what we advertise as readTimeout is actually handled as a read timeout, by the databases where you use their socketTimeout JDBC property under the hood?

Correct. The timeout value used for socket read operations. Oracle for example allows you to define timeouts for connection (oracle.net.CONNECT_TIMEOUT) and read (which is the property we're setting here)

I'm asking because both are very different things to me so I wonder if we are confusing these two notions? Or if I missed something entirely, which is very much possible :).

Fair enough, I'm open to suggestions on what makes it less confusing :)

@gsmet gsmet requested a review from yrodiere March 2, 2026 08:40
@gastaldi gastaldi changed the title Introduce enableKeepAlive() and socketTimeout() in JDBC Runtime config Introduce enableKeepAlive() and readTimeout() in JDBC Runtime config Mar 2, 2026
@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@gastaldi
Copy link
Copy Markdown
Member Author

gastaldi commented Mar 4, 2026

I confirm this passes with the reproducer provided in the original issue. It should be good to be merged when approved

These are the options I included in the reproducer available in michalvavrik/quarkus-test-suite@8eac3de

quarkus.datasource.validation-query-timeout.jdbc.enable-keep-alive=true
quarkus.datasource.validation-query-timeout.jdbc.read-timeout=6s

@quarkus-bot
Copy link
Copy Markdown

quarkus-bot bot commented Mar 4, 2026

Status for workflow Quarkus Documentation CI

This is the status report for running Quarkus Documentation CI on commit 8bf9a52.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

Warning

There are other workflow runs running, you probably need to wait for their status before merging.

@quarkus-bot
Copy link
Copy Markdown

quarkus-bot bot commented Mar 4, 2026

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit 8bf9a52.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.

Copy link
Copy Markdown
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great feature to have in its own right, but... when it comes to fixing https://issues.redhat.com/browse/QUARKUS-5881, are you sure it does the trick?

From what I understand, you're introducing a timeout for every single query executed by the application. Some of which can be quite long (e.g. in some overnight batch jobs), so the read timeout will typically need to be set quite high.

QUARKUS-5881, on the other hand, was about timeouts for validation queries, which should typically be quite quick, so we'd typically want a very low read timeout.

All this to say... in order for this PR to actually fix QUARKUS-5881, I believe it should make sure to set the read timeout to the (supposedly lower) "validation query timeout" in connection validators. See https://github.com/quarkusio/quarkus/pull/47233/changes#diff-ae22f0b05b345fab3d41e46504afedf53b79b6b3b4a12411bde2c8ae60c2c234

Does that make sense?

@gastaldi
Copy link
Copy Markdown
Member Author

gastaldi commented Mar 5, 2026

Enabling keep alive is what will fix the issue, the read timeout is only a safeguard mechanism to make sure the connection isn't stuck

@yrodiere
Copy link
Copy Markdown
Member

yrodiere commented Mar 5, 2026

So the recommended solution for users in the case described in QUARKUS-5881 would be to use keep-alive, and not necessarily validation queries. Got you.

Still, unless I'm missing something, this does mean that validation queries may get stuck beyond the validation query timeout, unless we set a read timeout.
Enabling keep-alive will make this problem less likely, because supposedly the keep-alive mechanism will sometime / most of the time run into a timeout and remove the connection before validation queries even happen; but still, in some cases, validation query timeouts will not be complied with.

Or am I still misunderstanding? Sorry :x

@gastaldi
Copy link
Copy Markdown
Member Author

gastaldi commented Mar 5, 2026

Still, unless I'm missing something, this does mean that validation queries may get stuck beyond the validation query timeout, unless we set a read timeout.

Yes, that's still possible.

Enabling keep-alive will make this problem less likely, because supposedly the keep-alive mechanism will sometime / most of the time run into a timeout and remove the connection before validation queries even happen; but still, in some cases, validation query timeouts will not be complied with.

Correct.

There is no way to set the read timeout per query, as this is a Connection setting that operates in the network level (usually passed in the URL or via JDBC property). The behavior in setQueryTimeout is driver-dependent so I see the driver's socket/read timeout as a reliable protection against any hung queries (including validation).

@gastaldi gastaldi requested a review from barreiro March 5, 2026 16:17
@xstefank
Copy link
Copy Markdown
Member

@gastaldi @yrodiere this PR seems to be stuck? So any resolution? Will you merge this and provide another fix for the JIRA?

@gastaldi
Copy link
Copy Markdown
Member Author

@xstefank it's waiting for an approval from @yrodiere, I've believe I've answered his questions

@yrodiere
Copy link
Copy Markdown
Member

yrodiere commented Mar 30, 2026

@xstefank it's waiting for an approval from @yrodiere, I've believe I've answered his questions

Uh, no, it's waiting for an approval from Luis, following my claim that the issue is not solved.

I'm +1 to merge this, but -1 to close the corresponding issues.

My understanding of your answers is that this patch does not solve any of the issues the description claims to solve.

It just adds a feature which, while nice, requires that users use a common timeout for every single request in their application. In many cases that just can't work, because you can have some valid long-running queries (e.g. an export for business intelligence purposes taking 30s) while you'd like validation queries to be cut short quite quickly (e.g. 2s).

The way I see it, the only way to fix the problem would be if Agroal makes sure to cancel a validation query when it takes more than the specified validation query timeout. I know Agroal executes many tasks asynchronously through a threadpool, so I guess that would be one way.

EDIT: I will add that my approval is not necessary to carry on -- George and Luis own the Agroal extension and are free to ignore my comments if they disagree.

@barreiro
Copy link
Copy Markdown
Contributor

@gastaldi another option is to wait for agroal/agroal#170 for the network timeout setting. queries that expect to take long can override that.

@gastaldi
Copy link
Copy Markdown
Member Author

gastaldi commented Mar 31, 2026

@barreiro right, it may solve for the readTimeout, but not for enabling keep-alive. Both properties in this PR are set as JDBC properties and the behavior is defined in the JDBC driver implementation.

@gastaldi
Copy link
Copy Markdown
Member Author

However, that's also a Connection setting, not a per-query one, so the same attention must be taken when using it as @yrodiere pointed out

@barreiro
Copy link
Copy Markdown
Contributor

it'a a connection setting but the pool resets the value, making it possible that some queries enjoy bigger timeouts.

@gastaldi gastaldi requested a review from gsmet March 31, 2026 12:00
@gastaldi
Copy link
Copy Markdown
Member Author

I need an approval from a committer to merge this, @gsmet @yrodiere ?

@yrodiere
Copy link
Copy Markdown
Member

Like I said:

I'm +1 to merge this, but -1 to close the corresponding issues.

Do you intend to consider the corresponding issues closed? If so I simply disagree 🤷

@gastaldi
Copy link
Copy Markdown
Member Author

@yrodiere as I mentioned in #52797 (comment), the real fix is enabling the keep-alive mechanism. This is also a solution documented in https://access.redhat.com/solutions/7117794 and mentioned in https://redhat.atlassian.net/browse/QUARKUS-5881

@yrodiere
Copy link
Copy Markdown
Member

But:

Still, unless I'm missing something, this does mean that validation queries may get stuck beyond the validation query timeout, unless we set a read timeout.

Yes, that's still possible.

Enabling keep-alive will make this problem less likely, because supposedly the keep-alive mechanism will sometime / most of the time run into a timeout and remove the connection before validation queries even happen; but still, in some cases, validation query timeouts will not be complied with.

Correct.

So QUARKUS-5881 is not fixed, it's mitigated -- there will be times where the DB went down between two keep-alives, and then we'll again be in the case where "These connections will fail with a timeout error when the application attempts to use them next time."

What's more, #49976 is about validation query timeout not being complied with. As far as I can tell, it still won't be -- people need to use different features.
So we're not fixing that timeout, and perhaps should remove it or have some sort of plan?

I know, I know: the answer is to use a global read-timeout. And then, like I said:

In many cases that just can't work, because you can have some valid long-running queries (e.g. an export for business intelligence purposes taking 30s) while you'd like validation queries to be cut short quite quickly (e.g. 2s).

So, yeah. If you want to merge this, as is, IMO we need to:

  1. Clarify that QUARKUS-5881 is not completely fixed, just mitigated: the keep-alive will reduce the likeliness, and read-timeout will reduce the impact, but an application that, say, needs some queries to take 2 minutes will still experience, from time to time, a connection being stuck for 2 minutes, even if the particular query it was running was expected to take 100ms.
  2. Close Datasource connection validation query timeout doesn't work for PostgreSQL, MariaDB and MySQL databases #49976 as not planned.
  3. Open an issue about deprecating/removing the validation query timeout, since it does not service its purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agroal area/jdbc Issues related to the JDBC extensions triage/flaky-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datasource connection validation query timeout doesn't work for PostgreSQL, MariaDB and MySQL databases

6 participants