Skip to content

Performance Degradation Observed When Upgrading org.graalvm.js from 22.0.0.2 to 23.1.7 #11728

@Tonys-L

Description

@Tonys-L

Describe GraalVM and your environment :

  • GraalVM version or commit id if built from source: [23.1.7]
  • CE or EE: [CE]
  • JDK version: [JDK17]
  • OS and OS Version: [Windows 10]
  • Architecture: [x86]
  • The output of java -Xinternalversion:

Describe the issue
We are using GraalVM's JS engine to execute dynamically provided JavaScript scripts. Our backend uses a thread pool with 10 threads to execute script tasks. Since org.graalvm.polyglot.Context is not thread-safe, we create a new Context for each script execution.

After upgrading from org.graalvm.js:22.0.0.2 to org.graalvm.js:23.1.7, we observed a 50% drop in performance. Through profiling and analysis, we traced the bottleneck to the Context initialization phase.

🔍 Performance Bottleneck

From the thread dump, we found that script execution threads are blocked at:

com.oracle.truffle.polyglot.InstrumentCache.load(InstrumentCache.java:144)

This is a synchronized block, meaning that all 10 threads are serialized during Context creation, causing contention and performance degradation.

"DeviceServiceExecutor-22" #318 prio=5 os_prio=0 cpu=27730.25ms elapsed=88.06s tid=0x0000007fa0017900 nid=0x6736 waiting for monitor entry  [0x0000007ed9a22000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.oracle.truffle.polyglot.InstrumentCache.load(InstrumentCache.java:144)
	- waiting to lock <0x00000000f5062938> (a java.lang.Class for com.oracle.truffle.polyglot.InstrumentCache)
	at com.oracle.truffle.polyglot.PolyglotEngineImpl.initializeInstruments(PolyglotEngineImpl.java:931)
	at com.oracle.truffle.polyglot.PolyglotEngineImpl.<init>(PolyglotEngineImpl.java:285)
	at com.oracle.truffle.polyglot.PolyglotImpl.buildEngine(PolyglotImpl.java:355)
	at org.graalvm.polyglot.Engine$Builder.build(Engine.java:756)
	at org.graalvm.polyglot.Context$Builder.build(Context.java:1925)

🧩 Root Cause Analysis

We traced the issue further into the internal loading mechanism:
EngineAccessor.locatorOrDefaultLoaders() ➜ TruffleLocator.loaders() ➜ Truffle.getRuntime().getCapability(TruffleLocator.class) ➜ DefaultTruffleRuntime.Loader.load()

In GraalVM 22.0.0.2, the DefaultTruffleRuntime.Loader.load() method calls ServiceLoader.load() once.

Image

In GraalVM 23.1.7, it calls ServiceLoader.load() twice, which increases the overhead, especially under high concurrency.

Image

🚨 Impact

All threads are serialized when creating Context instances.
Performance degradation is significant under concurrent execution.
This issue makes it difficult to scale script execution in high-throughput environments.

❓ Question

How can we resolve this performance bottleneck in a multi-threaded environment?
More specifically:
Is there a way to avoid synchronization in InstrumentCache.load()?
Can we reduce the number of ServiceLoader.load() calls in DefaultTruffleRuntime.Loader.load()?
Are there safe ways to reuse Context or pre-initialize engines to avoid contention?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions