Skip to content

[Bugfix, UE4] SensorData's frame, timestamp and transform are wrong for camera sensors #8935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: ue4-dev
Choose a base branch
from

Conversation

tdegraaff
Copy link

@tdegraaff tdegraaff commented May 23, 2025

After migrating an internal tool from CARLA 0.9.13 to 0.9.15, I noticed that there is an issue with the metadata, that comes with SensorData in 0.9.15.
For camera-based sensors, the frame, timestamp and transform do not correspond to the actual image, which is a quite severe issue for things like sensor fusion (since we cannot reliably synchronize different sensors) or for 3D reconstruction (where accurate transforms are necessary).

I created a minimal script sensor_metadata_test.zip , that showcases this problem. It spawns a camera and moves it along the x-axis by one meter per tick for ten ticks. It then logs the frame, timestamp and x-position of the sensor according to the WorldSnapshot and according to the received SensorData. The images are also stored on disk, so that they can be inspected.
The script can be executed at least from 0.9.13 to the current ue4-dev branch and shows the differences in what the SensorData states about the sensor:

0.9.13

In 0.9.13, everything is as expected (on Linux and Windows). Frame, timestamp and x-position are identically reported by WorldSnapshot and SensorData. The stored images also show, that the camera moves every frame by one step.

Timing results (0.9.13)
        Linux                        Windows
World:
frame | time     | x_pos     frame | time    | x_pos
    8 | 9.89307  | 1.0           8 | 2.80794 | 1.0
    9 | 9.94307  | 2.0           9 | 2.85794 | 2.0
   10 | 9.99307  | 3.0          10 | 2.90794 | 3.0
   11 | 10.04307 | 4.0          11 | 2.95794 | 4.0
   12 | 10.09307 | 5.0          12 | 3.00794 | 5.0
   13 | 10.14307 | 6.0          13 | 3.05794 | 6.0
   14 | 10.19307 | 7.0          14 | 3.10794 | 7.0
   15 | 10.24307 | 8.0          15 | 3.15794 | 8.0
   16 | 10.29307 | 9.0          16 | 3.20794 | 9.0
   17 | 10.34307 | 10.0         17 | 3.25794 | 10.0

Sensor:
frame | time     | x_pos     frame | time    | x_pos
    8 | 9.89307  | 1.0           8 | 2.80794 | 1.0
    9 | 9.94307  | 2.0           9 | 2.85794 | 2.0
   10 | 9.99307  | 3.0          10 | 2.90794 | 3.0
   11 | 10.04307 | 4.0          11 | 2.95794 | 4.0
   12 | 10.09307 | 5.0          12 | 3.00794 | 5.0
   13 | 10.14307 | 6.0          13 | 3.05794 | 6.0
   14 | 10.19307 | 7.0          14 | 3.10794 | 7.0
   15 | 10.24307 | 8.0          15 | 3.15794 | 8.0
   16 | 10.29307 | 9.0          16 | 3.20794 | 9.0
   17 | 10.34307 | 10.0         17 | 3.25794 | 10.0

0.9.15

In 0.9.15, WorldSnapshot is again as expected but SensorData returns wrong values. On Linux, I get an offset of 1 frame quite consistently, which also affects the timestamp and transform. On Windows, the problem is even worse, where duplicate frames, timestamps and x-positions occur, or are skipped. The stored images still show, that the camera moves every frame by one step, i.e. the images are at least correct but are associated with wrong metadata.

Is also tested on the current ue4-dev branch and the issue is still existing.

Timing results (0.9.15)
        Linux                        Windows
World:
frame | time     | x_pos     frame | time    | x_pos
 2992 | 10.03505 | 1.0         685 | 4.37873 | 1.0
 2993 | 10.08505 | 2.0         686 | 4.42873 | 2.0
 2994 | 10.13505 | 3.0         687 | 4.47873 | 3.0
 2995 | 10.18505 | 4.0         688 | 4.52873 | 4.0
 2996 | 10.23505 | 5.0         689 | 4.57873 | 5.0
 2997 | 10.28505 | 6.0         690 | 4.62873 | 6.0
 2998 | 10.33505 | 7.0         691 | 4.67873 | 7.0
 2999 | 10.38505 | 8.0         692 | 4.72873 | 8.0
 3000 | 10.43505 | 9.0         693 | 4.77873 | 9.0
 3001 | 10.48505 | 10.0        694 | 4.82873 | 10.0

Sensor:
frame | time     | x_pos     frame | time    | x_pos
 2993 | 10.08505 | 2.0         686 | 4.42873 | 2.0
 2994 | 10.13505 | 3.0         687 | 4.52873 | 4.0
 2995 | 10.18505 | 4.0         688 | 4.57873 | 5.0
 2996 | 10.23505 | 5.0         689 | 4.62873 | 6.0
 2997 | 10.28505 | 6.0         690 | 4.67873 | 7.0
 2998 | 10.33505 | 7.0         691 | 4.67873 | 7.0
 2999 | 10.38505 | 8.0         691 | 4.77873 | 9.0
 3000 | 10.43505 | 9.0         693 | 4.82873 | 10.0
 3001 | 10.48505 | 10.0        694 | 4.82873 | 10.0
 3001 | 10.48505 | 10.0        694 | 4.82873 | 10.0

Problem

I digged into the history of the sensor data streaming code and found, that the issue probably arose from the changes in this commit. As the comment in PixelReader.h is stating, the creation of the stream as part of the lambda's closure would still be executed in the game-thread, resulting in accurate values from frame, timestamp and sensor transform when instantiating the FAsyncDataStreamTmpl here. But the mentioned change is actually violating this assumption, since the creation of the stream is now part of the lambda, which will be asynchronously executed on the render thread. Thus, the metadata can be queried to late, resulting in associating wrong frame, timestamp and transform with the sensor data that will be send.

Solution

I was not sure about the actual reason to move the creation of the stream from the closure into the lambda, but when reverting this, CARLA actually crashed when spawning and listening to a sensor. So I left the stream untouched and implemented a very easy solution:

  • Add setters for the timestamp and transform of a FDataStreamTmpl (as it already exists for the frame)
  • Query the frame, timestamp and transform in the closure of the most outer lambda (thus executed on the game-thread) and set those values on the stream after its instantiation

If there might be a more clever fix, please let me know. But with this implementation, the timings are now correct again (this time only tested on Windows):

Timing results (after fix)
World:
frame | time     | x_pos
 3756 | 54.56837 | 1.0
 3757 | 54.61837 | 2.0
 3758 | 54.66837 | 3.0
 3759 | 54.71837 | 4.0
 3760 | 54.76837 | 5.0
 3761 | 54.81837 | 6.0
 3762 | 54.86837 | 7.0
 3763 | 54.91837 | 8.0
 3764 | 54.96837 | 9.0
 3765 | 55.01837 | 10.0

Sensor:
frame | time     | x_pos
 3756 | 54.56837 | 1.0
 3757 | 54.61837 | 2.0
 3758 | 54.66837 | 3.0
 3759 | 54.71837 | 4.0
 3760 | 54.76837 | 5.0
 3761 | 54.81837 | 6.0
 3762 | 54.86837 | 7.0
 3763 | 54.91837 | 8.0
 3764 | 54.96837 | 9.0
 3765 | 55.01837 | 10.0
0.10.0

From what I see, this issue should also exist in 0.10.0: PixelReader.h @ ue5-dev
Since I am currently not working with 0.10.0, maybe someone from the CARLA team might think about porting this PR to ue5-dev (if it is accepted as is).

Where has this been tested?

  • Platform(s): Windows 11
  • Python version(s): 3.8
  • Unreal Engine version(s): 4.26
  • Tested in Editor and Packaged

This change is Reviewable

…ng to the actually sent image for camera sensors.
@tdegraaff tdegraaff requested a review from a team as a code owner May 23, 2025 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant