Goal-conditioned Super Mario Bros. experiments built around a LeWM-style world model.
This repo keeps the publishable core only:
- export exact FM2 playback from FCEUX into
.npzepisodes - convert per-frame episodes into blocked LeWM trajectories
- train a reward-free world model from offline traces
- evaluate offline losses
- run local goal-conditioned live demos
- record demo videos with start/goal reference panels
What this repo is not:
- not a full-game autonomous Mario agent
- not a reinforcement learning project
- not a full emulator bundle
The main demo replays an existing trace up to a chosen point, then lets the model take over for a short local segment toward a target image.
This repo does not include:
- FCEUX binaries
- the Super Mario Bros. ROM
- large local datasets or checkpoints.
You need to provide:
- an FCEUX install directory containing
fceux64.exe - a compatible Super Mario Bros. ROM (SHA256: F61548FDF1670CFFEFCC4F0B7BDCDD9EABA0C226E3B74F8666071496988248DE)
By default, the scripts expect:
- FCEUX under
./fceux - the ROM at
./fceux/SMB.nes - traces under
./traces
Those paths can be overridden with command-line flags.
Core Python package:
mario_lewm/__init__.pymario_lewm/dataset.pymario_lewm/fm2.pymario_lewm/model.pymario_lewm/planning.py
Supported scripts:
export_fceux_dataset.pybuild_lewm_mario_dataset.pysplit_mario_dataset.pyprecompute_mario_dataset.pytrain_mario.pytest_mario.pyshow_goal_frames.pydemo_mario_goal_live_fixed.pydemo_mario_goal_live_record.pyfceux_export_trace.luafceux_live_bridge_fixed.luafceux_live_bridge_record.lua
Sample rendered videos are available under demo/, for example:
demo/demo1.mp4demo/demo5.mp4demo/demo9.mp4
These were produced with the recorded local-goal demo pipeline and are meant as examples, not as required repo assets.
If your Markdown renderer does not show embedded video, use the direct file links above.
python -m pip install -r requirements.txtYou also need ffmpeg on PATH for recorded demos.
python export_fceux_dataset.py --trace-root traces --output-dir mario_dataset_raw_lewm --max-frames 2000 --capture-initial-frameThis produces .npz files with:
- RGB frames from exact FM2 playback
- per-frame controller rows
- metadata copied from the FM2 header
python build_lewm_mario_dataset.py --dataset-root mario_dataset_raw_lewm --output-dir mario_dataset_lewm --frame-skip 5The blocked dataset uses:
- one blocked action = 5 raw emulator frames
- one blocked frame sequence aligned to those action blocks
This keeps one or more whole traces out of training so demos and evaluation can use held-out data.
python split_mario_dataset.py --dataset-root mario_dataset_lewm --train-dir mario_dataset_lewm_train --test-dir mario_dataset_lewm_test --test-name 141presses2python precompute_mario_dataset.py --dataset-root mario_dataset_lewm_train --output-dir mario_dataset_lewm_train_224 --image-size 224python train_mario.py --dataset-root mario_dataset_lewm_train --precomputed-root mario_dataset_lewm_train_224 --output-dir mario_runs/run_lewm_mario --epochs 100 --batch-size 128 --num-workers 6 --save-every 20 --npz-load-mode lazy --max-cached-episodes 4 --batching episode --log-every-steps 20 --compileResume:
python train_mario.py --dataset-root mario_dataset_lewm_train --precomputed-root mario_dataset_lewm_train_224 --output-dir mario_runs/run_lewm_mario --epochs 100 --batch-size 128 --num-workers 6 --save-every 20 --npz-load-mode lazy --max-cached-episodes 4 --batching episode --log-every-steps 20 --compile --resume-latestpython test_mario.py --checkpoint mario_runs/run_lewm_mario/best.pt --dataset-root mario_dataset_lewm_test --mode offline --batch-size 32 --num-workers 0python show_goal_frames.py --dataset-root mario_dataset_lewm_test 141presses2 60 65python demo_mario_goal_live_fixed.py --checkpoint mario_runs/run_lewm_mario/best.pt --dataset-root mario_dataset_lewm_test --episode-name 141presses2 --start-index 60 --goal-index 65 --horizon 5 --replan-every 1 --max-steps 10 --visual-debugpython demo_mario_goal_live_record.py --checkpoint mario_runs/run_lewm_mario/best.pt --dataset-root mario_dataset_lewm_test --episode-name 141presses2 --start-index 60 --goal-index 65 --horizon 5 --replan-every 1 --max-steps 10 --output-dir demo_141presses2Batch several recordings:
python demo_mario_goal_live_record.py --checkpoint mario_runs/run_lewm_mario/best.pt --dataset-root mario_dataset_lewm_test --episode-name 141presses2 --start-index 60 --goal-index 65 --horizon 5 --replan-every 1 --max-steps 10 --output-dir demo_141presses2_batch --num-runs 10- The demos are local goal-conditioned control, not full-level gameplay.
- The video starts with trace replay, then switches to model control.
- The planner does not use rewards or RL.
- Planning uses a target image and latent-distance cost only.