🎼🤖 Generative AI for MIDI Sequences

A real-time procedural Metal music generation system based on fine-tuned MusicVAE models, integrated into Unity.

🛠️ Technologies Used

📖 Description

Generative AI for MIDI Sequences is a thesis project focused on extending the capabilities of generative music models for stylistically complex genres — specifically Metal music — and integrating real-time procedural generation into a Unity-based interactive environment.

Despite significant advancements in AI-based music generation, general-purpose pre-trained models often struggle when applied to highly structured and stylistically demanding genres.

Preliminary experiments using pre-trained MusicVAE models produced convincing results in Classical and Jazz domains. However, when applied to Metal, significant limitations emerged:

Lack of rhythmic consistency
Weak harmonic coherence
Absence of genre-specific features (such as complex drum patterns and fast tempo structures)

These issues were directly linked to the limited representation of Metal music within the original training datasets.

🎯 Project Objectives

The primary objective of this thesis was to develop a system capable of generating dynamic and stylistically coherent Metal music in real time, integrated within a Unity game environment.

Key goals achieved:

✅ Creation of a specialized Metal MIDI dataset
✅ Fine-tuning MusicVAE models for genre-specific generation
✅ Real-time integration through OSC communication
✅ Development of a responsive procedural music system for interactive environments

🧠 Research Contribution

This project demonstrates how targeted fine-tuning on domain-specific datasets significantly extends the expressive capabilities of generative models, allowing them to operate effectively in stylistically complex domains.

The final result is a functional prototype in which procedural generation dynamically responds to user interaction inside a game environment.

🚀 Installation & Usage

1️⃣ Environment Setup

Install Anaconda.
Create a Python 3.10 virtual environment.
Install the required dependencies (tensorflow, magenta, etc.).

⚠️ Note: Magenta and TensorFlow require careful dependency management. Python 3.10 is strictly required for compatibility.

2️⃣ Dataset Preparation

Download your Metal MIDI files.
Place them inside your designated dataset directory.
To separate instrumental tracks (Guitar, Bass, Drums), run: 👉 Scripts/splitter.py

3️⃣ TFRecord Conversion

After splitting the tracks, convert each instrument dataset into the TFRecord format required by MusicVAE:

🎸 Guitar: Scripts/convert_guitar_to_tf.py
🎸 Bass: Scripts/convert_bass_to_tf.py
🥁 Drums: Scripts/convert_drums_to_tf.py

4️⃣ Model Fine-Tuning

The following pre-trained MusicVAE models were selected for fine-tuning based on the instrument:

Instrument	Base Model	Training Script
🎸 Guitar	`cat-mel_2bar_big`	`Scripts/train_guitar.py`
🎸 Bass	`cat-mel_2bar_med_chords`	`Scripts/train_bass.py`
🥁 Drums	`cat-drums_2bar_small`	`Scripts/train_drums.py`

Configuration: Training parameters (num_steps, batch_size, checkpoint_interval) can be configured directly inside each script.

🎹 MIDI Generation

After training, you can generate new, original MIDI sequences:

👉 Scripts/generate_MIDI.py

Customization: The generation process is highly flexible. By modifying the parameters within the script, you can vary the musical output at will—adjusting musical progressions, structural logic, and core generation data to fit your specific needs.

🎮 Unity Integration

The system uses the OSC (Open Sound Control) protocol to trigger audio samples dynamically based on AI-generated data.

🛠️ Setup Instructions

Install OSCJack: Download and import OSCJack into Unity following the official instructions.
OSCManager: Create an empty GameObject named OSCManager.
Player Objects: Create 5 child GameObjects inside OSCManager: GuitarPlayer, BassPlayer, KickPlayer, SnarePlayer, and HiHatPlayer.
AudioSources: Add an AudioSource to each Player and assign a short (one-shot) audio sample corresponding to the instrument.
MetalReceiver: Apply the script 👉 Scripts/MetalReceiver.cs to the OSCManager and link the Players in the Inspector.
SoundTrackManager: Apply the script 👉 Scripts/SoundTrackManager.cs within Unity. This component manages the overall soundtrack flow and handles the card selection logic, allowing the music to react to player choices.

⚡ Running the System

Start the OSC Server: Open the Anaconda Prompt, activate your environment, and run the server script: 👉 Scripts/playbackOSC.py This script acts as the server that generates the Metal music and sends the data to Unity via OSC.
Play in Unity: Press Play in the Unity Editor to listen to the generated output.
Parameter Tweaking: You can modify generation parameters, musical progressions, and other settings within the code to customize the musical results as needed.

📌 Current Status

⚠️ Disclaimer: This project is currently a research prototype and has not yet been released as a production-ready system.

It serves as a comprehensive proof-of-concept demonstrating:

Genre-specialized generative AI
Real-time procedural music systems
Interactive AI-driven audio design

🔬 Future Work

📈 Expansion of the Metal dataset for broader stylistic coverage.
🔗 Multi-instrument conditioning for tighter band cohesion.
🎭 Emotional modulation models to drive music based on game tension.
🏥 Validation in therapeutic environments (e.g., active music therapy).

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Scripts		Scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎼🤖 Generative AI for MIDI Sequences

🛠️ Technologies Used

📖 Description

🎯 Project Objectives

🧠 Research Contribution

🚀 Installation & Usage

1️⃣ Environment Setup

2️⃣ Dataset Preparation

3️⃣ TFRecord Conversion

4️⃣ Model Fine-Tuning

🎹 MIDI Generation

🎮 Unity Integration

🛠️ Setup Instructions

⚡ Running the System

📌 Current Status

🔬 Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎼🤖 Generative AI for MIDI Sequences

🛠️ Technologies Used

📖 Description

🎯 Project Objectives

🧠 Research Contribution

🚀 Installation & Usage

1️⃣ Environment Setup

2️⃣ Dataset Preparation

3️⃣ TFRecord Conversion

4️⃣ Model Fine-Tuning

🎹 MIDI Generation

🎮 Unity Integration

🛠️ Setup Instructions

⚡ Running the System

📌 Current Status

🔬 Future Work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages