Skip to content

rookie0607/Awesome-Reasoning-LALMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 

Repository files navigation

Awesome Reasoning in Large Audio Language Models (LALMs) 🎀🧠

Awesome PRs Welcome

A curated collection of cutting-edge research and resources for reasoning in Large Audio Language Models. Explore the frontier of multimodal reasoning through speech processing, chain-of-thought techniques, and audio-visual understanding.

πŸ”₯ Hot Topics
Multimodal Reasoning β€’ Chain-of-Thought Prompting β€’ Speech Translation β€’ Emotion Recognition β€’ Reinforcement Learning

πŸ“š Table of Contents

Papers πŸ“„

Chain-of-Thought Reasoning

Title Conference Code Highlights
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model Preprint Pioneering work on CoT in audio models
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought INTERSPEECH 2024 Multimodal CoT framework
Chain-of-thought prompting for speech translation EMNLP 2024 Zero-shot speech translation

Multimodal Reasoning

Title Conference Code Highlights
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language Models ICASSP 2025 Dedicated audio reasoning architecture
Internalizing ASR with Implicit Chain of Thought for Efficient Speech-to-Speech Conversational LLM ACL 2024 End-to-end speech conversation

Model Optimization

Title Conference Code Highlights
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio QA NeurIPS 2024 RL vs SFT comparison
Steering Language Model to Stable Speech Emotion Recognition via Contextual Perception and CoT ICMI 2024 Emotion recognition stability

β–Ά Research Areas

  • Multimodal Chain-of-Thought
    Exploring reasoning paths through audio-text interactions
  • Speech Translation
    Enhancing translation quality with reasoning mechanisms
  • Emotion Understanding
    Stable emotion recognition through contextual reasoning
  • Efficient Architectures
    Optimizing model structures for real-time applications

β–Ά Projects

Coming soon...
Submit your project!

Datasets πŸ“‚

Coming soon...
Suggest a dataset!

Contributing

We welcome contributions! Please see our:

Ways to contribute:

  • Add missing papers (with summary)
  • Suggest new categories
  • Add dataset/resources section
  • Improve documentation

License πŸ“œ

This repository is licensed under CC-BY-4.0.
Please check individual paper licenses for specific usage.


Maintained with ❀️ by [SK-HUANG]
Last updated: 2025/3/21

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published