Add initial Sphinx documentation structure

berangerthomas · berangerthomas · commit 0418b81da059 · 2025-10-14T13:27:55.000+02:00
diff --git a/docs/source/api/index.rst b/docs/source/api/index.rst
@@ -0,0 +1,10 @@
+##############
+API Reference
+##############
+
+This section contains the API documentation generated automatically from the source code docstrings.
+
+.. toctree::
+   :maxdepth: 2
+
+   modules
diff --git a/docs/source/api/modules.rst b/docs/source/api/modules.rst
@@ -0,0 +1,77 @@
+.. _api-reference:
+
+###############
+Project Modules
+###############
+
+main
+====
+
+.. automodule:: main
+   :members:
+
+stellascript.orchestrator
+=========================
+
+.. automodule:: stellascript.orchestrator
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+stellascript.config
+===================
+
+.. automodule:: stellascript.config
+   :members:
+
+stellascript.cli
+================
+
+.. automodule:: stellascript.cli
+   :members:
+
+stellascript.logging_config
+===========================
+
+.. automodule:: stellascript.logging_config
+   :members:
+
+stellascript.audio.capture
+==========================
+
+.. automodule:: stellascript.audio.capture
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+stellascript.audio.enhancement
+==============================
+
+.. automodule:: stellascript.audio.enhancement
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+stellascript.processing.transcriber
+===================================
+
+.. automodule:: stellascript.processing.transcriber
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+stellascript.processing.diarizer
+================================
+
+.. automodule:: stellascript.processing.diarizer
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+stellascript.processing.speaker_manager
+=======================================
+
+.. automodule:: stellascript.processing.speaker_manager
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -0,0 +1,42 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+import os
+import sys
+sys.path.insert(0, os.path.abspath('../..'))
+
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+project = 'StellaScript'
+copyright = '2025, Béranger Thomas'
+author = 'Béranger Thomas'
+release = '1.0.0'
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.napoleon',
+    'sphinx.ext.viewcode',
+    'sphinx.ext.todo',
+]
+
+templates_path = ['_templates']
+exclude_patterns = []
+
+language = 'en'
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = 'sphinx_rtd_theme'
+html_static_path = ['_static']
+
+# -- Options for todo extension ----------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/extensions/todo.html#configuration
+
+todo_include_todos = True
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -0,0 +1,32 @@
+.. StellaScript documentation master file.
+
+##########################
+StellaScript Documentation
+##########################
+
+Introduction
+============
+
+StellaScript is a Python application designed for audio transcription and speaker diarization. Its primary goal is to provide an accurate and efficient tool for converting audio streams, whether pre-recorded or captured live, into structured text while identifying the different speakers.
+
+The system is based on a modular architecture and integrates several state-of-the-art machine learning models for its key features:
+
+*   **Speech Recognition**: Utilizes OpenAI's Whisper model, through the `whisperx` library for optimized performance, to ensure accurate transcription.
+*   **Speaker Diarization**: Integrates the `pyannote.audio` pipeline for audio segmentation and turn-taking identification.
+*   **Speaker Identification**: Generates voice embeddings with `SpeechBrain` to differentiate and track speakers consistently.
+
+This documentation aims to provide a technical overview of the project, its architecture, and its API.
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents
+
+   technical/index
+   api/index
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
diff --git a/docs/source/technical/architecture.rst b/docs/source/technical/architecture.rst
@@ -0,0 +1,46 @@
+###################################
+StellaScript Project Architecture
+###################################
+
+This document details the structure of the StellaScript project, the role of each file, and how the modules interact to perform audio transcription and diarization.
+
+Overview
+========
+
+The project is structured around a main module, ``stellascript``, which contains all the application logic. Execution is initiated by ``main.py`` at the project root, which acts as the entry point.
+
+Root Files
+==========
+
+-   ``main.py``: **Application entry point.** It is responsible for parsing command-line arguments, initializing the orchestrator, and launching the transcription process (either live or from a file).
+-   ``README.md``: **Main documentation.** Provides an overview of the project, installation instructions, and usage guidelines.
+-   ``pyproject.toml`` & ``uv.lock``: **Dependency management.** These files define the Python libraries required for the project to function.
+-   ``.gitignore``: Configuration file for Git, specifying files and folders to be ignored.
+-   ``LICENSE``: Contains the MIT license under which the project is distributed.
+
+``stellascript`` Module (Application Core)
+============================================
+
+The ``stellascript/`` directory contains the main source code of the application, organized into several modules and sub-modules.
+
+-   ``orchestrator.py``: **The conductor.** This is the most important file in the project. The ``StellaScriptTranscription`` class manages the entire processing pipeline. It initializes the various components (transcriber, diarizer, etc.) and coordinates their interactions, whether for real-time or file-based processing.
+-   ``config.py``: **Central configuration.** This file centralizes all technical constants and parameters used in the application (e.g., sampling rate, audio buffer duration, voice detection thresholds). This allows for easy modification of the application's behavior from a single location.
+-   ``cli.py``: **Command-line interface.** Defines all the arguments that the user can pass to the program (such as ``--file``, ``--language``, ``--mode``) and ensures they are correctly interpreted.
+-   ``logging_config.py``: **Logging configuration.** Sets up the logging system to display informational messages, warnings, or errors during execution, which is crucial for debugging.
+
+``stellascript/audio`` Sub-module
+------------------------------------
+
+This module is dedicated to handling raw audio data.
+
+-   ``capture.py``: **Audio capture.** Manages interaction with the microphone to record the audio stream in real-time.
+-   ``enhancement.py``: **Audio enhancement.** Contains the logic for applying audio cleaning models, such as ``DeepFilterNet`` or ``Demucs``, to reduce background noise and improve voice clarity before transcription.
+
+``stellascript/processing`` Sub-module
+-----------------------------------------
+
+This module contains the components responsible for the intelligent analysis and processing of audio.
+
+-   ``transcriber.py``: **Transcription module.** Encapsulates the speech recognition model (Whisper via ``whisperx``). Its sole responsibility is to take an audio segment and convert it into text.
+-   ``diarizer.py``: **Diarization module.** Its role is to answer the question: "who is speaking and when?". It uses models like ``pyannote.audio`` or a combination of VAD (Voice Activity Detection) and clustering to segment the audio based on speakers.
+-   ``speaker_manager.py``: **Speaker manager.** Works closely with the ``diarizer``, especially for the ``cluster`` method. It is responsible for creating and managing "voiceprints" (embeddings) to identify and differentiate speakers consistently.
diff --git a/docs/source/technical/index.rst b/docs/source/technical/index.rst
@@ -0,0 +1,10 @@
+###################
+Technical Section
+###################
+
+This section provides detailed information about the internal architecture, design choices, and methodologies used in the StellaScript project.
+
+.. toctree::
+   :maxdepth: 2
+
+   architecture