Skip to content

artiomist/ebook2audiobook-ipa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

ebook2audiobook-ipa

ipa for ebook2audiobook fix for russian word stress

There is a great tool https://github.com/DrewThomasson/ebook2audiobook made by Drew Thomasson to convert an e-book to audiobook. For russian language there is a bug with word stresses: DrewThomasson/ebook2audiobook#597

This script is a workaround that uses IPA transcriptor and xtts-ru-ipa model.

  • after a few testes, it seems that tha intonation is a bit better with the original model, and word stresses are better with ipa model
  • tested on wndows
  • assuming the user path is C:\Users\admin\
  • on my pc GPU option (step 5) is twice as fast as CPU (~20000sec GPU, 54000sec CPU)
  1. pip install git+https://github.com/omogr/omogre.git

    This will install Omogre, Russian accentuator and IPA transcriptor. https://github.com/omogr/omogre

  2. Download russian ipa files from https://huggingface.co/omogr/xtts-ru-ipa/tree/main and zip to "xtts-ru-ipa.zip" (it must be a zip without folder inside but 4 files: ref.wav, model.pth, config.json, vocab.json). This is an IPA model that uses this corpus https://ruslan-corpus.github.io/

  3. convert audiobook from anyformat to epub

    C:\Users\admin\scoop\shims\ebook-convert.EXE C:\Users\admin\Downloads\XXX.fb2 C:\Users\admin\Downloads\XXX.epub
  4. pip install ebooklib bs4 num2words

    #run once, needed for transcribe_epub_book_file.py

    python "C:\Users\admin\Downloads\transcribe_epub_book_file.py" "C:\Users\admin\Downloads\Downloads\XXX.epub"
    • this will create "C:\Users\admin\Downloads\XXX_processed.epub" file
    • transcribes russian for xtts_ru_ipa model using IPA installed in step 1, processes only index_split_XXX.xhtml files
    • keeps english intact
    • converts numbers to words prior to transcribing
    • should ignore html code inside xhtml (keep intact)
  5. run with GPU

docker run --pull always --rm --gpus all -p 7860:7860 athomasson2/ebook2audiobook 
  1. run with CPU
  • in terminal run
    ebook2audiobook/ebook2audiobook.cmd
  • select C:\Users\admin\Downloads\XXX_processed.epub
  • select russian language (set default language in ./lib/lang.py)
  • optional: change clone voice to e.g Morgan Freeman (I like the voice)
  • load xtts-ru-ipa.zip into custom model zip and select it
  • process audiobook

About

ipa for ebook2audiobook fix for russian word stress

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages