ImGoingToBakeMyGPU
diff --git a/‎.env
Lines changed: 1 addition & 0 deletions b/‎.env
Lines changed: 1 addition & 0 deletions
diff --git a/‎.github/workflows/unitest.yml
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/unitest.yml
Lines changed: 1 addition & 1 deletion
diff --git a/‎.gitignore
Lines changed: 5 additions & 0 deletions b/‎.gitignore
Lines changed: 5 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 70 additions & 97 deletions b/‎README.md
Lines changed: 70 additions & 97 deletions
diff --git a/‎Retrieval_based_Voice_Conversion_WebUI.ipynb
Lines changed: 1 addition & 1 deletion b/‎Retrieval_based_Voice_Conversion_WebUI.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎Retrieval_based_Voice_Conversion_WebUI_v2.ipynb
Lines changed: 1 addition & 1 deletion b/‎Retrieval_based_Voice_Conversion_WebUI_v2.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎assets/indices/.gitignore
Lines changed: 2 additions & 0 deletions b/‎assets/indices/.gitignore
Lines changed: 2 additions & 0 deletions
diff --git a/‎configs/config.json
Lines changed: 1 addition & 1 deletion b/‎configs/config.json
Lines changed: 1 addition & 1 deletion
diff --git a/‎configs/config.py
Lines changed: 16 additions & 13 deletions b/‎configs/config.py
Lines changed: 16 additions & 13 deletions
diff --git a/‎configs/inuse/.gitignore
Lines changed: 4 additions & 0 deletions b/‎configs/inuse/.gitignore
Lines changed: 4 additions & 0 deletions
@@ -5,4 +5,5 @@ no_proxy = localhost, 127.0.0.1, ::1
 weight_root = assets/weights
 weight_uvr5_root = assets/uvr5_weights
 index_root = logs
+outside_index_root = assets/indices
 rmvpe_root = assets/rmvpe
@@ -33,4 +33,4 @@ jobs:
         python infer/modules/train/preprocess.py logs/mute/0_gt_wavs 48000 8 logs/mi-test True 3.7
         touch logs/mi-test/extract_f0_feature.log
         python infer/modules/train/extract/extract_f0_print.py logs/mi-test $(nproc) pm
-        python infer/modules/train/extract_feature_print.py cpu 1 0 0 logs/mi-test v1
+        python infer/modules/train/extract_feature_print.py cpu 1 0 0 logs/mi-test v1 True
@@ -21,3 +21,8 @@ rmvpe.pt
 
 # To set a Python version for the project
 .tool-versions
+
+/runtime
+/assets/weights/*
+ffmpeg.*
+ffprobe.*
@@ -14,35 +14,28 @@
 
 [![Discord](https://img.shields.io/badge/RVC%20Developers-Discord-7289DA?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/HcsmBBGyVk)
 
-[**更新日志**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/Changelog_CN.md) | [**常见问题解答**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98%E8%A7%A3%E7%AD%94) | [**AutoDL·5毛钱训练AI歌手**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B) | [**对照实验记录**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%AF%B9%E7%85%A7%E5%AE%9E%E9%AA%8C%C2%B7%E5%AE%9E%E9%AA%8C%E8%AE%B0%E5%BD%95)) | [**在线演示**](https://modelscope.cn/studios/FlowerCry/RVCv2demo)
+[**更新日志**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/cn/Changelog_CN.md) | [**常见问题解答**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98%E8%A7%A3%E7%AD%94) | [**AutoDL·5毛钱训练AI歌手**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B) | [**对照实验记录**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/Autodl%E8%AE%AD%E7%BB%83RVC%C2%B7AI%E6%AD%8C%E6%89%8B%E6%95%99%E7%A8%8B](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/%E5%AF%B9%E7%85%A7%E5%AE%9E%E9%AA%8C%C2%B7%E5%AE%9E%E9%AA%8C%E8%AE%B0%E5%BD%95)) | [**在线演示**](https://modelscope.cn/studios/FlowerCry/RVCv2demo)
+
+</div>
+
+------
 
 [**English**](./docs/en/README.en.md) | [**中文简体**](./README.md) | [**日本語**](./docs/jp/README.ja.md) | [**한국어**](./docs/kr/README.ko.md) ([**韓國語**](./docs/kr/README.ko.han.md)) | [**Français**](./docs/fr/README.fr.md)| [**Türkçe**](./docs/tr/README.tr.md)
 
-</div>
+点此查看我们的[演示视频](https://www.bilibili.com/video/BV1pm4y1z7Gm/) !
+
+训练推理界面：go-web.bat
+
+![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/092e5c12-0d49-4168-a590-0b0ef6a4f630)
+
+实时变声界面：go-realtime-gui.bat
+
+![image](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/143246a9-8b42-4dd1-a197-430ede4d15d7)
 
 > 底模使用接近50小时的开源高质量VCTK训练集训练，无版权方面的顾虑，请大家放心使用
 
 > 请期待RVCv3的底模，参数更大，数据更大，效果更好，基本持平的推理速度，需要训练数据量更少。
 
-<table>
-   <tr>
-		<td align="center">训练推理界面</td>
-		<td align="center">实时变声界面</td>
-	</tr>
-  <tr>
-		<td align="center"><img src="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/092e5c12-0d49-4168-a590-0b0ef6a4f630"></td>
-    <td align="center"><img src="https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/assets/129054828/730b4114-8805-44a1-ab1a-04668f3c30a6"></td>
-	</tr>
-	<tr>
-		<td align="center">go-web.bat</td>
-		<td align="center">go-realtime-gui.bat</td>
-	</tr>
-  <tr>
-    <td align="center">可以自由选择想要执行的操作。</td>
-		<td align="center">我们已经实现端到端170ms延迟。如使用ASIO输入输出设备，已能实现端到端90ms延迟，但非常依赖硬件驱动支持。</td>
-	</tr>
-</table>
-
 ## 简介
 本仓库具有以下特点
 + 使用top1检索替换输入源特征为训练集特征来杜绝音色泄漏
@@ -54,52 +47,47 @@
 + 使用最先进的[人声音高提取算法InterSpeech2023-RMVPE](#参考项目)根绝哑音问题。效果最好（显著地）但比crepe_full更快、资源占用更小
 + A卡I卡加速支持
 
-点此查看我们的[演示视频](https://www.bilibili.com/video/BV1pm4y1z7Gm/) !
-
 ## 环境配置
 以下指令需在 Python 版本大于3.8的环境中执行。  
 
-### Windows/Linux/MacOS等平台通用方法
-下列方法任选其一。
-#### 1. 通过 pip 安装依赖
-1. 安装Pytorch及其核心依赖，若已安装则跳过。参考自: https://pytorch.org/get-started/locally/
+(Windows/Linux)  
+首先通过 pip 安装主要依赖:
 ```bash
+# 安装Pytorch及其核心依赖，若已安装则跳过
+# 参考自: https://pytorch.org/get-started/locally/
 pip install torch torchvision torchaudio
-```
-2. 如果是 win 系统 + Nvidia Ampere 架构(RTX30xx)，根据 #21 的经验，需要指定 pytorch 对应的 cuda 版本
-```bash
-pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
-```
-3. 根据自己的显卡安装对应依赖
-- N卡
-```bash
-pip install -r requirements.txt
-```
-- A卡/I卡
-```bash
-pip install -r requirements-dml.txt
-```
-- A卡ROCM(Linux)
-```bash
-pip install -r requirements-amd.txt
-```
-- I卡IPEX(Linux)
-```bash
-pip install -r requirements-ipex.txt
+
+#如果是win系统+Nvidia Ampere架构(RTX30xx)，根据 #21 的经验，需要指定pytorch对应的cuda版本
+#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
 ```
 
-#### 2. 通过 poetry 来安装依赖
-安装 Poetry 依赖管理工具，若已安装则跳过。参考自: https://python-poetry.org/docs/#installation
+可以使用 poetry 来安装依赖：
 ```bash
+# 安装 Poetry 依赖管理工具, 若已安装则跳过
+# 参考自: https://python-poetry.org/docs/#installation
 curl -sSL https://install.python-poetry.org | python3 -
+
+# 通过poetry安装依赖
+poetry install
 ```
-通过poetry安装依赖
+
+你也可以通过 pip 来安装依赖：
 ```bash
-poetry install
+N卡：
+  pip install -r requirements.txt
+
+A卡/I卡：
+  pip install -r requirements-dml.txt
+
+A卡Rocm（Linux）：
+  pip install -r requirements-amd.txt
+
+I卡IPEX（Linux）：
+  pip install -r requirements-ipex.txt
 ```
 
-### MacOS
-可以通过 `run.sh` 来安装依赖
+------
+Mac 用户可以通过 `run.sh` 来安装依赖：
 ```bash
 sh ./run.sh
 ```
@@ -109,48 +97,48 @@ RVC需要其他一些预模型来推理和训练。
 
 你可以从我们的[Hugging Face space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/)下载到这些模型。
 
-### 1. 下载 assets
-以下是一份清单，包括了所有RVC所需的预模型和其他文件的名称。你可以在`tools`文件夹找到下载它们的脚本。
+以下是一份清单，包括了所有RVC所需的预模型和其他文件的名称:
+```bash
+./assets/hubert/hubert_base.pt
 
-- ./assets/hubert/hubert_base.pt
+./assets/pretrained 
 
-- ./assets/pretrained 
+./assets/uvr5_weights
 
-- ./assets/uvr5_weights
+想测试v2版本模型的话，需要额外下载
 
-想使用v2版本模型的话，需要额外下载
+./assets/pretrained_v2
 
-- ./assets/pretrained_v2
+如果你正在使用Windows，则你可能需要这个文件，若ffmpeg和ffprobe已安装则跳过; ubuntu/debian 用户可以通过apt install ffmpeg来安装这2个库, Mac 用户则可以通过brew install ffmpeg来安装 (需要预先安装brew)
 
-### 2. 安装 ffmpeg
-若ffmpeg和ffprobe已安装则跳过。
+./ffmpeg
 
-#### Ubuntu/Debian 用户
-```bash
-sudo apt install ffmpeg
-```
-#### MacOS 用户
-```bash
-brew install ffmpeg
-```
-#### Windwos 用户
-下载后放置在根目录。
-- 下载[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)
+https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe
 
-- 下载[ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)
+./ffprobe
 
-### 3. 下载 rmvpe 人声音高提取算法所需文件
+https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe
 
-如果你想使用最新的RMVPE人声音高提取算法，则你需要下载音高提取模型参数并放置于RVC根目录。
+如果你想使用最新的RMVPE人声音高提取算法，则你需要下载音高提取模型参数并放置于RVC根目录
 
-- 下载[rmvpe.pt](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt)
+https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt
 
-#### 下载 rmvpe 的 dml 环境(可选, A卡/I卡用户)
+    A卡I卡用户需要的dml环境要请下载
 
-- 下载[rmvpe.onnx](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx)
+    https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx
 
-### 4. AMD显卡Rocm(可选, 仅Linux)
+```
+之后使用以下指令来启动WebUI:
+```bash
+python infer-web.py
+```
+如果你正在使用Windows 或 macOS，你可以直接下载并解压`RVC-beta.7z`，前者可以运行`go-web.bat`以启动WebUI，后者则运行命令`sh ./run.sh`以启动WebUI。
 
+对于需要使用IPEX技术的I卡用户，请先在终端执行`source /opt/intel/oneapi/setvars.sh`（仅Linux）。
+
+仓库内还有一份`小白简易教程.doc`以供参考。
+
+## AMD显卡Rocm相关（仅Linux）
 如果你想基于AMD的Rocm技术在Linux系统上运行RVC，请先在[这里](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html)安装所需的驱动。
 
 若你使用的是Arch Linux，可以使用pacman来安装所需驱动：
@@ -167,25 +155,10 @@ export HSA_OVERRIDE_GFX_VERSION=10.3.0
 sudo usermod -aG render $USERNAME
 sudo usermod -aG video $USERNAME
 ````
-
-## 开始使用
-### 直接启动
-使用以下指令来启动 WebUI
+之后运行WebUI：
 ```bash
 python infer-web.py
 ```
-### 使用整合包
-下载并解压`RVC-beta.7z`
-#### Windows 用户
-双击`go-web.bat`
-#### MacOS 用户
-```bash
-sh ./run.sh
-```
-### 对于需要使用IPEX技术的I卡用户(仅Linux)
-```bash
-source /opt/intel/oneapi/setvars.sh
-```
 
 ## 参考项目
 + [ContentVec](https://github.com/auspicious3000/contentvec/)
 
@@ -290,7 +290,7 @@
     "\n",
     "!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",
     "\n",
-    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"
+    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"
    ]
   },
   {
 
@@ -309,7 +309,7 @@
     "\n",
     "!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",
     "\n",
-    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"
+    "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"
    ]
   },
   {
 
@@ -0,0 +1,2 @@
+*
+!.gitignore
@@ -1 +1 @@
-{"pth_path": "assets/weights/kikiV1.pth", "index_path": "logs/kikiV1.index", "sg_input_device": "VoiceMeeter Output (VB-Audio Vo (MME)", "sg_output_device": "VoiceMeeter Input (VB-Audio Voi (MME)", "sr_type": "sr_model", "threhold": -60.0, "pitch": 12.0, "rms_mix_rate": 0.5, "index_rate": 0.0, "block_time": 0.2, "crossfade_length": 0.08, "extra_time": 2.00, "n_cpu": 4.0, "use_jit": false, "use_pv": false, "f0method": "fcpe"}
+{"pth_path": "assets/weights/kikiV1.pth", "index_path": "logs/kikiV1.index", "sg_hostapi": "MME", "sg_wasapi_exclusive": false, "sg_input_device": "VoiceMeeter Output (VB-Audio Vo", "sg_output_device": "VoiceMeeter Input (VB-Audio Voi", "sr_type": "sr_device", "threhold": -60.0, "pitch": 12.0, "rms_mix_rate": 0.5, "index_rate": 0.0, "block_time": 0.15, "crossfade_length": 0.08, "extra_time": 2.0, "n_cpu": 4.0, "use_jit": false, "use_pv": false, "f0method": "fcpe"}
@@ -2,6 +2,7 @@
 import os
 import sys
 import json
+import shutil
 from multiprocessing import cpu_count
 
 import torch
@@ -58,13 +59,17 @@ def __init__(self):
             self.dml,
         ) = self.arg_parse()
         self.instead = ""
+        self.preprocess_per = 3.7
         self.x_pad, self.x_query, self.x_center, self.x_max = self.device_config()
 
     @staticmethod
     def load_config_json() -> dict:
         d = {}
         for config_file in version_config_list:
-            with open(f"configs/{config_file}", "r") as f:
+            p = f"configs/inuse/{config_file}"
+            if not os.path.exists(p):
+                shutil.copy(f"configs/{config_file}", p)
+            with open(f"configs/inuse/{config_file}", "r") as f:
                 d[config_file] = json.load(f)
         return d
 
@@ -123,15 +128,13 @@ def has_xpu() -> bool:
     def use_fp32_config(self):
         for config_file in version_config_list:
             self.json_config[config_file]["train"]["fp16_run"] = False
-            with open(f"configs/{config_file}", "r") as f:
+            with open(f"configs/inuse/{config_file}", "r") as f:
                 strr = f.read().replace("true", "false")
-            with open(f"configs/{config_file}", "w") as f:
+            with open(f"configs/inuse/{config_file}", "w") as f:
                 f.write(strr)
-        with open("infer/modules/train/preprocess.py", "r") as f:
-            strr = f.read().replace("3.7", "3.0")
-        with open("infer/modules/train/preprocess.py", "w") as f:
-            f.write(strr)
-        print("overwrite preprocess and configs.json")
+            logger.info("overwrite " + config_file)
+        self.preprocess_per = 3.0
+        logger.info("overwrite preprocess_per to %d" % (self.preprocess_per))
 
     def device_config(self) -> tuple:
         if torch.cuda.is_available():
@@ -161,10 +164,7 @@ def device_config(self) -> tuple:
                 + 0.4
             )
             if self.gpu_mem <= 4:
-                with open("infer/modules/train/preprocess.py", "r") as f:
-                    strr = f.read().replace("3.7", "3.0")
-                with open("infer/modules/train/preprocess.py", "w") as f:
-                    f.write(strr)
+                self.preprocess_per = 3.0
         elif self.has_mps():
             logger.info("No supported Nvidia GPU found")
             self.device = self.instead = "mps"
@@ -247,5 +247,8 @@ def device_config(self) -> tuple:
                     )
                 except:
                     pass
-        print("is_half:%s, device:%s" % (self.is_half, self.device))
+        logger.info(
+            "Half-precision floating-point: %s, device: %s"
+            % (self.is_half, self.device)
+        )
         return x_pad, x_query, x_center, x_max
@@ -0,0 +1,4 @@
+*
+!.gitignore
+!v1
+!v2
Original file line number	Diff line number	Diff line change
`@@ -290,7 +290,7 @@`
`290`	`290`	`"\n",`
`291`	`291`	`"!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",`
`292`	`292`	`"\n",`
`293`		`- "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"`
	`293`	`+ "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"`
`294`	`294`	`]`
`295`	`295`	`},`
`296`	`296`	`{`
Original file line number	Diff line number	Diff line change
`@@ -309,7 +309,7 @@`
`309`	`309`	`"\n",`
`310`	`310`	`"!python3 extract_f0_print.py logs/{MODELNAME} {THREADCOUNT} {ALGO}\n",`
`311`	`311`	`"\n",`
`312`		`- "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME}"`
	`312`	`+ "!python3 extract_feature_print.py cpu 1 0 0 logs/{MODELNAME} True"`
`313`	`313`	`]`
`314`	`314`	`},`
`315`	`315`	`{`
Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-{"pth_path": "assets/weights/kikiV1.pth", "index_path": "logs/kikiV1.index", "sg_input_device": "VoiceMeeter Output (VB-Audio Vo (MME)", "sg_output_device": "VoiceMeeter Input (VB-Audio Voi (MME)", "sr_type": "sr_model", "threhold": -60.0, "pitch": 12.0, "rms_mix_rate": 0.5, "index_rate": 0.0, "block_time": 0.2, "crossfade_length": 0.08, "extra_time": 2.00, "n_cpu": 4.0, "use_jit": false, "use_pv": false, "f0method": "fcpe"}`
	`1`	`+{"pth_path": "assets/weights/kikiV1.pth", "index_path": "logs/kikiV1.index", "sg_hostapi": "MME", "sg_wasapi_exclusive": false, "sg_input_device": "VoiceMeeter Output (VB-Audio Vo", "sg_output_device": "VoiceMeeter Input (VB-Audio Voi", "sr_type": "sr_device", "threhold": -60.0, "pitch": 12.0, "rms_mix_rate": 0.5, "index_rate": 0.0, "block_time": 0.15, "crossfade_length": 0.08, "extra_time": 2.0, "n_cpu": 4.0, "use_jit": false, "use_pv": false, "f0method": "fcpe"}`
-Original file line number
+Diff line change
@@ @@ -0,0 +1,4 @@ @@
 +*
 +!.gitignore
 +!v1
 +!v2