h0103661 2022-08-10 15:27
浏览 326
已结题

使用VITS训练时torch回报AssertionError

问题遇到的现象和发生背景

我在使用VITS训练时,torch pad回报了以下错误

AssertionError: 4D tensors expect 4 values for padding

https://github.com/jaywalnut310/vits

我使用的VITS-japanese
(训练部份基本没修改)

torch版本为1.6.0,是从requirements.txt下的

问题相关代码,请勿粘贴截图

执行代码:

test.json是修改了listfile路径的VITS-japanese/config/nan.json
python train.py -c config/test.json -m test
运行结果及报错内容
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "/content/vits-japanese/train.py", line 117, in run
    train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, [train_loader, eval_loader], logger, [writer, writer_eval])
  File "/content/vits-japanese/train.py", line 137, in train_and_evaluate
    for batch_idx, (x, x_lengths, spec, spec_lengths, y, y_lengths) in enumerate(train_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 989, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/vits-japanese/data_utils.py", line 97, in __getitem__
    return self.get_audio_text_pair(self.audiopaths_and_text[index])
  File "/content/vits-japanese/data_utils.py", line 62, in get_audio_text_pair
    spec, wav = self.get_audio(audiopath)
  File "/content/vits-japanese/data_utils.py", line 81, in get_audio
    center=False)
  File "/content/vits-japanese/mel_processing.py", line 71, in spectrogram_torch
    y = torch.nn.functional.pad(y.unsqueeze(1), (int((n_fft-hop_size)/2), int((n_fft-hop_size)/2)), mode='reflect')
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 3567, in _pad
    assert len(pad) == 4, '4D tensors expect 4 values for padding'
AssertionError: 4D tensors expect 4 values for padding
我的解答思路和尝试过的方法

我追踪了tensor大小
data_utils.py.get_audio()中的原始audio:

torch.Size([69506, 2])

data_utils.py.get_audio()中经过unsqueeze(0)修改的audio_norm:

torch.Size([1, 69506, 2])

mel_processing.py.spectrogram_torch()中放入pad()前的y.unsqueeze(1):

torch.Size([1, 1, 69506, 2])

padding大小和预设config相同((1024-256)/2):

(384,384)
我想要达到的结果

原始代码我没修改过应该都没问题

是我输入的wav音频有问题?

  • 写回答

1条回答 默认 最新

  • h0103661 2022-08-10 18:43
    关注

    我尝试用 (0,0,384,384) 填充padding size,但在下一个 stft() 中出现“expected a 1D or 2D tensor of floating types”错误,似乎是输入问题而不是VITS code有错误。

    问题出在音频档,我将所有wav重新取样之后就正常了

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论 编辑记录

报告相同问题?

问题事件

  • 系统已结题 8月18日
  • 已采纳回答 8月10日
  • 创建了问题 8月10日

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵