doupo2157 2018-11-23 19:08
浏览 29
已采纳

我有从0到99的wav文件,连接时使它们听起来很好的最佳逻辑是什么? [关闭]

For example, I "give" the number 1736, and I have 100 .wav files (like 0.wav, 1.wav, etc), how should I concatenate the audios to make them sound more "fluid". Most of the time they have a gap in between the numbers and sound very "hard", I want to listen them as if a real person was saying it, well, as close as possible (exluding the sound quality).

This can be in any language, PHP, Python, etc. I just need the logic/algorithm.

Not sure if it's a vague question, feel free to tell me so I remove it if that's the case.

Thanks.

  • 写回答

1条回答 默认 最新

  • dpj775835868 2018-11-23 19:23
    关注

    The issue you're likely having is intonation.

    When speaking, the rising and falling tones help indicate phrasing. If I say, "one, seven, three, six", and end with a falling tone (pitch going down), it sounds final and the listener knows they've heard all the digits. If I end with a rising tone (pitch going up), it sounds like I'm asking a question, which is weird to the listener since the numbers aren't a question.

    To make this sound more natural, at a minimum, you'll need to record each with different intonation and put them together correctly.

    There's another problem though with the phrasing. When speaking, it sounds best when continuously moving air and using articulation to enunciate the words. If you were to record the sound of a radio announcer and play it back while filtering out all of the higher frequencies so that you couldn't hear the articulation, you would hear something close to a continuous tone that would change a bit in pitch. This isn't something you'll get by concatenating audio files together. The best you can do is have a proper speech engine speak.

    See also:

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题