语音识别

  • 文本合成语音
  • 克隆声音项目
  • 语音转文字
  • bug解决
  • 文字转语音
  • 不保存文件转语音
  • 重复自己刚说过的话
  • 聊天盒子
  • 环境信息


B站教学视频跳转 音频格式在线转换:
https://www.aconvert.com/cn/audio/m4a-to-mp3/ 或者用格式工厂

文本合成语音

from gtts import gTTS
import os
tts = gTTS(text='hxd!今天你比昨天更博学了吗',lang='zh-tw')
tts.save("lsy.mp3")
os.system("lsy.mp3")
#需要在VPN下使用。执行成功后,结果为0,然后去目录下打开hello.mp3

克隆声音项目

github链接:https://hub.fastgit.org/CorentinJ/Real-Time-Voice-Cloning

中文克隆的实现作者的知乎 作者的B站:跳转作者的github

安装tensorflow-gpu

pip install -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com tensorflow-gpu==2.2.0

安装flask、flask_restx

pip install -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com flask
pip install -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com flask_restex

ModuleNotFoundError: No module named ‘unidecode’
提示缺什么,就装什么。

启动:

python web.py

不支持超过4MB的录音,最佳长度在5~15秒.
默认使用第一个找到的模型,有动手能力的可以看代码修改 web_init_.py。
windows下音频剪辑首选:Audacity

AttributeError: module ‘umap’ has no attribute ‘UMAP’
解决参考这里 ERROR: Cannot uninstall ‘llvmlite’
参考的这里 注意看使用时有无报错信息,
使用忽略参数,强制安装:

pip install --ignore-installed llvmlite

再执行一遍之前的安装命令,

启动GUI,使用:

python demo_toolbox.py -d dataset

dataset是自己新建的文件夹,里面放的是要进行克隆的音频文件,wav格式

python object克隆 python声音克隆_语音识别


克隆效果,用数据集里的样本效果比较好,自己录的音(总是容易带点噪音)效果一般。把自己录的音经过降噪后,效果应该能有所提升。

语音转文字

查看库的版本

import speech_recognition
r = speech_recognition.__version__

报错:ModuleNotFoundError: No module named ‘speech_recognition’
解决方法:

pip install SpeechRecognition

成功:

python object克隆 python声音克隆_python_02

bug解决

AttributeError: Could not find PyAudio; check installation

参考stackoverflow,解决办法:

pip install pipwin
pipwin install PyAudio

准备录音,并转换为文字,需要VPN

r = speech_recognition.Recognizer()
with speech_recognition.Microphone() as source:
    r.adjust_for_ambient_noise(source)#去噪音
    audio = r.listen(source)
r.recognize_google(audio,language='zh-TW')

英文的话是:language = ‘en’ 。

结果:

python object克隆 python声音克隆_python object克隆_03

文字转语音

文字转换为音频文件,

from gtts import gTTS
tts = gTTS(text='When you stand on the edge,so young and hopeless.',lang='en')
tts.save('myaudio\stand.mp3')

播放音频文件

from pygame import mixer
mixer.init()
mixer.music.load('myaudio\stand.mp3')
mixer.music.play()

不保存文件转语音

import tempfile
from gtts import gTTS
from pygame import mixer
mixer.init()

def speak(sentence):
    with tempfile.NamedTemporaryFile(delete=True) as fp:
        tts = gTTS(text=sentence,lang='zh')
        tts.save("{}.mp3".format(fp.name))
        mixer.music.load("{}.mp3".format(fp.name))
        mixer.music.play()

speak('与世界交手的这些年,你是否兴致盎然,光彩依旧?')

重复自己刚说过的话

即为:录音——语音转文字——文字转语音
这里需要VPN

import speech_recognition
import tempfile
from gtts import gTTS
from pygame import mixer

mixer.init()

def listenTo():
    r = speech_recognition.Recognizer()
    with speech_recognition.Microphone() as source:
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source)
    return r.recognize_google(audio,language='zh')

def speak(sentence):
    with tempfile.NamedTemporaryFile(delete=True) as fp:
        tts = gTTS(text=sentence,lang='zh')
        tts.save("{}.mp3".format(fp.name))
        mixer.music.load("{}.mp3".format(fp.name))
        mixer.music.play()

speak(listenTo())

聊天盒子

input_output = {
    '今天心情很一般':'人生如逆旅,我亦是行人。加油!',
    '什么是勇气':'压力下保持优雅',
    '最美城市是哪个':'重庆'
}
speak(input_output.get(listenTo(),'你在说啥呀,麻烦再说一遍'))

环境信息

生成当前环境信息:

pip freeze > requirements.txt

文件内容:

absl-py==0.15.0
alabaster==0.7.12
anaconda-client==1.7.2
anaconda-navigator==1.9.12
anaconda-project==0.8.3
aniso8601==9.0.1
appdirs==1.4.4
argh==0.26.2
asn1crypto==1.3.0
astroid @ file:///C:/ci/astroid_1592487315634/work
astropy==4.0.1.post1
astunparse==1.6.3
atomicwrites==1.4.0
attrs==19.3.0
audioread==2.1.9
autopep8 @ file:///tmp/build/80754af9/autopep8_1592412889138/work
Babel==2.8.0
backcall==0.2.0
backports.functools-lru-cache==1.6.1
backports.shutil-get-terminal-size==1.0.0
backports.tempfile==1.0
backports.weakref==1.0.post1
backports.zoneinfo==0.2.1
bcrypt==3.1.7
beautifulsoup4==4.9.1
bitarray @ file:///C:/ci/bitarray_1594751093906/work
bkcharts==0.2
bleach==3.1.5
bokeh @ file:///C:/ci/bokeh_1593178781838/work
boto==2.49.0
Bottleneck==1.3.2
brotlipy==0.7.0
cachetools==4.2.4
certifi==2020.6.20
cffi==1.14.0
chardet==3.0.4
click==7.1.2
cloudpickle @ file:///tmp/build/80754af9/cloudpickle_1594141588948/work
clyent==1.2.2
colorama==0.4.3
comtypes==1.1.7
conda==4.8.3
conda-build==3.18.11
conda-package-handling==1.7.0
conda-verify==3.4.2
contextlib2==0.6.0.post1
cryptography==2.9.2
cycler==0.10.0
Cython @ file:///C:/ci/cython_1594829190914/work
cytoolz==0.10.1
dask @ file:///tmp/build/80754af9/dask-core_1594156306305/work
decorator==4.4.2
defusedxml==0.6.0
diff-match-patch @ file:///tmp/build/80754af9/diff-match-patch_1594828741838/work
distributed @ file:///C:/ci/distributed_1594742844291/work
docopt==0.6.2
docutils==0.16
entrypoints==0.3
et-xmlfile==1.0.1
fastcache==1.1.0
filelock==3.0.12
flake8==3.8.3
Flask==2.0.2
Flask-Cors==3.0.10
flask-restx==0.5.1
Flask-WTF==0.15.1
fsspec==0.7.4
future==0.18.2
gast==0.3.3
gevent @ file:///C:/ci/gevent_1593005471151/work
glob2==0.7
gmpy2==2.0.8
google-auth==1.35.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
greenlet==0.4.16
grpcio==1.41.1
gTTS==2.2.3
h5py==2.10.0
HeapDict==1.0.1
html5lib @ file:///tmp/build/80754af9/html5lib_1593446221756/work
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
imageio @ file:///tmp/build/80754af9/imageio_1594161405741/work
imagesize==1.2.0
importlib-metadata @ file:///C:/ci/importlib-metadata_1593446511143/work
inflect==5.3.0
intervaltree @ file:///tmp/build/80754af9/intervaltree_1594361675072/work
ipykernel @ file:///C:/ci/ipykernel_1594745408489/work/dist/ipykernel-5.3.2-py3-none-any.whl
ipython @ file:///C:/ci/ipython_1593447482397/work
ipython_genutils==0.2.0
ipywidgets==7.5.1
isort==4.3.21
itsdangerous==2.0.1
jdcal==1.4.1
jedi @ file:///C:/ci/jedi_1592833825077/work
Jinja2==3.0.2
joblib @ file:///tmp/build/80754af9/joblib_1594236160679/work
Js2Py==0.71
json5==0.9.5
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client @ file:///tmp/build/80754af9/jupyter_client_1594826976318/work
jupyter-console==6.1.0
jupyter-core==4.6.3
jupyterlab==2.1.5
jupyterlab-server @ file:///tmp/build/80754af9/jupyterlab_server_1594164409481/work
Keras-Preprocessing==1.1.2
keyring @ file:///C:/ci/keyring_1593109799227/work
kiwisolver==1.2.0
lazy-object-proxy==1.4.3
libarchive-c==2.9
librosa==0.8.1
llvmlite==0.37.0
locket==0.2.0
lxml @ file:///C:/ci/lxml_1594822774489/work
Markdown==3.3.4
MarkupSafe==2.0.1
matplotlib @ file:///C:/ci/matplotlib-base_1592837548929/work
mccabe==0.6.1
menuinst==1.4.16
mistune==0.8.4
mkl-fft==1.1.0
mkl-random==1.1.1
mkl-service==2.3.0
mock==4.0.2
more-itertools==8.4.0
mpmath==1.1.0
msgpack==1.0.0
multipledispatch==0.6.0
navigator-updater==0.2.1
nbconvert==5.6.1
nbformat==5.0.7
networkx @ file:///tmp/build/80754af9/networkx_1594377231366/work
nltk @ file:///tmp/build/80754af9/nltk_1592496090529/work
nose==1.3.7
notebook==6.0.3
numba==0.54.1
numexpr==2.7.1
numpy==1.19.3
numpydoc @ file:///tmp/build/80754af9/numpydoc_1594166760263/work
oauthlib==3.1.1
olefile==0.46
openpyxl @ file:///tmp/build/80754af9/openpyxl_1594167385094/work
opt-einsum==3.3.0
packaging==20.4
pandas @ file:///C:/ci/pandas_1592833613419/work
pandocfilters==1.4.2
paramiko==2.7.1
parso==0.7.0
partd==1.1.0
path==13.1.0
pathlib2==2.3.5
pathtools==0.1.2
patsy==0.5.1
pep8==1.7.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow @ file:///C:/ci/pillow_1594298230227/work
pipwin==0.5.1
pkginfo==1.5.0.1
playsound==1.3.0
pluggy==0.13.1
ply==3.11
pooch==1.5.2
prometheus-client==0.8.0
prompt-toolkit==3.0.5
protobuf==3.19.1
psutil==5.7.0
py @ file:///tmp/build/80754af9/py_1593446248552/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
PyAudio @ file:///C:/Users/Wu/pipwin/PyAudio-0.2.11-cp38-cp38-win_amd64.whl
pycodestyle==2.6.0
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pycurl==7.43.0.5
pydocstyle @ file:///tmp/build/80754af9/pydocstyle_1592848020240/work
pyflakes==2.2.0
pygame==2.0.3
Pygments==2.6.1
pyjsparser==2.7.1
pylint @ file:///C:/ci/pylint_1592482039483/work
PyNaCl @ file:///C:/ci/pynacl_1595000047588/work
pynndescent==0.5.5
pyodbc===4.0.0-unsupported
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1594392929924/work
pyparsing==2.4.7
pypinyin==0.43.0
PyPrind==2.11.3
pyreadline==2.1
pyrsistent==0.16.0
pySmartDL==1.3.4
PySocks==1.7.1
pytest==5.4.3
python-dateutil==2.8.1
python-jsonrpc-server @ file:///tmp/build/80754af9/python-jsonrpc-server_1594397536060/work
python-language-server @ file:///C:/ci/python-language-server_1594162130238/work
pytz==2020.1
pytz-deprecation-shim==0.1.0.post0
PyWavelets==1.1.1
pywin32==227
pywin32-ctypes==0.2.0
pywinpty==0.5.7
PyYAML==5.3.1
pyzmq==19.0.1
QDarkStyle==2.8.1
QtAwesome==0.7.2
qtconsole @ file:///tmp/build/80754af9/qtconsole_1592848611704/work
QtPy==1.9.0
regex @ file:///C:/ci/regex_1593419644658/work
requests @ file:///tmp/build/80754af9/requests_1592841827918/work
requests-oauthlib==1.3.0
resampy==0.2.2
rope==0.17.0
rsa==4.7.2
Rtree==0.9.4
ruamel_yaml==0.15.87
scikit-image==0.16.2
scikit-learn @ file:///C:/ci/scikit-learn_1592853510272/work
scipy==1.4.1
seaborn==0.10.1
Send2Trash==1.5.0
simplegeneric==0.8.1
singledispatch==3.4.0.3
sip==4.19.13
six==1.15.0
snowballstemmer==2.0.0
sortedcollections==1.2.1
sortedcontainers==2.2.2
sounddevice==0.4.3
SoundFile==0.10.3.post1
soupsieve==2.0.1
speech-recognition==0.1
SpeechRecognition==3.8.1
Sphinx @ file:///tmp/build/80754af9/sphinx_1594223420021/work
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.4
sphinxcontrib-websupport @ file:///tmp/build/80754af9/sphinxcontrib-websupport_1593446360927/work
spyder @ file:///C:/ci/spyder_1594830825244/work
spyder-kernels @ file:///C:/ci/spyder-kernels_1594751670175/work
SQLAlchemy @ file:///C:/ci/sqlalchemy_1593445271541/work
statsmodels==0.11.1
sympy @ file:///C:/ci/sympy_1594234545115/work
tables==3.6.1
tblib==1.6.0
tensorboard==2.2.2
tensorboard-plugin-wit==1.8.0
tensorflow-gpu==2.2.0
tensorflow-gpu-estimator==2.2.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
toml @ file:///tmp/build/80754af9/toml_1592853716807/work
toolz==0.10.0
torch @ file:///D:/Anaconda-Jupyter/install/torch-1.8.0%2Bcu111-cp38-cp38-win_amd64.whl
torchaudio @ file:///D:/Anaconda-Jupyter/install/torchaudio-0.8.0-cp38-none-win_amd64.whl
torchvision @ file:///D:/Anaconda-Jupyter/install/torchvision-0.9.0%2Bcu111-cp38-cp38-win_amd64.whl
tornado==6.0.4
tqdm @ file:///tmp/build/80754af9/tqdm_1593446365756/work
traitlets==4.3.3
typing-extensions @ file:///tmp/build/80754af9/typing_extensions_1592847887441/work
tzdata==2021.5
tzlocal==4.1
ujson==1.35
umap-learn==0.5.2
unicodecsv==0.14.1
Unidecode==1.3.2
urllib3==1.25.9
watchdog @ file:///C:/ci/watchdog_1593447437088/work
wcwidth @ file:///tmp/build/80754af9/wcwidth_1593447189090/work
webencodings==0.5.1
webrtcvad-wheels==2.0.10.post2
Werkzeug==2.0.2
widgetsnbextension==3.5.1
win-inet-pton==1.1.0
win-unicode-console==0.5
wincertstore==0.2
wrapt==1.11.2
WTForms==2.3.3
xlrd==1.2.0
XlsxWriter==1.2.9
xlwings==0.19.5
xlwt==1.3.0
xmltodict==0.12.0
yapf @ file:///tmp/build/80754af9/yapf_1593528177422/work
zict==2.0.0
zipp==3.1.0
zope.event==4.4
zope.interface==4.7.1