欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 人工智能 > ChatGpt >内容正文

ChatGpt

[AI] 如何制作一个低配山寨版贾维斯?-口语人机交互 Spoken Human Robot Interaction

发布时间:2024/8/1 ChatGpt 34 豆豆
生活随笔 收集整理的这篇文章主要介绍了 [AI] 如何制作一个低配山寨版贾维斯?-口语人机交互 Spoken Human Robot Interaction 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

口语人机交互 Spoken Human Robot Interaction

  • 一、交互结构总览
  • 二、展示
    • 一、代码背景
    • 二、对话运行结果展示
      • 对话一
      • 对话二
      • 对话三
      • 对话四
  • 三、实现代码

关注!点赞!评论!收藏!谢谢!


如何制作一个低配山寨版贾维斯?


一、交互结构总览

  • 第一步:首先通过语音输入设备,将语音信息输入计算机。这里我使用speech_recognition.sr.Microphone() 函数调用计算机麦克风,然后 sr.Recognizer().listen() 将麦克风输入的语音信息保留下来。

  • 第二步:使用语言识别库,将输入语音信息转为文本信息。sr.Recognizer().recognize_google() 使用谷歌的语音识别获取文本信息。

  • 第三步:使用 en_core_web_sm 库对语言进行解析,并绘制依赖树

  • 第四步: 根据文本单词的词性或者特定词(此处可以自由替换对话中的某些信息,比如时间,数量等,并让计算机可以识别到这些信息),来让计算机自动回复设定好的语音(用Espeak工具可以让计算机用语音读出文本)

二、展示

一、代码背景

我设定的对话背景是,我的计算机是个低配版贾维斯 (是真的低配) ,他管理我的房子,然后有朋友要来家里做客,我让贾维斯帮我提前做一些准备。

二、对话运行结果展示

为了方便,所用的包均用的英文,读者可以自由替换语言包,比如中文包

对话一

贾维斯:What can I do for you sir? 我: today my friends will come to my house


下面的对话可以是任意人数,贾维斯会识别语言中的表示数字的单词,并在后面的对话中复述出来。

贾维斯:Sir, How many friends will come? 我: maybe six people

对话二

这里“ isyncr” 其实是 “I think”,谷歌语音识别的不是特别准(当然不能是我发音不漂准!)

此处语音可选择 三种饮料中的一种,贾维斯识别后会复述一遍你的选择

贾维斯:and what drinks do I need to prepare?cola, tea or coffee? 我 : isyncr coffee is the best 贾维斯:got it sir, I will prepare six cups of coffe

对话三

贾维斯:By the way, sir, when they will come? 我: if I remember correctly live well, either 5 p.m. 贾维斯:Sir, please confirm, your six friends will come at 5. pm and I will prepare six cups of coffe for them

对话四

贾维斯:And Sir, your house looks not clean, can I clean it now? 我:yes of course please 贾维斯:As your wish, please wait a moment 贾维斯:Sir, your house is clean now! 贾维斯:Have a nice day sir!

三、实现代码

我还在开头加了个下载好的贾维斯“欢迎回家”的语音包,代码及语音包戳这里

import speech_recognition as sr import en_core_web_sm import os import spacy from spacy import displacy from spacy.symbols import NOUN, NUM, VERB from nltk import Tree import winsound# Built a nltk tree def to_nltk_tree(node):if node.n_lefts + node.n_rights > 0:return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])else:return node.orth_def dependency_tree(text):nlp = en_core_web_sm.load()nlp_doc = nlp(text)print('---------------')print('Dependency Graph')[to_nltk_tree(sent.root).pretty_print() for sent in nlp_doc.sents]print('---------------')def clean():winsound.PlaySound('Jarvis/Jarvis-System/Garbage cleared.wav', winsound.SND_FILENAME)reply = 'Sir, your house is clean now!'print(reply)os.system('espeak "{}"'.format(reply))winsound.PlaySound('Jarvis/Jarvis-System/Welcome Home Sir(No Song).wav', winsound.SND_FILENAME)#today my friends will come to my house r = sr.Recognizer() with sr.Microphone() as source:l = "What can I do for you sir?"print(l)os.system('espeak "{}"'.format(l))audio = r.listen(source)try:textT = r.recognize_google(audio) except sr.UnknownValueError:print("Jarvis could not understand your audio") except sr.RequestError as e:print("Could not request results from Google Speech Recognition service; {0}".format(e))print(textT) dependency_tree(textT)nlp = en_core_web_sm.load() nlp_doc = nlp(textT)for word in nlp_doc:if word.pos == VERB and str(word) == 'come':l = 'Sir, How many friends will come?'print(l)os.system('espeak "{}"'.format(l))with sr.Microphone() as source:audio = r.listen(source)try:textT = r.recognize_google(audio)except sr.UnknownValueError:print("Jarvis could not understand your audio")except sr.RequestError as e:print("Could not request results from Google Speech Recognition service; {0}".format(e))print(textT)dependency_tree(textT)nlp = en_core_web_sm.load()nlp_doc = nlp(textT)for word in nlp_doc:if word.pos == NUM:friends_num = str(word) #I remember six people# I think tea is the best r = sr.Recognizer() with sr.Microphone() as source:l = "and what drinks do I need to prepare?cola, tea or coffee?"print(l)os.system('espeak "{}"'.format(l))audio = r.listen(source)try:textT = r.recognize_google(audio) except sr.UnknownValueError:print("Jarvis could not understand your audio") except sr.RequestError as e:print("Could not request results from Google Speech Recognition service; {0}".format(e))print(textT) dependency_tree(textT)if 'tea' in textT.lower():drink = 'tea'l = 'got it sir, I will prepare ' + friends_num+' cups of tea' elif 'cola' in textT.lower():drink = 'cola'l = 'got it sir, I will prepare ' + friends_num+' cups of cola' elif 'coffe' in textT.lower():drink = 'coffe'l = 'got it sir, I will prepare ' + friends_num+' cups of coffe' else:l0 = 'Sir, can you please say it again?'print(l0)os.system('espeak "{}"'.format(l0))print(l) os.system('espeak "{}"'.format(l))#If I remeber correctly, they will come at 5 pm r = sr.Recognizer() with sr.Microphone() as source:l = "By the way, sir, when they will come?"print(l)os.system('espeak "{}"'.format(l))audio = r.listen(source)try:textT = r.recognize_google(audio) except sr.UnknownValueError:print("Jarvis could not understand your audio") except sr.RequestError as e:print("Could not request results from Google Speech Recognition service; {0}".format(e))print(textT) dependency_tree(textT) nlp_doc = nlp(textT)time_num = ''for word in nlp_doc:if word.pos == NUM:time_num += str(word) + '.'if 'a.m.' in textT.lower():m = 'am'elif 'p.m.' in textT.lower():m = 'pm'else:m = 'pm'l = 'Sir, please confirm, your ' + friends_num + ' friends will come at '\+ time_num + ' ' + m + ' and I will prepare ' + friends_num + ' cups of ' + drink + ' for them' print(l) os.system('espeak "{}"'.format(l))r = sr.Recognizer() with sr.Microphone() as source:l = "And Sir, your house looks not clean, can I clean it now?"print(l)os.system('espeak "{}"'.format(l))audio = r.listen(source)try:textT = r.recognize_google(audio) except sr.UnknownValueError:print("Jarvis could not understand your audio") except sr.RequestError as e:print("Could not request results from Google Speech Recognition service; {0}".format(e))print(textT) dependency_tree(textT) if 'yes' or 'yeah' in textT.lower():l = 'As your wish, please wait a moment'print(l)os.system('espeak "{}"'.format(l))clean()l = 'Have a nice day sir!'print(l)os.system('espeak "{}"'.format(l))

当然,读者可以在此框架下,加入RNN来让贾维斯的回复更加智能!

关注!点赞!评论!收藏!谢谢!

总结

以上是生活随笔为你收集整理的[AI] 如何制作一个低配山寨版贾维斯?-口语人机交互 Spoken Human Robot Interaction的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。