Virtual Jarvis. TYPLE - voice control computer

Most users know that the SIRI system is considered the most popular personal assistant and questioning technology on IOS gadgets. Fortunately, not only the Siri system is available on the market. So, fans of fiction and comics created by Marvel, a personal assistant Jarvis from the movie "Iron Man" is offered.

If the owner of the device saw the film "Iron Man", then it is certainly known to the butler Tony Stark, whose name is Jarvis. Consequently, the user will be able to resort to the help of a virtual servant on its own portable apparatus. In addition, the Jarvis program is a unique development that applies voice and the image of Jarvis character.

The operation of the Jarvis utility begins with conventional audio instructions for using and managing the specified tool. At the end of the setup, the user will need to specify its gender (so that the virtual assistant can correctly access the device owner). In addition, it will have to set a unit of measurement of the main temperature conditions (in particular, degrees in Kelvin, Fahrenheit, or, of course, Celsius).


You can familiarize yourself with a detailed list of instructions, touching the icons placed in the upper corner of the display. At the same time, all teams must certainly begin with the appeal "Jarvis" and usually contain one word (for example, "Jarvis, weather forecast"). Jarvis also knows how to notify the device owner of future meetings and display the current time. A variety of audio names can be created in the program.

It is important to note that the owners of optical disks with a film block "Iron Man" The Jarvis utility provides additional features. For example, the user can easily manage the playback of the corresponding film using this virtual butler.


Helpful information: If you ask your virtual assistant Question: Is it worth buying a BMW 740 (http://www.bmw-avtoport.ru/auto/7/), then his answer with one hundred percent probability will be affirmative! By the way, you can purchase the BMW seventh series right now on the most profitable conditions for yourself! All you need to do for this is to visit the site www.bmw-avtoport.ru.

Today we will talk about our speech. I would like to you driving a computer voice, without the help of your fingertips? And, as they say, - the strength of thought! True, we will not manage the computer by the strength of thought, but this is very realistic.

Typle program - This is one of the best programs to manage a computer through voice. On sites in the comments to this program, the opinions converge.

True there are your shortcomings. But more on that later. By the way, if you are interested in - read my review.

You can download the program here: http://freesoft.ru/typle

How to use it? At the beginning, launch it and see the main control buttons:

The program welcomes us and immediately give us prompts how to use Typle. At the beginning, click the Add button and write a word, such as "Open". To do this, say this word into the microphone:

Then click add. So, we have saved the word "open" in the program. You can speak in the microphone any other words. The main thing is not to get confused.

The next step will be adding commands. To do this, let's go to this point:

Then we set a tick opposite that item that we need:

Select the program, application or action and click on the red record button. If the computer perceived our voice, click "Add":

And now will be visible in our profile one voice team. IN this case The one that opens 7-zip:

And now by clicking the final button "Start talk"

we speak the phrase "Open Semen Zip". In my case, everything will work. And the 7-ZIP program will open. Remember such a phrase: Sim sim open? This is something approximately the same.

The program does not always work adequately. Now the mighty Russian language is not fully studied by linguistic programmers ... but still nice when the computer is listening to you.

Therefore, for testing and banal curiosity, TYPLE program will suit 100%.

In this video, you can see the history of the creation of the first voice engines and what else should we work on:

There are such terrible names of other analogs of the program, like Gorynych, Perpetuum, dotograph, Voice Commander. But they are all "not that." Do not criticize a decent program.

I went for 5 minutes to master this program. This is pretty long time (mainly in such programs I understand 1-2 minutes). If questions arise - write. Until soon meetings, friends :)!


For a long time I did not leave the idea of \u200b\u200bmy "jarvis" and the management of technique in the house of voice. And finally, the hands reached the creation of this miracle. I didn't have to think about the "brains" for a long time, Raspberry Pi is suitable.

So iron:

  • Raspberry PI 3 Model B
  • USB Logitech Camera

Sales

Our assistant will work on the principle of Alexa / Hub:
  1. Activate offline on a specific word
  2. Recognize a command in the cloud
  3. Run the command
  4. Report on doing work or inform request information
Because My camera is supported from the box, you did not have to mess around with the drivers, so we immediately go to the program part.

Offline activation

Activation will occur with CMU Sphinx, and everything would be fine, but the recognition is very slow from the box, more than 10 seconds, which is absolutely not suitable, to solve the problem you need to clear the dictionary from unnecessary words.

Install everything you need:

PIP3 Install SpeechRecognition PIP3 Install Pocketsphinx
Further

Sudo nano /usr/local/lib/python3.4/dist-packages/speech_recognition/pocketsphinx-data/en-us /pronounciation-dictionary.dict
We remove everything except Jarvis we need:

Jarvis JH AA R V AH S
Now PocketSphinx recognizes pretty quickly.

Speech recognition

At first there was an idea to use the Google service, and its support is in SpeechRecognition. But as it turned out, Google takes money for this and does not work with Phys. persons.

The benefit of Yandex also provides such an opportunity, free and extremely simple.

We register, we get the KEY API. All work can be CURL'OM.

CURL -X POST -H "CONTENT-TYPE: AUDIO / X-WAV" --Data-Binary "@file" "https://asr.yandex.net/asr_xml?uuid\u003dya_uid&key\u003dyf_api_key&topic\u003dqueries»

Synthesis of speech

Here Yandex will help us again. We send the text in response to get a file with synthesized text.

Curl "https://tts.voictech.yandex.net/generate?format\u003dwav&lang\u003dru-ru&speaker\u003dzahar&motion\u003dgood&key\u003dya_api_key" -g --Data-Urlencode "Text \u003d Text"\u003e File

Jarvis

We collect all together and get such a script.

#! / USR / BIN / ENV Python # - * - Coding: UTF-8 - * - Import OS Import Speech_Recognition AS SR from XML.DOM Import Minidom Import SYS IMPORT RANDOM R \u003d SR.RECOGNIZER () YA_UUID \u003d "" YA_API_KEY \u003d "" # OS.System ("ECHO" Assist + Ut Zap + Town "| Festival --Tts --Language English") def convert_ya_asr_to_key (): xmldoc \u003d minidom.parse ("./ asr_answer.xml") ItemList \u003d XMLDOC.GetElementsByTagname ("Variant") if len (ItemList)\u003e 0: return itemlist.firstchild.NodeValue Else: Return False Def jarvis_on (): With sr.wavfile ("send.wav") AS SOURCE: audio \u003d r.record (Source) Try: T \u003d R.RecoGnize_Sphinx (Audio) Print (T) Except Lookuperror: Print ("Could Not Understand Audio") Return T \u003d\u003d ("jarvis") def jarvis_say (phrase): os.system ("CURL" HTTPS: //tts.voictech.yandex.net/Generate?format\u003dwav&lang\u003dru-rub&speaker\u003dzahar&motion\u003dgood&key \u003d "" -g --data-urlencode "text \u003d" + phrase + ""\u003e jarvis_speech.wav ) os.System ("aplay jarvis_speech.wav") def jarvis_say_good (): phrases \u003d ["ready", "done", "Listen "," Have "," Something else? ",] RandiTem \u003d Random.Choice (Phrases) Jarvis_say (RandiTem) Try: While True: OS.System (" ARECORD -B --Buffer-Time \u003d 1000000 -f Dat -R 16000 -D 3 -D plughw: 1.0 send.wav ") if jarvis_on (): os.System (" aplay jarvis_on.wav ") OS.System (" Arecord -B --Buffer-Time \u003d 1000000 - F Dat -R 16000 -D 3 -D Plughw: 1.0 Send.wav ") OS.System (" CURL -X POST -H "CONTENT-TYPE: Audio / X-WAV" --data-binary "@send .wav "" https://asr.yandex.net/asr_xml?uuid\u003d"+ya_uuid +" k +ya_uuid+"&key \u003d" k +ya_api_Key+"&topic\u003dqueries "\u003e ASR_ANSWER.XML") Command_Key \u003d convert_ya_asr_to_key () if (Command_Key): if ( command_key in ['key_word ",' key_word1 ',' key_word2"]): os.system ('') jarvis_say_good () Continue Except Exception: jarvis_say ("Something went wrong")
What's going on here. Run an infinite loop, ARECORD'OM write three seconds and send Sphinx to recognition if the word "jarvis" is found in the file

If jarvis_on ():
We lose in advance recorded activation alert file.

We again write 3 seconds and send Yandex, I get our team in response. Next, perform actions based on the command.

On this actually everything. Performance scenarios can come up with a great set.

Use-Case.

Now some examples of real use

Philips Hue.

Install

Pip Install Phue.
In the HUE application, install static IP:

Run:

#! / usr / bin / python import sys from phue import bridge b \u003d bridge ("192.168.0.100") # ENTER BRIDGE IP HERE. #If Running for the First Time, Press Button on Bridge and Run With B.Connect () Uncommented # B.Connect () Print (B.Get_Scene ())
We write out the ID of the desired schemes, the form "470D4C3C8-ON-0"

The end version of the script:

#! / usr / bin / python import sys from phue import bridge b \u003d bridge ("192.168.0.100") # ENTER BRIDGE IP HERE. #If Running for the first time, Press Button On Bridge and Run With B.Connect () Uncommented # B.Connect () if (sys.argv \u003d\u003d "OFF"): B.Set_Light (, "On", False) ELSE: B.ActiveVate_Scene (1, sys.argv)
Add to Jarvis:

If (Command_Key In ["light", "Turn on Light", "Light"]): os.System ("python3 /home/pi/smarthome/hue/hue.py a1167aa91-on-0") jarvis_say_good () Continue If (Command_Key In ["Light", "Mute Light"]): os.System ("python3 /home/pi/smarthome/hue/hue.py AC637E2F0-on-0") jarvis_say_good () Continue if (Command_Key In ["Turn off the light", "Turn off the light"]): os.System ("python3 /home/pi/smarthome/hue/hue.py" off ") jarvis_say_good () Continue

LG TV.

We take the script from here. After the first start and entering the conjugation code, the code itself does not change, so you can cut this part from the script and leave only the control.

Add to Jarvis:

# 1 - POWER # 24 - Volune_up # 25 - Volume_Down # 400 - 3D_Video If (Command_Key in ["TWN TWC", "Turn off the TV"]): os.System ("Python3 / HOME / PI / SMARTHOME / TV / TV2. PY 1 ") jarvis_say_good () Continue if (Command_Key in ['add volume", "louder"]): os.System ("python3 /home/pi/smarthome/tv/tv2.py 24") jarvis_say_good () Continue

Radio

Sudo Apt-Get Install MPG123
Add to Jarvis:

If (Command_Key In [News ", 'Turn Off News',' What happens']): os.System ('MPG123 URL) Continue
You can still put homebridge and manage everything through Siri, if you don't finish the Jarvis.

As for the quality of speech recognition, not alexa of course, but at a distance of 5 meters the percentage of faithful ingredients. The main problem is a speech from the TV \\ speakers is recorded with the commands and interferes with recognition.

That's all, thanks.

Tags:

  • raspberry PI
  • python
Add Tags