{"id":1545,"date":"2015-10-05T23:01:12","date_gmt":"2015-10-05T22:01:12","guid":{"rendered":"http:\/\/quantum-bits.org\/?p=1545"},"modified":"2022-08-12T17:29:21","modified_gmt":"2022-08-12T16:29:21","slug":"e-v-e-is-witty","status":"publish","type":"post","link":"https:\/\/www.quantum-bits.org\/?p=1545","title":{"rendered":"E.V.E is witty"},"content":{"rendered":"<p><a href=\"http:\/\/quantum-bits.org\/?p=1503\" target=\"_blank\" rel=\"noopener\">Yesterday<\/a>, I tested a tiny USB microphone for E.V.E, and managed to recognized the recorded sentence with the help of the <a href=\"https:\/\/wit.ai\" target=\"_blank\" rel=\"noopener\">wit.ai<\/a> platform (even though the quality of the microphone was a little poor).<\/p>\n<p>Let&#8217;s get a little deeper into wit.ai and see how it could be helpful for E.V.E.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1297\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/09\/rasbperrypi-hacks.jpg\" alt=\"rasbperrypi-hacks\" width=\"985\" height=\"503\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/09\/rasbperrypi-hacks.jpg 985w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/09\/rasbperrypi-hacks-300x153.jpg 300w\" sizes=\"(max-width: 985px) 100vw, 985px\" \/><\/p>\n<p><strong>Trying another USB Microphone<\/strong><\/p>\n<p>I received this morning the second USB microphone I ordered on <a href=\"http:\/\/www.dx.com\" target=\"_blank\" rel=\"noopener\">www.dx.com<\/a>. It&#8217;s a little more expensive (5.33 \u20ac), but still very cheap for experimentation purposes. It&#8217;s a lot bigger than the tiny one I used yesterday. It is flexible too. Once connected, it will face the user (contrary to the other one which is&nbsp;stuck in the back of E.V.E). Hopefully, it&#8217;ll provide better inputs for voice recognition.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1546\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/second-mic.png\" alt=\"second-mic\" width=\"1280\" height=\"769\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/second-mic.png 1280w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/second-mic-300x180.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/second-mic-1024x615.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/p>\n<p>I hooked the mic to the RPi and booted up the system. The system logs traced the microphone as a C-Media USB Pnp Sound device, with a HID interface. Just like the tiny one:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1551\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/dmesg2.png\" alt=\"dmesg2\" width=\"897\" height=\"99\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/dmesg2.png 897w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/dmesg2-300x33.png 300w\" sizes=\"(max-width: 897px) 100vw, 897px\" \/><\/p>\n<p>Running <code>lsub<\/code> command confirmed it :<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1553\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/lsusb2.png\" alt=\"lsusb2\" width=\"787\" height=\"98\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/lsusb2.png 787w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/lsusb2-300x37.png 300w\" sizes=\"(max-width: 787px) 100vw, 787px\" \/><\/p>\n<p>And&nbsp;<code>arecord<\/code> to find out the same&nbsp;card number and sub-device id:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1510\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/arecord-l.png\" alt=\"arecord-l\" width=\"510\" height=\"117\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/arecord-l.png 510w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/arecord-l-300x69.png 300w\" sizes=\"(max-width: 510px) 100vw, 510px\" \/><\/p>\n<p>Without even cranking up the gain of the device with <code>alsamixer<\/code>, I recorded straight away a new phrase (saying &#8220;Quelle heure est il ? \/ What time is it ?&#8221;), a couple of meters away from E.V.E:<\/p>\n<ul>\n<li><code>arecord -D plughw:1 -f dat -r 48000 record2.wav<\/code><\/li>\n<\/ul>\n<p>I applied the same filtering process (trimming the record, boosting the volume and applying a low pass and high pass filter). It gave me a much better record than yesterday:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1549\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2.png\" alt=\"filtered2\" width=\"357\" height=\"141\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2.png 357w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2-300x118.png 300w\" sizes=\"(max-width: 357px) 100vw, 357px\" \/><\/p>\n<!--[if lt IE 9]><script>document.createElement('audio');<\/script><![endif]-->\n<audio class=\"wp-audio-shortcode\" id=\"audio-1545-1\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/wav\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2.wav?_=1\" \/><a href=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2.wav\">http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/filtered2.wav<\/a><\/audio>\n<p>&nbsp;<\/p>\n<p>E.V.E is now looking a little funky, with some kind of ear pointing out on her&nbsp;left side. Some kind of old-style hearing aid.<\/p>\n<p>But I guess it&#8217;s still ok. After all, it&#8217;s a prototype. I&#8217;ll see if I keep this microphone or the tiny one later on.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1547\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/rpi-2ndmic.jpg\" alt=\"rpi-2ndmic\" width=\"1280\" height=\"853\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/rpi-2ndmic.jpg 1280w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/rpi-2ndmic-300x200.jpg 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/rpi-2ndmic-1024x682.jpg 1024w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/rpi-2ndmic-750x500.jpg 750w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/p>\n<p><strong>Playing with wit.ai<\/strong><\/p>\n<p>I used my account to&nbsp;log&nbsp;into wit.ai&#8217;s back-office, and created a new App named &#8220;e.v.e&#8221;. I selected &#8220;fr&#8221; as default language, since I want to interact with E.V.E in my native tongue (sorry about this, you&#8217;ll see a few french sentences in this post. I&#8217;ll try to keep things clear nonetheless) :<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1555\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-eve1.png\" alt=\"witai-eve1\" width=\"987\" height=\"537\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-eve1.png 987w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-eve1-300x163.png 300w\" sizes=\"(max-width: 987px) 100vw, 987px\" \/><\/p>\n<p>It gave me a new Access Token (blurred in the previous screenshot). I changed yesterday&#8217;s recognition script accordingly.<\/p>\n<p>Then I created a new intent called &#8220;what_time_is_it&#8221; and associated three expressions&nbsp;(three different ways to ask for the time in French) to it:<\/p>\n<ul>\n<li>&#8220;Quelle heure est-il ?&#8221;<\/li>\n<li>&#8220;Il est quelle heure ?&#8221;<\/li>\n<li>&#8220;Tu as l&#8217;heure ?&#8221;<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1558\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-intent1.png\" alt=\"witai-intent\" width=\"973\" height=\"596\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-intent1.png 973w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-intent1-300x184.png 300w\" sizes=\"(max-width: 973px) 100vw, 973px\" \/><\/p>\n<p>It tried this newly created intent with the expression &#8220;Quelle heure est-il ?&#8221; previously recorded with the brand new microphone.&nbsp;I ran yesterday&#8217;s script (after changing the Access Token to use &#8220;e.v.e&#8221; wit.ai App and recognize french speech). It got this:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1552\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai2.png\" alt=\"witai2\" width=\"380\" height=\"309\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai2.png 380w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai2-300x244.png 300w\" sizes=\"(max-width: 380px) 100vw, 380px\" \/><\/p>\n<p>Just like yesterday, it gave me back the interpreted meaning of the record (<code>\"_text\" : \"quelle heure est-il\"<\/code>). But this time, it gave me also a confidence level (<code>0.51<\/code>) and the recognized intent (<code>\"intent\" : \"what_time_is_it\"<\/code>).<\/p>\n<p>Nice !<\/p>\n<p>It would have returned the same intent if the record corresponded to one of the other expressions (&#8220;Tu as l&#8217;heure ?&#8221; or &#8220;Il est quelle heure ?&#8221;)<\/p>\n<p>You can also trace all this using wit.ai&#8217;s log system :<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1559\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-logs.png\" alt=\"witai-logs\" width=\"1007\" height=\"650\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-logs.png 1007w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-logs-300x194.png 300w\" sizes=\"(max-width: 1007px) 100vw, 1007px\" \/><\/p>\n<p>Of course, it&#8217;s not easy to create *all* the necessary intents. One nice functionality of wit.ai is the Inbox. It let you discover your needed intents: each expression sent to your wit.ai App by your users will end up in the App&nbsp;Inbox:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1560\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-inbox.png\" alt=\"witai-inbox\" width=\"989\" height=\"478\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-inbox.png 989w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-inbox-300x145.png 300w\" sizes=\"(max-width: 989px) 100vw, 989px\" \/><\/p>\n<p>From there, you can then create the necessary intents from the collected expressions in your Inbox (or find other intents from the community).&nbsp;From the Inbox, you can also train your intents (by mean of validation or correction). And wit.ai will get better at understanding things.<\/p>\n<p>An intent like &#8220;What time is it ?&#8221; is easy to manage. There might be several expressions corresponding to the same intent (telling what time it is), but it is easily managed by associating different expressions to the same intent. Now, if the user says &#8220;Wake me up tomorrow at 5 am&#8221; or &#8220;Set the alarm to 5 am&#8221;, this adds another level of complexity.<\/p>\n<p>It&#8217;s not enough to know that the <code>set_alarm<\/code> intent is concerned, you also need to capture the date and time at which the alarm should be set.&nbsp;This is where wit.ai &#8220;entity&#8221; concept comes to play:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1561\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-entities.png\" alt=\"witai-entities\" width=\"709\" height=\"214\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-entities.png 709w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-entities-300x91.png 300w\" sizes=\"(max-width: 709px) 100vw, 709px\" \/><\/p>\n<p>In addition to determining the user intent, wit.ai tries to capture and normalize these entities for you (if you ask for it). You&#8217;ll get your entities back in wit.ai JSON result.<\/p>\n<p>As you see in the <code>lights<\/code> example above, an entity value is not necessarily tied to a specific phrase in the sentence. <code>on_off = off<\/code> is inferred from the sentence as a whole, unlike <code>temperature<\/code>, which is inferred from the 68\u00b0F phrase in the sentence.<\/p>\n<p>From wit.ai&#8217;s backoffice, you can add your own&nbsp;entity (enum-based, composite entities, &#8230;.) to any expression.&nbsp;You can also use built-in entities. Here is an excerpt :<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-1563\" src=\"http:\/\/quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-builtinentities.png\" alt=\"witai-builtinentities\" width=\"971\" height=\"983\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-builtinentities.png 971w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2015\/10\/witai-builtinentities-296x300.png 296w\" sizes=\"(max-width: 971px) 100vw, 971px\" \/><\/p>\n<p>This really looks good. I think I&#8217;m about to set my mind and select wit.ai for E.V.E voice recognition.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yesterday, I tested a tiny USB microphone for E.V.E, and managed to recognized the recorded sentence with the help of the wit.ai platform (even though the quality of the microphone &#8230;<\/p>\n","protected":false},"author":1,"featured_media":3851,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0},"categories":[21],"tags":[],"_links":{"self":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts\/1545"}],"collection":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1545"}],"version-history":[{"count":0,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts\/1545\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/media\/3851"}],"wp:attachment":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1545"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1545"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1545"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}