good Monday morning today I’m just going
to chill out a little bit and play
around with the Watson voice api’s I am
mpj and you are watching fun function
alright so IBM Watson in that I be M
State a cloud machine learning platform
thingy I’m honestly not sure iBM has
always had really confusing branding for
everything so yesterday on the twitch
stream I played around with these api’s
in preparation for a hack I’m doing this
weekend but I figured that the API was
pretty impressive and it turned turned
out really nicely when you when you did
it in a web so I figured that I would
just do a quick video where I
demonstrate how to use the IBM Watson
voice processing in inside a web app
because I was pretty fun this is not a
sponsored video or anything I just
happened to use the the Watson API is
because they had a couple of features
that I need for the hack I’m pretty sure
that you can use it do this with most
streaming api’s but what I found
particularly impressive with the the
Watson implementation is that they have
a pretty nice JavaScript API that you
can use in browser and it screams the
result two streams to result of a
WebSocket so it’s incredibly snappy
considering that it’s the voice
recognition is done over in the network
okay so let’s get started yes they
create react app my voice thing and
while we’re doing that I’m going to pull
up a browser and we’re going to do what
[Music]
it’s not one looking for what some
speech NPM yeah this is the one this is
the one we’re looking for
uh-huh cool let’s jump to the github
repo okay back to the the CD my voice
open up terminal and go and him start to
make sure that things are working yes it
works yeah I have a cold
I don’t know if the microphone picks up
my snot but like a my head is full of
snot all right the first thing that we
need to do is look at the examples here
so they have an example server here
because on that we need to have access
to the web service and tokens and stuff
like that
and so we need to store the secrets on
the server so what we’re gonna do is
we’re just gonna grab this this server
up j/s here looks like that I’m gonna
steal that going to do a server job J as
this is just a hack this is just an
exploration so we’re not bothering with
quality and crap like that they have a
bunch of stuff we don’t need they are
serving and browser if eyeing and stuff
here don’t need that for this example
we’re we’re not gonna use dot and either
because I have already set the secrets
for the the API in my environment
variables in my bashed-up profile so
that I don’t accidentally share my
secrets with you find people you’re all
fine people except one of you who is a
criminal and that person is destroying
things or
let’s remove this thing this is the
browserify stuff we don’t need that my
nose token end points we need this
speech-to-text thing here but we don’t
need the text-to-speech thing we don’t
care about that at all
currently that most of this looks fine I
think this thing we can’t use port 3000
because that’s what’s being used by the
create react app dev server I’m going to
change that in 3000 to other than that I
think that all we need to do this is
stole all of these these things here so
let’s do NPM I express and we need the
Watson developer cloud and we need the B
cap services I’m not sure I don’t know
what that is
but it’s used here and I am too lazy to
investigate it and I think that’s it
boom
installing please stand by sorry about
the studio being messy again by the way
it’s because I’m painting that wall
there so I had to move the desk here and
move a lot of boxes there and I’m also
painting there it’s come to be nice
eventually I promise like it’s just a
constant one more thing kind of
situation with the studio okay let’s see
if the server runs node SRC server of
chess BAM it didn’t run it broke web
pack is not defined it’s setting up the
web pack compiler we don’t need that
it’s we’ve got to remove it let’s try to
run it again still fails no such file or
directory localhost okay all right the
point is a chrome requires HTTPS to
access the user’s microphone unless is a
low
host your eyes so this is a basic server
on port 3001 using a self-signed ii a
certificate so we probably honestly
don’t need that since we’re just gonna
do explorations of localhost to be
honest but they actually do provide some
certificates here so I’m going to just
steal these and this was local hosted
cert we also need the local host dot
p.m. there we go let’s see if this works
all right cool what’s an IBM speech just
kgm skip applying an app token server
live at localhost 3002 that’s a catchy
name in it ah right
this means that we can go jump to the
example code here you let’s have a look
let’s have a look okay I’m loosing
myself here they are here in the static
directory screwing down to Microsoft
streaming object extracted to console
I’m just gonna copy paste this part here
and we’re gonna jump into our app I want
to trigger this on on a like button
press so I’m going to delete this we
create a button and then listen to
microphone what do you do and on click
it’s going to do things like on
and 200 listen and then that there yeah
so you know what let’s just try that
what’s the speech is not defined unless
and click it’s not fine no it’s gonna be
this dot on this and click what’s the
speech is not defined right this is well
it’s not a fine need to get that from
somewhere actually if we go back to the
to the npm module to the root we’ll see
that there’s a Watson speech bottle you
can require it like sub parts here
see yeah like we’re requiring Watson
speech slash speech to text slash
recognize microphone and if you have a
look over here you see that it’s the
same structure as this thing here so if
you load it using some method
like Bower it will load it into the
global scope and do this like we did
JavaScript it back in the dark ages but
he use NPM here so we’re gonna do that
let’s steal that let’s pull that in here
and we’re going to use import recognice
Mike from something like that see if
does this look like watching speech it’s
not fine now we’re gonna use this from
here see what that looks like module not
found what’s a speech no because we
haven’t installed that module quite yet
so let’s cancel the work shut down the
reactive alabaster and go NPM I Watson
speech see what that
gives us installing please stand by okay
let’s start this surrogate see what
happens really nice that we add the
react development thing just
automatically reloads I click let’s open
up the console over here because I know
that my face will be down here and
listen to microphone okay broke break
ins breakage okay fail to construct web
socket the air contains a fragment
identifier oh you know what betting that
this is the problem is that this token
here is messing things up let’s the
console dot log slash token yes
token see what happens see the
microphone scrolling up hey token is
okay
token adds a lot of HTML oh okay
so we are a local host 3000 but the the
server that we created the one that is
supposed to provide the token it’s
actually on three thousand and to
remember it because we are running our
react development server 3000 we changed
that so I’m just going to do a dirty
thing here and do low close three
thousand two and that is going to
improve things
listen to microphone see what we get
okay it just failed to fetch that’s nice
oh no access control origin it’s
cross-origin requests since we are
developing and playing around here we’re
just going to allow all the cross-origin
requests in the world you can do that by
just require the course module and then
use that in Express also coffee and
let’s hoops we’re starting the wrong
thing
let’s do NPM I course so that we
actually get the module then we’re going
to restart the app developer no no they
don’t server our token server COBOL
install please note as a server
we’re up and running okay let’s click
that again okay cannot sit all right
cool we’re getting some alternatives
here okay you see here that it’s
actually doing some really bad parsing
of what it is that I am saying because
voice finn’s is these okay here cool it
just did what I just basically parse
what I said hesitation I like this
hesitation thing okay you see here
that’s actually be doing some really bad
hesitation is it aged a lot well
speaking I knew I want to grab this and
actually render it to screen because
that would be really cool so let’s go
back to this thing let’s have a look at
what this looks like so it’s the data
object is has a little turn ative array
and that alternatives has each each
array item has a transcript property so
we’re going to use that see if we can
just add a dip here and dave is going to
just contain state I’m sorry this dot
state dot txt and oh you know what this
won’t work
yeah because this thing like state is
going to be no when it’s in its initial
state so we don’t have to do a
constructor here and we have to set this
dot state to lemon tea object I suppose
and then we also need to call super here
because we need to do that in classes in
the Oscar see what this looks like okay
it’s not breaking any more at least now
now when we get data we’re going to grab
the I got to grab the data dot
alternatives and the first alternative I
think that’s okay and then I think it’s
oops and then just go this dot state dot
see what’d that how what’d that looks
like all right
cannot set probably text or undefined no
okay first of all I have the wrong
okay all right there BAM so this dot
subset it’s not function no it’s because
this is now incorrectly scope because
when we do this listen click here first
of all we need to bind this to this
because we need to bind it’s it’s a it’s
all a big mess if you are interested in
learning about bind and this I made a
series on that here we’re not gonna go
into why this is breaking because you
need to understand this when you develop
JavaScript it’s it’s very tricky but
check out that video if you are confused
then we also need to do make this arrow
function maybe no I’m not actually what
actual let’s try not let’s not do code
unless we absolutely how do you know
it’s the work so like this is your arrow
function okay think that might work what
no nope we need to still make this an
error function as well so that this this
scope of this is preserved all through
the chain all cool now it’s updating the
div it’s thought I said Dave but I said
div but this is pretty cool right if I
speak very clearly and eloquently then
it actually understands what I am
but if I’m speaking here slowly it just
we all black closing the loading the
console this is basically what I wanted
to show you
ah let me just actually style this up a
little bit because I think it’s cool
there alright how what does this look
like it’s now listening to the
microphone eventually this is what I
wanted to show you react and what’s an
voice recognition integrated it’s
surprisingly fast considering that this
is all done over the network and this
API is free you could sign up all while
your schooling Watson speech API said
click service passwords and let me
actually just show you the example
server these things here it’s just the
speech-to-text used today by the
speech-to-text password I’ve set these
and my bash profile so that they are
accessible here and you can just get
your own speech text username a speech
text password on by signing up for
what’s the developer cloud I must say
this is really fast really impressive
maybe there are other voice api’s that
are also equally impressive but I a
solid was fun
to show you what I’ve been doing and
that is it for this cold episode of fun
fun function I put I release these every
Monday morning Oh 800 GMT you can
subscribe here so that you don’t miss it
or you can look at another episode right
now by clicking here I am mpj now having
a cold until next Monday morning thank
you is