Press "Enter" to skip to content

GOTO 2016 • Machine Learning with Google Cloud Platform • Kaz Sato

I’m Kaz Sato I’m a developer advocate
for Google cloud platform and I’m also a
tech lead for the data analytics team
and I have been working at Google for
over five years and for the last one and
a half years I have been working as a
developer advocate to having a
presentation like this at many events so
in these sessions I’d like to talk about
these agendas but first I’d like to
introduction introduce the concept of
neural network and deep learning and how
it works with very great demonstrations
and also I’d like to introduce how
Google has been deploying those neural
network technologies to the Google crowd
and then I’ll be covering the
technologies and products that those are
actually provided as a products from
Google cloud platforms so what what is
neural network so neural network is a
function that can learn from training
data set so if you want to have newer
networks to do image recognition then
you can put the for example cat image is
converted into a load vector and then
put that vector into the networks then
eventually you’d have another output
vector which represents the labels of
the objects detect such as cat or human
face so it’s designed to mimic the
behavior of new ones inside human brains
by using matrix operations so actually
it’s really it has very basic matrix
operations only there is no
sophisticated or fancy mathematics going
on everything you do with neural
networks is you have if you matrix
operations we have learned at high
school for example the input vector may
we present represent the image of cat
then they will have a vector that would
have the pixel data converting it to a
and you would get another output vector
as a result that represents the label of
the detected images in in this case you
will have the any number closer to the
1.0 that indicates the neural networks
thinks the image must be a cat but you
think’s going on inside in your networks
is very simple all you are doing with
neural network is the matrix operations
like W X plus B equals wise where the WS
are weights and B are biases and
actually you don’t have to care about
those WS and B’s at all you let
computers find and calculate the WS and
B’s so all you have to care is the yeah
what kind of data you would want to put
into the networks and what kind of the
result you want to get so let’s take a
look at the some interesting
demonstration of neural networks for
example if you have a problem like this
the very simple classifications how you
can use neural networks to do the
classifications of these two different
data sets I’m not sure what that those
data point means but let’s take a lift
imagine that those are the data of the
weights and height of people so that if
a person’s weight and height are both
loud and you can think he or she must be
a child or maybe he or she could be a
adult how how can you classify them if
you using you if you try to use neural
networks to do this problem then you can
just apply the same equation W X plus B
equal Y to do you’ve to do two
classifications so you are putting the
weight number and height number here as
a vector and you get an output a vector
like this where if you if what single
data points isn’t classified as an adult
then you would have one here and if it’s
a child then you would have a one here
and thing is that computer tries to find
the optimal combination of the
parameters such as the weights and
biases by itself and you don’t have to
think about what kind of parameters you
have to set to neural networks computer
does it for you so let’s take a look at
the actual demonstration did you see
that so I can do that okay so instead of
the humans letting computer instructing
learning you know you don’t have to
teach computers how to solve these
problems but everything you have to do
is provide the training data set so that
computer thinks by itself to optimize
the combination of the parameters such
as weights and biases so you see that
the computer is try to change the the
weights to do the classifications at
optimal success rate and now computer is
using an algorithm is called gradient
lee set that means it tries to increase
or decrease each weights and biases to
make the combinations closer through the
higher accuracy or lower loss slate so
it’s just like we’re learning things
from the the parents or there may be a
senior people in your company where you
this junior people or these children are
learning from many many mistakes so
computer makes many many mistakes but in
the initial stage but if you provide
much much more training data set than
computer using the gradient descent
algorithms try to minimize the failures
so that that’s how it works so let’s
take a look at another interesting
demonstrations of new or networks where
you have the another training dataset
like this I’m not sure what what does it
mean but we think that we have a some
data set which requires a complex
classifications if you a program has
maybe with this data setting you may
want to user maybe equations for circle
to classify the two datasets through
datasets and arrange a disease but by
using newer networks you can just let
computer things how to solve it now you
saw the computer was trying to create a
pattern to classify those datasets by
using so called neurons in the hidden
layers with the first example we didn’t
have any hidden layers but with this
complex data that you would need to use
the hidden layers that means between the
input data and the output neurons you
would have another layers between them
there has much pool new ones and each
new ones does very simple things these
neurons only classifies whether the data
points is in the in the bottom left red
area or the upper right area or now
these neurons only classifies whether
the data points is in the left or right
just like that but combining those
outputs at the the last new one neural
network then the neural networks can
compose much more complex pattern like
this and if you have more and more new
neurons inside the hidden layers then
the network specifications speculations
can be much more accurate like this so
here by adding more hidden layers with
more new ones you have to spend much
more computation power but at the same
time the neural networks can compose
much more complex patterns and extract
the patterns from the large data set how
about this let’s try it this is a data
pattern called double spiral if you are
a programmer and your director or cast
asked you to cross fight this kind of
data set what kind of your program
called you drive do you want to write
many issue statements or switch
statements with many threshold try
checking the x and y partitions no I
don’t want to do that instead I would
you be using neural networks so that
neural networks try to looking at the
data points in the data sets to find the
optimal patterns hidden inside the
training data set that’s right this so
this is the where in your networks can
exceed the human performance human
programmers performance it can extract
those hidden patterns inside the
training data set and you don’t have to
specify anything decorative features you
don’t have to any actual features by
human hand instead computers can find
the patterns from the training data set
so that why people are so excited with
the neural networks and that deep
learning about this so if you have the
problems of the identifying handwriting
text you can still using the very simple
GW Express peep equals why kind of
neural networks to cross find this
handwriting takes the networks would
come up with disease complex patterns
stacking classify those images into the
year vectors with the labels like an
eight or seven or six if you want more
accuracy than you would have more no
more hidden dangers so that you you
could get like a 85 or 95 or 98 like a
machine how about this how can you
classify these cat images by using
neural networks you have to have many
more layers of neural networks that is
so-called deep neural networks
this isn’t diagram is called inception
model pattern 3 there has been published
by Google last year where we have used
240 hidden layers in a single neural
network design so it takes much more
computation power and time but still we
much more complex competitions like this
you know the new ones closer to the
input vector could learn very simple
pattern rugby’s like you know vertical
lines or the horizontal rise but the
neurons closer to the output vector
could learn much more complex patterns
or compositions such as eyes nose or
human face again we didn’t put any
features of patterns embedded in the
neural networks before training did so
everything can be trained from the data
yeah by using computation power so
that’s the how neural network and deep
learning works but as I mentioned it
takes so much computation time and
training data sets to use the planning’s
for the production projects so there are
two big challenges right now for the
users for the deep runnings and this is
why keep running is has not been so
popular for you guys once we have solved
these problems raka will have a plenty
of computation power with the printed
training data set then you can easily
apply neural network so deep learning to
your existing problems if you a game
programmer you may want to apply the
deep learnings to analyzing gear you are
your game log server logs to check
whether a player could be an cheatin
player or spammer office weather or if
you web designer or web systems engineer
for the ad system then you may want to
apply the logs for the as conversion or
quick-quick flow rate our logs to neural
network so that you can get you can have
computers to learn from the year as log
to get more optimization but you have to
have computation power and training data
so that’s the reason why we have started
using Google cloud to train in large
scale neural network Google cloud has
racket and
hundreds of thousand machines in our
data centers in global and we have been
building those computers at the data
center as a computer not just a bunch of
the computers Switzer building we design
each Craster which holds like a ten or
twenty thousand servers working as a
single computer with a multiple our CPUs
so that’s the reason why we can it’s
it’s not so hard for us to deploy try to
scale neural networks or odds can be
created Processing’s to our google cloud
if there are two basic very fundamental
technologies inside google’s that
supports the data center the computer
one is the network we have been building
our own hardware support for the network
switch fabric that is called jupiter
networks so we union we are not using
the commercial network switches for most
cases such as the Cisco or juniper
routers those are not mainstream of our
new our network backbones we we have
been building our own hardware that can
hold like a hundred thousand pots of 10
Gigabit Ethernet ports that can eat at
one point to pet a bit per second per
our datacenter so that is the networks
we have at Google and also contain a
technology called Borg bo is our
proprietary container technologies we
have been using over 10 years for for
deploying or almost all Google services
such as Google search or Gmail or Google
bulk containers account hold up to
10,000 or 20,000 physical servers in a
single cluster so that you can do the
large scale job scheduling right the
scheduling the CPU cycles or memory
spaces or disk i/os with that scale so
that reason why you can deploy your
single applications like did the neural
network training or the big data
processing into maybe hundreds or
thousands of machines with a single land
of default command
and Google brain is the project where we
have started applying the Google cloud
technology 2d to build a large-scale
neural networks this project has started
in 2011 and right now the Google brain
has been used for the many many
production project in Google and fast
the scalability of Google Google brain
project for example rankbrain rankbrain
is our GU algorithms we are using for
the ranking of Google search service
right now since last year that has been
using Google Google brain infrastructure
and with five hundred nodes and that can
perform at three hundred times faster
than single node so that means if you
are training your deep learning model
with single servers then you would take
300 times longer than Google engineers
and inception is the model for the
visual visual recognition we can use 50
GPUs to accelerate the performance at 40
times faster so that reason those are
the reason why Google has been so strong
on applying deep running for the
production project such as the alphago
Frady we have the series of the core
matches with the core professional they
have been using the Google brain
infrastructure for the training as well
as the prediction of the Google match
Google search has been using deep with
the brain of since last year and we have
been using the machine learning
technologies for the optimizing the data
center operation and also or she our
natural language processing and visual
recognition of speech recognition such
as the Google photos what he voiced the
conventions of the androids we have over
60 production projects that has been
using Google brain and deep learnings
for last a couple of years now we have
started to externalizing this power of
Google brain to external developers
the first product is called crowd vision
API and the second product is called
crowd speech API crowd vision API is an
image analysis IPA that provides the
pre-trained model so you don’t have to
train your own neural network and you
also don’t have to have the any skill
set for the machine learning so it’s
just on REST API you can just upload
your photo image to API then you repeat
receiving JSON result in a few seconds
there has the the analysis result and
it’s free to start trying out up to
1,000 images per month and it’s general
generally available right now so it’s
ready to be used for the production
project it has six different features to
be detected labial detections means that
you can put any labels or categories on
any images you uploaded for example if
you uploading the cat images then the
API will be returning the Arabians such
as a cat or pet face detections can
detect the location of face in the image
OCR I can convert the text on image to a
string explicit content detection means
that you can check whether the images
can contain the images contain the the
adult or violent images on our landmark
detection can detect the location of the
images or popular places and you can
also detect the product or corporate
role let’s take a look at the
so I’d like to show a demonstration by
video at first this is the
demonstrations by using the Raspberry Pi
robot that sends the image to division
API cloud vision provides powerful image
analytics capabilities as easy to use
api’s it enables application developers
to build the next generation of
application that can see and understand
the content within the images the
service is built on powerful computer
vision models that power several to firm
Google services the service enables
developers to detect a broad set of
entities within an image from everyday
objects to faces in product logos the
service is so easy to use as one example
of the use cases you can have any
Raspberry Pi robot like gulp I go
calling the cloud vision API directly so
the broad can sum the images taken by
its camera to the cloud and can get the
analysis results in real time it detects
faces in the image along with the
associated emotions the cloud vision API
is also able to detect entities within
the image now let’s see how facial
detection works cloud vision detect
spaces on the picture and returns the
positions of eyes nose and mouth so you
can program the bot to follow the face
it also detects emotions such as joy
anger surprise and sorrow so the bottom
moved toward smiling faces or avoid
anger or surprise face one of the very
interesting features of cloud vision API
is the entity detection that means it
detects any objects you like you see
cloud visitors likes developers to take
advantage of Google’s latest machine
learning technologies quite easily
please go to slash
vision to learn more and I have another
interesting demonstrations that is made
by using the vision API that if this is
called vision Explorer demonstrations
where we have imported 80,000 images
downloaded from Wikimedia Commons and
uploaded to the Google Cloud storage and
applied the vision API analysis so here
we have the cluster of the images it is
80 thousand images and each cluster has
the labels such as snow or transport
residential area means that the the
cluster of the similar images for
example if you take a look at here let’s
go to a plant so each single dot
represents your thumbnail of the
uploaded images so if you go to the
Craster there must be some cluster of
the oh it’s oh it’s not showing why let
me redraw this maybe because I’m using
okay let’s go directly to the cat
cluster so in this cluster we have many
many cats and closer to the cat cora’s
we have the crust for dogs let’s go back
to the cat cluster and if you click to
image thumbnail image and you’ll be
seeing the analysis result from the API
right this the API thinks this must be a
mimic of cat and it’s a cat as a pet or
it must it must be a British Shorthair
so this is these some things you can do
with the deep learning technology and
those results are returned in a JSON
format erectus and with statistical
stretches we can show it in a GUI if
image contains any text inside it then
we can come back convert it into the
extreme example with this you can have
the string like these three Kangaroos
crossing next to you images if images
contains faces also this API doesn’t
support any personal identification or
the personal recognition but it can
detect the location of the faces with
landmark locations such as nose and
mouth and also it can recognize the
emotions such as joy sorrow and anger
and surprising in this confidence level
and if your picture contains any popular
places such as this dinner API can
return the name of the landmark the API
thinks it must be an image of the Citi
Field Stadium in New York City with the
longitude and latitude so you can easily
in a put a marker on the Google Maps
it’s too slow so I’m cutting it off also
you can detect the product and corporate
like this Android so this was the this
vision API so it’s ready to be used for
any applications and another API is
called speech API which also provides
the pre-trained model for the voice
recognition so you don’t have to have
any skill set or experiments with the
voice recognition or training neural
networks for doing doing that it’s just
on REST API and G RPC API so you can
just upload your audio data to the API
and you’ll be receiving a result in the
few seconds it supports over 80
languages and dialects it supports both
real-time recognition and battery
recognition the API is still in limited
preview so if you go to the speech speech then you have to
sign up with the form for immediate
limited preview access but we hope to
make it public better maybe in a couple
of weeks I suppose let’s show some
demonstration I’m not sure if this works
in the event or not because this is the
first time to try this and and I have
some accent problems so I’m not sure I
really not sure if this works or not but
best right hello this is a testing of I
think it’s not working
maybe the gathering is getting so slow
hello this is a test of voice
recognition bar by Google Cloud machine
learning oh yeah
so final result is you know not bad
right and you also saw the fast response
so you could get the recognition result
in recent one second district Iraq a 0.5
seconds in real time so those are the
api’s and but those api’s here are
pre-trained model so that means you
cannot train your own model with those
aps and well one would if we country
asked questions for those api is that
whether the google will be you know
looking at the uploaded images or the
audio data to train your own model or
doing some more research and as know
those api saudi all of our products are
provided by the Google cloud platform is
is under the terms and conditions of DCP
that has here some sections for the
customer customer data we don’t look at
the customer data except for the very
special cases for the troubleshooting or
emergency situation so basically we
don’t look at the gyro data uploaded to
the cloud but at the same time so you
can so the APS cannot train cannot do
the trainings for your data or your
applications so that’s the reason why we
provide the other options for machine
learning with a stencil or cloud machine
learning the other d-dick frameworks and
platforms that can used for train your
own data set train your own machine
learning and neural network what is 10
so for tensile Pro is an open-source
driver of your machine intelligence we
have published the libraries last
November and this is the the
actual framework we are right now using
us via Google research of Google brain
team so it’s not something’s
outdated or stay out since the latest
machine learning framework we are using
are right now at Google for example if
you want to design this
restaurant work swag DW Express Pico why
you can use Python to write it in a
single line of code Roxas you can put
the image of cat here this vector and
then you would have an output vector
that represents the labels of the
detected objects like a cat or human
face and you can let computers to find
the obits and biases so it’s so simple
and also it’s really simple to train
your networks because you can just write
this single line to have your networks
trained for your training data set by
using by specifying the algorithm
Stryker gradient is set you don’t have
to implement your own the procedural
code or called – implementing the each
the optimization logic actually I’m not
good at math or those machine learning
algorithms but still I can just copy and
paste the sample code to my laptop and
I’m praying with my own data sets missed
and so forth so you can just let the
chancel for runtimes to do the
optimization and also the tool provides
you a very good visualization tool so
one of the problems we had at Google for
applying the neural networks to the
production production problem is the
debugging so if you have many more
hidden layers inside the neural networks
you have to check the all these stages
of the parameters whether the parameters
are converging in a right direction or
the parameters could be you know going
away and having a wrong number such as a
na or 0 elsewhere so it’s really
important to visualize what’s happening
inside in your networks and tensorflow
provides the tool and also the
portability is another important aspect
of the framework so once you have
defined your neural networks with
Python code of tensorflow then you can
start running you and you ready to work
training or prediction with your laptop
like a Mac or Windows Raptor but you
will find that your laptop is too slow
to trying the trendy deep neural
networks so maybe you may soon want to
buy some GPU class and instead of maybe
2 or 3 GB because in a single box but
still usually it takes like a few few
days usually a few days or maybe some
people spending a few weeks to to do the
trainings on their neural networks so it
takes so much computation time so in
that case you can applaud your tensor
flow graph to Google cloud so that you
can utilize the power of the tents or
maybe hundreds of GPU instances we have
we’re running at Google cloud and also
once you have finished your training
then the size of the parameter sets
could be fit into our hundreds of
megabytes or tens of megabytes then you
can easily copy that parameter sets into
the smaller devices mobile devices or
IOT devices such as Android iOS or maybe
Raspberry Pi so that you know you can
have those devices doing the prediction
like image recognition or voice
retention without using any internet
everything could be implemented within
the framework of tensorflow and with the
at the last Google i/o it was about 1
months ago we have announced a new
technology called tensor processing unit
this is a replacement not a replacement
maybe a complementary technology for the
GPU and CPU so so far or maybe right now
the any deep neural networks researchers
or developers outside Google is using
GPUs mostly for training the neural
networks because it’s a matrix
operations and by using GPUs you can
accelerate the matrix office
ten times or maybe 40 times faster so
that’s what typical neural networks
users are doing right now but the
devices problem for GPU is the power
consumption each consumers record 100
watts or 200 watts per GPU card and we
are having we were using thousands of
them in a Google Data Center and power
Concepcion is becoming the Rogers
problem so by by designing the Asics or
the editorship specifically for the
tensor flow or deep neural networks we
were able to reduce the power
consumption and gain the ten times
better for our performance – powerful
performance for what result and we also
use the special techniques such as the
bit quantization rather than using a
32-bit or 16-bit to calculate everything
every matrix operations we use the
quantization strike a quantized into the
8-bit where there’s not so not so much
loss of the accuracy so that you can fit
much bigger parameters into a very small
memory footprint and we have been using
GTP’s for many production projects
already run greying alphago and google
photos speech recognitions these are all
has been using tepees since a couple
months actually we have been using GPS
for less than one year and the if you
want to yeah we have been I haven’t
discussed describing about the power of
the Google brain such as number of the
CPUs GPUs and TP is and if you want to
utilize the power of Google brand
infrastructure here’s the product we
provide which is called cloud machine
crud machine learning is a fully managed
distributed training environment for
your tester for graph so once you have
wrote your tensor flow graph and run run
it on the laptop then you can upload the
same types of rock for graph to Google
Cloud messing learning so that you can
specify the number of the GPS you want
to use with the service suggested you
such as 20 nodes or 50 notes to do the
acceleration only training when a crowd
Mao is in the limited preview so you
have to sign up to start trying out but
maybe I we suppose that the for
availability record public better will
be sometimes later in this year if you
go to the YouTube then you can take a
look at the actual demonstration of
kratom a budget of teens where they have
where he has presented demonstrated the
actual chance of grotessa fraud based
neural networks that takes 8 hours with
single node but if you upload the same
tensor flow graph to the crowd email
then you can accelerate the performance
to up to 15 times faster that means you
could get the result of the trainings
within 30 minutes rather than waiting
for the 8 hours that is the speed we are
seeing inside Google for any deep
learning deployment and we externalizing
the dispo to to you guys to have you
utilizing the power for 40 you are for
solving your own problems and also
claudemir can be used for the production
as well not only for the training and
indeed mo solutions he presented at he
demonstrated that the crowd email could
be used for the body predictions at 300
meters per second so those are the the
topics I have covered and now we have
two different products one is the Amero
api’s like a vision API or speech API
where you can just upload your own data
to cloud so that you will be getting the
results in a few seconds and if you want
to train your own neural networks
then you can use a try using the
intensive roll or crud machine learning
so that you can accelerate training
you’re on or you’re on your network so
if you take a look at the links on the
resources of this lesson that you can
start trying out those products right
now thank you so much yeah I got two
questions Trulia
first of all few steps back on the
machine learning and neural networks
you talked about more hidden layers to
more complex algorithm yeah but Sturm
x-fighters they’re like we were told in
University to use yeah you’re feeling to
see how many layers you have any tips on
that or yeah yeah that’s actually a
really good question
so maybe question is is there any good
practice on designing the Union neural
networks right so as far as I know
there’s no one theory to optimize your
design of neural networks so everybody
even in the fifth at Google you know
when I asked the Google research team
and people would say you know let’s
start with the five 5 children
let’s see how it works all right so
strata and that’s the largest challenge
we have right now for deploying the
neural networks for your own data or
applications so you have to do it in
many many trials just right the the
people in the pharmaceutical companies
trying to create a new drug so you have
to have a different combination of
obably hyperparameters hyper parameters
means the parameters such as the number
of hidden layers or the or new ones or
the way you can import the data or
extracting features so you have to try
out every different combinations that’s
the problem
and also it takes much much computation
power yeah I think that’s what all do
yeah it’s not a theory behind it and a
little bit later in the presentation but
use cases will be supported in the near
we’re going from post training to
runtime can network we and B be exported
in order to use a better processing for
example what is second question cannon
can the network we can be be exported in
order to reuse in embedded processing
for example by exporting okay yeah
before the first question which is the
the online training I think it’s on the
world map or maybe your do list of the
nucleus but it’s currently it’s not
supported but it’s possible that we’ll
be supporting the online training where
your neural networks will be gradually
joined by the online data and second
question is exporting yes you can export
the trained parameter sets so that you
can use the parameter assists to your
une use cases such as the importing the
parameters into the IOT devices or maybe
you can even copy that data sets into
different cloud record AWS to learn your
predictions on database ok those
questions are cut do anyone else has any
questions no and thank you very much
thank you so much
Please follow and like us:

Be First to Comment

Leave a Reply