Press "Enter" to skip to content

Using async generators to stream data in JavaScript


morning in this video we’re going to
look at how you can use a synchronous
generators to stream data I think that a
synchronous generators is one of the
coolest things to land in JavaScript and
I think it has huge implications for how
we do programming but it’s also a lot of
moving parts so I want us to think back
on like a series of episodes that led us
here if you want to watch from the start
there’s a link in the episode
description in the first episode we just
looked at the concept of an iterator we
talked about how we can iterate anything
with things such as the for loop because
they provide an iterator and Ray
provides an iterator and the arguments
object provides an iterator and we can
also create our own custom objects that
provide an iterator so that anything can
iterate over it
in the episode after that we looked at
generators which is a way for us to
create these iterators in a more
syntactically sugary more terse way so
generators themselves are not like some
special magical concept in the end they
just produce an an iterator so we don’t
need generators to produce iterators we
can also do them by hand if we want to
then in the episode after that we talked
about a synchronous iterators and a
synchronous iterator is just like a
normal iterator it’s just that it deals
with promises so it can go and like
every time you call next it can fetch a
new item over the network but what I
fail to do in the last episode is to
clearly explain why and when you would
want to use a synchronous iterator
because I had been selling the concept
of an iterator earlier as you know a way
to make a custom thing that you can
iterate over but in the asynch iterators
example i used a fake database that just
fetched a couple of rows and a lot of
people asked in the comment well
and an array is already iterable it
already works in the for loop for up
loop
why can’t I just do promised at all on
these database rows and then just return
the a promise of within that resource to
an array and then iterate over that
what’s the what’s the point of using and
sync
I think iterable I am so glad this
question was asked I really like when
people are questioning new technologies
and asking like why does this exist why
do we have this what is it purpose of
this and just not using things
mindlessly therefore in this video we
are going to use an example where are an
array would be a like completely
impossible data structure to use so an
array that’s kind of like a box or like
a long you can imagine it like a set of
boxes that are glued together of limited
lengths in some language languages its
limited length in some languages you you
can like just glue new boxes on top of
them but but it is like I said a box and
then you put things in the array inside
of these little boxes and they are there
the limitation of the array is that you
either get all or nothing if you have an
API of some database or like some file
reader or something and you just call
read file and that somehow gives you an
array that means that all of the rows
have to be read
you get everything at once which might
be fine that that’s a very simple way of
reasoning about like a file I just read
the rows I’ll get the rose I read the
users table I get all the users I now
have the users I can iterate over them
great I mean that’s amazing that’s
simple why not always use an array
imagine that the array is really big
what do you mean big you mean like
one hundred thousand rows because that’s
nothing to an array like that is
transferred really fast even over a
mobile connection and like even a cell
phone can handle one 100,000 rows in an
array really easily like computers are
really powerful even the small ones yeah
that’s true
computers are super fast and they are
just getting faster and faster all the
time but one thing that grows even
faster than the speed and performance of
computers is the sets of data that we
have to handle so I want you to think
for this for when thinking about asynch
generators I need you to think beyond
like small data sets like 100,000
100,000 rows because that’s that’s not
big I want you to think 30 gigabytes I
want you to think 100 gigabytes things
that are cannot fit in memory you can’t
read it all at once even in the
foreseeable future and even when we get
to computers that have 100 gigabytes of
RAM then we were going to have data sets
that are just several terabytes of of
data and even like a dataset might even
be in black logic infinite and in this
case I’m going to use the thing that I
can the data set that I can think of
that is closest to infinity and that is
all the cat pictures in the world all
the kittens and this is where the
concept of a stream comes in it’s
something that we we asked for like a
very small part of the data we asked for
the start of like we asked for the start
of a rope and then we just start pulling
the rope we don’t ask for the entire
rope at once so the interesting part
about the concept of an iterator which
again remember that it’s just an object
that we can ask for the next item
constantly is that it can express
an array because an array is just it
works in that interface it’s while we
have everything in memory it works just
apply the interface of asking for the
next thing for the next thing for the
next thing but the NS an iterator can
also be used to express a stream we just
constantly ask for the next thing but
under the hood it doesn’t really have
all the things yet it just asked for
things as you go and that’s the power of
of a stream and we can use an iterator
to express that or express that is
perhaps the wrong way of me putting it a
stream can conform to the iterator
protocol so I think iterators is
basically a way for us to express
streams and the purpose of streams is so
that we can deal with things as they
come along so it might be really huge
datasets or just datasets that are like
a couple of megabytes but you you want
to show them gradually in the interface
so that the user doesn’t have to wait
let’s have a look at some code all right
so this is observable age queue it’s
kind of like a JavaScript playground
it’s it’s pretty darn cool I use it a
lot for prototyping what you see here is
a stream of all the cats on Flickr which
is a photo service that was huge back in
the day but then Yahoo bought it and
destroyed but it’s still wrong and it
has a really nice API for there mowing
stuff like this you see here that we
have its siesta for a weight loop that
we looked at in the last episode and it
for every Earl of cats
it just yields HTML which is this this
it’s just an image tag
really and then we see here that every
time we
illness then observable displace it
think of this thing as just a this block
here as and it’s basically on a sink
generator it’s both and a sync function
and a generator so it has the capability
of both awaiting and yielding it’s a
super super powerful entity really but
don’t spend too much time understanding
this line that’s not terribly important
for this video like we could just have
this could just have been like you know
like render whatever like that’s that’s
not the purpose of this video the
interesting thing is what what the hell
is this cat where it’s where is it
coming from I’m going to show you what
cats look like but I want to stress here
that I’m showing you nothing new like in
the a sync editors example we saw
exactly all of the moving parts that
we’re going to see here but the example
is going to look bit more intimidating
because it’s using a real API service
and there is the slightly more moving
parts and complications related to that
but it’s basically as the fetch API and
the asynch generators that you’re going
to see so looking at cats here
it’s just creates these calls flicker
tag search
cats or like it holds flicker tag search
with the with the string cats all right
fine we’re collapsing that and we look
at what how what flicker tag search
looks like I need to reduce the font
size slightly there we open this up and
we see here that it’s it’s a big
function let’s look at this thing first
flicker tag search this has nothing to
do with their with iterators this is
just a function that returns it returns
a promise that resolves to an array of
urls to images that represents a page
results of a tag search on flickr wow
that’s about mouthful now it’s this is
just calling the Flickr API for a search
we give it a tag but we also give it a
page because because basically this
result set of cats that’s just millions
and millions of Catholics probably
billions so we can’t get it all at the
same time we have to give it a page and
every page is 100 cats so you see here
we just call fetch where’s you know some
Earl here that is defined in the Flickr
API and then we parse the Jason
we know that the Jason looks like this
it has this weird structure we photo is
an array and then for every photo we
extract the we construct an image Earl
this is how like the CD on your eyes on
on Flickr are constructed so what this
gives us is a small chunk of the
enormous search result that is all the
cats on the interweb oh well on Flickr
that’s as close as we’re gonna get let’s
scroll down a little bit to get to the
meat of the iterator itself or the
iterable that is cats we see here that
the the function that is Flickr tag
search and that takes a tag it returns
an object which has just one method that
is called symbol dot a sync iterator so
this is pretty funky it’s a method that
uses this symbol as its name unless
CAPTCHA but more accurate name for this
would be a sync iterator factory or a
sync iterator maker because like this is
that creates and a sink a traitor so
again this is what the for loop will
call under the hood and well when it
starts looping its going to call this
function to get the iterator and see
what’s happening here see that it sets
up a couple of variables here to keep
track of the page in dixon and photo
index because it’s going to be iterating
these things it also sets up a cache
because because
it doesn’t this iterator it’s not going
to get one picture at a time because I
would be very wasteful because the way
the web works is that there’s a latency
for every request so we want to be
batching the requests a little like so
I’m not a hundred pictures that it seems
kind of reasonable to just get a hundred
at a time even though we can’t get all
in the world at a time but getting one
at a time that would be a little bit too
little but the iterator we want to
expose we won’t expose that as one item
at a time so we’re going to create a
cache under the hood and have us a
buffer and that’s what the cache is for
and we have this little filled cache
function helper here that just does the
flicker tag search for whatever tag we
provided and then it’s going to take the
photos here and just stuff it into the
cache variable they’re variable so
that’s the iterator setup then it
returns the returns the iterator here
and the iterator it has a next function
that’s where the magic happens and the
next function is going to increment the
photo index it’s going to be minus one
from the start but then it’s going to
block no Cheeta up one so that we’re on
0 let us go to see if there’s a catch
and if there is a cache if that exists
in the cache in this case it’s not going
to be that so it goes down here then it
resets the photo index to 0 and then it
increments the page index in this case
the page index is zero and that means
that we’re going to end up on page one
which is the first page in the and the
Flickr search results all right start
one not zero
it calls Phil cash where the page index
and then it does the delay here for two
seconds I’m doing it like this delay
exists for demonstration purposes in
this case because otherwise it will just
blaze Ruby these images in a very
annoying manner the delay function is
just a function that it’s basically a
wrapper for set timeout that just makes
it promise based just returns a promise
that resolves after seconds so this
delay here that’s just a promise that
resolves after two seconds after that
has resolved we return an iteration
object we’re not done because frankly
we’re never gonna be done iterating
kittens this is largely an infinite
result set and then the value is going
to be the cash at the photo index so the
cash is basically the current page that
we had loaded so that gets us our first
cat it loads the first page and it gives
us the next the first cat when we call
next or for the for aww bloop calls next
the next time it’s going to hit this
line and it’s going to see that if there
is a cash and it’s going to see that
there’s also carried like there’s also a
cat at the the photo index which is zero
but one it’s the second cat in the and
the photo index result so it’s going to
just do this artificial delay and then
it’s going to return that cashed that
cashed cat because again remember that
the first 100 cats are cached then it
just loops through this for 100 cats and
then it’s going to hit this line it’s
going to see that this cache is going to
see that hey
this we’re now out of cats and the cat
we’re going to need to increment and go
to the next page and catch that one so
then it goes down here resets the photo
index because we’re now back and start
and then it increments the page and then
it fills the cat with that page and then
we’re back iterating in the next page
and then you just continues like that
again none of this is new
you saw all of this in the previous
video the difference is that this uses a
real API instead of like a fake API that
I created to illustrate a point there
these are more moving parts and it’s
just a more intricate example for that
reason but there’s no actual new
concepts and that’s it that’s the
entirety of our example and they’re all
results in this constant stream of cats
in this example we’ve crafted an
iterator by hand but we can also use
generators to create iterators and we
have looked at that we have looked at
how to create synchronous iterators
using synchronous generators but today
we’re going to refactor this manually
handcrafted a synced rare and we’re
going to create that using an a sync
generator and it’s going to blow your
mind how much sleeker this becomes so
let me expand Flickr tag search here and
I’m going to write rewrite this bugger
as an as a generator instead so I’m
going to comment this out and grab this
loop and with your function just normal
functions just just start with to start
with we begin with our project and we’re
and we want to grab this tag here it’s
going to be cats and the second argument
to Flickr tag search is page three is
call it page index this variable doesn’t
exist yet let’s create that actually let
page index we what this is a synchronous
it returns a it turns a promise as we
can see here so we need wanna we want to
wave that it’s a weight and in order to
use the await keyword we need to make
this function I think white oh my god
I’m jumping around sorry about that
sorry about that
now once we have this array we want to
iterate over it
so let’s do that for Const Earl or this
results that we are wait we’re gonna do
stuff looks actually us break this out
because it might be confusing to us
we’re gonna just call this array this
this will give us an array so since
we’re waiting here this will be the
actual array of the page or perhaps like
perhaps page data that’s that’s a good
name so once we have some page data and
we iterating over it we want to yield
don’t you heal there this is a game
remember how this is how generator works
like this is this is how we yield the
next the next item or want like when
we’re asked for something like oh I want
the next item then we yield it upwards
and then we pause the generator because
the generator is sort of possible
function and then we won’t continue like
the generator won’t continue until the
external world the external consumer
calls next on us after we’re done with
this loop once we prove will yield at
everything on the in the page data once
we’re out of data in the page we go page
index
just increment that what I do H and X
plus one and then then this then this
generator is going to end that we don’t
want that we wanted to like just
continue so I’m going to wrap this in a
while
true we tend to see in these generators
it’s a pretty much de vanilla pattern
for generators and yeah that’s the way
it way it looks like so now it’s just
gonna continue yeah it’s just gonna
continue there’s actually no end
condition here like hypothetically we
could run out of pages but that’s not
gonna happen
in in this applications case like cats
infinite and I’ve now actually written
the entire bloody thing
have a look at this yes I do you see how
a little code this is compared to this I
actually Jesus I need reduce the font
size to show these side by side so this
it’s the same as this we haven’t added
the delay here but that’s the only
difference it’s just amazing how much
code we can get rid of by us having the
possibility of the generator it’s pretty
darn cool it’s actually making sure that
it works let me run this oh my god boom
the keyword yield is deserved which is
reserved oh of course of course
sorry about that we can’t use yield
because this function here is not a
generator yet we need to make it a
generator otherwise we’re not allowed to
use field so womp
now it’s a generator let’s just run that
again and now you see here that oh
there’s a lot of cats and it’s kind of
jumpy and stuff it becomes really
confusing and that is because we haven’t
added the artificial little delay that
was on the lasts on like here these so
I’m gonna add that don’t you and it here
let me actually increase the font size
again so that mobile users can actually
see something go
to await a delay or to run it again and
now you see you hear that hmm now it’s
loading the cats and like this nice
sequential manner better scroll down and
just gaze upon how extremely sleek this
is this is just so cool and that’s it I
hope I’ve linked this notebook in the
description so that you can fork it and
play around with it yourself it’s very
important to actually use these things
in order to understand them you can’t
just watch me do it it’s just too many
moving parts and things you just have to
experiment with them on your own into in
order to fully grasp them that said
don’t get stuck on your own confusion is
to be expected and your fellow
programmers are helpful if you have a
question posted in the comments down
below or if you want to support the show
a little bit and become a patron you get
access to the Fun Fun forum which is
also an amazing place to ask for help
like I hang around there a lot and the
people rounder are so friendly and so
nice and really good at explaining
things so make sure you use that if you
are a patron there are so many cool
implications with asynch generators
we’re going to be exploring them in
future episodes
because if you haven’t like made the
connection yourself yet you might notice
that this is now streams built into
JavaScript we now have this awesome
syntax to express streams in a standard
manner so this effectively does four
essential iterators and and the
generators effectively does what I like
the native promises did like we have
these love bit prior to the native
promises built into JavaScript who had
libraries like cue and Blue Bird but now
we have like this standard way of doing
streams which means that we can have
libraries that operate on those streams
and we can mix and match which is just
amazing we don’t have to use this moral
effect stream libraries like Highland or
bacon
our ex we can just use that built-in
things and then just pick little
libraries that does transforms and just
use build our own functions and then
those things will be useful for all
projects because they all deal with
async intervals which is amazing
so I think that depending on what
questions people have we might do like
higher order I think it rules next time
like functions that create like other
intervals from iterables like map for
intervals or like we can live like
filter the cats or something I don’t
know you have just watched an episode of
fun fun function I normally try to make
these a little bit shorter than this one
but we needed to spend a lot of time on
why and also show a real world example
on both of those take time I produce
this show every Monday morning Oh 800
GMT so tune in next week but you will
forget that so you can click subscribe
so that you don’t miss it
or if you’re feeling frisky you can just
watch another episode right now I am mpj until next Monday morning stay curious
Please follow and like us: