GOTO 2016 • Seven Secrets of Maintainable Codebases • Adam Tornhill

as part of my day job
I’ve analyzed hundreds of different code
bases most of them open source but also
a lot of preparatory and commercial code
bases and today I want to share some of
my observations with you because there
are some patterns that I see recur over
and over again and those patterns are
independent of programming language or
technology and I want to do this in a
different way because I’m pretty
convinced that we as an industry we know
what good code looks like we know about
the importance of naming testability
cohesion and all that other stuff so
today I want to go beyond that
instead I’m going to start with data
analysis because isn’t it fascinating
the data analysis has become mainstream
and the rise of machine learning has
taught us how to find patterns in
complex phenomenons yet we as developers
have never taken those techniques and
turn them on ourselves to understand our
own behavior and how our code bases grow
so that’s what I want to do today I want
to show you how behavioral data about us
as developers can help us make better
decisions decisions that guide us
towards maintainable code bases and the
good news are that you all already have
all the data that you need we’re just
not used to think about it that way I’m
talking about the version control
systems our version control data is a
perfect behavior log over we as
developers have interacted with our code
so let’s uncover the secrets of version
control later my first observation is
that maintainable code goes way beyond
technology and we as developers we tend
to be pretty good on a technical part of
that but my experience is that we often
miss the social implications of our
designs let’s see what that is important
this is what pretty much every system
worked with over the past 10 years has
looked like we have a number of big
subsystems usually free that we somehow
integrate and let them exchange
information and at one end we have a big
database now each one of those
subsystems can be fairly large they can
consist of hundreds of thousands or even
millions of code and the scale of that
alone makes it really really hard to
reason about it but this is not just
about scale because today’s systems tend
to be written in multiple different
programming languages and of course they
are developed by multiple programmers
organized into multiple teams and that
leaves everyone with their own view of
how the system looks no one has a
holistic picture but this isn’t really
just about technology because systems
that look like this tend to come with an
organization that looks like that so our
main challenge is to balance this
technical and social organizational
complexities this is actually a hard
problem but it’s a problem that we can
simplify somehow by understanding the
next point here my next observation is
that all code is equal but some code is
more equal than others and to explain
what I mean about that I want to share a
little story to you this is something
that happened to me a number of years
ago at that time I was working on a
large legacy code base and what we did
was we bugged a tool that was able to
measure a bunch of different complexity
measures and produce something called a
technical depth metric so we basically
took that tool threw it at the code base
and out came a prioritized list of every
module in the system and each module had
a technical depth number assigned to it
the interesting thing here was that
there was a clear number one candidate
there was one piece of code that was way
worse than the rest of it however when
we start to investigate that piece of
code it turned out that that code had
been stable for three years we never
needed to touch it it worked wonderful
in production it just did its job so
should we really spend time and effort
improving that piece of code what can we
expect to gain to me it’s just a big
risk and besides we probably all have
much more urgent matters to attend to
and to find out those matters we need to
do something differently we need to take
on evolutionary look at our code here’s
how it looks like please have a look at
these different graphs all of them shows
the same thing on the x-axis you have
each file in the system and those files
are sorted according to their change
frequencies that is how many commits
have we done that influence that
particular file and the number of
commits is what you see on the y-axis
now the interesting thing here is that
what you see here are free radically
different code bases written in
completely different programming
languages and with completely different
lifetimes yet they all show the same
pattern you see a power-law distribution
and this is something I’ve seen in every
single code base that I’ve analyzed most
but what this means to you is that in
your typical system most of your work
tends to be in a relatively small part
of the code base and most of your code
is in the long tail which means is
rarely if ever touched and this is
important because it gives us a tool to
prioritize to focus on the code that
really matters so now we’re able to
narrow down the amount of code we need
to inspect if we want to make it real
improvement that matters in terms of
productivity and this is a good starting
point but we need to do even better and
to explain how I need to take a little
risk I’m going to put up a slide and I
think this is the point in the
presentation where you will all
violently disagree with me but here we
when it comes to maintainable code bases
complexity isn’t the problem
we developers we are so conditioned to
despise bad code we have learned that we
need to reflect the bad code as we see
it and we have learned that we need to
keep all our code clean and nice and
that that is the fastest way forward the
problem I have with that it’s just not
true and the graph on the previous page
indicates why that may be the case but
still we cannot discard complexity
entirely we just need to find out when
complexity matters and when it doesn’t
and this is a problem I’ve been
struggling with for years then five
years ago I was working again on a
fairly large system with a lot of people
and at the same time I was enrolled into
University where took a course in
forensic psychology and I think that
forensics in general has a really
beautiful mindset that replies well to
software development as well
but there was one technique in
particular that I really want to show
you this is a technique called
geographical offender profiling
geographical offender profiling is a
technique that we use to catch serial
offenders and you see an example here of
how it looks this is a map over a city a
city that looks pretty much like London
and you see those green dots each one of
those dots represents a crime scene with
56 different crime scenes and we know
that those 56 crimes are committed by
the same offender how do we know it’s
the same offender well perhaps we have
the hard evidence like DNA or
fingerprints or we have just found that
it’s the same modus operandi the same
method of operation used by the criminal
at the different crime scenes now what
do you do in geographical offender
profiling is that you use the
distribution of those crimes to
calculate a probability surface and that
probability surface is import on meth
ethical weighting of the distribution of
the crimes but we also weight that
formula with our knowledge of human
behavior and using this probability
surface we are now able to predict the
most probable home location of our
offender and that’s there in the red
area and that’s what you call a hot spot
and according to the research there’s a
70% chance that our offender will have
his home base there and the reason I
think this applies well to software is
because think about what we have done we
have taken a potentially vast
geographical area and narrow it down to
much much smaller part a much smaller
part what we can now focus a human
expertise and still be pretty sure that
we catch that offender so what if we
could do this in software what if we
could take those horrible million lines
of code and narrow them down to a few
hotspots and know that if we focus on
improvement there we get a real effect
let’s see how that may look in software
here we go so this is a geographical
offender profile of a fairly large
system what you see there is
approximately three hundred thousand
lines of code and that data here is
built up from a version control system
or behavioral log and it’s based on two
thousand four hundred different commits
and identifying patterns in those
commits we were able to project a
probability surface intuitive our code
and using that probability surface we’re
able to predict the most probable
maintenance savings in that code now I’m
going to walk you through this
visualization in a minute but before I
do that I just want to point out that a
hotspot analysis like this it’s actually
a social analysis because this data is
based on the collective intelligence of
all contributing offers all right so
what you see you see that there are some
large blue circles each one of those
large blue circles represents a
subsystem this is a hierarchical
visualization so it pretty much follows
the folder structure of your
project and when you do detail and
large-scale visualizations it’s also
vital that you keep them interactive so
that we can zoom in and now to the level
of detail of interest and if you zoom in
on one of those subsystems you will see
that we represent each file as a circle
you will also see that those circles
have different sizes and that’s the
course size is used to represent
complexity so complexity is something
you measure from the code and we have a
bunch of different complexity measures
to choose from and you can basically
pick any metric you want because what
they all have in common is that they are
equally bad so I tend to go with the
simplest possible thing I tend to pick
the number of lines of code which also
has the advantage of being language
neutral but still I said a minute ago
that complexity alone isn’t a problem so
we need something else we need to
understand if we actually work in that
code or not we need to understand the
change frequency of a code and this is
something you pull out a reversion
control later and the interesting thing
here of course is we will combine these
two dimensions because now we’re able to
identify complicated code that we also
had to work with often and I will show
you a real-world case study of how this
may look this is a study of microsoft’s
the open source project Rozlyn and
roastin is a really interesting project
to study because ruslan is on open
source compiler platform and it actually
includes two compilers itself it
includes the c-sharp compiler written in
c-sharp and the visual basic compiler
written in Visual Basic so Ruslan is a
polyglot code base now if we look at the
top hot spots in Ruslan right now you
will see that the number one hot spot is
something called command line tests and
written in c-sharp another top hot spot
in Ruslan is called command line tests
and it’s written in Visual Basic
hmmm I wonder if the some kind of
relationship between those two in a few
minutes we’re going to see a technique
that helps us answer that question but
for now I just want to point out that
those hotspots they look tiny on-screen
but that’s just across the size of
Rosslyn Ruslan is huge what you see
there is almost four million lines of
code and each one of those command line
tests are a file with 7,000 lines of
code and I would also like to argue if
you had 7,000 lines of code in a file
called command line tests perhaps that
isn’t a good unit of test and what I
would do in this case is that I will
look for the different responsibilities
and start to break that file down into
separate test Suites for example one for
parsing one for validation one for read
debug flags and if you do that each one
of those new test files will of course
become easier to reason about in
isolation but that’s not the most
important thing the most important thing
is that you end up with an entirely new
context because now if you continue to
do a hotspot analysis like that you will
be able to see which parts of a solution
that you managed to stabilize and which
parts that continue to change
so cohesion is a tool that gives you
additional insights into the evolution
of your code another observation that
made you saw that those two modules they
were test code and this is again
something I’ve found over and over again
that the worst offenders in code tend to
be in the test code and I think the main
reason for that is because we developers
we tend to make a mental divide on one
hand we have that vacation code and we
know it’s vital that we keep it clean
that it’s possible to maintain and
evolve it on the other hand we have the
test code and most of the time we’re
happy if we get around to write any of
it at all and I think this is a
dangerous fellow set because from a
maintenance perspective there’s really
no difference between test code and
application code and if you have tests
lacking quality they will hold you back
all right so let’s see what the hotspots
actually brought us when we added
complexity dimension we’re able to
narrow down the amount of code we need
to investigate even more and a typically
find that we’re able to narrow down to
just three to six percent depending on
the code base and this is important
because those free to six percents they
tell you which part of the code should
we focus improvements on in order to get
both improvement improvements in both
productivity and quality and the reason
I say quality is because hotspots tend
to be strong predictor of defects all
right so let’s leave the hotspots now
for a while and talk about something
entirely different let’s talk about a
primary tool as software developers and
I’m not talking about the compiler I’m
not talking about the ID not Emacs not
even vim I’m talking about the brain and
the reason I want to talk about the
brain is because your brain does not
always work in your best interest and to
show you an example I want to do a
little poll here please think back to
the last large project that you worked
on perhaps the project you work on right
now how many of you know where your
hotspots are in that codebase a few of
you ten people maybe please keep your
hand up if you’re 100% certain few of
your cool great you may well be right
what worries me though is that if 100
years of psychological research has
taught us anything it is that we humans
we can’t really trust our own judgment
and to explain what I mean about that we
need to talk about something different
yes that’s right we need to talk about
gorillas so this is one of my favorite
psychological experiments you may well
have heard about it’s quite famous which
was done back in the 90s and what
researchers did here was that they
record a vid
of two teams that play basketball
against each other and your task as a
participant in that experiment was to
count the number of process now as you
sat down and watch the two teams play
basketball something bit strange
happened because suddenly a man dressed
in a gorilla suit walks across the
basketball field stops right in front of
the camera turns towards you and starts
to beat on his chest then he walks off
after you’ve seen that video the
researchers will ask you did you notice
anything particular and you would say of
course a man resting a real a suit
that’s sure a bit odd but that’s not
what happened it turned out then more
than 50% of the participants failed to
see the gorilla and the follow-up
experiment revealed that people fail to
see the gorilla even when it’s right in
front of their eyes even when the image
of the gorilla hits a retinas we failed
to see it and the reason for that is
because you don’t see with your eyes you
see with your brain and in order to
perceive something we need to focus our
attention on it but your attention was
directed to calculate the number of
passes and the question for us here is
if we’re humans if we are capable of
missing something as obvious as a
gorilla what’s the risk that we will
miss the gorillas in our own code what’s
the risk that we will overlook our hot
spots and I think it goes deeper than
because now I talked about the failure
of the human mind a little bit but it
turns out we humans are actually really
good at some things and one thing we are
exceptionally good at that is to
rationalize decisions and believes that
we don’t even share so let me explain
how that works this is something
completely different if you took part in
this experiment what happened was that
you get shown two pictures of two
different faces and your task Assaf
participant is to select
face you find the most attractive once
you have made your selection the
researchers will hand your copy of that
photo only they don’t they trick you so
you actually receive a copy of the
folder that you didn’t pick and now they
do something really really evil because
they ask you please motivate your choice
interesting let’s think about that for a
while so we sit there with a copy of a
photo that we didn’t choose and are now
asked to motivate a choice that we
didn’t do and again you would think that
if something like that happens to you
you would of course notice immediately
and again that’s not what happened it
turned out that more than two-thirds of
the participants failed to notice the
swap and if you read original research
is really great because they have a
transcription of the interviews and the
motives that people gave so for example
you had this woman who she says yeah I
picked this folder because I really
loved the earrings the folder that you
actually picked don’t show any earrings
at all and of course that’s this really
confident man who says yeah I picked
this folder because I really prefer
blondes in reality the photo he picked
showed the dark-haired woman now would
have just told you about our two
examples of cognitive biases on an
individual level but if you want to see
a real disaster here’s what you do
you take a number of individuals put
them together and call that a team and
to explain what I mean we need to travel
into what perhaps may look at some
unethical corners but I promise you I
will keep this nice so please just relax
let me ask you a question instead how
many of you have been given the advice
that if you want to make a real impact
in a meeting you should speak first just
a few of you well in this context it’s
actually good advice because it turns
out that that
first person who speaks in a meeting
will buy us the whole discussion will
buy us the whole group but there’s an
even sneakier way to get what you want
and this is something called vocal
minorities and vocal minorities are
based upon the fact that we people when
we hear an opinion repeated over and
over again we come to believe that that
opinion is more popular and widespread
than it actually is and that’s true even
when it’s the same person repeating the
same opinion over and over again so all
I have to do in order to manipulate you
is to keep repeating things like do you
know that common lisp is a great
programming language
have you seen common lisp it’s an
amazing language now I try to be a good
person so I will only manipulate you in
a good way common lisp is indeed great
but what if it was the other way around
for example let’s say someone complains
a lot
have you seen that network module code
it’s crap that network module code we
just have to throw it away it’s so lousy
how do you think that opinion will
affect your idea on where we’re through
hotspots are and the reason I tell you
this is because I really really really
want to make the case from a next slide
because this is probably the most
important thing I’m going to tell you
today do use your intuition do use your
expertise but make sure to support your
decisions with data all right let’s move
on from gorillas and groups and all this
stuff I talked about change patterns in
our plications and I want to talk about
surprise and the cost of surprise and
the reason I want to talk about surprise
is because surprise is one of the most
expensive things you can put into a
software architecture and there are
different kinds of surprises
I like to show you the first kind of
surprise by showing you what has to be
my all-time favorite code this is really
a work of art this is the code from the
Apollo project so this is the code that
actually took us to the moon so please
have a look at this beauty in particular
focus on the comments how many of you
want to go to the moon on that code so
one could argue that this is a surprise
to the end user that’s not the kind of
surprise I want to talk about today I
want to talk about the surprise we
developers leave behind for the poor
maintenance program are coming after us
and I want to show you how a concept
called temporary coupling helps us
uncover those surprises
now temporal coupling is interesting
because it’s so different from the way
we developers typically talk about
coupling when we developers talk about
coupling what we typically mean is a
dependency between different parts and
pieces temporal coupling is different
because it’s not measured from code
temporal coupling is measured from the
evolution of your code so this is
something we pull out of version control
later and temporal coupling is about
files two or more that keep changing
together over time perhaps even in the
same comment and I want to show you how
that looks by looking at another real
word system this is a case study from MVC an MVC they tend to
focus a lot on automated tests so if you
look at their code base they actually
have more test code than application
code and that’s a consequence if you do
our temporal coupling analysis you will
find that most of your tempura couples
or between test code and the unit under
test and this is not a surprise at all
this is actually what we expect in fact
I would be worried if that temporal
dependency wasn’t there because it
probably means our tests aren’t being
kept up to date so what I tend to look
for instead or
our couples were had no easy explanation
perhaps something like this so in that
code base we have two different files
one is called script tag helper and one
is called a link tag helper and if you
look at the code you will see that
there’s no immediate dependency between
them yet they keep changing together in
89 percent of all commits how can that
happen when I find something like that I
always look at the code but today I’m
going to delegate that responsibility to
you so what will happen now is that I’m
going to put up a copy of the script tag
helper next to the link tag helper are
you ready
your task is to see if you can spot some
kind of subtle pattern here already here
we go here’s the script tag helper let’s
put the link tag helper next it anyone
notice a pattern yeah this is what I
find in quite a many cases a dear old
friend of mine copy/paste but I think
I’m being a little bit unfair here
because this is not really copy/paste
because they have done something where
they have actually updated the copy
pasted property names and even more rare
this is something you almost never find
they’ve updated it copy paste the
documentation so this my friends this is
not really copy paste this is more like
copy paste with a gold plating but still
tempura coupling is a great tool to
uncover surprises in our own code but we
can do even more with it we can use it
to analyze complete software
architectures and I’d like to show you
an example from micro-services why
micro-services the course right now
micro services are all the rage and that
just means that tomorrow’s legacy
systems are going to be micro services
so this is what micro services look like
in PowerPoint
in reality they tend to look much more
like this so we are developers we have
learned we need to abstract away shared
functionality so perhaps we introduce a
shared communication library we notice
shared database access param so let’s
abstract away those as well and of
course we want to follow the
recommendation of providing a shared
service template so that all of our
services behave in the same way and
obviously we want to make our services
easiest consume as possible so let’s
introduce a bunch of client libraries
now each one of those choices may well
be good what you have to watch out for
though is that in micro-services loose
coupling is king as soon as we start to
couple different services to each other
we lose most of the advantages of a
micro service architecture and are left
with an excess mess of complexity so
what I suggest is that we use temporal
coupling not on individual files but on
a service level and we look out for
surprising patterns like this where
multiple services change together due to
a shared library or even worse when
multiple services evolve together like
like this and if you do a temporal
coupling analysis like that regularly
you will be able to detect such warning
signs in your architecture early so that
you can react on time so remember that I
started out this presentation by talking
about organizations and I would like to
take it a step further and actually
claim that most organizational problems
are mistaken as technical issues and the
main reason for that is because social
information is something that’s
invisible in the code it’s just not
there and in order to tackle those
problems we need to combine our code
with social information here’s one
approach this is a case study of another
open-source project this is the
development of the closure programming
language and this is a visualization
called fractal figures a fractal figures
works like this you consider each file
as a box and each programmer gets
assigned a color and the more that
programmer has contributed to code the
larger their area of the box now you can
use fractal figures for a lot of
different things for example if add a
color legend
we get a useful communication tool let’s
say that we join this project we want to
contribute to closure and we want to
contribute to the evaluation module in
your top left corner we see that that
code is written by Stewart Ella way so
if we have a question about that code
Stewart probably knows a lot about it
we also see that closure in general is
written by the dark blue developer so if
we have a that’s rich Hickey so if you
ever questioned by closure in general
well recheck it probably knows a thing
or two about it but fractal figures are
useful even without the color legend
because now we want to look out for
surprising patterns like this where we
have 20 further different developers
contributing to the same piece of code
and the reason we want to look out for
that is because research has taught us
that the number of developers behind a
piece of code is one of the best
predictor on the number of quality
issues you will find and fractal figures
helps you identify those modules at risk
for defects fractal figures also
explains a lot about our hot spots
sometimes I come along pretty old code
bases code that’s been around for 10-15
years and what I tend to find in those
code bases is that most of the code is
stable and then we have a number of
really red glowing hot spots in the
central parts of that code and when I
find something like that always look
back in time and I see that those hot
spots have been around for years
so the question a folk risk for of
course why haven’t anyone done anything
about them
why haven’t they improved the code do
you know the reason why much existing
code is never improved the reason is
because the fractal figures looks like
this so you have fir the people that
work in that code all the time which
means you will impact the work of all
those people if you try to redesign that
piece and this leaves us in a very
unfortunate situation that are called
immutable design and please trust me on
this one I’m a functional programmer but
in this context there’s nothing good at
with immutability and I find it ironic
that we cannot improve the code because
we are so many people working on it in
parallel and we have to be so many
people the cross we cannot improve the
code all right let’s move on let’s we’ve
just talked about knowledge distribution
let’s turn an eye towards our blind
spots and I want to tell you a little
story about Paul Phillips that you see
here on screen Paul Phillips used to
work in the Scala code base
any word on Scala for five years and
during those five years Paul Phillips
was the number one contributor to Scala
then two years ago Paul Phillips made
this excellent presentation that I
really recommend and linked there where
he announces his decision to step back
and stop contributing to Scala so far
this is an excellent opportunity to see
what happens when a main developer
leaves so I did a study on knowledge
loss and this is two years after Paul
Phillips has left here’s what the
knowledge loss looks like in Scala you
see those red areas in this case they
don’t represent any hot spots no they
represent abandoned code that is code
that’s written by developer who is no
longer part of the organization and this
is something that you of course can use
to reason about knowledge distribution
you see that large subsystem to your
left that’s something called a compiler
which may be important but you can also
use it a bit more proactively and look
for things like this where you have an
entire abandoned subsystem in this case
they’re readable print loop so use it in
case you know that you’re planning some
features there you see to schedule some
additional time for learning because it
is a hugely increased risk to modify
code that we no longer understand all
I’m at my final observation now I have
found to succeed with maintainable code
bases we need to make it fun I work at a
company called ampere and we do most of
our development in closure and people
often ask me why did you choose closure
and I could of course tell them stuff
like yeah you know we do data analysis
and closure it’s an ex
and tool for data analysis the thing is
I didn’t know that back when we started
I was just lucky the reason of the
closure has nothing to do with
epic closure because it looked fun I
wanted to learn the language and I think
that fun is a much underestimate the
driver of the sign in fact fun is a much
underestimate the motivator in the
software industry because fun is
virtually a guarantee that things get
done so if you work in a large code base
always remember to put the fun into it
even if you’re locked down in your
choice of technology and platform
there’s always a lot of supporting code
to write around it a lot of tasks to
automate so pick a mundane task use of
technology of your choice to automate it
and turn it into a learning experience
your code is going to thank you for it
so I’m done now and before I take some
I just want to take this opportunity and
say thanks a lot for listening to me and
please remember that Common Lisp is a
great language thanks
so thank you so the first question that
I had like five times how can I get this
how how can I get these metrics myself
on my own code base yeah so that’s a
question that often get what kind of
tools do I use and the rear I actually
use my own tools and the reason I do
that is because one started out with
this there were no tools available that
could do does kind of analyze this I
wanted to do so I’ve racked my own ones
and the last year I decided to focus
full-time on that so I know how my
startup and peer were developed those
tools and we actually have some tools
available now and what will come up soon
is our service so that you can actually
sign up and get an analysis of all your
code for that service and we are
probably launching a preview quite soon
so sign in if you want to try that oli
does that hook up against github very
similar yes it does okay good so um
another question here is is number of
times that a file was modified really a
good measure because sometimes you might
within a day modify a file you know 60
times yeah so there were actually two
questions there first of all yes the
number of times the file has changed is
a really really good measure and it’s
actually backed by empirical research
the number of times a module has changed
is a better predictor than any other
metric you can mine from the code but
still it’s say of course important
because there may be a huge differences
in commits styles between offers on a
project and what a typically recommend
is that you try to use a uniform commit
style if you cannot find that there’s a
alternative metric that you can use
called code churn so instead of
calculate the number of commits you
calculate the amount of code that has
changed and that completely removes the
bias at possibly introduced by commits
and that’s how you extract your metrics
yes it is I use those two and I tend to
again stick with the number of commits
if it’s if possible because it’s such a
simple metric it’s so intuitive to
reason about it so and the last question
here can you recommend a good place in
the town centre to a party after
go to party yeah I’ll know a lot of
really good pups down in the south of
Sweden where I come from that won’t help
you so now sorry okay I delegate you
that yeah well I’m not sure I could help
you either Thanks
