Press "Enter" to skip to content

GOTO 2017 • A Crystal Ball to Prioritize Technical Debt • Adam Tornhill


good morning everyone and welcome to
this talk where we’re going to look out
for a crystal ball that will help us to
not only identify but also prioritize
technical depth so let’s jump right in
and see what this is all about my
starting point is martin fowler’s
well-known definition of technical depth
and what i want us to do here is to
focus on the key aspect and the key
aspect of technical depth is that just
like its financial counterpart technical
eft incur some interest payment and this
is something that i’ve found that we in
the industry we tend to misuse and
sometimes misunderstand and the
consequence of that is that technical
depth the concept is not as useful or
helpful as it could be so let me show
you what i mean about that by having you
do a small experiment so what’s going to
happen now is that i’m going to put up a
piece of code here and your task is to
judge the quality of that piece of code
all right ready
you do look ready so have a look at this
beauty how many of you think that this
is excellent code how many would let
this for a code review so looks like
we’re just two of us of course this is
not good code this is not the kind of
code we write and of course we would
never write something like that
ourselves never ever write but is it a
problem it’s a technical depth without
more context we just cannot tell because
it’s not technical depth unless we have
to pay an interest rate on it an
interest rate interestingly enough is a
function of time so in order to identify
technical depth we will need to have a
time dimension into our code where can
we get such a thing in a few minutes I’m
going to show you
technique but before we go there I would
like to share a little story with you as
as part of my day job I go to different
companies and I analyze their code bases
and a while ago I worked with a company
that prior to my arrival they had used
the tool capable of quantifying
technical depth so they had taken this
tool and thrown it at our 15 year old
codebase and this tool reported that on
your 15 year old codebase you have
accumulated 4,000 years of technical
depth 4000 years just to put it into
perspective for if 4000 years ago that
was here when Moses parted the Red Sea
so 4000 years of technical depth may
well be accurate but it’s not
particularly helpful I mean where do we
start it’s all depth equal the sites
actually think it’s a mistake to try to
quantify technical depth from code alone
and the reason I say that is because
most technical depth isn’t even
technical and this is something I found
over and over again that we as
developers we tend to mistake
organizational problems for technical
issues and the consequence is that we
start to address the symptoms instead of
the root cause so just to give you some
quick examples perhaps you have seen
this that you come into a new company
right and someone helps you the first
day and tells you alright but be a bit
careful with this part of the code it’s
a complete mess it’s really really hard
to understand or my personal favorite
you may might have heard this one we
spend a lot of time merging different
feature branches we need to buy better
merge tools have you heard that one some
of you not yeah so as you will see there
may be an organizational side to that
and we’re going to discuss that later
but I think that the main reason that we
make misattributions like this is
because the organization that builds the
code is invisible in the code itself and
that’s unfortunate because there is a
strong link between technical depth and
the organization and I’m pretty sure
we cannot address technical depth
without taking a holistic view of it and
that holistic view has to include our
social dimension so based upon what I’ve
told you so far I put together kind of a
wish list kind of the information I
think that we need in order to address
technical depth and the first thing our
starting point is to find out which part
of the code has the highest interest
rate the second aspect is does our
architecture support the way our system
evolves are we working with or against
our architecture and the third and final
and perhaps most important point how
does it look from our social perspective
order any productivity ball next you
know those pieces of the code were five
different teams have to work together
all the time
now please think back to the project you
work on right now
do you think this kind of information
would be useful to you in your day job I
see that some of you are nodding that’s
why we’re good you’re going to enjoy the
rest of the session what I want you to
focus on here is that none of this
information is available in the code
itself most importantly we like some
critical information we like a tight
time dimension and we miss social
organizational information so how can we
get that where’s our crystal ball it
turns out you all already have one we’re
just not used to think about it that way
I’m talking about our version control
data so this is a simple example from
get and version control is amazing in
that it’s actually a behavioral log it
tells us exactly what parts of the code
we have changed at what point in time so
we get the time dimension right and
based on that time dimension we can
calculate a lot of interesting
statistics and frequencies but even more
important version control data is social
data so we know exactly which programmer
that worked in which parts of the code
so let’s embrace version control later
and let’s see how it helps us find code
with high interest rate and my starting
point here is a concept I call hotspots
let me show you how that works this is a
hotspot analysis of a well-known Java
system Tomcat as anyone who worked or
used Tomcat yeah a lot of you cool so
Tomcat is a mature system with a rich
history and I’m going to show you how to
interpret this visualization so have a
look at those large blue circles that
blink right now each one of those
represents a folder in the Tomcat
repository and you see that you have
other smaller circles inside it
those are subfolders right so this is a
hierarchical distillation that follows
the folder structure of your code it’s
also interactive which i think is
important once we get to systems of
scale so you can sue min on the level of
detail you’re interested in and if you
do that you will see that each file is
represented as a circle and you will see
that the circles they have different
size that’s because the size is used to
represent the complexity of the
corresponding code now how do we measure
code complexity we have a bunch of
different ways right I guess most of you
are familiar with things like mccabe’s
cyclomatic complexity we also have
things like health stats volume metrics
that we could use and it doesn’t really
matter what kind of metric you use
because what they all have in common is
that are equally bad complexity metrics
are terribly terrible at measuring
complexity so what I tend to use is the
simplest thing that could possibly work
which in this case is the number of
lines of code has a very high
correlation to things like cyclomatic
complexity and besides this is not the
interesting part of the data the
interesting part of the data is what’s
the interest rate so what we’re going to
do is that we’re going to pull out some
data from a version control system and
look at the change frequency of that
code so how often do we actually into
work in that
to the code and when combined those two
dimensions were able to identify
complicated code that we have to work
with often and that’s our hot spots so
let’s return to Tomcat and you see that
in tongkat the hot spots are relatively
few hot spots right now just because
something is a hot spot that doesn’t
mean it’s a problem what it means is
that this is the most important part of
the code when it comes to technical lab
so you want to inspect that code and
ensure that the code is nice it’s clean
it’s easy to understand it’s easy to
evolve because that’s where you spend
most of your time however in practice
more often than not I found that the
opposite is true and that tends to make
hot spots great refracting candidates
and there’s a fascinating reason why
that’s the case and I want to show you
why by borrowing a quote from their
great George Orwell perhaps you remember
those of you have read George Orwell
that he actually said that all code is
equal but some code is more equal than
others now what will mean with that
statement well let me show you please
have a look at the following graphs they
all show exactly the same thing so in
the x-axis you have each file in the
system and they’re sorted according to
their change frequency that is how many
commits do I do to each file and the
number of commits is what you see here
on the y-axis now if you look at the
free examples here you see that dos
represent three radically different code
bases they’re implemented in different
languages to target different domains
they use completely different
technologies they’re developed by
different organizations they have a
different lifetime span everything is
different yet do you see any patters
that’s right they all show exactly the
same distribution the show a power-law
distribution and this is something I’ve
found in every single code base that
I’ve analyzed this seems to be the way
that software evolves and this is
important because this gives us a tool
to prioritize because what this means is
that in your typical code base most of
your code is here
in the longtail it represents code that
you rarely if ever need to work with and
most of our development activity is
focused to a relatively small part of
your code and that’s where we want to
focus improvements in order to be pretty
sure that we get a real return on that
investment so hotspots are a great
starting point but sometimes they are
not enough and I want to show you an
example where hotspots doesn’t quite
work and I want to show you a different
case study this is a hotspot analysis of
the.net core runtime from microsoft and
when I did this analysis it was like six
months ago
most of the hot spots were here in the
JIT package the just-in-time compilation
support right but you also see there’s
kind of like a satellite hot spot here
something called GC dot CPP that’s that
dot net garbage collector now GC that CP
team looks quite innocent here on screen
but that’s just because the scale of
dotnet core dotnet is huge it consists
of what you see here on screen is
approximately four million lines of code
and GC dot CPP so a relatively big file
how big I have no idea so let’s look at
that on github Wow
so pretty big right so we need to look
at the raw file and when we do that we
will see that GC that CPP is a file with
37,000 lines of C++ further 7000 lines
of C++ let me tell you I used to be a
C++ developer I did C++ for 10 years and
yeah I know what you think that’s 10
years I never get back right
I kind of agree but at the same time it
gives me a lot of respect for 47,000
eyes of C++ it’s it’s the stuff
nightmares of Madoff and besides how
useful is it to know that you see that
CPP is a hot spot so let’s pretend I
come to your organization and say hey
look I found your worst maintenance
problems I’ve found your technical depth
with the highest interest rate you just
rewrite this file and everything is fine
again it would be accurate but not
particular helpful right so we need to
do better because how do we act upon
such information the answer is we don’t
so what I suggest is this when we denta
fie those big hotspots what we do now is
that we take the source code
I will parse it into chunks of different
methods and then we look at where does
each commit hit right so it’s exactly
the same measure but on a function
method level let me show you how that
looks on GC the CPP so our hotspot
analysis on a function level reveals
that the number one hotspot on a
function level is a function called grow
brick card-tables that function consists
are three hundred thirty two lines of
code that’s kind of a lot forcing a
function isn’t it but it’s definitely
much less than three to seven thousand
lines of code which is the size of the
total file and it’s definitely less than
four million lines of code which is the
size of a total system so even more
important we are now at the level where
we can act upon the information and do a
focused refactoring based on how we have
worked with the code so far so
we’re going to return to the hot spots
but before we do that I would like to
kind of try to answer a question like
how do you get to a file with 37,000
lines of C++ I really can’t let that go
so this is a book recommendation for me
this is the best book I know on how
organizations fail it’s written by Diane
Morgan and Diane Warren the interesting
thing here is she’s not a software
developer she’s a sociologist and she
used the Challenger accident to coin her
fear of normalization of deviance so how
many of you remember the space shuttle
Challenger yes so like half of you may
be right so let’s do a quick recap on
what happened so this is their
Challenger on its final launch and
that’s the actual space shuttle and what
you see here is this huge object that’s
a solid rocket booster that’s a huge
thing right it’s so big so it’s
delivered in three separate segments
that are then assembled and what you see
here is a puff of gray smoke you see it
not a good thing it’s not supposed to be
there so what actually happened was that
here’s a joint between two different
segments and that joint failed to seal
and the consequence was that some hot
rocket gases could escape and impact the
whole structure of the Challenger so
that spaceship basically disintegrated
once it got into there and when Diane
Morgan writes about this she tells the
story of already back in their 70s when
they made the first design tests on the
Challenger and the solid rocket boosters
it turned out that the actual
performance of those solid rocket
booster joints deviated from the
predicted performance what do you do in
that case well this is no sir so they
formed a committee and they discussed
the problem and they decided to pass it
off as an acceptable risk
years later during the first in flight
tests again it was revealed that the
actual performance deviated from the
project
performance again they discuss the
problem had passed it off as an
acceptable risk and it went on and on
like that for years before actual and a
weary eve of the launch there were some
very real concerns raised by some of the
engineers working this stuff due to the
unusually cold temperatures in Florida
at that time again it was discussed and
passed off as an acceptable risk and the
consequence was a tragic loss of lives
now this is what Diane Warren calls the
normalization of deviance that each time
we accept a risk the deviations become
the new normal we get a new point of
reference and this is interesting
because this is not really about
spaceships this is about people and we
have plenty of normalization of deviance
in software too so let me give you an
example let’s say that you join a new
organization and you inherit a file with
6,000 lines of code
wow that’s a lot but if you work with
that long enough that becomes your new
normal you start to get to know those
lines of code and I mean besides
seriously what difference would it make
you with a few extra hundred lines of
code right so soon we have seven
thousand lines of code then we have
eight thousand and so on so we need to
find a way to identify and stop the
normalization of Debian’s
here’s one technique that I found really
useful for software this is something I
call complexity trends and the way this
works is basically that when I find a
hot spot I pull out all historic
revisions from the version control later
of that code and then I measure the code
complexity at each point in time and
plot a trend like this so all systems
they tell stories right
we just lean to learn to read them so
here’s an example so you see the blue
line shows the growth in lines of code
the red line is one of the complexity
metrics like cyclomatic complexity and
you see here that there was some kind of
refactoring down here back in April
right but somehow they seemed to have
failed to address the root cause because
Preda seen the complexity starts to
sneak in back again before there was a
dramatic increase here just at the
beginning of the era
so this is a warning sign that we need
to take a step back and consider what
actually happens and in the direction
that this code grows so the
normalization of deviance is one of the
reasons why whistleblowers are so
important in organizations and I found
that complexity trends make great
whistleblowers in code so to sum up this
first part hotspots help us identify
that code with the highest interest rate
and the reason they work is because all
code isn’t equal what I want to us to do
now is to climb up the abstraction
ladder and look at the architectural
level and try to find out how well does
the architecture supports the way our
system evolves and as you all know you
cannot really talk about architecture
without mentioning art so does anyone
recognize this piece of work none of you
that’s a good thing because it doesn’t
exist this is something called a liquid
crystal light projection done by the
artist Gustav Metzger so this is an
object that changes all the time right
so this work of art looks completely
different today than it did yesterday
it’s kind of fascinating to watch it now
the interesting thing is the way it
changes because the way it changes is
that any new form necessitates the
destruction of an existing form and this
is exactly how code is supposed to
evolve so successful code will evolve
and it will change and that’s a good
thing however the reason I say it’s
order destructive is because changes and
new features they often become
increasingly more difficult to implement
over time and some systems they just
seem to reach some kind of tipping point
and beyond that point they get virtually
impossible to maintain so how can we
catch that how can we detect that well
we have our hot spots right but I would
say that the fire level makes it pretty
hard to reason about the macro aspects
of the system so we would like to raise
the abstraction
lateral illness so one way of doing that
is this instead of measuring hot spots
on our filin function level
let’s introduce logical components but
taking a set of files and dripping all
aggregating all contributions to that
file and put a logical name on it so we
can have our logical components follow
the folder structure of your code if you
want
that’s quite typical you can use any
kind of mapping here that you want the
only important thing is that whatever
group or maps you introduce logical
components those should carry on
architectural significance in your
context so once we do that we can start
to reason about our system evolves so
let me show you a specific example from
our service-oriented architecture so
what I would do in this case is that I
will introduce a logical component for
each service error and then I would
measure hot spots on that level
let level of services now when we do
that we get help to answer one of the
most important questions we have about
micro services and this is a serious
question right I mean serious in the way
that flame wars have been talking about
it friendships have been ended and the
question is how big is micro service
supposed to be 100 lines of code 1,000
mm we know it’s a bit misdirected to
reason about service size in terms of
lines of code because what we’re
actually after is business capabilities
we want each service to implement a
single business capability right so the
key here is cohesion so once we start to
measure hotspots on this level we can
identify hotspots services so this is
actually some real data from a real
system and we see that the number one
hotspot here is something called the
recommendations service we see we have
done a lot of changes to that piece of
the code and it now consists of around
five thousand lines of code now just
that will calculate complexity transfer
individual files we can aggregate the
contributions from multiple files and
get a complexity
the whole service and we see that this
service is pretty much was some rapid
initial development and now it seems to
have stabilized on that higher level of
like five thousand lines of code with a
lot of changes all the time but without
any significant growth in this kind of
data it doesn’t really solve any
problems well it does is that helps you
ask the right questions and the question
here is if that service with five
thousand lines of code that changes all
the time really represents a single
business capability perhaps it’s a
service that’s better off when split
into two parts or free but we can do
even more with this kind of data and I
would like to show you how by
introducing a concept called temporal
coupling or change coupling here’s how
it works let’s say we have three
different subsystems the first time we
do a change to this system we modified a
fuel injector and diagnostics module
together the next time we do a change we
modify something else the third time
we’re back to modifying the fuel
injector and diagnostics module now if
this is a trend that continues there has
to be some kind of relationship between
those two subsystems right because they
keep changing together this is something
we can use to reason about their
architectures and I want to show you an
example from a few different
architectural styles and I would like to
start by talking about layers while
layers
well because layered architecture have
been so popular so a lot of what we call
legacy systems are implemented in
layered styles so this is basically the
canonical building blocks of a Model
View controller and I assigned a light
code basis for our living when I come to
a company I always ask them what’s kind
of architectural style you have and they
often tell me that yeah we’re using a
model view controller or a Model View
presenter or model view viewmodel or
some of those variations right and then
you look at the code and that’s never
what the code looks like and the reason
for that is that we all know that this
is not enough right
we need a services layer too right of
course we need a services layer and but
below that we must have a repository
layer and the reason that we must have a
repository layer is because I actually
have no idea so let’s call it a best
practice shall we and then we need an
object relational mapper beyond that
below that because we don’t want to
access the sequel directly and then
maybe we have some sequel and we all
know that in reality it’s quite likely
that we have a few extra layers right we
may have a view helper or something like
that now this is interesting because if
we do a temporal coupling measurement on
this level we consider each layer a
logical component component what does
our change pattern actually look like
here’s what they look like I have found
and this is real data from real systems
that somewhere between 30 to 70% of all
commits ripple through the entire
architectural hierarchy and this is kind
of interesting because it impacts what
we can do with that system so what this
means is that few changes are local
right and I have found that work like
this that architecture like this is
basically built around a technical
partitioning into technical building
blocks but kind of work we do is feature
oriented its end user oriented so there
is a kind of conflict between those two
styles and it will kind of work with you
now I’ve found that layered
architectures may work really really
well for small teams where have seen it
work less well is when the team is a
little bit larger and I think the kind
of point is as low as four maybe five
people so if you have a large
organization this may well become a real
problem and it might become a real
problem because now we have all
developers working all parts of the code
all the time not only is that on quite
an increased coordination overhead it
also puts you at risk for things like
unexpected feature interactions and I
would like to claim that
one of the original motivations for
layered architectures is this or
separation of concerns but that will
claim that if most of our commits have
to ripple for all the layers perhaps
just perhaps it’s the wrong concerns we
are separating honor and I think this is
one of the main reasons also due to the
recent popularity of things like micro
services and you all recognize this
right because that’s what micro services
look like in PowerPoint in reality we
know that micro services they are much
much more complex right so we’re share
building blocks we have service
templates we’ve client IP eyes we may
have cross-cutting concerns like
diagnostics and monitoring and all that
stuff so that additional complexity it
when should comes with a risk and that
risk is coupling which is like the
cardinal sin in micro services right
tight coupling so that’s something we
want to avoid because the moment we
start a couple different services to
each other we lose a lot of the
advantage as a micro services we are
left with an excess mess of complexity
so if we have an important architectural
principle like that
why don’t we measure it so again let’s
consider each micro service our logical
component and let’s measure temperature
pling on that level so that we can
detect patterns like this change parents
that ripple across multiple repositories
and if we start to use that from the
very beginning we can use that as an
early warning system and prevent those
things from happening and if we do that
we can avoid things like what I call
them micro services shotgun surgery
pattern now what’s the micro service a
shotgun service surgery pattern that’s
basically the idea that you want to
tweak a single business capability and
you end up modifying five different
services and there’s several reasons why
that may happen the resource I’ve seen
that our most common is first of all
that those different services they may
share code that itself isn’t stable from
an evolutionary perspective
another reason I’ve seen that some
services are just leaky abstractions
then our services tend to depend upon
their implementation details which makes
them change together I’ve also seen that
this tends to be a bit more RAM common
when the same team is responsible for
several services it seems to become much
much easier to send more directed
information and assume more about the
receiver so what tools do I use to
analyze these kind of architectures and
to do a hotspot analysis and temporal
coupling analysis well
when I’ve wrote my book like three or
four years ago with your coders crime
scene the warrant really on the tools
available that could do the analysis I
wanted to do so I put together my own
tool suite it’s available on my github
account it’s open source just download
it play around with it it’s a pretty
good starting point if we want to start
to get a feel for what these kind of
analyzers can do to you you can do for
you so what I working right now is
something called code scene that you can
have a look at that code see an i/o I
have a bunch of different case studies
there it’s free to use for open source
projects so if you work in that space
have a look at it there are also some
interesting options from the academic
space my personal favorite is this one I
call it evolution radar which is
particularly strong on change coupling
it’s a really really good tool for that
and finally if you want to create your
own tools I recommend that you have a
look at the moose platform so the moose
platform is a platform for building yak
code analyzers moose is also the best
excuse I know to learn to program in
small talk it’s a wonderful language so
to sum up this part we have seen that
our hotspots they scale to all different
levels we can use the same measure on a
function level on a file level as we can
on architectural level and we have seen
how to use temporal coupling to evaluate
our architectures the evolution of them
now I want to spend the final minutes of
this presentation on exploring the
social side of our code so let me start
by asking you how many of you develop
software as part of a team so that’s
like 99.9 percent of something so most
of you write and that’s what you have to
do because we take on larger and larger
problems our systems become more and
more complex and we simply cannot work
on them alone
so we need to get together in the team
to achieve what we want to achieve
however what I want you to know is that
teamwork is never free teamwork always
comes with a cost
and that cost is what social psychology
is called process loss so process loss
is a concept that the social
psychologists have taken it from the
field of mechanics and ideas yet just
like a machine cannot operate at 100%
efficiency all the time due to things
like friction and heat loss neither can
our team so let me show you how this
model works so we have a bunch of
individuals and together they have a
potential productivity that’s never what
we get out of a team the real
productivity is always somewhat smaller
and part of potential is simply lost how
is it lost well the kind of process loss
you get depends upon the tasks you do
but I would say within something like
software most of the process loss is
coordination and communication overhead
and you may also have software aspects
like for example motivation losses which
is a very real thing on several in
several life projects the bad news are
that you can never avoid process loss
entirely you can never eliminate it but
what we can do is to minimize this of
course so let’s look at another common
source for process loss this is another
concept from social psychology called
the diffusion of responsibility and the
diffusion of responsibility is something
that you can experience if you’re
unfortunate enough to witness on
emergency or an accident
the Crossway turns out the more
bystanders the more people present when
something happens like that
the less likely that an individual will
offer help
kind of scary isn’t it and there are
several reasons for that one common
reason is of course that the more preper
present we just assumed that our well
someone else will help instead and we
have plenty of diffusion of
responsibility in software – and now I’m
going to talk a little bit about the
weary weary weary controversial topic
the kind of topic you never talk about
on our conference right it’s stupid to
do that so I’m going to do that I’m
going to talk about I believe it or not
I’m really scared to say this but I’m
going to talk about code ownership Wow
so and I want to talk about it because I
want to make it clear that when I say
code ownership I don’t mean ownership in
a sense of hey this is my code stay away
no not at all
I mean code ownership in the sense that
someone takes a personal responsibility
for a piece of code and for its quality
and for its future and that someone
could be an individual it could be a
pair or it could be a small team within
a larger organization so things like
that help us minimize the diffusion of
responsibility and if you have an
important principle like that we should
try to supervise and measure it here’s
one way of doing that since we’re in
version control Wonderland we have
access to all that social data so we
know exactly which individual that
worked on which parts of the code what
we do now is to aggregate individuals
into teams right because that’s the
interesting level to operate and measure
on so let’s see how a diffusive
responsibility metric looks like so it’s
the same kind of visualization as we saw
earlier with the hotspots only now the
color carries a different meaning the
color shows the amount of inter team
coordination so the more often different
teams work in the same parts of the code
and the more fragment that their
development efforts the more red the
corresponding module and file here now
how do we use this data well we
area here with a lot of potential a lot
of access parallel development by lots
of different teams how do we react to
that well it’s really really hard there
are in general two things that I’ve seen
work the first thing is that you start
to investigate that part of the code
from a technical perspective because you
will find that code changes for a reason
and the reason that part of the code
attracts so many different teams is
probably because it had many reasons to
do so it has too many responsibilities
so you will see that if you take that
code and you identify those
responsibilities and you split it
according to responsibilities not only
will you get a more cohesive system you
will also get a system that helps you
minimize the amount of coordination
needs between different teams another
thing that you may find from time to
time is that your organization may lack
a team to take on a shared
responsibility so you may actually have
to introduce a new team to take on that
ownership but we can do so much more
with version control later we can start
to measure things like Conway’s law
right so here’s an example now this time
the contributions of each team are
marked up with different colors right so
each team has unique color and here we
can see this is a package by feature
architecture you can see basically each
circle here represents one feature and
you see from the perspective of Conway’s
law this looks absolutely wonderful
doesn’t it
each team works in perfect isolation
with a minimum of coordination overhead
between them is that the way we want to
go I’m not sure because I don’t know I
have to make a confession here all day
that you have seen so far or from
real-world systems except this one I had
to fake it because I never seen that
happen in reality and and I don’t think
it’s a good idea to go all that way I’m
not going to talk so much about that but
I think when we separate teams too much
we run the risk of introduced or falling
into a lot of different social traps
that may be more expensive than the
actual coordination itself but please
ask me about that afterwards and I’ll be
happy to talk
about it but now I want to show you a
more realistic example the next one is
real data so this is an organization I
worked with a couple of years ago and
what this organization did was that they
decided to introduce feature teams so
they took their existing organization
and sliced it into twelve different
teams and then they worked in sprints so
at the beginning of each sprint each
team got assigned a number of tasks and
then they get working on the codebase
and this is what the contribution
patterns look like after one month do
you see any patterns neither do i this
is basically collective cows and the
reason I say that is the chorus happens
here this is a coordination nightmare
because what happens here is that you
have basically 12 different teams
working in the same parts of the code
all the time for different reasons so
not only is there and you need to start
to put processes and stuff on top of
that in order to be able to control the
quality at all and it’s going to be
really really expensive another things
that’s going to be expensive is that in
this case you miss synergies between
different features so that is you missed
opportunities to simplify not only the
problem domain but also the solution
domain so please align your architecture
and your organization your code is going
to thank you for it now I hope you have
enjoyed this journey through the
fascinating field of evolving code and
ultimately it’s all about writing better
software software that can evolve with
the pressure of new features novel you
suggests and change circumstances and
I’m pretty sure that writing code of
that quality is never going to be easy
so we need all the support we can get
and I hope that this session has
inspired you to investigate the field in
more depth if you want to read more
about this I have two books about the
topic your code is a crime scene and my
new book available from a pragmatic shot
a bookshelf now software design x-rays I
also blog a lot about this so if you
look at ampere Komar
I have a lot of different case studies
from different open-source systems there
you can read about and now before I take
some questions I like to take this
opportunity and say thanks a lot for
listening to me and may the code be with
you thanks
Thanks yeah thanks a lot for the talk so
there are some questions that came in to
the earth through the app one of them is
I think quite an interesting one in the
common one
what’s the best way to share the
necessity of code refactoring to
stakeholders a management sorry one more
time what’s the best way to sell the
necessity of code refactoring to
stakeholders and management oh yeah
that’s a good one yeah so it’s actually
not that hard so what I think we need to
do is that we need to bridge that gap
between us developers and management and
the way to do it is by ensuring that we
kind of speak the same language we have
to let it’s I think it’s mostly our
responsibility as developers to give
management to glimpse into our word and
several managers are non-technical they
don’t code perhaps they’d never done it
and so what I’ve found is that
visualizing the stuff really really
helps the hotspots tend to be quite
intuitive most managers tend to
understand them that yes this is really
where we work all the time so they tend
to because what they are afraid of is of
course to make a decision like all right
let’s go ahead hey we have a refactoring
badges or to improve this stuff and then
you don’t get any effect out of that so
if you show them real data they can base
their decisions on data which makes them
feel much more comfortable I’ve also
found that visualizing things like
complexity trends helps a lot because
what I find in practice is that when
identify a hot spot in a large
organization and I look at the code and
I see yeah this is really really tricky
we really need to do something here then
you look back in time and you see that
this code has probably been a problem
for years and trends are really really
good at visualizing that you can see
that all right you worked most of your
time in this part of the code and as you
do it becomes more and more and more
complicated all managers understand that
that’s not a good thing so that’s my
recommendation put numbers on it instead
of just relying on a gut feeling and
visualize it
another one is can you have examples of
process gains opposite of process loss
for example when a developer works
together with a domain expert on the
domain-specific development project yes
of course you can and we do have a lot
of process gained to in that way but
really sorry to say but that’s not
something I can measure now for job
there but yeah it’s definitely a case so
you recommend for limited cross team
coordination what would such a process
look like so I think there’s an
important distinction to make here
because what we want to do is basically
we want to minimize the coordination
need between different teams I think
that’s really fundamental at the same
time we don’t want to isolate teams
because then we run into a lot of
strange social traps like fundamental
attribution errors and we may get long
lead times between the teams so we need
to make a distinction between what I
call the operational boundaries and
their knowledge boundaries so there are
several ways of achieving that so your
operational boundaries the parts of the
code where you spend most of your time
should be relatively small right so your
team should carry a meaning from an
architectural perspective I think that’s
important but your knowledge boundaries
have to be much much broader so there
are several ways of achieving that one
is to start to invite people from other
teams to your code reviews and design
works to to get to know them and
exchange context get to know each other
another thing you can do is to encourage
people to wrote their team’s not
everyone wants to rotate the team but if
they want to do it they should be free
to do it and the third thing is to what
I think work really well as this open
source ownership model where a team kind
of owns a piece of code in the sense
detective responsibilities anyone can
make changes to it but the team has the
final say on what’s going inside into
the code and what not that this actually
worked quite well because it takes away
this ball neck right if I need to make a
change to this part of the code
can do it I don’t have to wait for
someone else so whatever you do make it
a conscious and deliberate decision
because this is not something that will
happen automatically hope I manage to
answer a question so there were a lot of
questions about the truths you were
using that was you did show that on the
slides still which one did you actually
use to do the the visualizations that
you have in your presentation oh yeah so
I use quotes in for that because that’s
kind of my life these days
that shall be solace Asians you can use
whatever tool you want to do them you
can I’ve started out just using Excel
you can actually get pretty far with
that d3 is wonderful at visualizing data
like that
that’s what codes in uses for most of
the visualizations isn’t focusing on
hotspots making them even hotter it
depends on what you do with them of
course so yeah I mean in the short term
of course when you start to refactor a
hotspot it may indeed lead to increased
activity in that hotspot however what
you shall see what a typical find is
that most really really scary hotspots
what happens with them when you refactor
them and refactoring a hotspot is not
something you do like alright let’s
factor it let’s start this week and
let’s finish in two months
so it’s something that has to be an
iterative process right we will make
smaller and smaller improvements over
time so they should actually cool down
over time and many hotspots just go away
because they don’t exist in the new
design they are represented by different
design elements and during your
factories if you see a noticeable drop
in complexity if you start to measure
trends like that so you can actually and
it’s another one that ties back to a
question I got about how about how to
sell this stuff to management right is
that you can show them that I look we’re
actually getting our measurable effect
of our refactorings how can you create
code ownership how you create code
ownership yes I don’t think you can I
don’t think you can create I mean
again whatever model your shoes I mean
I’ve seen everything and I’ve seen a lot
of different styles that work I think
the key is really to make it a
deliberate choice decide upon one model
and go with it okay final question have
you used sonarqube
no I haven’t used sonic qube I I do use
a lot of static analysis tools I used to
work back in my days when I was on net
consultant yeah it’s true I used and
depend a lot is now I do closure these
days I use other static analysis tools
for that and I’m actually a big fan of
static analysis right I just think the
issue is that we need something on top
of static analysis we need techniques
like this to prioritize and say hey this
is where we should start to address
those findings that the static analysis
tools gives us so it’s just another
input source and I think they’re they
complement each other quite nice okay
thanks a lot again thanks a lot for the
great talk and hope to see you sometime
yes thanks and please come and grab a
bunch yeah there are stickers
Please follow and like us:

Be First to Comment

Leave a Reply