Buildstuff Conference Report - day 1
The Long Sad History of Microservices (TM)
Greg Young @gregyoung
Like several of Greg’s recent talks that I’ve seen, this talk too encourages the audience to think a bit harder about what’s new and cool, and in particular, what it all costs. This time Greg covers the popular topic of microservices. It seems like you’re not a cool programmer if you don’t produce microservices instead of monolithic systems.
Greg first covers a bit of history, pointing out that microservices aren’t new at all. As is true for many other “discoveries” in computer programming, microservices find their origin in previous concepts, like SOA (in a way you could say that “microservices done right” is in fact “SOA done right”), DCOM and CORBA. But he goes further and leads us back to Alan Kay and the concept of “objects” in his 1970 programming language Smalltalk, as well as Carl Hewitt’s more or less contemporary work on the Actor model. Objects should be conceived as little computers, encapsulating state, offering useful behaviors, processing and sending messages amongst each other; add concurrency as defined by the actor model and you get pretty close to the design ideals behind microservices.
Though much of what we do now has been done before, Greg points out that we haven’t learned too much from the past. We make many of the same old mistakes (and some new ones too). For example, Greg mentions that it’s almost never a good idea to start out with microservices. Often our business doesn’t run on a scale that justifies the large cost that comes with microservices. If we must, it is often because we need higher availability (e.g. better handling of failure), but even then we often think it should be about scale. We should never forget that distributed computing is hard because network calls are likely to go wrong.
Many people think that devops and continuous delivery/deployment are things every microservice needs. This isn’t true. These culture shifts again come with a high cost and it may just not be needed in your context. Of course, the whole continuous delivery process is costly itself, but you also need to start versioning your microservices (its functions as well as the supported data structures), which is time-consuming to say the least.
Greg offers the suggestion that one of the main ideas behind microservices, which is that microservices all have their own data storage, is not true for every situation. For example, combining data from across the network to generate some report is just very expensive. We should consider sharing a database in these specific situations so we can just quickly and pragmatically join some data inside the database. Then again, we know about the difficulties that may arise from sharing a database between services, so we should be careful. Overall, the message is to get rid of binary options (e.g. a single database per microservice versus a single database for the entire organization), and instead consider other options instead (e.g. some services share a database). By the way, the “everything should be a container” rule is another binary answer that should be avoided. As many of us have experienced, containerization comes with its own set of trade-offs (expensive during a development phase, networking may be tricky, etc.).
One thing many people do wrong when applying a microservice architecture is heavily using synchronous communication. Microservices aren’t meant to communicate synchronously. It greatly reduces a service’s reliability. From the frontend to the backend we could just make synchronous calls, but after that, everything should happen by default asynchronously. If we want some network call to be blocking, we have to consider alternatives first, and explicitly justify our choice.
There are more trade-offs to be considered in this area: we might listen to events and store copies of external data locally (which will make our service autonomous, because less reliant on data owned by other services) or we might ask for data when it’s needed and get accurate, authoritative data. Depending on a data authority also allows the authority itself to log access to the data, something that’s much harder to accomplish if services rely on events and cache data locally.
Every one of Greg’s suggestions comes with the encouragement to think about microservices trade-offs and make decisions that are valid and reasonable in our own context. The benefits of a microservice architecture may be useful, or not, within our own projects. We should never go blindly all-in on some technique or strategy.
Finally, Greg encourages us not to invest too much learning in technologies that may be current for only a short while. In general he estimates that 50% of what we learn now will be irrelevant in 5 years. Instead, we should focus on everything that’s not a fad. We should turn to academia and learn from research papers for example. These discoveries will likely never date. He mentions Pi-calculus, the Actor model amongst others.
Metrics Driven Development
Sam Elamin @samelamin
If someone asks you to deliver a feature, it makes sense to ask them: why? (As a software developer I’ve learned only fairly recently that you’re supposed to do this.) If you don’t, then you might end up making all kinds of wrong assumptions, resulting in the code (maybe) delivered on time, (maybe) within budget, but not adding nearly enough value to the business.
For some time now we know that continuously delivering value is what agile teams are supposed to be doing. We use all kinds of useful techniques for this, like TDD and BDD. TDD should allow us to move fast and safely so. BDD should help us make the behavior defined in the code reflect the way the stakeholder thinks and talks about the business. Still, what’s missing is some way to know, and see, if what we build actually has some impact.
This is where metrics can play a very useful role: you should try to describe the goal you’re trying to achieve. Describe the goal in a way that allows you to define measurements which can confirm that you’re gradually reaching the goal (or not). For example, if the goal is to convince more people to log in on your system and you think that adding social media login options to your application will help you reach that goal, start measuring the amount of people that are logging in. Then you can verify that after releasing the social media login feature you are actually closer to reaching your goal. If not, it might be a good idea to revise your plans. By the way, this is much related to a technique called Impact mapping (https://www.impactmapping.org/), introduced by Gojko Adzic. Basically, instead of simply implementing all the features that are on the customer’s “shopping list”, developers should collaborate with stakeholders to first describe the underlying business goals. Then you can still work on the features, but verify the stakeholder’s assumptions along the way, by putting some measurements in place.
Sam Elamin gives us some useful suggestions for getting started with generating metrics for our running software. We should start out small. Collecting metrics, rendering them in cool graphs, is quite addictive. You can measure things to find out if you’re going in the right direction (like A/B testing). You should also use metrics for alerts (i.e. respond to sudden changes, unusual patterns in the usage of your system, like a high number of failed login attempts).
There are different kind of metrics: infrastructure (CPU/memory usage, etc.), application (response times, etc.) and business metrics (sales, new accounts, etc.). You should share metrics (maybe not the infrastructure ones) with stakeholders too. This will allow you to discuss certain problems with them. But it also allows them to notice interesting patterns or trends. Leveraging this nice little feedback loop can help both of you contribute in more useful ways to making the business more successful.
Measuring how your application functions in the real world will give you great satisfaction; you will be happy to find out how many people are actually “using your code”. And besides helping the stakeholders make better decisions, you will become more in control of the system, in particular if you decide to continuously improve the system upon every failure or problem that you noticed by analyzing the metrics and registering anomalies.
Sam recommends to just start playing with Statsd, Graphite, and the Elasticsearch, Logstash, Kabina (ELK) stack to start getting a grip on your application.
Alberto Brandolini @ziobrando
Alberto has quite an interesting style of presentation, which isn’t exactly “by the book”. He does not make a lot of eye contact with the audience, he often looks at the screen, he sometimes mutters things. But somehow this presentation style works. Very well, I’d say; it’s informal and funny. It has the appearance of an overflowing source of knowledge and experience from which we are all allowed to tap. This talk had a lot of interesting insights and takeaways, too many to write down here. I recommend you to watch this talk later, whenever it becomes available.
In this talk, Alberto mentions lots of approaches to software development, from waterfall to iterative development with a product owner as domain expert, to using user stories as the complete specification of a product. He points out many problems with these approaches, and considers several aspects for each: Autonomy, Mastery and Purpose. A team of developers really needs a good score on each of these topics. It’s crucial to have a sense of autonomy, not just being there to “type the code”, or “convert the specs into working software”. Mastery is something developers take a lot of pride in. We want to be good at things, but these things should be worth being good at. Then a development team needs purpose. And this purpose should be something that is a bit more elevated than “delivering asap”.
We have learned that we should be agile. This means we should be responsive to change. Now, many development teams are supposed to be agile, applying scrum practices, but in reality there is a continuous desire for predictability. This goes hand in hand with reducing risk, estimating everything upfront and producing burn-down charts with a straight diagonal, showing how hours get burned up in predictable ways. Alberto points out that, though we can do this, and be very efficient, we ignore an important aspect of our work: “Software development is a learning process. Working code is a side effect” (apparently quoting Dan North). Careful, albeit iterative, planning and focussing on predictability doesn’t allow for learning to take place.
Alberto encourages us to consider areas within a project where we can do experiments, try something entirely different than what we’re used to do. We can do this for example in the software we create for our core domain (as it is called in DDD). This is a high risk/high value area, where you can really make a difference as a business, hence also as a software development team.
The fact that many developers don’t learn enough on the job is clear by the number of side projects developers usually work on. Of course, side projects and experimental code should never end up as production code. Still, this is what happens when the work developers do is too boring. They will find ways to make their work more interesting, more fulfilling. They might loose themselves in a race for technical excellence. Making every piece of code perfect, trying out crazy advanced stuff without considering if anybody is going to like, let alone use that piece of code.
Alberto suggests that to become experts, developers need to experiment, and get some experience with a certain technique. Developers should try new things. They should be allowed to make mistakes, because mistakes will give them lots of insights. They should also reinvent the wheel, in order to more deeply understand some things.
The best way to prevent developers from living in a vacuum, from picking their own goals, which may or may not align to those of the business, is to bring them in touch with the actual stakeholders and the actual users of the application. This will bring back all the purpose in the work of developers.
Dan North @tastapod
Dan’s talk was the closing keynote for day 1 of BuildstuffLT. I was looking forward to see a full talk by Dan in real life and I was not disappointed at all. His presentation style is quite informal too, but his message is nevertheless quite serious and very useful: if you’re aware of what you’re trading off, you can make informed decisions. Also known as: every decision (in software development) involves a trade-off.
A bit more in detail: we are often supposed to pick one side of a supposed dichotomy, like: automated, or manual deployment. We often don’t consider other answers, like: it depends, or first A, then B, or something else entirely, or both, in an appropriate mix. Some examples:
- If we only do automated deployments, nobody will have an understanding of how a deployment actually works and the deployment procedure will be very hard to revise. A suggestion might be to start with manual deployment, and when some part of it gets boring, it means we know enough about this part to automate it.
- If we only do automated testing, we will only write tests that will tell us things we already know. So we definitely also need manual testing and probably people who don’t look at our application from the perspective of a developer.
Dan discussed many other interesting topics, but I find it hard to put them into a nice, readable story. Still, I wanted to write down some interesting points:
- Instead of betting on microservices/components, start with one application. Deploy it, run it, monitor it, and only then decide if you can carry the burden of an extra thing that needs all of that care.
- Code is either 1. new, and people will know what it does, how the application works, what it all means 2. old, but tested, documented, maintainable, etc. or 3. somewhere in between. The latter being the most dangerous software. How to prevent code from ending up there? Keep deleting it and rewriting it. Dan introduces the concept of software half-life for this: how long do you have to be away for half the code to have changed? If half-life is long, the code becomes unmaintainable over time, because at some point no one will know how what it all means.