I have lots of thoughts about software engineering, some popular, some unpopular, and sometimes about things no-one ever talks about.

Rather than write a blog post about each one, I thought I'd dump some of my thoughts in brief here, and if there's any interest in a particular item I might expand in full in the future.

Context

I loved Jamie Brandon's series Reflections on a decade of coding. It's been nearly a decade since I first learnt to code, so I think it's about the right time to write my own.

He starts off by pointing out that advice has to be taken in the context of where it's coming from. So here's my background.

I have 8 years experience as a backend developer at companies ranging in size from a 30 person startup to Google. I have never worked on a frontend[1], mission critical software, or performance critical software. All of our products have had users, and some level of scale, but never hyperscale.

I have also worked on a number of open source libraries, including the C# compiler.

If your current circumstances don't match that, it's likely my learnings won't be applicable to you.

I've also focused on areas where I think I have some insight, and avoided areas where I don't trust my own opinion any more than the next developer's.

All rules are made to be broken

Each of my thoughts here are pointers in a particular direction. None of them are universal, and all of them would require a full length blog post to describe exactly where they apply or do not. Instead treat them more as directional: compared to where I think the average developer stands on the issue, I want to move them more to the left or the right.

Topic 1: Programming Languages

a. Languages have advantages. There are better crafted languages and worse crafted languages. Plenty of outages have happened because you used the wrong language. But it is extremely rare for a product to succeed or fail based on the language, so ultimately make a sensible decision and move on.

b. The biggest factors affecting which language to choose are:

  • The deployment model - if you want a light self contained binary you'll want c/c++/go/rust etc. Not C#/Java/python/node etc. If you want to run on the browser, java/typescript is usually your best bet.
  • Which libraries are available. Often you're forced to use a particular library written in a particular language and that decides the issue.
  • Which languages your team is familiar with.

c. Once that's whittled down your options, there's issues like memory safety, typing, fp, threading model etc. Everybody likes to go on about these endlessly, and there are plenty of situations where these are important, but they are way less important than the previous issues.

d. Going against the grain imposes high costs. Sure, you can write machine learning pipelines in Java, or compile c++ to wasm for the web. But you will find a lot of libraries missing, and much less documentation or prior art. In very particular situations it's worth it, but in most cases, swallow your pride and use that garbage language you hate so much.

e. Programming languages have particular styles. Linq is all the rage in C#, but trying to do the same in golang is far less popular, and is looked down upon. It also works less well. For the sake of other developers working with you, try to stick to the standard styles for that programming language.

Topic 2: Microservices

a. In all but the largest services, microservices are likely to cause a significant performance degradation. The advantage of using less RAM is easily outweighed by the cost of going over the network all the time.

b. Microservices make things far harder to debug, since you need to trace calls across multiple services.

c. Microservices makes code harder to develop, since testing requires you to spin up multiple services.

d. Microservices makes deployments harder, since each microservice might be at any of a number of versions, and you need to work out whether any member of the cartesian product of all these versions is incompatible.

e. Unless you invest a lot of thought and effort into your architecture and setup, microservices are likely to degrade rather than improve reliability because there are more moving parts which can go wrong.

f. Tooling can mitigate some of the issues above, but only somewhat, and requires significant investment in integration and education.

g. It is often (but not always) a lot easier to scale a monolith horizontally than to rewrite it to use microservices.

h. The biggest advantage of microservices is that it encourages you to split your code into vertical slices, with clear boundaries between parts. It's much easier to monitor where code crosses the service boundary and make sure those are actually necessary and well defined. Actually running the code as microservices has far fewer advantages.

h. Microservices are great if you have a clearly defined microservice, which you own, you have users of from outside your team, and you offer your users a well defined API with good versioning guarantees. Basically you treat your microservice like a product.

i. Sometimes microservices are necessary in other situations. But they will cause you pain.

Topic 3: Methods/Functions

a. Far better to duplicate code and then merge it later, than to share code, discover that different callers need slightly different things, and then try to modify it to be everything to everyone. DRY is overrated.

b. If a method can be made into a pure function, without any particular loss of performance, make it pure.

c. When choosing where to extract smaller methods from a very long method, do so where the smaller methods will be pure, take as few parameters as possible, and do something which can be clearly defined.

d. It is fine for methods to be long. Especially if you are updating state, often better to update the state in multiple places in one method, than spread out across multiple methods. However, extracting out the bits that don't update the state can make it easier to see those bits that do.

e. It's perfectly fine to use mutation inside a "pure" method, just so long as from the perspective of the caller it's pure.

Topic 4: Architecture

a. Solving a specific problem in a specific context allows you to take shortcuts and skip solving the difficult parts of a general problem. Don't jump to solving a general problem when you don't need to for your use case.

b. At the same time, thinking about how to solve the general problem can give you ideas for how to architecture the solution for your specific problem. When implementing it though, feel free to take shortcuts the general solution wouldn't allow.

c. You might think you've written a generic library suitable for executing arbitrary dependency graphs, but no plan survives first contact with the enemy. If you want your solution to be shared by other people, be prepared to maintain it and continuously upgrade it, in ways that may be detrimental to your own use case. If you don't want to do that, don't expect it to be suitable for other use cases.

d. This even includes things like - if you only use it for your use case, you can log wherever you need to, using whichever log library you usually use. If you want to share it, logging becomes a pain in the ass.

e. Structuring your application into vertical slices, each of which has as few dependencies as it needs and often talks directly to underlying services it needs, tends to work better than an onion style architecture with e.g. a thick shared data access layer which everything uses. A thick data access layer has to be everything to everyone, but most components don't need all that. The vertical slices all talk to each other via well defined APIs instead of arbitrarily calling into each other. This allows each vertical slice to choose the architecture that makes the most sense for itself.

Topic 5: Testing

a. Even when I think something is so simple it clearly doesn't have any bugs, it usually has bugs. Writing tests is the best way of catching those bugs.

b. Manual testing is important for sanity checking your work, but only automated tests can provide thorough coverage and future proofing.

c. Tests should ideally be written at boundary points where they will only break if the specification of the tested component breaks. I.e. refactoring should usually either remove tests completely because you've deleted the subcomponent, or leave them unchanged. Only adding new features should regularly break tests.

d. For large complex components, it is worth investing a significant amount of effort into making it simple to add new tests. For example, in a compiler, have a testing library where you can write some code, and the tests will automatically generate and store the emitted assembly. Ensure that it is trivial to update all these goldens whenever emit changes. That way it is easy for reviewers to see impacts of your changes.

e. Golden tests are a great testing method where they're applicable. Google has an amazing internal golden library which I'm a huge fan of.

f. Even for smaller components, write your tests in a way that lowers the trivial inconvenience of adding a new test. For example, in golang, consider writing your test as a table test, even if you only want to test one case. That way the next time you make a change, the test doesn't need to be refactored for you to add a new test case.

g. Unit tests vs end to end tests lie on a spectrum. Every component is made up of sub components, and sometimes you test closer to the root, other times closer to the leaves.

h. All else being equal, tests closer to the root (E2E tests) are better because they ensure that all the subparts are wired up correctly.

i. On the other hand, the larger the component being tested the slower the test is likely to be, the more flaky its likely to be, the more likely you are to need to use fakes or mocks, and the harder it is to know what contributed to the failure.

j. Also, some particular edge cases are hard to cover E2E, and easier if you test just the relevant component.

k. IMO if you are able to write E2E tests that are deterministic, reliable, use genuine implementations of all interfaces, and are reasonably fast it is best to put most of your focus on E2E tests, but with ample use of unit tests as sanity checks, to test particular edge cases, to test well defined and well specified components, to test areas where a subcomponent is complicated and you feel it needs the extra coverage, etc.

l. As said above, E2E tests vs unit tests is a spectrum. So applying the above advice implies focusing your tests at the largest component where they can still be deterministic, reliable and fast.

m. Prefer real implementations over fakes, and fakes over mocks.

n. If a test needs to be significantly updated every time you make a change, and is preventing you from refactoring/adding new features, delete the test. Badly written tests have negative value.

o. If a test uses a lot of mocks and fakes, you're likely testing at the wrong level. The test is likely to break whenever you refactor, and likely doesn't actually check what you want it to check. Try to use real implementations instead of mocks/fakes, or to refactor your implementation to split out the business logic from the external calls, and write integration tests to cover the external calls.

p. Integration tests are often flaky. Increasing the number of tests increases the flakiness, so either invest a lot of effort into reducing this flakiness, or use integration tests judiciously to sanity check your wiring instead of testing your overall logic.

q. Flakiness is problematic because it means you ignore real failures. Work hard to reduce it.

r. If you have an integration test which isn't flaky or slow, e.g. because you're integrating with an extremely reliable and fast service, then the integration test isn't problematic, and beats using a fake/mock.

Topic 6: Code Review

a. It takes just as long to review someones pull request now or in an hour. But if you review it in an hour there's a decent chance they're twiddling their thumbs the entire time. And the longer you leave the code review the greater the chance of merge conflicts. Unless you're in the middle of a flow, code review should almost always be your top priority to unblock other people.

b. The purposes of code review include: education (two way), having a second pair of eyes on the code, knowledge sharing (two way), ensuring quality, and stopping people taking shortcuts[2]. It is not to impose your personal coding preferences on the other person. Everyone has their own style. Either officially document that something is the team style and get team buy in, or let it pass.

c. It is perfectly fine to approve a large pull request without a single comment, or to leave dozens of comments on a small one. Keep your standards constant instead of looking for something to pick up on and then calling it quits.

d. If you trust the other developer, leave feedback and then approve. You trust them to handle the feedback, so don't block them unnecessarily. You should still consider reviewing their code after they merge it to calibrate your trust.

e. Code review is important. Put effort into it. If you're just an LGTMing machine, you can be replaced with a rock.

Topic 7: What makes a good developer?

a. A talented developer can fluently translate a high level description of an algorithm into code in a language or ecosystem they are familiar with. Mediocre ones look more like they're trying to tie their shoelaces with their eyes shut, constantly trying to work out what they've done and where to go from here. This is true even for talented Junior developers. If you can't write code fluently, practice until you can. That doesn't mean the code will be working on the first try, but the outline will pretty much be there.

b. Being able to keep a large and complicated codebase/architecture in your head is a superpower. If you know exactly which parts of the code do what, and how they all connect together, you will have a tremendous advantage when designing new features, implementing code changes, or diagnosing bugs. I have no idea if this is something you can practice.

c. As you gain experience you should be able to pattern match what you're doing to more and more problems in the past. You should recognise that X is a perfect problem for a relational database, but in Y you're essentially turning a database into a message queue and it's better to use an actual message queue. But this will only happen if you gain experience in a wide variety of areas, and work with more experienced developers. Don't let yourself have 1 year experience 10 times over instead of 10 years of experience.

Topic 8: Career

a. Tech companies pay orders of magnitudes higher salaries for software developers than non tech companies. Get out of a non-tech company as soon as you can.

b. Getting your first job is really hard, but once you have some experience getting the next job is a lot easier. Accept almost any software development job to start off with[3], then after a year start looking around for something with better pay, a closer location, better working conditions, etc.

c. You generally get larger pay rises by switching between jobs than getting promoted, but the benefits to doing so diminish as you climb higher up the ladder. Still, consider testing the waters every so often to see what you can get.

d. The higher in the career ladder you go the less you'll be judged on your code, and the more you'll be judged on your architecture, your designs, your knowledge, and your product ownership. It's really difficult to advance by just being able to translate design documents into working, tested, well written code. Seek opportunities to write design documents, contribute to product decisions, and to take ownership of large projects, including proactively doing research.

Topic 9: Team structure

a. Whenever you need something from outside your team there tend to be much longer delays than when you need something from inside your team. For that reason try to structure teams such that every team has everything it needs for its day to day work. Some companies have a separate databases team and a separate deployment team who need to approve all changes to the database or run all deployments, and this tends to be a disaster.

b. Within a team, you can either have everyone contribute to a specialised area, with a few members having deeper expertise, or have a dedicated specialist/subteam. IME, the first approach tends to be more effective, but if skill differences are too substantial, the second option might be the only viable choice (e.g. noone on the team has sufficient background in frontend technologies).

  1. ^

    To be fair, I did add a feature to a WPF application, my CTO saw my colour scheme, and banned me from touching the frontend ever again.

  2. ^

    Sometimes I know that something isn't going to pass code review, so I add tests to it or refactor it even though I can't really be bothered.

  3. ^

    But only if its software development. Too many people accept a job as tech support or whatever "temporarily" then find it almost impossible to switch to development.

New Comment
17 comments, sorted by Click to highlight new comments since:

For that reason try to structure teams such that every team has everything it needs for its day to day work.

 

I would extend that to "have as much control as you can over what you do". I increasingly find that this is key to move fast and produce quality software.

This applies to code and means dependencies should be owned and open to modifications, so the team understands them well and can fix bugs or add features as needed.

This avoids ridiculous situations where bugs are never fixed or shipping very simple features (such as changing a theme for a UI component) is impossible or takes weeks because a framework actively prevents it.

More control and understanding also tends to be better for satisfaction. Of course all this is on a spectrum and should be balanced with other requirements.

I agree with the micro service points except for these:

Performance degradation due to network overhead outweighing RAM savings

The network penalty is real but can be optimized. Not an absolute blocker.

  • Cloud-native services rely on microservices and scale well despite network overhead.
  • Event-driven architectures (Kafka) can mitigate excessive synchronous network calls.
  • Optimized serialization greatly reduces the cost of network calls.
  • Example: Netflix operates at scale with microservices and optimizes around network overhead successfully.

More moving parts = lower reliability

Poorly designed microservices hurt reliability; well-designed ones improve it. Poorly designed microservices can indeed degrade reliability by cascading failures.

  • Failure domains are smaller in microservices, meaning a single failure doesn’t bring down the entire system.
  • Service meshes and circuit breakers improve resilience.

It’s often easier to scale a monolith horizontally than rewrite it as microservices

Monoliths scale well up to a point; microservices help at extreme scales.

  • Monoliths are easier to scale initially, but eventually hit limits (e.g., database bottlenecks, CI/CD slowdowns).
  • Microservices allow independent scaling per service
  • Example: Twitter and LinkedIn refactored monoliths into microservices due to scaling limits.

So I agree with everything you wrote. Microservices can be extremely reliable and performant, and at hyperscale are often the only choice.

But these things require a lot of design effort, and hardening. They don't happen by default. If you take your monolith, convert it to microservices and deploy it, the chances are your performance will significantly decrease (per the same compute cost), not increase.

I know I sounded very harsh on microservices, but I have nothing against them. It's just that people jump straight to microservices without really understanding the tradeoffs.

Very much agree. And you can get the maintainability benefits of modularisation without the performance overhead with good old refactorings. 

A talented developer can fluently translate a high level description of an algorithm into code in a language or ecosystem they are familiar with.

 

Could you say a little bit more about what "fluency" is in this context? It's doing all the work in this section but I'm not sure I understand what you're trying to communicate. 

What I mean is that once they know the algorithm they want, writing that as code just flows, they can write out 50 lines of code that represents the algorithm in 10 minutes, without having to stop to double check what they're doing after every statement.

Of course their implementation will have bugs, but it will still be approximately correct.

And vice versa, a talented developer can read any reasonably well written code and quickly work out what it's doing.

It's really about fluency in a language, like the difference between talking in your first language Vs one you only know from lessons. Writing code is the bread and butter of your job, so if you can't do it fluently that's a problem.

Writing algorithms that are 50 lines of code seems like one definition of fluency, and one that is probably relevant in compilers/backend, but this also rings a little hollow to me, in terms of the pragmatics of real software engineering.

In my experience, most software engineering is using libraries, not language features; how would you describe fluency over libraries? Is "glue code" like command-line flags or CRUD web app routing subject to this? Should that code also "just flow"? In my experience truly powerful developers are able to do this, but even many Google L5s will just look up this code every time. Is that "fluent"? Is there a better concept to be applying here other than "fluency"?

My impression is that this is outside the scope of what you're describing, "implementing algorithms." This is an important part of software engineering, but I would say it's not entirely overlapping with what I would call "building <things/tools/products>". Would you agree? How would you relate these two aspects?

I think the ability to "just look up this code" is a demonstration of fluency - if your way of figuring out "what happens when I invoke this library function" is "read the source code", that indicates that you are able to fluently read code.

That said, fluently reading code and fluently writing code are somewhat different skills, and the very best developers relative to their toolchain can do both with that toolchain.

In my experience truly powerful developers are able to do this, but even many Google L5s will just look up this code every time.

Indeed I am a Google L5, and I usually do look this stuff up (or ChatGPT it). I think it's more important to remember roughly what libraries do at a high level (what problems they solve, how they differ from other solutions, what can't they do) than trivia about how exactly you use them.

I personally don't feel "fluent" programming this way, and maybe it is my own perfectionism, but this and the other replies, while certainly understandable and defensible, ring a little more hollow than I would like. I think going down below the level of "just know what APIs broadly exist" and actually being fluent at that lower level is usually necessary for the true 10-100x devs I've seen to work at that level. Usually this is achieved by building lots and lots of practical, deployable systems, but this just means it is implicitly taught through experience, and I wonder if there is a better way. Anyway, trying to figure out if it was this popular, but IMO flawed, type of fluency you were referring to was my original question, and I thank you for your answer. 

You are right that writing code glue code is a large part of software engineering, and that knowing what the libraries do is an important part of that. But once you know (or think you know) what the libraries do, how quickly do you bash out the code that does that? Do you struggle, or does it just come naturally?

And as faul_sname pointed out, often the quickest way to understand what the library does is to look at it. Is that something you're capable of doing, or are you forced to hope the documentation addresses it?

Other times you want to write a quick test that the library does what you expect. Is that going to take you half an hour, or 2 minutes?

  • "Wrap that in a semaphore"
  • "Can you check if that will cause a diamond dependency"
  • "Can you try deflaking this test? Just add a retry if you need or silence it and we'll deal with it later"
  • "I'll refactor that so it's harder to call it with a string that contains PII"

To me, those instructions are a little like OP's "understand an algorithm" and I would need to do all of them without needing any support from a teammate in a predictable amount of time. The first 2 are 10 minute activities for some level of a rough draft, the 3rd I wrote specifically so it has an upper bound in time, and the "refactor" could take a couple hours but it's still the case that one I recognize it's possible in principle I can jump in and do it.

You did not explicitly state the goal of the advice, I think it would be interesting to distinguish between advice that is meant to increase your value to the company, and advice meant to increase your satisfaction with your work, especially when the two point in opposite directions.

For example it could be that "swallow[ing] your pride and us[ing] that garbage language you hate so much" is good for the company in some cases, but terrible for job satisfaction, making you depressed or angry every time you have to use that silly language/tool.

I think it's more that learning to prioritize effectiveness over aesthetics will make you a more effective software engineer. Sometimes terrible languages are the right tool for the job, and I find it gives me satisfaction to pick the right tool even if I wish we lived in a world where the right tool was also the objectively best language (OCaml, obviously).

You want to be tending your value system so that being good at your job also makes you happy. It sounds like a cop-out but that's really it, really important, and really the truth. Being angry you have to do your job the best way possible is not sustainable.

The goal is writing good software to solve a particular problem. Using haskell to write an SPA is not going to work well whether your doing it for someone else or for yourself (assuming you care about the product and it's not just a learning/fun exercise). It is a perfectly valid decision to say that you'll only work on products where Haskell is a good fit, but I would strongly recommend against using Haskell where it's not a good fit in a production setting, and would consider it low key fraud to do so where somebody else is paying you for your time.

but terrible for job satisfaction, making you depressed or angry every time you have to use that silly language/tool.

My experience is that once you get over yourself, and put in the effort to properly understand the language, best practices, etc. you might not love the language, but you'll find it's actually fine. It's a popular language, people use it, and they've found ways to sand down the rough edges and make the good bits shine. Sure it's got problems but it's not as soul destroying as it looked at first sight, and you'll likely learn a lot anyway.

(I'm not talking about a case where a company forces you to use a deprecated language like COBOL or ColdFusion . I'm talking about a case where you pick the language because it's the best tool for the job).

This is in general good career advice. You'll lose out on a lot of opportunities if you refuse to put yourself in uncomfortable situations.

Curated and popular this week