Dustin Ingram

WritingSpeakingGitHubSocial
This is a text version of a talk I gave in June 2018. #

This talk is about the principles of designing software for the "long now", and my experiences working with "long software".

In a former life, prior to being a professional software engineer, I was a woodworker. I made custom furniture and cabinetry, finishwork and millwork.

One day I was working on a window very much like this one. I was taking the old window out, repairing the framing around it, putting a new one in place, and doing all the finish work.

This was in an old house, in Philadelphia, and I realised as I was removing the old window that it had probably been there for about a hundred years.

I also realised that if I did a really good job installing the new window, it would probably be there for about a hundred years as well.

When I started working in technology, I really struggled with the ephemerality of my work. We often build physical objects to exist for a hundred years or more, but we don't really do this with software.

This is Danny Hillis. Danny spent his lifetime building supercomputers, and while Danny is still alive, there are probably more computers of his in museums today than are still running.

I'm sure that Danny also has struggled with the ephemerality of the work that he has done as well.

In the nineties, as we got closer and closer to the new millenium, Danny began to notice something about how we were thinking about the future.

As we got closer to the year 2000, we, as a society, were becoming more and more fixated on a smaller and smaller distance into the future. We were thinking a lot about what would happen in the year 2000, but not much about what would happen after that.

Danny came up with an idea, with the goal of making us think about the future on a longer time scale.

His idea was to build a clock. This clock would only tick once a year, and once every thousand years, a cuckoo would come out. He wanted this to happen, accurately, for the next ten thousand years.

He chose ten thousand years because this is just within the limits of plausibility. We have found human artifacts that are approximately that old, but nothing so complex, and nothing specifically cared for for that long.

Danny was able to make his dream a reality.

This is the Clock of the Long Now. It was finished January 31st, 1999, just in time to ring in the new year.

When Danny was designing and building this clock, he found five key design principles which he felt were most important to making the idea a success.

The first of these principles was longevity.

Aside from just being accurate for ten thousand years, the clock also needed physical longevity -- safe from rust, from flooding, from potential looters if it was made out of valuable materials.

The second principle was maintainability. This clock needs to be maintainable by future generations, with minimal tools, possibly even with bronze-age tools.

This is especially important when designing how to power the clock. It immediately rules out power sources like nuclear energy, which aren't readily maintainable, especially with bronze-age tools.

Instead, the clock is hand-wound, and in the event that no one is around to wind it for a very long time, it can capture energy from the changes in temperature between day and night.

The third principle is transparency. The clock needs to be understandable by it's maintainers, without stopping.

This is especially important considering that it's original creators will be long gone.

The fourth principle is evolvability. The clock needs to be able to be improved over time.

For example, the rotation of the earth is slowing at a noticeable, but unpredictable rate. It should be possible to adjust the clock to compensate for the changing length of day, without stopping the clock.

The final principle is scalability. This means that it should be possible to build working scale prototypes of the clock, and then scale those up for the final version.

In fact, when I told you before that this was the Clock of the Long Now, it was not quite accurate. This is actually a scale prototype of the clock.

This is the size of the actual clock, with gears as large as humans...

...a massive pendulum,

...and all inside a 500-foot deep shaft inside a mountain.

And even this is not the final version of the clock! This is just a 1:1 prototype, currently installed in West Texas.

The final version of this clock will be sunk into this mountain, Mount Washington, in Nevada.

When you're building a giant clock inside of mountain, your biggest concerns are things like earthquakes, the changing rotational speed of the earth, et cetera. These are geologic, planetary concerns!

We don't ever design software to exist at this scale, let alone a one-hundred year scale.

If I were able to run git blame on every line of code necessary to make your laptop run, what do you think would be the oldest unchanged line? Ten years? Twenty? Thirty?

Here's a better question: what is something that you've created, that's still in use? If it's code, how long will it be used? If it's documentation, how long will it be relevant? If it's a technical talk, when is the last time someone will watch it?

As technologists, we don't really want to think about this. Not only does it make us think of our own mortality, but when we're constantly trying to move fast and break things, why bother?

I think the closest I've come is the work I've done on the Python Package Index. PyPI, for those unaware, is the canonical repository for Python packages, and I'm a contributor, maintainer and administrator of the project.

PyPI is what picks up the phone at the other end of the line when you pip install something. It's one of the oldest language-specific package repositories, and like in most programming languages, this centralized repository didn't exist when Python was first created.

There's an apocryphal joke within the Python packaging community that Guido van Rossum, the creator of Python, has never felt the need for something like PyPI, because any Python code that he wanted to use that didn't come with Python, he could just put it in the standard library.

This isn't quite true, but the early contributors to Python really didn't see a need for a centralized package repository. They saw packaging as a solved problem, and couldn't imagine Python being used on a platform that didn't have a platform-level package manager

As it turns out, developers love to write software on platforms without their own package managers. So in 2003 PyPI was created.

It was originally just a prototype. It didn't even host any distributions itself, it just linked to files hosted elsewhere. But it still worked pretty well.

And as we all know, any sufficiently advanced prototype is indistinguishable from production software. So it quickly became "production software".

And it was pretty successful! This is PyPI in 2007, four years later. You can see it's improved a bit.

But this is PyPI in 2018.

I'll put these screenshots side by side. There's eleven years between these two images, but PyPI still looks exactly the same -- not like a modern website at all.

It's a little unfair to compare these visually, because one thing that you can't see in these images is that in these eleven years, PyPI went from hosting about three thousand packages, to more than one hundred and forty thousand.

It went from "a" place to put Python code, to "the" place. This means that there was a lot of work done behind the scenes to keep this prototype working at this larger scale.

The other thing worth mentioning with regards to why PyPI didn't change much is that by it's very nature, it has a bit of a chicken-and-egg problem.

When PyPI was first created, none of the packages that it currently hosts were available, including modern web frameworks, testing frameworks, etc.

So in 2015, some folks decided to rewrite PyPI from scratch. The project officially started in 2015, and I joined it in early 2016.

And while we were making pretty good progress, it was an entirely volunteer-driven project... so it seemed like it might never actually be finished.

Thus, in November of last year, we applied for a Mozilla Open Source Support grant for myself, an infrastructure engineer, a designer and project manager to work on this rewrite full-time.

And, six months later in April, we launched! The project became the canonical PyPI, and we phased out the fifteen-year old legacy codebase.

We did what we had told Mozilla we wanted to do: finish and launch the rewrite. But we had an ulterior motive: to ultimately turn PyPI into software for the long now.

Because as it turns out, the same principles for designing a ten-thousand year clock are generally good principles for designing anything to last a long time.

First, longevity. It should always be possible to pip install the first module ever uploaded to PyPI. And, it should still install the same code.

This is the first project ever uploaded to PyPI. It's not that interesting, aside from the fact that it's fifteen years old and you can still install it with modern tools, thanks to painstaking work to maintain backwards compatibility.

Second is maintainability. Since PyPI is a volunteer-driven project, this means that we must design it in a way to require as little maintenance as possible.

I don't want it to require a lot of my time, and thus I also want it to be maintainable by people that aren't me (and I don't want it to take up a lot of their time, either).

This means that during the course of the grant, we also spent a lot of time working on building community around the project, acquiring new contributors and maintainers.

Third is transparency. It's easy to say "Oh! It's an open-source project, therefore it's transparent".

And while this does go a long way towards being transparent, it's not enough. Adding documentation helps, but often docs are too focused on the "how" something works, or "how" to use it.

For a project like this to be truly transparent, a new person approaching it needs to be able to understand the thoughts, discussions and decisions which lead to understanding "why" the project is designed the way it is.

For us, this meant that we minimized any backchannel communication as much as possible, and recorded everything, including our meetings, and published the records in multiple places where they would be permanently archived.

I also spent a lot of time thinking about how I responded to users on GitHub, making sure that not only was I answering their immediate question, but providing enough context and background for any future visitors to that issue or pull request, for them to understand it as well.

Fourth is evolvability. There's a number of ways which we've made PyPI more "evolvable".

The first is tests. PyPI currently has one hundred percent test coverage.

When I tell people this, they think I'm nuts, that this is possibly overkill. Which would definitely be true if PyPI were not designed to last so much longer than most software.

Having 100% coverage allows any contributor to come and propose changes and not only know if their changes will break things, but also have confidence that their changes will likely not break things for PyPI's hundreds of thousands of users.

It also raises the amount of investment a contributor must make to ship a new feature, which helps prevent us from making changes too quickly that we might not have thought through fully, or might not want to support for the next 15 years.

Second is sponsors. PyPI runs entirely on donated infrastructure from our sponsors.

In order for PyPI to stay running, it needs to be able to evolve in the event that one of our sponsors goes away for some reason.

By being infrastructure-agnostic and following good software practices by abstracting away our dependencies on these services as much as possible, we make it as easy as possible to replace one sponsor with another.

Finally, just having a modern codebase lets us ship the features that our users want, and spend less time fighting fires.

For example, one of the most oft-requested features for PyPI is the ability to write the project description in Markdown. We didn't support this before because when PyPI was first created, Markdown didn't exist, so it wasn't designed with it in mind, and the lack of modern frameworks made it more challenging than it should be.

This was the thing I was most excited to do once we had launched...

...and people loved it. This is a case where evolvability begets popularity, and thus survivability. If we can't give our users the features they want, they're going to go elsewhere.

Finally is scalability. PyPI has definitely had it's prototype, and we've learned a lot from it.

For us, this is more about scalability in the more traditional sense: Can PyPI support the growing demands in the next year? In the next decade?

One thing we've done to sure that PyPI can scale: we put it behind a content distribution network (CDN).

If something on PyPI is going to potentially be viewed more than once, we put it in the cache. This requires lots of extra logic around when to invalidate the cache, but it pays off: as a result PyPI currently handles about two billion requests per day, and serves more than 20 terabytes of bandwidth -- and can handle an order of magnitude more.

All this to say: I'm confident that PyPI will not only continue to exist at least for the next fifteen years, but that it will continue to grow and evolve, and not stagnate like it's predecessor.

I often think back to this window at the beginning of the story.

Like the Clock of the Long Now, it helps give me a greater perspective on what I'm building in a less immediate time frame. It also helps me focus the work that I'm currently doing on the projects that will have the greatest impact on the future.

And last I checked, it's still there. 🙂

Thanks for your time!