Linux at 25: Linus Torvalds on the evolution and future of Linux

The creator of Linux talks in depth about the kernel, community, and how computing will change in the years ahead

Comments

The last time I had the occasion to interview Linus Torvalds, it was 2004, and version 2.6 of the Linux kernel had been recently released. I was working on a feature titled “Linux v2.6 scales the enterprise.” The opening sentence was “If commercial Unix vendors weren’t already worried about Linux, they should be now.” How prophetic those words turned out to be.

More than 12 years later -- several lifetimes in the computing world -- Linux can be found in every corner of the tech world. What started as a one-man project now involves thousands of developers. On this, its 25th anniversary, I once again reached out to Torvalds to see whether he had time to answer some questions regarding Linux’s origins and evolution, the pulse of Linux’s current development community, and how he sees operating systems and hardware changing in the future. He graciously agreed.

The following interview offers Torvalds’ take on the future of x86, changes to kernel development, Linux containers, and how shifts in computing and competing OS upgrade models might affect Linux down the line.

Linux’s origins were in low-resource environments, and coding practices were necessarily lean. That’s not the case today in most use cases. How do you think that has affected development practices for the kernel or operating systems in general?

I think your premise is incorrect: Linux's origins were definitely not all that low-resource. The 386 was just about the beefiest machine you could buy as a workstation at the time, and while 4MB or 8MB of RAM sounds ridiculously constrained today, and you'd say "necessarily lean," at the time it didn't feel that way at all.

So I felt like I had memory and resources to spare even back 25 years ago and not at all constrained by hardware. And hardware kept getting better, so as Linux grew -- and, perhaps more importantly, as the workloads you could use Linux for grew -- we still didn't feel very constrained by hardware resources.

From a development angle, I don't think things have changed all that much. If anything, I think that these days when people are trying to put Linux in some really tiny embedded environments (IoT), we actually have developers today that feel more constrained than kernel developers felt 25 years ago. It sounds odd, since those IoT devices tend to be more powerful than that original 386 I started on, but we've grown (a lot) and people’s expectations have grown, too.

Hardware constraints haven't been the big issue affecting development practices, because the hardware grew up with our development. But we've certainly have had other things that affect how we do things.

The fact that Linux is "serious business" obviously changes how you work -- you have more rules and need to be more thoughtful and careful about releases. The pure amount of people involved also radically changes how you develop things: When there were a few tens of developers and we all could email each other patches, things worked differently from when there are thousands of people involved, and we obviously need source control management and the whole distributed model that Git has.

Our development model has changed a lot over the quarter century, but I don't think it's been because of hardware constraints.

Do you see any fundamental differences in the younger kernel hackers today versus those of 20 years ago?

It's very hard to be introspective and actually get it right. I don't think the kernel developers are necessarily all that different; I think the scale and maturity of the project itself is the much bigger difference.

Twenty years ago, the kernel was much smaller, and there were fewer developers. It was perhaps to some degree easier to get into development due to that: There was less complexity to wrap your mind around, and it was easier to stand out and make a (relatively) big difference with a big new feature.

Today, it's a lot harder to find some big feature that hasn't already been done -- the kernel is a fairly mature project, after all. And there are tons of developers who have been around for a long time, so it is harder to stand out. At the same time, we have a lot more infrastructure for new people to get involved with, and there are lots more drivers and hardware support that you can get involved with, so in other respects things have gotten much easier. After all, today you can buy a Raspberry Pi for not very much money and get involved in doing things that 20 years ago were simply not even possible.

The other thing that has changed is obviously that 20 years ago, you'd get involved with Linux purely for the technical challenge. These days, it can easily be seen as a career: It's a big project with a lot of companies involved, and in that sense the market has certainly changed things radically. But I still think you end up having to be a pretty technically minded person to get into kernel programming, and I don't think the kind of person has changed, but it has maybe meant that people who 20 years ago would have gone, "I can't afford to tinker with a toy project, however interesting it might be," now see Linux as a place to not just have a technically interesting challenge, but also a job and a career.

Do you view the blossoming growth of higher-level and interpreted languages and associated coding methods as drawing talented developers away from core internal OS development?

No, not at all. I think it mainly expands the market, but the kind of people who are interested in the low-level details and actual interaction with hardware are still going to gravitate to projects like the kernel.

The higher-level languages mostly reflect the fact that the problem space (and the hardware) has expanded, and while a language like C is still relevant for system programming (but C has evolved a bit too over the years), there are obviously lots of areas where C is definitely not the right answer and never will be. It's not an either-or situation (and it's not a zero-sum game); it's just a reflection on the kinds of problems and resource constraints different projects have.

What do you think the future holds for x86?

I'm not much for a crystal ball, but there is obviously the big pattern of "small machines grow up," and all the historical comparisons with how the PC grew up and displaced almost everything above it. And everybody is looking at embedded and cellphones, and seeing that grow up and the PC market not growing as much.

That's the obvious story line, and it makes a lot of people excited about the whole x86-vs.-ARM thing: "ARM is going to grow up and displace x86."

At the same time, there are a few pretty big differences, too. One big reason PCs grew up and took over was that it was so easy to develop on them, and not only did you have a whole generation of developers growing up with home computers (and PCs in particular), and even when you were developing for one of those big machines that PCs eventually displaced, you were often using a PC to do so. The back-end machines might have been big serious iron, but the front end was often a PC-class workstation.

When the PCs grew up, they easily displaced the bigger machines because you had all these developers that were used to the PC environment and actually much preferred having the same development environment as their final deployment environment.

That pattern isn't holding for the whole x86-vs.-ARM comparison. In fact, it's reversed: Even if you are developing for the smaller ARM ecosystem, you still are almost certain to be using a PC (be it Linux, MacOS, or Windows) to do development, and you just deploy on ARM.

In that very real sense, in the historical comparison with how x86 PCs took over the computing world, ARM actually looks more like the big hardware that got displaced and less like the PC that displaced it.

What does it all mean? I don't know. I'm not seeing ARM grow up until it is self-sufficient enough, and that doesn't seem to be happening. I've been waiting for it for a decade now, and who knows when it actually happens.

We may be in a situation where you end up with separate architectures for different niches: ARM for consumer electronics and embedded, and x86 for the PC/workstation/server market. With IBM supporting its own architectures forever (hey, S/390 is still around, and Power doesn't seem to be going away either), reality may be less exciting than the architecture Thunderdome ("two architectures enter, one architecture leaves").

The computer market isn't quite the wild and crazy thing it used to be. Yes, smartphones certainly shook things up, but that market is maturing now, too.

What do you think of the projects currently underway to develop OS kernels in languages like Rust (touted for having built-in safeties that C does not)?

That's not a new phenomenon at all. We've had the system people who used Modula-2 or Ada, and I have to say Rust looks a lot better than either of those two disasters.

I'm not convinced about Rust for an OS kernel (there's a lot more to system programming than the kernel, though), but at the same time there is no question that C has a lot of limitations.

To anyone who wants to build their own kernel from scratch, I can just wish them luck. It's a huge project, and I don't think you actually solve any of the really hard kernel problems with your choice of programming language. The big problems tend to be about hardware support (all those drivers, all the odd details about different platforms, all the subtleties in memory management and resource accounting), and anybody who thinks that the choice of language simplifies those things a lot is likely to be very disappointed.

What for you is the biggest priority for driving kernel development: supporting new hardware or CPU features, improving performance, enhancing security, enabling new developer behaviors (such as container technology), or something else?

Me personally? I actually tend to worry most about "development flow" issues, not immediate code issues. Yes, I still get involved in a few areas (mainly the VFS layer, but occasionally VM) where I care about particular performance issues, etc., but realistically that's more of a side hobby than my main job these days.

I admit to still finding new CPU architecture features very interesting -- it's why I started Linux in the first place, after all, and it's still something I follow and love seeing interesting new things happen in. I was very excited about seeing transactional memory features, for example, even if the hype seems to have died down a lot.

But realistically, what I actually work on is the development process itself and maintaining the kernel, not a particular area of code any more. I read email, I do pull requests, I shunt things to the right developer, and I try to make sure the releases happen and people can trust me and the kernel to always be there. And yes, answering email from journalists is something I consider my job, too.

My principal model [with respect to] kernel development is to make sure we get all the details right, that we have the right people working on the right things, and that there aren't any unnecessary things standing in the way of development. If the process works right and the people involved care about quality, the end result will take care of itself, in a sense.

Yes, that is very, very different from what I did 25 years ago, obviously. Back then I wrote all the code myself, and writing code was what I did. These days, most of the code I write is actually pseudo-code snippets in emails, when discussing some issue.

What do you think still needs to be done to improve Linux containers?

I'm actually waiting for them to be more widely used -- right now they are mostly a server-side thing that a lot of big companies use to manage their workloads, but there's all this noise about using them in user distributions, and I really think that kind of use is where you end up really finding a lot of new issues and polishing the result.

Server people are used to working around their very particular issues with some quirk that is specific to their very particular load. In contrast, once you end up using containers in more of a desktop/workstation environment, where app distribution, etc., depends on it, and everybody is affected, you end up having to get it right. It's why I'm still a big believer in the desktop as a very important platform: It's this general-purpose thing where you can't work around some quirk of a very specific load.

I'm actually hoping that containers will get their head out of the cloud, so to say, and be everywhere. I'm not entirely convinced that will actually happen, but there are obviously lots of people working on it.

We’ve seen Microsoft, Google, and Apple pushing new desktop and mobile OS releases at an unprecedented pace over the past few years. What are your views on the increasingly rapid release cycles for desktop and mobile operating systems?

Well, in the kernel we obviously did our own big release model change about 10 years ago, where we went from multiyear release (2.4 to 2.6) to much more of a rolling release (new release every two months or so, and it's more about "continual improvement" rather than "big new feature").

Quite frankly, having gone through that big mindset change in kernel development, I really think it's the only good way to do updates. The whole notion of "big releases every year or two" should die, in favor of constant gradual improvements.

Of course, a lot of people want the "big revolutionary release" model -- sometimes for marketing reasons, but often for (wrongheaded) technical reasons, where the big revolutionary release is not just seen as a way to make improvements, but as a way to also break support for older versions or workflows. Too many people seem to think that "radical change" is more interesting and better than "gradual improvement."

Me, I see "radical change" as a failure. If your new version can't seamlessly do everything the old one did, your new version is not an improvement, it's just an annoyance to users.

In the kernel, that shows up as our No. 1 rule: We don't break user space, and we don't regress. It doesn't matter how cool a new feature is or how clever some piece of code is -- if it breaks something that used to work, it's a bug and needs to be fixed.

Once you embrace that kind of model, the logical result is a rolling release model, rather than big upgrades that leave the old code behind.

How do you view the forced upgrades and privacy concerns surrounding Windows 10 and Apple’s moves toward a walled-garden desktop OS? Do you think they may be catalysts for change to the fundamental concept of a desktop OS? Do you think general-purpose desktop operating systems will survive the next five to 10 years?

Oh, I definitely think the general-purpose OS will survive and flourish. Yes, a lot of people don't actually need that kind of general-purpose environment, and there are good reasons to then offer cut-down versions that are (for example) more secure simply because they are a bit more limited and have stricter rules. Lots of people are happy with something that just does web browsing and video and some games, and obviously a limited environment tends to be simpler and cheaper (and limited hardware -- all those phones and tablets are great for what they do, but you need something more for content creation rather than consumption).

But that doesn't make the general-purpose needs go away.

It does mean that many scenarios that used to require full PCs -- but didn't really need it -- will be perfectly happy with the cut-down environment. I'm not surprised that the PC market has been stagnant and shrinking. But I think that's a natural rebalancing, rather than some fundamental decline ("The PC is dying!").

Given the increasingly disparate requirements and workloads for desktop, mobile, and server -- especially in virtualized server environments such as containers -- do you foresee a time when the kernel may be forced to go in two different directions due to blocking architectural issues or perhaps other reasons?

I used to think that was inevitable, but I no longer do.

The cause for that was when SGI was (long ago -- we're talking 15 years or so) working on scaling up Linux to their big systems and talking about running Linux on systems with hundreds of nodes and thousands of CPU cores. I felt that the effort required wouldn't make sense for the normal cases where you only had a handful of CPUs, and SGI would likely have to have their own set of external patches for their very special needs.

I was wrong, so very wrong.

Not only were we able to integrate SGI's patches, we actually made the code better by doing so -- even for the non-SGI cases. We had to add a certain amount of abstraction and clean up a lot of our per-CPU data handling, but the result was a more robust source base that was cleaner and better designed.

We currently have a fairly unified kernel that scales from cellphones to supercomputers, and I've grown convinced that unification has actually been one of our greatest strengths: It forces us to do things right, and the different needs for different platforms tend to have a fair amount of commonalities in the end.

For example, the support for SMP came from server needs, but then it became common on the desktop too, and now you can't even find a cellphone without multiple CPUs. The fact that we had good core SMP support ended up being important even for the small machines, and splitting up the code base for architectural reasons (which could have made sense 15 years ago) would have been a huge mistake.

A lot of the power management code and concerns came from laptops and cellphones, but is now often used in big machines too, where it turns out power use does matter. While there are often things that end up specific to one platform, it's not always obvious, and most of the time those very specific things are in the end small details that can be abstracted away (and compiled away so that they don't even affect platforms that don't want or need them).

What technology, package, or core functionality within the Linux kernel or Linux distributions has lasted far longer than you thought it would or should?

Heh. There's a number of drivers that we still build and support, but that I sincerely hope nobody actually uses any more. I'm not even convinced that they work anymore -- do people really still use floppies? I think (and hope) that driver is mainly used in virtualized environments, not on actual physical hardware.

But on the whole, legacy code doesn't tend to be all that painful to maintain, so we keep it around.

What are your hopes and fears for the internet of things?

I hope the hype dies down, and we can concentrate on what it actually does ;)

But seriously, I hope that the whole interoperability issues get solved, and I really, really hope there will be more small smart devices that don't put everything in the cloud. Right now, not only do things stop working if the internet connection goes down, but we've already had too many cases of devices that simply stopped working because the cloud service was turned off.

Yes, yes, it's called the internet of things, but even so …. Some of those things take it unnecessarily far.

There has been talk over the past year about self-protection technologies for the kernel. What would that look like?

The most powerful ones are actually hardware assists.

Everybody knows about NX by now (the ability to mark pages nonexecutable), which catches a certain class of attacks. But we also now have SMAP and SMEP (supervisor mode access/execute protection), which close another set of holes, and make it much harder to fool the kernel into accessing (or executing) user space data unintentionally due to a bug.

SMAP in particular was something we Linux kernel people asked for, because it's fairly expensive to do in software, but hardware already has all the required information in the TLB (translation lookaside buffer) when it does the access. Yes, it takes a long time for new CPU features to actually make it to consumers, and SMAP only exists in the most recent Intel CPUs, but it's very much an example of technologies that can protect against a whole class of attacks.

There are other tools we use too, both using static analysis and by having compilers generate extra checking code. The latter can be pretty expensive (depending on which check and what load people run), but the expensive cases we tend to have config options for, so you can run a safer but slower/bigger kernel that does more self-checking.

The biggest challenge is making sure the systems are always updated to the latest kernel. (What’s the point of adding security features if the servers aren’t updating to the latest kernel, right?) Are there promising trends toward automatic updates for kernels, or is that purely an upstream problem (Red Hat, Ubuntu)?

Honestly, updating is always going to lag behind. We do things to make it easier: The whole "no regressions" rule is partly there to make people know that upgrading should always be safe and not break what they are doing, but there are also people who are working on live patching so that you can fix bugs on the fly in some cases.

But one of the reasons for a lot of the hardening work is to hopefully make updating less critical, in that even if there is a bug that would be a security hole, hardening efforts, then mitigate it to the point where it's not an acute security issue.

We're definitely not there yet. And there is no such thing as "absolute" security, so nothing will ever be 100 percent. But people are working on making things much better in practice.

The biggest problem is often not even directly technical, but about how people use the technology. It can be very frustrating to see people use old, and quite likely insecure, kernel versions for all the wrong reasons -- because the vendor made it hard or impossible to update sanely. Maybe they used some piece of hardware with a driver that wasn't open source or where the driver wasn't upstream, but some vendor abomination that isn't getting updates. In those kinds of situations upgrading is suddenly much harder. That used to be a big issue in the embedded space -- it's getting better, but it's still an issue.

The "not upgradable and has known security holes" is one of the nightmares people have about IoT. Some of those devices really don't have an upgrade model, and the vendor very much doesn't even care. We used to see that in various home routers, etc., (and "enterprise" routers too, for that matter), but with IoT it's obviously spreading to a lot more equipment in your house.

Related InfoWorld resources

Join the newsletter!

Error: Please check your email address.

More about Apple ARM Google Intel Linux Microsoft Raspberry Pi Red Hat Ubuntu