Year Four: “I” becomes “We”
2015 was a big year. It was the year ZeroTier graduated from side project incubation.
ZeroTier began in early 2012 as a personal open source project. It arose from two pain points. The first was social and political. Like many I thought the Internet was becoming too centralized, and I figured the difficulty of directly connecting endpoints must be a contributing factor. If things can’t connect directly they have to make use of intermediaries, and this creates more niches for middle-boxes than perhaps ought to exist. The second pain point was professional. In my day job I struggled with the pain of networking: tunnel spaghetti, an alphabet soup of needlessly narrow hard-to-deploy protocols, terrible user (and developer) experience at every level of the stack, and endless ugly hacks to get around badly thought out physical topologies.
I wanted something that made networks as easy to create and join as IRC channels or chat rooms. I’d been entertaining some of my own ideas about peer to peer networking since 2009 so I figured I’d take a serious crack at the problem of simple direct end-to-end networking.
The design I settled on is one of pragmatically minimal centralization, trading just a bit of decentralization for huge gains in speed and user experience. “Decentralize until it hurts, then centralize until it works.” Implementing and testing, the design seemed to work almost shockingly well. A bit of math told me it would scale. I kept going.
The first usable alpha was pushed to GitHub in July 2013, and the first packaged beta binaries for end-user use were released for Macintosh and Linux in February 2014.
The original pitch I made to myself when I decided to create this was to “do a Dropbox” on VPNs. Dropbox achieved amazing success by solving an un-trendy un-sexy problem everyone thought was already solved: file sync. “But can’t I just send an e-mail attachment or upload my file to a web site?” Networking seemed like a similar domain. The problem of virtual networking is already solved by a lot of awful stuff everyone hates.
After going live, ZeroTier got a lot of positive attention. Since I started tracking the number of active devices online in early 2014, it’s had sustained ~10% month-over-month growth in user base.
In early 2015 the product went out and found its own funding. I’d been considering seed financing or crowdfunding, but two users approached me first. “I” is now “we,” the product has received quite a bit of polish, and the vision is a lot larger. What began as a “better VPN” has evolved into a vision of universal software defined networking in every setting and on every device.
Think of Earth as a data center and ZeroTier as a smart switch with cryptographic authentication, VLAN capability, and soon many other features. That’s what we’re building. 2015 was about getting it ready for a larger stage and converting the first few large-scale customers. 2016 will be about enterprise networking, applications, and beyond.
The Internet is only nominally peer to peer. The way we’ve deployed it makes it amazingly hostile to direct connectivity. This is partly a result of security needs. Since so much software is so broken security-wise, an ugly ham-fisted hack called a firewall has become indispensable standard practice. The second reason for the Internet’s hostility to direct communication is a kind of historical feedback loop. Since most Internet applications so far have been client-server at the protocol level, we’ve invested a lot of time and effort into making the Internet reliable for endpoints to act as clients for servers that live in “the cloud.” Comparatively little attention has been paid to the reliability or ease of endpoint-to-endpoint communication.
Developing a reliable direct networking solution that works in today’s real world is probably comparable in difficulty to making a distributed database that won’t get called “maybe.” Consider what stateful NATs might do if they run out of IP:port mappings for a particular endpoint on their network (or globally). It turns out that some implement LIFO behavior, forgetting the most recently learned mappings. Others implement FIFO behavior, expiring old ones. Finally at least a few seem to just forget them randomly and completely irrespective of whether they are being actively used for traffic. Router/gateway manufacturers simply haven’t put much thought into the reliability of these systems for this use case. After all everyone just uses the Internet to make short lived connections to web servers on port 80 or 443, right?
Building a direct networking layer means dealing with gremlins who constantly run around yanking network cables. Now throw in mobility, battery life concerns, ISP traffic shaping, and bandwidth quotas. Most people who try to do this run away screaming. This is a problem with a body count.
Here at ZeroTier we’ve taken a solemn oath to field a peer to peer network that can work reliably enough across both stationary and mobile devices (and in the cloud, etc.) to be trusted for things like point of sale networks and self-driving vehicles.
What we have so far is pretty good, but it’s not that good. We still see edge case flakiness, especially on less-standard networks. Getting that last few percent of reliability is going to take bigger guns. It will take data, analytics, and a more methodical and scientific approach.
The first step is for us to start turning on something called circuit testing. We quietly introduced it as a feature in version 1.1.0, and it allows the network controllers behind networks you’ve joined (making it opt-in) to generate test traffic between members on the same network. This traffic is small and invisible and doesn’t interfere with ordinary network operations, and it reports back some very basic statistics about what is talking to what through what network paths and how often it’s working (see the link for details).
One thing we’ve discovered is that there’s only so much we can do with scripted, synthetic scenarios and tests. The real world is full of network equipment that behaves strangely only under load or only when certain conditions arise. It’s also full of weirdly configured networks. Our customers sometimes encounter problems in the field because their network configurations are so perverse we’d never have imagined building such a thing as a test scenario. (Example: we’ve discovered that stacking multiple different brands of NAT routers is common. This causes networking mandelbugs but hey the web still works so the Internet is still up.) Circuit testing will provide real data that will allow us to hone in on problems and edge cases that are unanticipated or hard to reproduce in the lab.
The second thing circuit testing will let us do is to provide SRE (site reliability engineering) as a service for distributed networks. In the next few months we’re going to be unveiling this as a product offering. If you want to use a ZeroTier network in a mission critical setting, we can monitor it and react to problems before you notice them.
These two things work together. When we address problems in monitored networks we’ll be taking what we learn and building it into the product. Solve, improve, repeat. Ultimately it’s not possible to exceed the reliability of the underlying physical network, but we do think it’s possible to converge with it to within tiny fractions of a percent. That is our goal.
We’re aware that a few users might have privacy concerns about this, but we’ve already been quite up front about the fact that we are not an anonymity solution like Tor. ZeroTier has encryption and using it doesn’t require you to make an account on zerotier.com, but the protocol doesn’t conceal network meta-data. Nothing less than Tor-like onion routing can achieve that, and onion routing comes with speed penalties and other issues.
Circuit testing only gathers meta-data, and only network controllers for networks you have joined are allowed to do it. As with our core design, we adhere to a philosophy of pragmatic multi-objective optimization. It’s okay to sacrifice just a little bit of one thing for a lot of another. We think making ZeroTier absolutely rock solid is worth a little bit of already easy to obtain public network information. The alternative is to never adequately solve the problem of reliable peer to peer networking, and if that isn’t done we are guaranteed a future in which all traffic is man-in-the-middle’d by design.
Rules and Policies
Right now ZeroTier virtual networks emulate flat Ethernet. It’s possible to make them private and allow only approved devices (enforced with automatically issued certificates) and to set minimal policies around what types of traffic they can carry (IPv4, IPv6, etc.). That’s good enough for a huge array of use cases, but many enterprise users and special purpose use cases require more control.
That’s why in 2016 we’re going to be adding a rules engine to ZeroTier. It will allow network-wide traffic flow rules to be set in a manner similar to a simple OpenFlow-enabled smart switch or “iptables.” You’ll be able to only allow certain IP ports or protocols, prohibit lateral traffic if your application actually is client-server, and so on.
Rules would be enforced by all members of a network, so compromise of a single device wouldn’t permit them to be broken. To break the rules would require global compromise of a large number of hosts or of the network controller. If that happens you have bigger problems than network-level rule enforcement.
You’ll also be able to conduct security monitoring in your own networks by setting up rules that “tee” traffic of interest to observers. For example: by watching all TCP SYN packets an observer could see all TCP connections on a network without having to actually back-haul and man-in-the-middle all payload traffic. At least some of the security monitoring benefits of centralized choke points can be achieved without centralized choke points.
SDN for the Cloud and Data Centers
One of the odd beliefs we have here at ZeroTier is that VPN, SDN, and peer to peer networking are all the same problem framed in different ways. We’re evolving into an enterprise networking company because doing the crypto-hippie Internet decentralization thing well turns out to be the same as doing SDN well.
What ZeroTier does on the Internet, it can also do on the intranet. If these really are all the same problem then you can solve them all with one stack and manage them all with one interface.
We already have a number of users using ZeroTier for hybrid cloud. It’s a clear use case: add your cloud resources and your on-premise resources to the same virtual network and you have a location-independent private backplane. Some users also use it to mix and match cloud resources, allowing them to spread their infrastructure across hosts for better diversity or reduced cost.
In 2016 and beyond we plan to do more in this area. Our rules and policies work will make ZeroTier a distributed competitor to OpenFlow-enabled switches. Federation for root servers will allow on-premise hosting of ZeroTier’s “upstream” for reduced latency and better reliability during Internet outages. Finally, we’re researching various options for closing the performance gap between ZeroTier and things like VXLAN and IPSec. One would be to make this a kernel module, while another would be to skip the kernel entirely.
Standards Based Peer to Peer Networking
Last month we somewhat quietly released a beta of Network Containers. As the name suggests, we’re initially targeting Docker, rkt, LXC, and runC containers with this technology. It lets you package a complete user-mode network stack inside the container, allowing network virtualization to be deployed without special host access. This is particularly well suited to mixed or multi-tenant container hosting infrastructures where single hosts might run containers belonging to more than one customer, department, or subsystem.
While containers are big hype these days, we weren’t thinking about these days when we built netcon. We were thinking about those days, the ones that come after these.
Our private mission statement is to “directly connect the world’s devices.” By that we mean all of them. We want to make it easy to create arbitrary network topologies joining anything with a CPU or that runs on something with a CPU.
While Network Containers has gained some attention in Linux devops circles, our larger vision for its future is in applications. In 2016 we plan to introduce the ZeroTier Application SDK for both desktop and mobile.
Network Containers will be a central part of our SDK and will allow an application to join virtual networks without kernel support or elevated permissions. Even better, you’ll be able to communicate over these networks as easily as you can communicate over the Internet. Instances of your app can communicate securely with other instances of itself or with anything else that can join a ZeroTier network using standard TCP/IP based protocols and standard network I/O calls and libraries. In most cases you won’t even have to recompile your code. Just add the SDK to your build path and add a few lines to enable it.
There are clear uses for this today like scaling, avoiding bandwidth costs, and easing interoperability. In the future we think there will be more. Thanks to Einstein, cloud back-haul introduces an unavoidable latency penalty. If your data must travel over a thousand miles to and from a data center to reach another computer in the same city, you’re adding a mandatory 20-60 milliseconds to its round trip time. On the horizon are user interface paradigms like virtual and augmented reality. Achieving the best possible immersive experience using these technologies requires latency minimization and therefore shortest path direct networking. These applications are also going to be bandwidth intensive. If it’s expensive to back-haul pictures and video today, it’s going to be even more costly and inconvenient to pay the indirect networking tax for all the users of an immersive telepresence or virtual world application. Massive companies might be able to afford it, but the requirement that all data for everything flow through the cloud imposes an unacceptable cost on small independent developers. Historically these are the ones who innovate most in emerging areas like VR.
An Internet of Things You Actually Own
There’s not much of a gap between Network Containers and porting ZeroTier to embedded. If anything, the former was probably harder. Correctly emulating the Posix Socket API was not easy (and it’s not quite done yet).
For embedded Linux that effort often isn’t even required. It can be trivial to run ZeroTier on ARM-based Linux-powered devices. We’re already talking to the makers of IP cameras and other bandwidth-intensive devices that could benefit from a faster direct networking alternative to the “put everything in the cloud” status quo. Why should video from a baby monitor travel 1,500 miles to and from a data center to reach your phone thirty feet away in your bedroom?
Beyond speed, latency, and cost, we foresee other benefits that revolve around privacy and user control. The first age of personal computing, which lasted from the late 1970s to the late 2000s, revolved around the personalization of information processing through personal ownership of “a computer.” The grey box (or laptop) was the center of an individual’s computing world.
The Internet and the cloud (a.k.a. mainframe 2.0) have changed all that and have pushed things toward a centralized model where your devices orbit closed cloud-hosted services.
Here at ZeroTier we’ve spent a bit of time pondering what PC 2.0 might look like. One thing that’s clear is that the amount of “silicon per capita” is increasing, and any one single device is no longer the center of a person’s computing world. If computing is to become personal again, it’s going to do so through the personalization of the cloud.
What if instead of a grey box each person could own one or more private network envelopes into which they could place all their devices? This would be ideally suited to devices with a more open design, devices that are designed for you to control rather than being tethered to an opaque monolith in the cloud.
Once we release our SDK we will be positioned to start realizing this. It will become possible to embed ZeroTier in a device and also in an app for accessing that device. From there we will be exploring ways of making network virtualization even easier to use, allowing non-tech-savvy users to control their own network boundaries intuitively.
Standardizing the ZeroTier Protocol
“But isn’t ZeroTier itself a closed silo?”
Our software and protocols are open. We plan to keep some enterprise software and SaaS offerings private but in terms of the core endpoint connectivity code there is not much up our sleeves. But our “pragmatically minimal centralization” model does get some push-back, and as we move forward we’d like to take measures to address some of our users’ concerns.
The core ZeroTier protocol has been fairly stable for a while. Once we address some lingering issues and are even more confident in its stability we plan to write it up as an RFC. That will allow third party implementations and interoperability, though we would still maintain its reference implementation.
That leaves the root servers. We’re not quite sure exactly what we’re going to do there yet, but we are exploring options such as the creation of an independent non-profit entity or consortium to manage them like the root name servers. We don’t make money off them directly, so we don’t think this would impact our revenue plans very much. If anything it would likely help us convert more customers by providing stronger assurances of the network’s long term stability and viability. We’ll just have to take care to balance this against security, performance, stability, and innovation.
In the long run we’d like the ZeroTier network virtualization protocol to be as much a part of the Internet as DNS, so maybe following that model is best.
Staying Up to Date and Supporting the Project
The best ways to stay up to date with ZeroTier is to follow us on Twitter and subscribe to this blog’s RSS feed. You can also follow the main ZeroTierOne repository on GitHub and join our new community. It’s just getting off the ground but we hope it will soon evolve into an active place where users and interested parties can discuss issues, improvements, and future directions.
If you want to support our work there are several things you can do. The simplest is to use it, report issues and provide feedback, and tell other people about it. If you want to support it financially, create an account on our hosted network controller interface and you can subscribe to paid network service. You’ll also get an e-mail announcement when our new enterprise offerings are available.