Advanced Computing in the Age of AI | Thursday, May 30, 2024

Hardware Matters, But Software Also Drives Switch Choices 

Breaking into the content caching and delivery market against industry juggernaut Akamai Technologies is about as easy as breaking into the datacenter networking space against Cisco Systems. But, CDN upstart Fastly and networking upstart Arista Networks are teaming up to do just that.

Arista may still be in the Others category when market share numbers for revenue or switch port counts are added up by the market researchers each quarter, but the company's influence is much greater than you might expect based on this data. Arista was founded in 2004 with the express goal of adding server smarts to low-latency 10 Gb/sec Ethernet switches based on merchant silicon from the likes of Fulcrum Microsystems (now part of Intel) and Broadcom.

Serial entrepreneur Andy Bechtolsheim, co-founder of Sun Microsystems and of Granite Systems, a maker of Gigabit Ethernet switches that was bought by Cisco in 1996 for $220 million. Granite laid the foundations of Cisco's move from routers into switches. Bechtolsheim is chief development officer and chairman at Arista, and co-founded the company with David Cheriton, who designed the Granite ASICs and stayed on at Cisco to work on the Catalyst switch line. Ken Duda, another co-founder, is a software expert who also hails from Granite and who created the company's Extensible Operating System, or EOS, variant of Linux that has been tweaked and hardened to be an application platform that rides atop the Arista switches. Arista also tapped Jayshree Ullal, who used to run Cisco's Catalyst 4500 and 6500 and Nexus 7000 high-end core switches, to be president and CEO when Bechtolsheim came on board full time in 2008.

That was a long time ago, of course, but Arista continues to grow, and was bragging this week that it has shipped over 2 million Ethernet ports since it first started shipping products nearly six years ago. Most of those ports are running at 10 Gb/sec, and that is against a market that was consuming about 3 million ports of switching gear each quarter in 2013. Arista is privately held, so it doesn't provide revenue figures or specific port counts, but Martin Hull, senior product line manager at the company, would give a little insight into the ramp of various speeds of Ethernet. First, the port count installations are "certainly accelerating" he says.

Fastly is part of that ramp, in fact. And so is its first customer.

The content delivery network was founded by Artur Bergman, who was chief technology officer at Wikia, the offshoot of Wikipedia that hosts over 400,000 separate communities focused on lifestyle, entertainment, and video game topics. Wikia currently has nearly 33 million pages, and handles something on the order of 50 million objects with big spikes in traffic. One of the "gorilla CDNs" was caching the Wikia site – Bergman did not want to name names – and because Wikia is supported by advertising and the performance was poor for this sort of workload, this was a problem. So two and a half years ago, after quipping that he wanted to start a CDN that would solve Wikia's caching woes, Wikia said it would use Bergman's cache if he could build it. And so Bergman started a company and took the open source Varnish cache and used it as the basis of the Fastly CDN service.

The important thing to know as far as switching was concerned is that Wikia was an early adopter of Arista switches, and after that experience, when Bergman set up his own company, he was predisposed to stick with Arista switches. (This is the same way, incidentally, that Sun Microsystems came to rule the early Internet era alongside Oracle. Just about every big dot-com was founded with the pairing of Sun hardware and Solaris operating system and Oracle's database.)

"EOS is an environment that developers like," explains Bergman to EnterpriseTech. "We are software engineers, and it is the kind of environment that we felt good with and that we can extend and control in ways that we want." And if you don't think that network hardware is becoming commoditized, Bergman and just about all of the upstart switch makers using merchant silicon from Intel, Broadcom, or a few others will disagree with you. These companies as well as open source network operating system makers such as Cumulus Networks want to do to switches what Windows and Linux did for X86-based servers: break the operating system free from the hardware and allow for multiple operating systems to be run on the switches. Arista uses commodity network ASICs, but has not yet allowed other network operating systems to run on its hardware

"We looked at the landscape from the software point of view," Bergman continues, talking about the selection process when Fastly started building out its infrastructure. "The hardware is pretty similar between all of the vendors, and at the time, we did not find a switch vendor that had the software infrastructure that we needed. Calling it a bake off might be a little bit too formal, but we did look around."

The niche that Fastly has carved out for itself is to serve up small files and live streaming content very rapidly. The company's stats show that it can get the first byte moving off its network to a user in under 400 microseconds 95 percent of the time. But the Fastly Caching Engine is not only to be able to serve up data very quickly, but can also to clear a cache very quickly when that data changes. So, for instance, it can take up to seven minutes to clear the cache at a big CDN, but Fastly can do it in 150 milliseconds on average.

The big content delivery networks, explains Bergman, have built their infrastructure years ago, and they were designed for streaming out large files off of disk drives in the servers in their network. This back-end architecture is terrible at serving up lots of small files. Many of them still have Gigabit Ethernet switches lurking back there. So, Fastly crammed its servers with flash-based solid state disks and connected the machines together using the low-latency 10 Gb/sec Arista 7150 switches. These are based on Intel's "Alta" FM6000 series of ASICs, and are arguably the switches that put Arista over the top in the markets where low latency is as important as high bandwidth.

In certain locations where having a high port count is more important than low latency, Fastly is deploying Arista's 7300 series switches, which are based on Broadcom's "Trident-II" ASICs. Arista announced these "spline" switches last November, and the interesting thing about them is that they create a single-tier network as compared to a two-tier leaf-spine network, that has up to 2,000 servers all linked to each other by a single hop. While the latency on the 7300 switches is higher and the spline setup cannot scale to 100,000 servers as the leaf-spline setup can, it takes one third the hops to link one server to another with the spline. It also costs about a third per port for the spline setup for each 10 Gb/sec port.

The combination of a network operating system that allows for network applications to run on the switch, making the network programmable in a way that it has not been historically, was a defining feature of Arista switches from day one. And the company has been very aggressive about driving down the cost of 10 Gb/sec ports, too. Now that 10 Gb/sec ports are widely available on X86 servers, it is no wonder that the port counts are on the rise, driving switch revenues across all of the players. In the third quarter, over 3 million 10 Gb/sec ports shipped and revenues rose 22.9 percent, according to market watcher IDC. That said, during Q3 over 55 million Gigabit Ethernet ports shipped and revenues were up 6.5 percent sequentially, so this market is not quite dead yet.

Fastly has its content delivery network in 18 co-location datacenters around the world, with eight of them in the United States and two more on the way shortly. The networks have no oversubscription and all told, Fastly has a couple thousand 10 Gb/sec Ethernet ports across its point of presence sites. The company, which has raised $14 million in two rounds of funding from Battery Ventures, O'Reilly Alpha Tech Ventures, and August Capital, has 70 employees and over 150 relatively large customers with what Bergman says is a long tail of smaller customers that the company doesn't count in that number. Twitter uses Fastly to cache its content, as does Wikia still, and Pinterest uses it to cache just the image counter portion of its application but not the images themselves. Bergman did not give out revenue figures, but says that Fastly is more than doubling its size and sales every year.

This is the kind of customer that an upstart like Arista attracts, but don't get the wrong idea. As EnterpriseTech has explained before, Arista's switches are at the heart of the largest single network in the world, which has 200,000 servers all linked on a leaf-spine network. Arista is not at liberty to say who that customer is, but it has 10 Gb/sec leaf switches in the racks linked to 40 Gb/sec spine switches, and then there is a spine of spines also based on 40 Gb/sec switches to push up beyond the 100,000-port limit of the leaf-spine that Arista has designed its gear to support.

Arista has over 2,000 customers, and its early gear was enthusiastically adopted by the financial services sector because of the compactness of the switches, low power usage, and low latency compared to gear from Cisco and Juniper Networks at the time. Arista gear is popular in public cloud operators and service providers, Web 2.0 application providers, and supercomputing facilities – BP is using Arista top of rack and core switches in its latest cluster – and generally, customers are using 10 Gb/sec downlinks to the server and 40 Gb/sec uplinks to the next tier up of switching, according to Hull. Arista is seeing particularly good uptake in the oil and gas industry as well as in life sciences and national labs on the HPC front. The company's customers tend to have between 150 and 200 racks, or somewhere between 6,000 and 8,000 server nodes among its largest enterprise customers.

Interestingly, while the bulk of Arista's shipments are still for 10 Gb/sec switches, Hull says "40 Gb/sec is definitely taking off and even 100 Gb/sec is ramping well."

The real question is when 40 Gb/sec will become normal on the server end. "It will happen," says Hull, "but when is the question."

What we can say for sure is that when 40 Gb/sec ports are needed – say, among customers who have to reduce the serialization delay, or the time it takes to get data out of the network card and onto the LAN cable, and who are not all that concerned about port-to-port latency – that will inevitably start driving sales of 100 Gb/sec switches in the spine or aggregation layers of the network.