Hedge Fund Scraps DIY Lustre Cluster for Terascala
When high performance is needed, some enterprises have no choice but to get out there on the bleeding edge and experiment with new technology and put it into production use. This is precisely the situation that Tradeworx, a hedge fund and proprietary trader, found itself in when it needed a high-bandwidth file system for developers to create and simulate financial models.
If you haven't heard of Tradeworx, the 50-person company based in Red Bank, New Jersey, is influential in the financial services business in a manner that is disproportional to its modest size.
For instance, the company was tapped by the US Securities and Exchange Commission for the fine-grained stock trading data it captures every day. Tradeworx helped the government agency analyze the "flash crash" on May 6, 2010, when the Dow Jones Industrial Average lost more than 600 points in several minutes and recovered those losses a few minutes later.
Tradeworx has also provided some of the data feeds into the new MIDAS system launched by the SEC to compile the tapes from thirteen national equity exchanges and proprietary feeds outside of the exchanges. (MIDAS collects nearly 1 billion records from these feeds and timestamps them down on the microsecond level so the SEC can figure out who did what when, analyzing as much as 100 billion records at a time over a span of six months to a year. And EnterpriseTech will be hunting down the architecture of this machine for a future story, rest assured.)
The dozen quantitative analysts that work at Tradeworx do similar work each day, combing through the ups and downs of the market with the goal of trying to figure out algorithms to outsmart it. This is what hedge funds and high frequency traders do for a living, of course, and it is a never-ended battle. The algorithms are changing all the time because anything one of them does inevitably affects the market as a whole; then everyone reacts, starting the game all over again. (Jacob Loveless, CEO at Lucera, explained this cat and mouse game to EnterpriseTech a month ago after that high frequency trading cloud was launched.)
The developers creating trading strategies at Tradeworx don't have to build their own systems to do market models. A sister company, called Thesys Technologies, does that for them. Tradeworx set up Thesys not just to be its own systems company, but to build trading platforms for its competitors in the hedge fund market. Thesys sells access to the Tradeworx trading platform through an API, and hedge funds can install their own hardware to get market data feeds and trade. Alternatively, they can hire Thesys to design the systems to do the trading and either buy or lease the equipment to do so.
The developers at Tradeworx who are building market models and trading strategies run various kinds of Monte Carlo pricing and risk analysis simulations. "Any kind of strategy you can think of, I am sure they are testing it," says Scott Kornblum, managing director at Thesys.
Several years ago, the techies at Thesys built the quants a homegrown Lustre parallel file system so they could grab market feeds and run very fast simulations against that data. Lustre was chosen because of the performance it offered over other storage options thanks to the parallel I/O access, the do-it-yourself approach that Thesys took, in part because of the lack of maturity of commercialized Lustre solutions at the time, involved some risk.
"It didn't work out well at all," admits Kornblum. "We ran with it for a couple of years, and it was a support nightmare. We didn't build our Lustre system with enough redundancies. Something would fail, and then the whole house of cards would come down. One disk would fail, and that would cause a whole server to fail, which would take the Lustre file system down. It would take hours for us to get it back online."
This was a research and development system, not a production system, so the intent was to experiment to learn how to use Lustre. But still, the unreliability of the homegrown Lustre setup did not make Thesys or Lustre look good. And so Thesys approached Terascala, one of a handful of companies that has sprung up to offer commercial-grade Lustre support embedded in its storage servers.
The algorithm development cluster that the quants work from runs Linux, but it is not particularly huge, with fifteen nodes and over 150 cores. The Terascale appliance that feeds the cluster data has about 100 TB of capacity and about three-quarters of a rack is full of object storage servers at the moment, according to Kornblum. The great thing about Lustre is that if you need to add more capacity or more bandwidth, you just add more object storage servers to the setup, which is one of the reasons why Thesys chose Lustre for the quants at Tradeworx to play with in the first place.
Capacity is not a huge issue given the textual nature of market data, which is why the Terascala. The market tapes from all of the stock exchanges only takes up about 50 GB per day, according to Kornblum. With the markets open an average of 252 days per year, that's only 12.6 TB of data for a full year.
It took some "arm twisting," as Kornblum put it, to have the market tape data put on the Lustre system for production as well as development workloads, in fact. But the fact that this data is stored on Lustre shows that the parallel file system is not just a scratch pad for supercomputing clusters anymore and is reliable enough to be used for long term as well as high-speed storage.
Incidentally, the real-time trading systems at Tradeworx do not run on this Lustre cluster, in part because of the initial reputation that the homegrown machine had and in part because real-time trading is not heavily dependent on disk I/O, says Kornblum. For those real-time trading systems, Thesys has chosen Isilon NAS filers running the NFS file system. But there is talk of using the Terascala appliance in more production workloads because it has higher performance and better reliability than the homegrown Lustre cluster did.
The Terascala appliances use Dell PowerEdge R620 servers as metadata storage servers and then either Dell PowerVault disk enclosures or NetApp E-Series disk arrays for object storage servers. (Tradeworx is using the NetApp E4500 and E2600 arrays.)Terascala supports Lustre 2.15 today on this iron, which is the current stable release of the file system software, and Peter Piela, vice president of engineering, says that the company is targeting support for the new Lustre 2.5 release by the end of the year.
That Lustre 2.5 update is significant because it has API hooks in it that allow for hierarchical storage managers to reach into Lustre and move data between it and other disk and tape storage. This is distinct from the Intelligent Storage Bridge 1.0 software that Terascala has cooked up, which works in conjunction with workload managers such as Adaptive Computing's Moab, Univa's Grid Engine, and IBM's Platform LSF to stage data from cheap NAS storage to the Lustre file system as workloads require.
The largest Lustre system that Terascala has deployed to date is a 3 PB system that feeds the "HiPerGator" system at the University of Florida. This machine, which was installed in May, has 16,384 Opteron cores and is rated at 150 teraflops. It cost $3.4 million, including the compute cluster, the Lustre file system and its cluster, and installation services.
Piela tells EnterpriseTech that now that Lustre has been commercialized by Terascala and others, it is starting to get traction outside of the traditional HPC world.
Terascala is doing a lot of deals in the manufacturing sector, says Piela, with Westinghouse Electric Company being a long-time and early customer for its nuclear reaction simulations. Companies in the oil and gas industry are also an up-and-coming vertical for commercial Lustre appliances, and the life sciences sector is also showing some interest. The Translational Genomics Research Initiative, or TGen for short, is a non-profit backed by money from Michael Dell's personal fortune, and not surprisingly it uses Dell servers and Terascala Lustre appliances (in this case built on Dell servers) for its genomics work. TGen was involved in the creation of the Intelligent Storage Bridge software, in fact, and uses it to shuttle data between its 280 TB Terascala Lustre appliance and 1.3 PB of Isilon NAS filers of various vintages.