Advanced Computing in the Age of AI | Thursday, April 25, 2024

Improving Computational Capabilities Using Green at CEBio 

Constrained by the limitations of its existing cluster, CEBio (The Center for Excellence in Bioinformatics)—part of Brazil’s prestigious FIOCRUZ organization—deployed the Convey HC-1 ex hybrid-core computer to upgrade its bioinformatics capabilities and reign in power consumption growth.

Constrained by the limitations of its existing cluster, CEBio (The Center for Excellence in Bioinformatics)—part of Brazil’s prestigious FIOCRUZ organization—deployed the Convey HC-1 ex hybrid-core computer to upgrade its bioinformatics capabilities and reign in power consumption growth.

Jointly established by FIOCRUZ and the Minas Gerais State government, CEBio’s mission is to bring genomics and cutting-edge bioinformatics to Brazil’s rapidly growing biotech and health care sectors.

CEBio Expands Genomics Capabilities

The new Convey HC-1ex system helps CEBio substantially accelerate bioinformatics applications to tackle computationally intensive problems. The new system is also helping CEBio reduce its carbon footprint, because the HC-1ex requires less power, space and cooling than a comparable cluster.

Assembly of large genomes for species like bovine and swine—which CEBio works with—is especially demanding. Convey’s architecture is hoping to address issues CEBio has had with completing tasks on their existing cluster. “We’ve had assemblies we couldn’t complete on our 256-node cluster simply because they were taking too long,” said Dr. Guilherme Oliveira, CEBio coordinator, who is also President of the Brazilian Association for Bioinformatics and Computational Biology and a member of the Board of the International Society for Computational Biology. “We evaluated several platforms and are excited to be working with the Convey hybrid- core system.”

Convey’s hybrid-core architecture achieves performance gains by pairing classic Intel x86 microprocessors with a coprocessor comprised of FPGAs. Particular algorithms—Velvet1-based assembly, for example—are optimized and translated into code that’s loadable onto the coprocessor at runtime. The Convey architecture also features a highly parallel memory subsystem, which would remove memory bottlenecks inherent to commodity servers. The overall result is a speedup for applications that can be parallelized.

“Speed and power consumption were two of our top concerns. Electricity is expensive and, of course, we want to contain our costs, but it’s also important for us to be ecologically responsible,” said Dr. Oliveira. According to Dr. Oliveira, deployment of the HC-1 ex was easy. “Our IT staff was initially concerned because high-performance machines usually don’t work right of out of the box. The Convey system did.”

Convey Bioinformatics Suite

The Convey GraphConstructor (CGC), part of Convey’s Bioinformatics Suite of optimized applications, facilitates construction and manipulation of de Bruijn graphs commonly used in short-read genome assembly applications such as Velvet. Other performance and workflow optimization includes a fast kmer counting tool that allows quick identification of optimal kmer length and coverage cutoffs for de novo assembly. Avoiding low coverage kmers, which generally result from sequence errors, would yield faster run times and quality assemblies.

CEBio Is Able to Fulfill Mission

CEBio works with industry, academia and government on a wide array of projects spanning agriculture, animal husbandry, human health and biodiversity. Clients vary from small biotechs and academic departments to large Brazilian multinational companies. An aggressive user of NGS technology, CEBio is heavily involved in de novo sequencing, re-sequencing, and reference mapping.

“Our work covers a wide range of disciplines from basic biology to agribusiness such as cattle and fish—and recently we’ve started working with plants and metagenomics,” said Dr. Oliveira. “We’re also working with private industry to develop new diagnostics tools for cancer research.” Among the many genomes CEBio has worked on include the following: Bos indicus (cattle), Schistosoma mansoni (Schistosomiasis-causing nematode), HIV, honey bee and wild bees, yeast and Mycobacterium tuberculosis, to name just a few. “The number of species we work with is growing rapidly, the amount of data we must deal with is growing, and we are venturing into transcriptomics as well,” concluded Dr. Oliveira. “We are thrilled to have the Convey system to help us take advantage of these new opportunities.”

Dr. George Vacek is the Director of the Life Sciences Business Unit at Convey Computer

Related Articles

Green Technology and Servers

Massachusetts Universities Plant Green Datacenter

Photovoltaic Demand Misses Expected Value

 

EnterpriseAI