Advanced Computing in the Age of AI | Saturday, May 11, 2024

TACC Receives NSF Grant to Deploy New Data Resource 

The Texas Advanced Computing Center (TACC) at The University of Austin and its partners today announced that they will design, build and deploy Wrangler, a groundbreaking data analysis and management system for the national open science community. Supported by a grant from the National Science Foundation (NSF), which includes $6M for deployment plus additional funding for operations, the new system is scheduled for production in January 2015.

The design and implementation of Wrangler responds to developments in technology and research practice that are collectively referred to as Big Data or the Data Deluge, encompassing a variety of needs related to research data storage, analysis, and access in the sciences.

Wrangler features a novel primary storage tier based on NAND Flash memory, which will enable reading and writing data at up to one terabyte per second and executing up to 275 million IOPS (input/output operations per second). In addition, the 10 petabyte disk storage system of Wrangler will be fully replicated to Indiana University, a partner in the project, providing data access reliability and security. Wrangler will support the popular Hadoop software framework and a full ecosystem of analytics methods and technologies for Big Data.

Dell Inc. and DSSD Inc. are the two strategic partners providing the technology that make up the core of Wrangler.

In addition to hosting part of the system, Indiana University will participate in operations and training, and will help users optimize their network performance between their home institutions and Wrangler. The Computation Institute (CI), a joint initiative of the University of Chicago and Argonne National Laboratory, will integrate their Globus Online service within the Wrangler project to make transferring data to and from Wrangler simple and fast.

Wrangler’s performance and storage capabilities for Big Data applications will be enhanced through tight integration to TACC’s Stampede supercomputer and to NSF Extreme Science and Engineering Discovery Environment (XSEDE) resources around the country. Immediately upon deployment, Wrangler will be a part of the broader XSEDE ecosystem. Integration with Globus Online, the official data transfer mechanism for XSEDE, will provide for rapid, reliable and secure data exchange with other elements of the national cyberinfrastructure.

System Features 

•       Massive, replicated, secure high performance data storage (10PB each site).

•       A large scale flash storage tier for analytics, with bandwidth of 1TB/s and 275M IOPS

•       Embedded processing of more than 3,000 processors cores for data analysis

•       Flexible support for a wide range of software stacks, including Hadoop and relational data.

•       Integration with Globus Online services for rapid and reliable data transfer and sharing.

•       A fully scalable design that can grow with the amount of users and as data applications grow.

EnterpriseAI