Advanced Computing in the Age of AI | Saturday, April 27, 2024

3D Internet Building Block Championed by Nvidia and Apple Faces Challenges 

(POR666/Shutterstock)

Apple and Nvidia want to dominate the 3D universe, and the world of scientific simulation, with a building block championed by both companies.

The closed-source proponents are going the open standards route to drive the development and adoption of a file format called USD (Universal Scene Descriptor), which is described as the HTML of the metaverse and 3D Internet.

The Alliance of OpenUSD, or AoUSD, wants to make USD a linchpin to create and render 3D worlds, AI-powered graphics, and animated avatars. Those universes could be on the Internet, in virtual reality worlds, in movies, or in graphics-rich scientific simulations.

The other AoUSD founding members include Pixar, Adobe, and Autodesk. None of the other major chip makers, software, or cloud providers are listed as initial supporters. The Worldwide Web Consortium and International Organisation for Standardization do not yet consider USD a standard, but AoUSD’s goal is to get it there.

The goal is to "take the open-source project that Pixar created and make a specification that will enable it to become an international standard that can be used by anyone worldwide, like the standards we use today such as JPEG or H.264, HTML or other standards," said Steve May, the chairperson at AoUSD, and chief technology officer at Pixar, during a media briefing.

The file format allows companies to share and reuse 3D assets in virtual worlds or graphics applications. Animators can simply take 3D objects from existing repositories and add them to their projects. The 3D objects could be models, animations, backgrounds, materials, and other assets.

USD implementations typically use rendering engines that pull out procedural descriptions to patch together scenes from shared assets. Nvidia has collaborated with Pixar on many projects in its metaverse product called Omniverse, which relies on the company’s GPUs to render animations, virtual worlds, and simulations.

The USD file format has been used for decades by Pixar to create animated movies. The company developed USD to reuse animations instead of recreating every pixel from scratch. Now USD plays a starring role in Pixar’s moviemaking efforts.

Creating complex animated scenes involves many workflows, 3D content, software tools, and technology, May said.

"And historically, those tools all used different data and file formats. Pixar wanted to enable a more powerful creative expression for the artist by streamlining workflows to allow the same data re-entry … by all the content creation tools," May said.

USD is a unifier for graphics in supercomputing and entertainment, May said, adding, “I see this as kind of an exciting time as it grows that we can see benefits crossing over between the different areas.”

Nvidia's metaverse strategy hinges on the success of the USD file format, which has composition operators such as position, orientation, colors, and layers, which allows for real-time sharing and collaboration in the metaverse.

Nvidia has projected Omniverse as a tool for engineers to collaborate in real-time on the creation of equipment like aircraft, cars, and machines. Nvidia’s Earth-2 simulation of climate patterns, which ingests visual data from multiple sources, is based on the USD file format and the Omniverse back-end.

The graphics chip maker is using artificial intelligence and metaverse to sell more GPUs and software. But the software development platform behind Omniverse, called CUDA, is proprietary. Nvidia locks down CUDA customers to its hardware and software in AI and the metaverse, and USD will allow it to create services.

Apple is like Nvidia: customers are locked into the devices, software, and services. But the company’s interest in the standard could stem from its recent introduction of the $3,999 Vision Pro headset, which is a headset computer on which users can watch movies, videoconference, and interact with virtual worlds.

Apple called the Vision Pro its "first spatial computer" that is responsible for creating a new spatial computing category. Apple is trying to attract developers to write applications for Vision Pro, and the USD file format could be at the center of it all.

USD "enables realistic augmented reality experiences essential to things like spatial computing," May said.

But USD has many usability issues. There are no practical browser-side implementations yet, and it is heavily reliant on server-side processing. Users of the technology can read USD by creating their own Python and C++ services, after which they can send back the needed client-side information.

Autodesk has taken some steps in creating libraries that will make USD practical in browser-based applications, an Nvidia spokesperson said in an email.

Autodesk has created a prototype for USD JavaScript bindings to enable more client-side processing and has been posting relevant proposals to the GitHub page of OpenUSD. OpenUSD is an open-source repository where the code for USD lives, and AoUSD will specify and codify what is represented in that code base.

Other USD acceleration in client-side processing will come through native WebGPU support in Hydra, which is a communication and rendering engine for USD. WebGPU is the successor to WebGL and is designed for client-side acceleration to distribute AI computing by leveraging local GPUs, which will reduce the server load. Google recently announced that WebGPU was fully integrated within Chrome.

But for Apple and Nvidia, standardizing USD to be the HTML of the 3D Internet will be challenging. For one, it has to contend with a standard that is dubbed the JPEG of 3D.

Companies are leaning toward a long-established USD alternative called glTF, which is a 3D file format backed by W3C and ISO. The glTF and USD file formats are being discussed by the Metaverse Standards Forum, which was formed earlier this year but does include Apple and Pixar as members.

"glTF is viewed as a simpler and lighter weight way to represent 3D data. And USD is viewed as the way to make much more complex sorts of scenes and have more people interact with them at the same time," May said.

The glTF format, which is backed by Khronos, has a well-established workflow with lightweight delivery and browser-based acceleration.

"One of the interesting challenges, and if we embrace this challenge, is can we make USD as lightweight and as optimal for simpler things as glTF? In many ways, it would be ideal if we had kind of one solution for both things. That is going to be an active area of debate in the community," May said.

May contends that USD will be the file format for scientific computing, which requires complicated graphics simulations. That is something that glTF cannot handle in its current form as a web-friendly data interchange format.

“OpenUSD will become the fundamental building block on which all 3D content will be created,” May said, adding, “Industrial applications or scientific visualization applications have overlap with what we do in entertainment.”

The power of USD is the ability to aggregate and modify large numbers of assets and then combine them into a complete picture, May said.

The Linux Foundation’s Joint Development Foundation will manage the AoUSD efforts. AoUSD could potentially partner with other associations such as the Academy Software Foundation (ASWF).

“JDF is really structured to help kind of incubate these early techniques and technologies on the pathway to becoming a standard,” May said.

The Nvidia spokeswoman said AWSF has a working group that will in the future partner with AoUSD to bring USD on the Web.

The stakeholders in both groups will share findings and use cases to inform community priorities.

EnterpriseAI