Why distributed AI is key to pushing the AI innovation envelope

The future of AI is distributed, said Ion Stoica, co-founder, executive chairman and president of Anyscale on the first day of VB Transform. And that’s because model complexity shows no signs of slowing down.

“For the past couple of years, the compute requirements to train a state-of-the-art model, depending on the data set, grow between 10 times and 35 times every 18 months,” he said.

Just five years ago the largest models were fitting on a single GPU; fast forward to today and just to fit the parameters of the most advanced models, it takes hundreds or even thousands of GPUs. PaLM, or the Pathway Language Model from Google, has 530 billion parameters — and that’s only about half of the largest, at more than 1 trillion parameters. The company uses more than 6,000 GPUs to train the most recent.

Even if these models stopped growing and GPUs continued to progress at the same rapid rate as in previous years, it would still take about 19 years before it’s sophisticated enough to run these state-of-the-art models on a single GPU, Stoica added.

“Fundamentally, this is a huge gap, which is growing month by month, between the demands of machine learning applications and the capabilities of a single processor or a single server,” he said. “There’s no other way to support these workloads than distributing them. It’s as simple as that. Writing these distributed applications is hard. It’s even harder than before, actually.”

The unique challenges of scaling applications and workloads

There are multiple stages in building a machine learning application, from data labeling and preprocessing, to training, hyperparameter tuning, serving, reinforcement learning and so on — and each of these stages need to scale. Typically each step requires a different distributed system. In order to build end-to-end machine learning pipelines or applications, it’s now necessary to stitch these systems together, but to also manage each of them. And it requires development against a variety of APIs, too. All of this adds a tremendous amount of complexity to an AI/ML project.

The mission of the open-source Ray Distributed Computing project, and Anyscale, is to make scaling of these distributed computing workloads easier, Stoica said.

“With Ray, we tried to provide a compute framework on which you can build these applications end-to-end,” he said. “W Anyscale is basically providing a hosted, managed Ray, and of course security features and tools to make the development, deployment and management of these applications easier.”

Hybrid stateful and stateless computation

The company recently launched a serverless product, which abstracts away the required functions, eliminating the need to worry where these functions are going to run, and easing the burden on developers and programmers as they scale. But with a transparent infrastructure, functions are limited in their functionality — they do computations, write the data back on S3, for instance, and then they’re gone — but many applications require stateful operators.

For instance, training, which takes a great deal of data, would become far too expensive if they were being written back to S3 after each iteration, or even just moved from the GPU memory into the machine memory, because of the overhead of getting the data in, and then also typically serializing and de-serializing that data.

“Ray, from day one, was also built around these kind of operators which can keep the state and can update the state continuously, which in software engineering lingo we call ‘actors,’” he says. “Ray has always supported this dual mode of this kind of stateless and stateful computation.”

What inning is AI implementation in?

There’s a temptation to say that AI implementation has finally reached the walking stage, shoved ahead in the AI transformation journey by the recent acceleration in digital growth — but we’ve just seen the tip of the iceberg, Stoica said. There’s still a gap in the current market size, compared to the opportunity — similar to the state of big data about 10 years ago.

“It’s taking time because the time [needed] is not only for developing tools,” he said. “It’s training people. Training experts. That takes even more time. If you look at big data and what happened, eight years ago a lot of universities started to provide degrees in data science. And of course there are a lot of courses now, AI courses, but I think that you’ll see more and more applied AI and data courses, of which there aren’t many today.”

Learn more about how distributed AI is helping companies ramp up their business strategy and catch up on all Transform sessions by registering for a free virtual pass right here.

Source link

St. John’s vs. New Mexico prediction: College basketball odds, picks

O.C. firefighter who feared he was paralyzed in crash walks out of rehab

Bullet strikes Southwest Airlines plane at Dallas Love Field Airport : NPR

Dem Rep. torches Harris campaign for relying on celebrity endorsements: ‘No one cares’

Refined carbs and red meat driving global rise in type 2 diabetes, study says

Texas Governor removes over 1 million from voter roll

University of Michigan’s Student Government Votes To Oust Pro-Palestinian President for Inciting Violence

Biden Backs Down on Israel Arms Ultimatum

Democrat Ruben Gallego wins Arizona U.S. Senate race, defeating Republican Kari Lake

St. John’s vs. New Mexico prediction: College basketball odds, picks

O.C. firefighter who feared he was paralyzed in crash walks out of rehab

Bullet strikes Southwest Airlines plane at Dallas Love Field Airport : NPR

Dem Rep. torches Harris campaign for relying on celebrity endorsements: ‘No one cares’

Refined carbs and red meat driving global rise in type 2 diabetes, study says

Texas Governor removes over 1 million from voter roll

University of Michigan’s Student Government Votes To Oust Pro-Palestinian President for Inciting Violence

Biden Backs Down on Israel Arms Ultimatum

Democrat Ruben Gallego wins Arizona U.S. Senate race, defeating Republican Kari Lake

Why distributed AI is key to pushing the AI innovation envelope

Jan. 6 hearing Thursday to focus on Donald Trump

Oshoala, Mane Retain CAF Player Of The Year Awards

Related News

ChatGPT’s success could have come sooner, says former Google AI researcher

Are You a Monster Lover? Red One’s Krampus is For You

What Okta’s failures say about the future of identity security in 2025

Best Smart Locks of 2024

Oshoala, Mane Retain CAF Player Of The Year Awards

Discussion about this post

Subscribe To Our Newsletters

Customer Support

Subscribe To Our Newsletters

Categories

Recent News

Australia politics live: Lidia Thorpe says Senate censure over King Charles protest was a vote ‘to shut me down’; Babet defends censured tweet | Australia news

LaMelo Ball fined $US100,000 for ‘offensive and derogatory comment’ after NBA match

Is There a Problem With ‘Mathbots’?

The most overrated movies of all time

Welcome Back!

Retrieve your password

Add New Playlist