Parallel computing speeds up data processing. Projects that used to take hours are now completed in seconds. This is the basic concept that ushered in the AI boom along with advanced semiconductors. The increasing availability of advanced chips has made parallel computing more accessible.
At the Data Science Conference held in midtown Manhattan earlier this month, there was always a crowd of enthusiastic attendees around the Nvidia exhibitor’s table. They weren’t fishing for work or for selfies. In general, they were enthusiastic about the possibilities of parallel computing.
This is the fundamental concept that has made Nvidia the most valuable company in the world. And this was on display at the PyData conference during a short demo conducted by Nvidia Engineering Manager Rick Ratzel.
Nvidia makes graphics processing units, which are computer chips that process many tasks simultaneously. Hence the term parallel computing.
The chip that most people know is the central processing unit. You can see me handling various tasks on my laptop. Although quick and efficient, these tasks are typically processed one at a time in a specified order.
GPUs are ideal for the large-scale data processing required to build and run artificial intelligence models such as OpenAI’s GPT-4, the computing brain behind ChatGPT.
But even before ChatGPT arrives in late 2022, parallel computing already has the potential to accelerate data science for delivering relevant internet advertising, optimizing supply chain decisions, detecting online fraud, and more. I was there.
That’s why Nvidia has had a long relationship with PyData, a conference for developers who use the Python coding language to analyze data.
This year, Ratzel was on hand to introduce a software partnership with Nvidia for Python developers using popular open source data management tools.
He started with a dataset of movie reviews and numerical ratings. The aim was to make good suggestions. Ratzel needed to make the tastes of film critics as close as possible to human tastes. The calculations to determine who has similar hobbies were not very complicated. However, this calculation required a large amount of data on 330,000 users.
“It’s huge,” he said.
On a traditional computer with a CPU, it took two hours to run the first analysis. With some adjustments, the time required was reduced to 1 hour.
Ratzel then switched to the GPU and ran the analysis again. He did it in less than two seconds. This speed is achieved through parallel computing enabled by GPUs.
Although this concept has been around since the 1980s, the ability to perform parallel computing has been difficult to access until relatively recently. Increased availability of GPUs by cloud providers has made it easier for avid data scientists to complete their own projects in seconds instead of hours.
With significant time savings, researchers can run more experiments and work on more projects.
“You can see how this changes the way we work,” Ratzel said. “Now we can try a lot of things and do experiments, and we’re using the exact same data and the exact same code.”
The calculations that GPUs perform to enable generative AI are much more complex and extensive than using existing structured data to recommend movies based on common characteristics and tastes.
This huge amount of computation puts a huge demand on Nvidia GPUs. And that’s what made Nvidia’s business so valuable to investors.