Logo

Vultr launches GPU Stack and Container Registry for AI model acceleration

First-of-its kind GPU Stack with public and private Container Registry provides full AI application lifecycle management from anywhere in the world.

  • Thursday, 28th September 2023 Posted 2 years ago in by Phil Alsop

Vultr has launched the Vultr GPU Stack and Container Registry to enable global enterprises and digital startups alike to build, test and operationalize artificial intelligence (AI) models at scale — across any region on the globe. The GPU Stack supports instant provisioning of the full array of NVIDIA GPUs, while the new Vultr Container Registry makes AI pre-trained NVIDIA NGC models globally available for on-demand provisioning, development, training, tuning and inference. Available across Vultr’s 32 cloud data center locations, across all six continents, the new Vultr GPU Stack and Container Registry accelerate speed, collaboration and the development and deployment of AI and machine learning (ML) models.

 

“Vultr is committed to enabling innovation ecosystems around the world – from Silicon Valley and Miami to São Paulo, Tel Aviv, Tokyo, Singapore, London, Amsterdam and beyond – providing instant access to high-performance cloud GPU and cloud computing resources to accelerate AI and cloud-native innovation,” said J.J. Kardwell, CEO of Vultr’s parent company, Constant. “By working closely with NVIDIA and our growing ecosystem of technology partners, we are removing access barriers to the latest technologies, and offering enterprises the first composable, full-stack solution for end-to-end AI application lifecycle management. This enables data science, MLOps, and engineering teams to build on a globally-distributed basis, without worrying about security, latency, local compliance, or data sovereignty requirements.”

“The Vultr GPU Stack and Container Registry provide organizations with instant access to the entire library of pre-trained LLMs on the NVIDIA NGC catalog, so they can accelerate their AI initiatives and provision and scale NVIDIA cloud GPU instances from anywhere,” said Dave Salvator, director of accelerated computing products at NVIDIA.

 

The development and deployment of machine learning (ML) and AI models is complex, and is made even more so by looming regulations related to data privacy, sovereignty and compliance. To eliminate configuration and provisioning bottlenecks, data scientists and MLOps team need access to best-of-breed tools and technologies to build, test, run and deploy models worldwide. 

 

With these challenges in mind, Vultr launched the first-of-its-kind Vultr GPU Stack: a finely tuned and integrated operating system and software environment which instantly provisions the full array of NVIDIA GPUs, pre-configured with the NVIDIA CUDA Toolkit, NVIDIA cuDNN and NVIDIA drivers, for immediate deployment. This solution removes the complexity of configuring GPUs , calibrating them to the specific model requirements for each application and integrating them with the AI model accelerators of choice. Models and frameworks can be brought in from the NVIDIA NGC catalog, Hugging Face or Meta Llama 2 and include PyTorch and TensorFlow. With these resources easily provisioned, data science and engineering teams across the globe can get started on their model development and training with a click of a button.

 

Kubernetes Made Easy

Vultr also launched its new Vultr Kubernetes-based Container Registry, fully integrated with Vultr’s GPU stack. Comprising both a public and private registry, the Vultr Container Registry enables organizations to source NVIDIA ML models from the NVIDIA NGC catalog and provision them to Kubernetes clusters via Vultr’s 32 cloud data center locations. This empowers data science, MLOps and engineering teams to leverage pre-trained AI models from anywhere in the world — regardless of the team’s physical location. Meanwhile, the private registry combines public models with an organization’s private datasets so developers can train and tune models based on proprietary data and then create their own instance of the model for inference. That trained and tuned model is then accessible in each company’s private container registry, accessible only to authorized users. This in turn speeds up global instantiation and tuning of AI models, synchronized across private registries in each region.