F5 and NVIDIA expand collaboration on AI infrastructure

F5 and NVIDIA join forces to enhance AI infrastructures by enhancing token throughput, reducing latency, and enabling secure multi-tenant platforms.

F5, a provider of application and API delivery and security solutions, has announced expanded capabilities in collaboration with NVIDIA to enhance AI inference infrastructures. This collaboration integrates F5 BIG-IP Next for Kubernetes with NVIDIA BlueField-3 DPUs, creating a telemetry-aware infrastructure layer. The integration is designed to increase token throughput through improved GPU utilisation, reduce latency, and support secure multi-tenant AI platforms at scale.

In AI systems, tokens are measurable units of AI output, such as words or data fragments generated during inference. The production rate of these tokens affects user experience, infrastructure efficiency, and revenue per accelerator. As businesses and GPU-as-a-Service (GPUaaS) providers adopt AI, infrastructure efficiency is an important consideration. The solution from F5 and NVIDIA aims to address these factors, including token throughput and cost per token.

The shift from application-centric to agent-driven AI workflows requires architectural approaches that improve token throughput and reduce costs. BIG-IP Next for Kubernetes now uses NVIDIA NIM statistics and GPU telemetry to make routing decisions for inferences. This matches workloads with appropriate accelerators in real time, aiming to improve utilisation and reduce latency.

Tests validated by The Tolly Group demonstrated increased token throughput, faster time to first token (TTFT), and reduced request latency. Offloading functions such as networking and AI-aware load balancing to NVIDIA BlueField-3 DPUs allows host CPU capacity to be preserved, enabling GPUs to perform high-throughput inference. This increases token yield and reduces costs without requiring modifications to AI models.

AI applications require traffic control beyond traditional load balancing. BIG-IP Next for Kubernetes now supports inference-aware routing for agent-driven AI tasks. Integration with the NVIDIA DOCA Platform Framework facilitates deployment and management of NVIDIA BlueField DPUs. These capabilities aim to allow organisations to share GPU infrastructure securely across units or clients while maintaining performance and service predictability.

The collaboration between F5 and NVIDIA aims to provide tools to monitor token consumption, improve traffic flow, and optimise infrastructure utilisation. This approach seeks to allow organisations to achieve greater efficiency from GPUs and better align resources with AI workloads.

By combining NVIDIA infrastructure telemetry and DPU acceleration with F5 operational intelligence, enterprises can adapt AI infrastructures for more efficient, multi-tenant, and agent-driven workloads.

Enterprise AI hits the wall

Posted 4 days ago by Phil Alsop
Demands for privacy and sovereignty expose limits of architectures built for centralised and borderless data flows.
Abnormal AI strengthens its team with key executive hires amid rising AI-generated cybersecurity threats, aiming to enhance product innovation and...
At its 2026 Relate event in Colorado, Zendesk outlined its push towards an autonomous service workforce, revealing new AI platform capabilities. The...
SolarWinds research reveals growing confidence in automation, however concerns around accuracy, skills and oversight remain.
IT leaders survey finds that despite rising hardware costs and sustainability goals, 1/3 of mobiles, laptops and drives destroyed to protect data...
HCLTech has released findings from its latest Enterprise AI Market Report, The AI Impact Imperatives, 2026, highlighting a growing execution gap as...

SMBs hit a cybersecurity breaking point

Posted 5 days ago by Phil Alsop
New global research shows internal teams can’t keep pace, fueling demand for always-on, outcome-driven security services.

Zendesk reveals autonomous service workforce

Posted 6 days ago by Sophie Milburn
Zendesk has outlined a new AI-focused strategy for customer service centred on combining AI capabilities with human support workflows to improve...