I put a datacenter GPU in my gaming PC

TL;DR

A gamer installed a Tesla V100 SXM2 data center GPU in a gaming PC using an adapter, doubling VRAM for AI tasks at a low cost. The setup involves hardware modifications, including fan control adjustments. This demonstrates a cost-effective way to access high-bandwidth GPUs for machine learning.

A gamer has installed a Tesla V100 SXM2 data center GPU into a consumer gaming PC, enabling access to 32GB of VRAM for large language model inference at a significantly lower cost than commercial high-end GPUs.

The Tesla V100 SXM2, a data center GPU designed for NVIDIA’s servers, was adapted for use in a standard PC through an SXM2-to-PCIe adapter purchased for approximately £50. The user paid about £150 for the GPU itself on eBay, totaling roughly £200 for the entire setup, which now allows running large AI models with 32GB of VRAM.

The V100 features 16GB of HBM2 memory, 5120 CUDA cores, and a memory bandwidth of 900 GB/s—surpassing many consumer GPUs in bandwidth, including the RTX 4080’s 736 GB/s. The adapter used is not officially supported, and the GPU’s connector does not include display outputs or standard power connectors.

To manage the GPU’s cooling fan, the user experimented with wiring the fan to a motherboard fan header, successfully controlling its speed and reducing noise from 82 decibels to a manageable level. The GPU was then integrated into the system alongside an RTX 4080, doubling VRAM and enabling tensor splitting for large language model inference, such as running a 27-billion-parameter model at 32 tokens per second.

Why It Matters

This development demonstrates a cost-effective way for enthusiasts and researchers to access high-bandwidth, high-memory GPUs typically reserved for data centers, enabling advanced AI and machine learning tasks on consumer hardware at a fraction of the cost. Get an entire RTX 5090 gaming PC for around the price of just the GPU.

It highlights the potential for hardware hacking and adaptation, which could influence how individuals and small labs approach large-scale AI inference without investing in expensive, purpose-built hardware.

NVIDIA Tesla V100 (Volta) 32GB NVLINK 2.0 SXM2 GPU

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

High-end consumer GPUs like the RTX 4090 and RTX 5090 offer substantial VRAM and bandwidth but come at high prices, often exceeding £2,000. Data center GPUs like the Tesla V100 are designed for server environments, with proprietary form factors and connectors, making them inaccessible without adapters. Enthusiasts have previously explored repurposing such hardware for personal use, but this case is notable for its low cost and practical approach.

The V100 was released in 2017, with features that still outperform many modern consumer cards in specific AI tasks, especially related to memory bandwidth, critical for large language model inference.

“For about £200 total, I had a 16GB VRAM GPU that could slot into my motherboard alongside my RTX 4080. That is 32GB of total VRAM.”

— the user

“The fan on this adapter is not subtle. It is not quiet. It is a loud fan designed for server environments, but I managed to control it with PWM.”

— the user

Guaber Heavy Duty Graphics Card GPU Adapter SXM2 to PCIe X16 Expansion Temperature Sensing for P100 V100 Accessories GPU Expansion for

Guaber Heavy Duty Graphics Card GPU Adapter SXM2 to PCIe X16 Expansion Temperature Sensing for P100 V100 Accessories GPU Expansion for

Features a large fan or down draft fan with heat sinks for active cooling, ensuring stable operational of…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear how stable and long-term this setup will be, as it relies on an unofficial adapter and manual fan control. Compatibility and driver support may pose ongoing challenges, and the user has not tested extensive workloads or durability over time.

Mellanox ConnectX-4 EN Network Adapter (MCX414A-BCAT)

Mellanox ConnectX-4 EN Network Adapter (MCX414A-BCAT)

High performing silicon for applications requiring high bandwidth, low latency and high message rate

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

The user plans to further optimize cooling and explore more advanced control over the GPU. There may also be community interest in similar hardware hacks, potentially leading to more accessible methods for repurposing data center GPUs in consumer systems.

12V 4 Pin PWM Fan Speed Controller PC Fan Hub 6 Fans Supported, Powered by Type-C PD3.0 QC 3.0 and DC 5521 with Max Total 60W Output

12V 4 Pin PWM Fan Speed Controller PC Fan Hub 6 Fans Supported, Powered by Type-C PD3.0 QC 3.0 and DC 5521 with Max Total 60W Output

Supports 6pcs 4 Pin PWM Fans (Fans not included, Not compatible with 3-pin/2-pin fans)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I use a data center GPU in my gaming PC?

Yes, with an appropriate adapter and some hardware modifications, such as fan control, it is possible to install a data center GPU like the Tesla V100 in a consumer PC.

Is this setup stable for long-term use?

This depends on the quality of the adapter, cooling, and driver support. It is experimental and may not be suitable for critical or prolonged workloads without further testing.

How does the performance compare to consumer GPUs?

The data center GPU offers higher memory bandwidth and VRAM, which benefits large AI models, but may not outperform high-end consumer GPUs in gaming or general tasks. Its main advantage is in AI inference tasks requiring bandwidth.

Source: Hacker News

You May Also Like

FCC wants to kill burner phones by forcing telecoms to get all customers’ IDs

FCC plans to mandate telecoms collect ID and location data from all phone customers, threatening privacy and access to burner phones.

Linux gaming is faster because Windows APIs are becoming Linux kernel features

Recent developments show Windows API features are being integrated into the Linux kernel, improving gaming performance on Linux systems like Steam Deck.

How Fast Does Claude, Acting as a User Space IP Stack, Respond to Pings?

A test measures how quickly Claude, acting as a user space IP stack, responds to ICMP ping requests, revealing insights into AI-driven network processing.

Tesla Reveals New Details About Robotaxi Crashes—and the Humans Involved

Tesla reveals new details about 17 robotaxi crashes from July 2025 to March 2026, including incidents involving remote human drivers in Austin, raising safety concerns.