TL;DR
A gamer installed a Tesla V100 SXM2 data center GPU in a gaming PC using an adapter, doubling VRAM for AI tasks at a low cost. The setup involves hardware modifications, including fan control adjustments. This demonstrates a cost-effective way to access high-bandwidth GPUs for machine learning.
A gamer has installed a Tesla V100 SXM2 data center GPU into a consumer gaming PC, enabling access to 32GB of VRAM for large language model inference at a significantly lower cost than commercial high-end GPUs.
The Tesla V100 SXM2, a data center GPU designed for NVIDIA’s servers, was adapted for use in a standard PC through an SXM2-to-PCIe adapter purchased for approximately £50. The user paid about £150 for the GPU itself on eBay, totaling roughly £200 for the entire setup, which now allows running large AI models with 32GB of VRAM.
The V100 features 16GB of HBM2 memory, 5120 CUDA cores, and a memory bandwidth of 900 GB/s—surpassing many consumer GPUs in bandwidth, including the RTX 4080’s 736 GB/s. The adapter used is not officially supported, and the GPU’s connector does not include display outputs or standard power connectors.
To manage the GPU’s cooling fan, the user experimented with wiring the fan to a motherboard fan header, successfully controlling its speed and reducing noise from 82 decibels to a manageable level. The GPU was then integrated into the system alongside an RTX 4080, doubling VRAM and enabling tensor splitting for large language model inference, such as running a 27-billion-parameter model at 32 tokens per second.
Why It Matters
This development demonstrates a cost-effective way for enthusiasts and researchers to access high-bandwidth, high-memory GPUs typically reserved for data centers, enabling advanced AI and machine learning tasks on consumer hardware at a fraction of the cost. Get an entire RTX 5090 gaming PC for around the price of just the GPU.
It highlights the potential for hardware hacking and adaptation, which could influence how individuals and small labs approach large-scale AI inference without investing in expensive, purpose-built hardware.

NVIDIA Tesla V100 (Volta) 32GB NVLINK 2.0 SXM2 GPU
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
High-end consumer GPUs like the RTX 4090 and RTX 5090 offer substantial VRAM and bandwidth but come at high prices, often exceeding £2,000. Data center GPUs like the Tesla V100 are designed for server environments, with proprietary form factors and connectors, making them inaccessible without adapters. Enthusiasts have previously explored repurposing such hardware for personal use, but this case is notable for its low cost and practical approach.
The V100 was released in 2017, with features that still outperform many modern consumer cards in specific AI tasks, especially related to memory bandwidth, critical for large language model inference.
“For about £200 total, I had a 16GB VRAM GPU that could slot into my motherboard alongside my RTX 4080. That is 32GB of total VRAM.”
— the user
“The fan on this adapter is not subtle. It is not quiet. It is a loud fan designed for server environments, but I managed to control it with PWM.”
— the user

Guaber Heavy Duty Graphics Card GPU Adapter SXM2 to PCIe X16 Expansion Temperature Sensing for P100 V100 Accessories GPU Expansion for
Features a large fan or down draft fan with heat sinks for active cooling, ensuring stable operational of…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It remains unclear how stable and long-term this setup will be, as it relies on an unofficial adapter and manual fan control. Compatibility and driver support may pose ongoing challenges, and the user has not tested extensive workloads or durability over time.

Active Optical/Electrical Cable CDFP x16 to CDFP x16 for PCIe 5.0, 1.6ft Data Center AI GPU Server Cable, Ultra Low Latency High Bandwidth,Computer Component (39.3, Inches)
【Features and Benefits】Compliant with SFF-TA-1032 MSA Standard,Data Rate: Support PCIe 5.0 32GT Per Channel,Low Power Consumption,Exceeds 32GT/channel electrical…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
The user plans to further optimize cooling and explore more advanced control over the GPU. There may also be community interest in similar hardware hacks, potentially leading to more accessible methods for repurposing data center GPUs in consumer systems.

12V 4 Pin PWM Fan Speed Controller PC Fan Hub 6 Fans Supported, Powered by Type-C PD3.0 QC 3.0 and DC 5521 with Max Total 60W Output
Supports 6pcs 4 Pin PWM Fans (Fans not included, Not compatible with 3-pin/2-pin fans)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Can I use a data center GPU in my gaming PC?
Yes, with an appropriate adapter and some hardware modifications, such as fan control, it is possible to install a data center GPU like the Tesla V100 in a consumer PC.
Is this setup stable for long-term use?
This depends on the quality of the adapter, cooling, and driver support. It is experimental and may not be suitable for critical or prolonged workloads without further testing.
How does the performance compare to consumer GPUs?
The data center GPU offers higher memory bandwidth and VRAM, which benefits large AI models, but may not outperform high-end consumer GPUs in gaming or general tasks. Its main advantage is in AI inference tasks requiring bandwidth.
Source: Hacker News