Homemade Nvidia 5090 Inference machine

Hi!

This is going to be another quick one. I’ve decided that it was time to upgrade my inference (and training) capability at home now that I’m no longer part of a large company with substantial cloud access. So I decided to build my own box with sufficient power and memory to run small and medium-sized transformers. While I am going to name a few brands and part numbers here, I paid for all of this myself and I am in no way associated with or receive compensation from any of the companies mentioned. This is no advertisement, just my own experience. YMMV, Objects in the rear view mirror, you know the drill. Don’t take this as “the truth” or any form of guarantee that this will work for you. I’m quite experienced in this, I build one of the first PC-based Linux clusters in Europe back in 1996, so I do things others might not want to do.

I started my design by looking at the most recent pc builders proposal from c’t, my go-to publication for computer stuff here in Germany. The article itself is paywalled, but they have a public project website at https://ct.de/yay1. I kept the general idea but modified a few details to get a bit more power. I ended up with the following spec:

AMD Ryzen 9 9950X, 16C/32T, 4.30-5.70GHz
Noctua NH-U14S CPU cooler
ASUS Prime X870-P mainboard
Crucial UDIMM 32GB, DDR5-5600, CL46-45-45
Samsung SSD 990 PRO 2TB, M.2 2280 / M-Key / PCIe 4.0 x4
be quiet! Pure Power 13 M 1000W
Fractal Design Focus 2 Black Solid Tower
ZOTAC GeForce RTX 5090 SOLID PCIe x16 5.0 GPU

The core is the ‘Blackwell’ RTX5090, as it draws up to 575 W, I upgraded the power supply to a 1000w model. I also selected a model with the new 16 pin PCIe 5.1 600W power output to avoid any heat issues with intermediate power cable adapters for the older 12V GPU power outputs. So far this worked out well. I also upgraded the RAM. I kept the Noctua cooler although it’s rated slightly below the TDP of the CPU, but no issues so far and it is very quiet. I took the mainboard version without WIFI since the box will be attached to wired ethernet only. I swapped the housing since the one used by the c’t project wasn’t available.

Putting it together took some time and the initial boot attempt looked like it failed, I did not get a video signal, neither from the motherboard HDMI nor from the Zotac. I removed the Zotac again, checked all connections etc. but nothing seemed to be wrong. It turns out that the initial self-test of the board and CPU take a *long* time, after waiting a few minutes, I was greeted with the AMI firmware screen. As everything looked fine, I then reinstalled the Zotac and again, after a *long* wait, the firmware screen showed up on its HDMI out. So far so good.

I then installed Ubuntu 25.04 and it worked well. I repartitioned the drive to set aside 256G of Swap and a separate boot partition. The Realtek network interface worked without issues. After a couple of reboots, installing openssh-server and disabling the graphical boot with “systemctl set-default multi-user.target” I had a very quiet box idling along. I plugged the box into a tasmota power sensor and it gave me about 55W of power consumption when idle. Not great but OK. I will check later how I can get that further down.

To test the AI/ML capabilities of the box, I installed uv, pytorch and ollama. All worked right out of the box. nvidia-smi gives me NVIDIA-SMI 575.64.03 Driver Version: 575.64.03 CUDA Version: 12.9

I then pulled and ran Deepseek-R1 with a couple of prompts. When running the model, the machine pulled about 600W but was still surprisingly quiet, just a bit of a louder humm from the fans. So I am quite pleased with the result so far.

Hope this helps,

ps: I tested with the following prompt: “why are there infinite irrational numbers between every pair of rational numbers? give a proof“. This is great for keeping Deepseek-R1 busy but the result isn’t pretty.

One Response to Homemade Nvidia 5090 Inference machine

kenn-blog says:

3. August 2025 at 19:55

One thing I was struggling with was to get Wake-on-Lan to work. Turns out it isn’t sufficient to set up WOL in the UEFI setup, one also needs to configure the NIC every boot to enable WOL via ethtool.

ethtool -s enp6s0 wol g

Writing a small oneshot service for systemd now runs this every boot.

And if your machine runs NetworkManager, you can also use nmcli to configure this:

nmcli c modify netplan-enp6s0 802-3-ethernet.wake-on-lan magic

You must be logged in to post a comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Homemade Nvidia 5090 Inference machine

One Response to Homemade Nvidia 5090 Inference machine

Leave a Reply

Archives

Meta