▲I regret building this $3000 Pi AI clusterjeffgeerling.com

361 points by speckx 7 hours ago | 280 comments

densh 3 hours ago [-]

For anyone interested in playing with distributed systems, I'd really recommend getting a single machine with latest 16-core CPU from AMD and just running 8 virtual machines on it. 8 virtual machines, with 4 hyper threads pinned per machine, and 1/8 of total RAM per machine. Create a network between them virtually within your virtualization software of choice (such as Proxmox).

And suddenly you can start playing with distributed software, even though it's running on a single machine. For resiliency tests you can unplug one machine at a time with a single click. It will annihilate a Pi cluster in Perf/W as well, and you don't have to assemble a complex web of components to make it work. Just a single CPU, motherboard, m.2 SSD, and two sticks of RAM.

Naturally, using a high core count machine without virtualization will get you best overall Perf/W in most benchmarks. What's also important but often not highlighted in benchmarks in Idle W if you'd like to keep your cluster running, and only use it occasionally.

qmr 1 hours ago [-]

No need for so much CPU power, any old quad core would work.

bee_rider 2 hours ago [-]

Tangentially related: I really expected running old MPI programs on stuff like the AMD multi-chip workstation packages to become a bigger thing.

cyberpunk 2 hours ago [-]

Honestly why do you need so much cpu power? You can play with distributed systems just by installing Erlang and running a couple of nodes on whatever potato-level linux box you have laying around, including a single raspberry pi.

globular-toast 2 hours ago [-]

I've been saying this for years. When the last Raspberry Pi shortage happened people were scrambling to get them for building these toy clusters and it's such a shame. The Pi was made for paedogogy but I feel like most of them are wasted.

I run a K8s "cluster" on a single xcp-ng instance, but you don't even really have to go that far. Docker Machine could easily spin up docker hosts with a single command, but I see that project is dead now. Docker Swarm I think still lets you scale up/down services, no hypervisor required.

bunderbunder 6 hours ago [-]

Reminds me a bit of one of my favorite NormConf sessions, "Just use one big machine for model training and inference." https://youtu.be/9BXMWDXiugg?si=4MnGtOSwx45KQqoP

Or the oldie-but-goodie paper "Scalability! But at what COST?": https://www.usenix.org/system/files/conference/hotos15/hotos...

Long story short, performance considerations with parallelism go way beyond Amdahl's Law, because supporting scale-out also introduces a bunch of additional work that simply doesn't exist in a single node implementation. (And, for that matter, multithreading also introduces work that doesn't exist for a sequential implementation.) And the real deep down black art secret to computing performance is that the fastest operations are the ones you don't perform.

bee_rider 5 hours ago [-]

> The first benchmark I ran was my top500 High Performance Linpack cluster benchmark. This is my favorite cluster benchmark, because it's the traditional benchmark they'd run on massive supercomputers to get on the top500 supercomputer list. […]

> After fixing the thermals, the cluster did not throttle, and used around 130W. At full power, I got 325 Gflops

I was sort of surprised to find that the top500 list on their website only goes back to 1993. I was hoping to find some ancient 70’s version of the list where his ridiculous Pi cluster could sneak on. Oh well, might as well take a look… I’ll pull from the sub-lists of

https://www.top500.org/lists/top500/

They give the top 10 immediately.

First list (June 1993):

     placement  name            RPEAK (GFlop/s)
     1          CM-5/1024       131.00
     10         Y-MP C916/16256 15.24

Last list he wins, I think (June 1996):

     1          SR2201/1024     307.20  
     10         SX-4/32         64.00

First list he’s bumped out of the top 10 (November 1997):

     1          ASCI Red        1,830.40
     10         T3E             326.40

I think he gets bumped off the full top500 list around 2002-2003. Unfortunately I made the mistake of going by Rpeak here, but they sort by Rmax, and I don’t want to go through the whole list.

Apologies for any transcription errors.

Actually, pretty good showing for such a silly cluster. I think I’ve been primed by stuff like “your watch has more compute power than the Apollo guidance computer” or whatever to expect this sort of thing to go way, way back, instead of just to the 90’s.

Coffeewine 6 hours ago [-]

It's a pretty rough headline, clearly the author had fun performing the test and constructing the thing.

I would be pretty regretful of just the first sentence in the article, though:

> I ordered a set of 10 Compute Blades in April 2023 (two years ago), and they just arrived a few weeks ago.

That's rough.

geerlingguy 5 hours ago [-]

That's the biggest regret; but I've backed 6 Kickstarter projects over the years. Median time to deliver is 1 year.

Somehow I've actually gotten every item I backed shipped at some point (which is unexpected).

Hardware startups are _hard_, and after interacting with a number of them (usually one or two people with a neat idea in an underserved market), it seems like more than half fail before delivering their first retail product. Some at least make it through delivering prototypes/crowdfunded boards, but they're already in complete disarray by the end of the shipping/logistics nightmares.

maartin0 5 hours ago [-]

Not completely related, but do you know if hardware kickstarters typically have any IP protection? I'm surprised there haven't been any cases of large companies creating patents for ideas from kickstarter at least that I've seen

ssl-3 2 hours ago [-]

One cannot (or at least, one is not supposed to be able to) patent someone else's invention.

privatelypublic 50 minutes ago [-]

You theoretically have a year before you even have to apply- but patents are expressly "first to file."

4 hours ago [-]

ComputerGuru 4 hours ago [-]

Guys, don’t take the claim so literally. He’s a successful tech poster. He makes good money showing off his purchases the good money complaining about how expensive they were.

But certainly don’t imitate his choices, his economics aren’t your economics!

esskay 3 hours ago [-]

Thats pretty much a given, but what the real takeaway should be from most of the content is that whatever you are doing, these days the answer in all likelihood is not to buy a raspberry pi. It's specs to price just do not add up at all anymore, and it's looking like a pretty damn stagnant place these days.

Computer0 2 hours ago [-]

What if you need ARM? What is the best sub $100 sbc that I am missing? Orange Pi hardware always looks good but I hear a lot of negativity about the software that I don't really experience with Raspbian.

michaelt 2 hours ago [-]

Then you should take a good hard look at older, much cheaper Raspberry Pis.

Then look at Apple’s ARM offerings, and AWS Graviton if you need ARM with raw power.

If you need embedded/GPIO you should consider an Arduino, or clone. If you need GPIOs and Internet connectivity, look at an ESP32. GPIOs, ARM and wired ethernet? Consdier the the STM32H.

Robotics/machine vision applications, needing IO and lots of compute power? Consider a regular PC with an embedded processor on serial or USB. Or nvidia jetson if you want to run CUDA stuff.

And take a good hard look at your assumptions, as mini PCs using the Intel N100 CPU are very competitive with modern Pis.

privatelypublic 52 minutes ago [-]

I've heard nothing but horror stories on the Jetson & Tegra in general. I'd Avoid it unless the project MUST use a SoM w/ CUDA. which will basically only be Professional stuff. I've never heard of anything hobby level where a PCIe slot was a deal breaker- even with high vibration. (PCIe 4.0 isn't terrible difficult to get good flex cables for)

dzhiurgis 2 hours ago [-]

Yea but Jeff’s videos are refreshing.

A lot of others are stuck in a loop where they essentially review tech for making more youtube videos - render times, colour accuracy, camera resolution, audio fidelity.

brcmthrowaway 3 hours ago [-]

If only Dan Luu pivoted to this style of content

4 hours ago [-]

system2 3 hours ago [-]

I totally agree. A person with that kind of builder knowledge already knows a decent GPU could 10x that compute power.

fidotron 6 hours ago [-]

If Pi Clusters were actually cost competitive for performance there would be data centres full of them.

shermantanktop 6 hours ago [-]

Like the joke about the economists not picking up the $20 bill on the ground?

Faith in the perfect efficiency of the free market only works out over the long term. In the short term we have a lot of habits that serve as heuristics for doing a good job most of the time.

uncircle 4 hours ago [-]

> Like the joke about the economists not picking up the $20 bill on the ground?

For those like me that don't know the joke:

Two economists are walking down the street. One of them says “Look, there’s a twenty-dollar bill on the sidewalk!” The other economist says “No there’s not. If there was, someone would have picked it up already.”

shermantanktop 1 hours ago [-]

Presumably the non-economist following them picked up the twenty, unencumbered by theory.

ThrowawayR2 5 hours ago [-]

There's been so much investigation into alternative architectures for datacenters and cloud providers, including FAANG resorting to designing their own ARM processors and accelerator chips (e.g. AWS Graviton, Google TPUs) and having them fabbed, that that comes off not as warranted cynicism but silly cynicism.

themafia 3 hours ago [-]

It's quite the opposite when corruption becomes involved. There are definite financial incentives for middle men to deliver inefficient and wasteful experiences.

Competition is what creates efficiency. Without it you live in a lie.

infecto 6 hours ago [-]

Sure but for commodities, like server hardware, we can say it’s usually directionally correct. If there are no pi cloud offerings, there is probably a good economic reason for it.

IAmBroom 6 hours ago [-]

> Faith in the perfect efficiency of the free market only works out over the long term

... and even then it doesn't always prove true.

rozab 14 minutes ago [-]

People said the same about PlayStations, to be fair

pmarreck 3 hours ago [-]

Yes. And if women were actually paid 80 cents to the dollar for men, men would be unemployable.

phoronixrly 6 hours ago [-]

If they were cost competitive for ... anything at all really...

ssl-3 33 minutes ago [-]

It depends on the application.

If one just wants a cheap desktop box to do desktop things with, then they're a terrible option, price-wise, compared to things like used corpo mini-PCs.

But they're reasonably cost-competitive with other new (not used!) small computers that are tinkerer-friendly, and unlike many similar constructs there's a plethora of community-driven support for doing useful things with the unusual interfaces they expose.

jacobr1 6 hours ago [-]

They are competitive for hobbyist use cases. Limited home servers, or embedded applications that overlap with arduino.

rebolek 27 minutes ago [-]

They are cost competitive enough for Korg synthesizers which is pretty OK for me.

ACCount37 6 hours ago [-]

Prototyping and low volume.

They're good for long as the development costs dominate the total costs.

magicalhippo 2 hours ago [-]

I picked up several Rpi 4 2GB for $20 each just before covid-19. At that price point they've been quite competitive for small homelab workloads.

The current RPi 5 makes no sense to me in any configuration, given its pricing.

mayli 1 hours ago [-]

Yeah, it's only competitive as a toy for under $35, anything beyond that you can get a cheap x86 with much better performance, a much compatible architecture and much more IOs.

wltr 6 hours ago [-]

Well I have a Pi as a home server, and it’s very energy efficient, while doing what I want. Since I don’t need latest and greatest (I don’t see any difference with a modern PC for my use case), it’s very competitive for me. No need for any cooling is bonus.

Waraqa 5 hours ago [-]

>very energy efficient

If your server has a lot of idle time, ARM will always win.

miunau 2 hours ago [-]

Mythic Beasts rents rpi servers: https://www.mythic-beasts.com/order/rpi/ - there is a niche for it

mayli 1 hours ago [-]

This is also a proof where you can get cheaper and more perf/$ if you buy x86 based vps.

nromiun 6 hours ago [-]

There is a reason all the big supercomputers have started using GPUs in the last decade. They are much more efficient. If you want 32bit parallel performance just buy some consumer GPUs and hook them up. If you need 64bit buy some prosumer GPUs like the RTX 6000 Pro and you are done.

Nobody is really building CPU clusters these days.

anematode 1 hours ago [-]

Unfortunately even the RTX 6000 Pro has nerfed double-precision throughput at about 2 TFLOPS, 64x slower than single precision. For comparison an EPYC 9755 does ~10 TFLOPS, while drawing less power. An A100 -- if you can find one -- is in the same ballpark.

The best option for DP throughput for hobbyists interested in HPC might be old AMD cards from before they, too, realized that scientific folks would pay up the nose for higher precision.

ted_dunning 3 hours ago [-]

Well, El Capitan uses AMD CPUs (which have integrated GPU capabilities) and it is right on top of the rankings lately.

Frontier is right behind it with the same arrangement.

Having honest to god dedicated GPUs on their own data bus with their own memory isn't necessarily the fastest way to roll.

nromiun 3 hours ago [-]

They do not. The CPUs are only there to support and push data to the GPUs. Much like Nvidia GH200 systems. Nobody buys these APU chips for their CPU parts.

For comparison there are 9,988,224 GPU compute units in El Capitan and only 1,051,392 CPU cores. Roughly one CPU core to push data to 10 GPU CUs.

Aurornis 6 hours ago [-]

I thought the conclusion should have been obvious: A cluster of Raspberry Pi units is an expensive nerd indulgence for fun, not an actual pathway to high performance compute. I don’t know if anyone building a Pi cluster actually goes into it thinking it’s going to be a cost effective endeavor, do they? Maybe this is just YouTube-style headline writing spilling over to the blog for the clicks.

If your goal is to play with or learn on a cluster of Linux machines, the cost effective way to do it is to buy a desktop consumer CPU, install a hypervisor, and create a lot of VMs. It’s not as satisfying as plugging cables into different Raspberry Pi units and connecting them all together if that’s your thing, but once you’re in the terminal the desktop CPU, RAM, and flexibility of the system will be appreciated.

bunderbunder 6 hours ago [-]

The cost effective way to do it is in the cloud. Because there's a very good chance you'll learn everything you intended to learn and then get bored with it long before your cloud compute bill reaches the price of a desktop with even fairly modest specs for this purpose.

dukeyukey 5 hours ago [-]

It's good for the soul to have your cluster running in your home somewhere.

NordSteve 4 hours ago [-]

Bad for your power bill though.

platybubsy 3 hours ago [-]

I'm sure 5 rpis will devastate the power grid

duxup 3 hours ago [-]

I need to heat my house too so maybe it helps a little there.

trenchpilgrim 1 hours ago [-]

Still less than renting the same amount of compute. Somewhere between several months and a couple years you pull ahead on costs. Unless you only run your lab a few hours a day.

11101010001100 3 hours ago [-]

You still pay for power for the cloud.

Damogran6 3 hours ago [-]

I got past that back when I was paying for ISDN and had 5 Surplus Desktop PCs...write it off as 'Professional development'

throwaway894345 3 hours ago [-]

What does a few rpis cost on a monthly basis?

theodric 3 hours ago [-]

Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Handy: https://700c.dk/?powercalc

My Pi CM4 NAS with a PCIe switch, SATA and USB3 controllers, 6 SATA SSDs, 2 VMs, 2 LXC containers, and a Nextcloud snap pretty much sits at 17 watts most of the time, hitting 20 when a lot is being asked of it, and 26-27W at absolute max with all I/O and CPU cores pegged. €3.85/mo if I pay ESB, but I like to think that it runs fully off the solar and batteries :)

throwaway894345 2 hours ago [-]

> Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Pretty sure most of us aren't running anywhere close to full load 24/7, but whoa, Irish power is expensive. In the central US I pay $0.14/KWh.

ofrzeta 5 hours ago [-]

Maybe so, but even then a second-hand blade server is more cost-effective than a Raspi Cluster.

geerlingguy 3 hours ago [-]

Not if you run it idle a lot; most commercial blade servers suck down a lot of power. I think a niche where Pi blades can work is for a learning cluster, like in schools for HPC learning, network automation, etc.

It's definitely not suited for production, but there, you won't find old blade servers either (for the power to performance issue).

Almondsetat 6 hours ago [-]

I can get a Xeon E5-2690V4 with 28 threads and 64GB of RAM for about $150. If you need cores and memory to make a lot of VMs you can do it extremely cheaply

Aurornis 5 hours ago [-]

> I can get a Xeon E5-2690V4 with 28 threads and 64GB of RAM for about $150.

If the goal is a lot of RAM and you don’t care about noise, power, or heat then these can be an okay deal.

Don’t underestimate how far CPUs have come, though. That machine will be slower than AMD’s slowest entry-level CPU. Even an AMD 5800X will double its single core performance and even walk away from it on multithreaded tasks despite only having 8 cores. It will use less electricity and be quiet, too. More expensive, but if this is something you plan to leave running 24/7 the electricity costs over a few years might make the power hungry server more expensive over time.

semi-extrinsic 5 hours ago [-]

For $3000 you can get 3x used Epyc servers with a total of 144 cores and 384 GB memory, with dual-port 25Gbe networking so you can run them in a fully connected cluster without a switch. It will have >20x better perf/$ and ~3x better perf/W.

That combo gives you the better part of a gigabyte of L3 cache and an aggregate memory bandwidth of 600 GB/s, while still below 1000W total running at full speed. Plus your NICs are the fancy kind that let you play around with RoCEv2 and such nifty stuff.

It would also be relevant to then also learn how to do stuff properly with SLURM and Warewulf etc. instead of a poor mans solution with Ansible playbooks like in these blog posts.

p12tic 3 minutes ago [-]

Better build a single workstation - less noise, less power usage and the form factor is way more convenient. A budget of $3000 can buy 128 cores with 512GB of RAM on a single regular EATX motherboard, a case, a power supply and other accessories. Power usage is ~550W at maximum utilization which not much more than a gaming rig with a powerful GPU.

Almondsetat 3 hours ago [-]

You are taking my reply completely out of context. If you want to learn clustering, you need a lot of cores and ram to run many VMs. You don't need them to be individually very powerful.

mattbillenstein 5 hours ago [-]

Power and noise - old server hardware is not something you want in your home.

Commodity desktop cpus with 32 or 64GB RAM can do all of this in a low-power and quiet way without a lot more expense.

nine_k 6 hours ago [-]

It will probably consume $150 worth of electricity in less than a month, even sitting idle :-\

blobbers 6 hours ago [-]

The internet says 100W idle, so maybe more like $40-50 electricity, depending on where you live could be cheaper could be more expensive.

Makes me wonder if I should unplug more stuff when on vacation.

nine_k 6 hours ago [-]

I was surprised to find out that my apartment pulls 80-100W when everything is seemingly down during the night. A tiny light here and there, several displays in sleep mode, a desktop idling (mere 15W, but), a laptop charging, several phones charging, etc, the fridge switches on for a short moment. The many small amounts add up to something considerable.

amatecha 4 hours ago [-]

Yeah it kinda puts it all into perspective when you think of how every home used to use 60-watt light bulbs all throughout. Most people just leave lights on all over their home all day, probably using hundreds of watts of electricity. Makes me realize my 35-65w laptop is pretty damn efficient haha

ToucanLoucan 4 hours ago [-]

I got out of the homelab game as I finished my transition from DevOps to Engineering Lead, and it was simply massively overbuilt for what I actually needed. I replaced an ancient Dell R700 series, R500 series, and a couple supermicros with 3 old desktop PCs in rack enclosures and cut my electric bill nearly $90/month.

Fuckin nutty how much juice those things tear through.

rogerrogerr 5 hours ago [-]

100W over a month (rule of thumb 730 hours) is 73kWh. Which is $7.30 at my $0.10/kWh rate, or less than $25 at (what Google told me is) Cali’s average $0.30/kWh.

mercutio2 5 hours ago [-]

Your googling gave results that were likely accurate for California 4-5 years ago. My average cost per kWh is about 60 cents.

Rates have gone up enormously because the cost of wildfires is falling on ratepayers, not the utility owners.

Regulated monopolies are pretty great, aren’t they? Heads I win, tales you lose.

lukevp 5 hours ago [-]

60 cents per kWh? That’s shocking. Here in Oregon people complain about energy prices and my fully loaded cost (not the per kWh but including everything) is 19c. And I go over the limit for single family residential where I end up in a higher priced bracket. Thanks for making me feel better about my electricity rate. I’m sorry you have to deal with that. The utility companies should have to pay to cover those costs.

Damogran6 3 hours ago [-]

CORE energy in Colorado is charging $0.10819 per kWh _today_

https://core.coop/my-cooperative/rates-and-regulations/rate-...

cogman10 5 hours ago [-]

Depends entirely on the utilities board doing the regulation.

That said, I'm of the opinion that power/water/internet should all be state/county/city ran. I don't want my utilities companies to have profit motives.

My water company just got bought up by a huge water company conglomerate and, you guessed it, immediate rate increases.

SoftTalker 5 hours ago [-]

Most utilities, even if ostensibly privately-owned, are profit-limited and rates must be approved by a regulatory board. Some are organized as non-profits (rural water and electric co-ops, etc.) This is in exchange for the local monopoly.

If your local regulators approved the merger and higher rates, your complaint is with them as much as the utility company.

Not saying that some regulators are not basically rubber stamps or even corrupt.

cogman10 4 hours ago [-]

I agree. The issue really is that they are 3 layers removed from where I can make a change. They are all appointed and not elected which means I (and my neighbors) don't have any recourse beyond the general election. IIRC, they are appointed by the governor which makes it even harder to fix (might be the county commissioner, not 100% on how they got their position, just know it was an appointment).

I did (as did others), in fact, write in comments and complaints about the rate increases and buyout. That went unheard.

LTL_FTC 4 hours ago [-]

They have definitely increased but not all of California is like this. In the heart of Silicon Valley, Santa Clara, it's about $0.15/kWh. Having Data Centers nearby helps, I suppose.

chermi 3 hours ago [-]

I'm guessing the parent is talking about total bill (transmission, demand charges..) $.15/kwH is probably just the usage, and I am very skeptical that's accurate for residential.

favorited 2 hours ago [-]

Santa Clara's energy rates are an outlier among neighboring municipalities, and should not be used as an example of energy cost in the Bay Area. Santa Clara residents are served by city-owned Silicon Valley Power, which has lower rates than PG&E or SVCE, which service almost all of the South Bay.

yjftsjthsd-h 6 hours ago [-]

> Makes me wonder if I should unplug more stuff when on vacation.

What's the margin on unplugging vs just powering off?

dijit 5 hours ago [-]

By "off" you mean, functionally disabled but with whatever auto-update system in the background with all the radios on for "smart home" reasons - or, "off"?

titanomachy 5 hours ago [-]

100W continuous at 12¢/kWh (US average) is only ~$9 / month. Is your electricity 5x more expensive than the US average?

RussianCow 5 hours ago [-]

The US average hasn't been that low in a few years; according to [0] it's 17.47¢/kWh, and significantly higher in some parts of the country (40+ in Hawaii). And the US has low energy costs relative to most of the rest of the world, so a 3-5x multiplier over that for other countries isn't unreasonable. Plus, energy prices are currently rising and will likely continue to do so over the next few years.

$50/month for 100W continuous usage isn't totally mad, and that could climb even higher over the rest of the decade.

mercutio2 5 hours ago [-]

Not OP, but my California TOU rates are between a 40 and 70 cents per kWh.

Still only $50/month, not $150, but I very much care about 100W loads doing no work.

cjbgkagh 5 hours ago [-]

Those kWh prices are insane, that’ll make industry move out of there.

selkin 2 hours ago [-]

Industrial pays different rates than homes.

That said, I am not sure those numbers are true. I am in California (PG&E with East Bay community generation), and my TOU rates are much lower than those.

Almondsetat 5 hours ago [-]

Isn't your home lab supposed to make you learn stuff? Why would you leave it idle?

cjbgkagh 5 hours ago [-]

You wouldn’t, it’s given as a lower bound, it costs more than that when not idling

dijit 5 hours ago [-]

but then you’d turn it off, if you don’t then cloud is much more expensive too.

Also $150 for 100w is crazy, thats like $1.70 per kWh; it would cost about $150 a year at the (high) rates of southern Sweden.

cjbgkagh 5 hours ago [-]

Im not the OP, don’t know how they arrived at that cost.

Personally it’s cheaper to buy the hardware that does spend most of its time idling. Fast turnaround on very large private datasets being key.

swiftcoder 3 hours ago [-]

Obviously the solution is to pickup another hobby, and enter the DIY solar game at the same time as your home lab obsession :D

kjkjadksj 5 hours ago [-]

So shut it off when you don’t need it.

sebastiansm 6 hours ago [-]

On Aliexpress those Xeon+mobo+ram kits are really cheap.

datadrivenangel 5 hours ago [-]

1. Not in the US with tariffs now. 2. I would not trust complicated electronics from Aliexpress from a safety and security perspective.

6 hours ago [-]

kbenson 5 hours ago [-]

Source? That seems like something I would want to take advantage if at the moment...

kllrnohj 5 hours ago [-]

Note the E5-2690V4 is a 10 year old CPU, they are talking about used servers. You can find those on ebay or whatever as well as stores specializing in that. Depending on where you live, you might even find them free as they are often considered literal ewaste by the companies decommissioning them.

It also means it performs like a 10 year old server CPU, so those 28 threads are not exactly worth a lot. The geekbench results, for whatever value those are worth, are very mediocre in the context of anything remotely modern: https://browser.geekbench.com/processors/intel-xeon-e5-2690-...

Like a modern 12-thread 9600x runs absolute circles around it https://browser.geekbench.com/processors/amd-ryzen-5-9600x

mattbillenstein 5 hours ago [-]

This is the correct analysis - there's a reason you see this stuff cheap or free.

The homelab group on Reddit is full of people who don't understand any of this - they have full racks in their house that could be replaced with one high-end desktop.

kllrnohj 4 hours ago [-]

> The homelab group on Reddit is full of people who don't understand any of this - they have full racks in their house that could be replaced with one high-end desktop.

A lot of that group is making use of the IO capabilities of these systems to run lots of PCI-E devices & hard drives. There's not exactly a cost-effective modern equivalent for that. If there were cost-effective ways to do something like take a PCI-E 5.0 x2 and turn it into a PCI-E 3.0 x8 that'd be incredible, but there isn't really. So raw PCI-E lane count is significant if you want cheap networking gear or HBAs or whatever, and raw PCI-E lane count is $$$$ if you're buying new.

Also these old systems mean cheap RAM in large, large capacities. Like 128GB RAM to make ZFS or VMs purr is much cheaper to do on these used systems than anything modern.

mattbillenstein 4 hours ago [-]

Perhaps, but I don't really get the dozens of TB of storage in the home use case a lot of the time either.

Like if you have a large media library, you need to push maybe 10MB/s, you don't need 128GB of RAM to do that...

It's mostly just hardware porn - perhaps there are a few legit use cases for the old hardware, but they are exceedingly rare in my estimate.

kllrnohj 4 hours ago [-]

> Like if you have a large media library, you need to push maybe 10MB/s,

For just streaming a 4k bluray you need more than 10MB/s, Ultra HD bluray tops out at 144 Mbit/s. Not to mention if that system is being hit by something else at the same time (backup jobs, etc...).

Is the 128GB of RAM just hardware porn? Eh, maybe, probably. But if you want 8+ bays for a decent sized NAS then you're already quickly into price points at which point these used servers are significantly cheaper, and 128GB of RAM adds very little to the cost so why not.

Kubuxu 3 hours ago [-]

For 8+ bays you just need a SAS HBA card and one free PCI-E slot. Not to mention that many motherboards will have 6+ SATA ports already.

If anything, 2nd hand AMD gaming rigs make more sense than old servers. I say that as someone with always off r720xd at home due to noise and heat. It was fun when I bought it during winter years ago, until summer came.

kllrnohj 2 hours ago [-]

> For 8+ bays you just need a SAS HBA card and one free PCI-E slot. Not to mention that many motherboards will have 6+ SATA ports already.

And what case are you putting them into? What if you want it rack mounted? What about >1gig networking? What if I want a GPU in there to do whisper for home assistant?

Used gaming rigs are great. But used servers also still have loads of value, too. Compute just isn't one of them.

zer00eyz 2 hours ago [-]

Most of the workloads that people with homelabs run, could be run on a 5 year old i5.

A lot of business are paying obscene money to cloud providers when they could have a pair of racks and the staff to support it.

Unless you're paying attention to the bleeding edge of the server market, to its costs (better yet features and affordability) this sort of mistake is easy to make.

The article is by someone who does this sort of thing for fun, and views/attention, and im glad for it... it's fun to watch. But it's sad when this same sort of misunderstanding happens in professional settings, and it happens a lot.

montebicyclelo 6 hours ago [-]

Yeah... Looks like can get about $1/hr for 10 small VMs, ($0.10 per VM).

So for $3000, that's 3000 hours, or 125 days, (if just wastefully leave them on all the time, instead of turning them on when needed).

Say you wanted to play around for a couple of hours, that's like.. $3.

(That's assuming there's no bonus for joining / free tier, too.)

wongarsu 5 hours ago [-]

The VMs quickly get expensive if you leave them running though.

The desktop equivalent of your 10 T3 Micro instances is about $600 if you buy new. For example a Lenovo ThinkCentre M75q Gen 2 Tiny 11JN009QGE has 8x3.2GHz processor with hyperthreading. That's 16 virtual cores compared to the 20 vcpus of the T3 instances, but with much faster cores. And 16GB RAM allows you to match the 1GB per instance.

If you don't have anything and feel generous throw in another $200 for a good monitor and keyboard plus mouse. But you can get a used crap monitor for $20. I'd give you one for free just to be rid of it.

That's a total of $800, or 33 days of forgetting to shut down the 10 VMs. Maybe half that if you buy used.

Granted not everyone has $800 or even $400 to drop on hobby projects, renting VMs often does make sense

verdverm 6 hours ago [-]

You can rent a beefy vm with an H100 for $1.50 / hr

I regularly rent this for a few hours at a time for learning and prototyping

Y_Y 6 hours ago [-]

[flagged]

verdverm 5 hours ago [-]

I'll take the H1/200s over a vehicle any day of the week

pinkgolem 3 hours ago [-]

Are you comparing 10 VM with 1 shared core with a 144 core solution?

sam1r 4 hours ago [-]

A great way to do this is… is with a brand new Aws account, which will give you 1 year free across all services with reasonable limits.

jahsome 3 hours ago [-]

Oracle's free tier is pretty generous too.

aprdm 6 hours ago [-]

That really depends on what you want to learn and how deep. If you're automating things before the hypervisor comes online or there's an OS running (e.g: working on datacenter automation, bare metal as a service) you will have many gaps

leoc 5 hours ago [-]

If you want to run something like GNS3 network simulation on a hosting service's hardware you'll either have to deal with hiring a bare-metal server or deal with nested virtualisation on other people's VM setups. Network simulation absolutely drinks RAM, too, so just filling an old Xeon with RAM starts to look very attractive in comparison to cloud providers who treat it an expensive upsell.

motorest 2 hours ago [-]

> The cost effective way to do it is in the cloud.

This. Some cloud providers offer VMs with 4GB RAM and 2 virtual cores for less than $4/month. If your goal is to learn how to work with clusters, nothing beats firing up a dozen VMs when it suits your fancy, and shut them down when playtime is over. This is something you can pull off in a couple of minutes with something like an Ansible script.

cramcgrab 3 hours ago [-]

I don’t know, i keyed this into google Gemini and got pretty far: “ Simulate an AWS AI cluster, command line interface. For each command supply the appropriate AWS AI cluster response”

pinkgolem 4 hours ago [-]

For learning I feel much safer setting everything up locally, worst case I have to reinstall my system.

In the cloud, worst case I have a bill over 5-6 digits.

And I know my ADD, 2 is not super unlikely.

nsxwolf 6 hours ago [-]

That isn’t fun. I have a TI-99/4A in my office hooked up to a raspberry pi so it can use the internet. Why? Because it’s fun. I like to touch and see the things even though it’s all so silly.

bakugo 6 hours ago [-]

It heavily depends on the use case. For these AI setups, you're completely correct, because the people who talk about how amazing it is to run a <100B model at home almost never actually end up using it for anything real (mostly because these small models aren't actually very good) and are doing it purely for the novelty.

But if you're someone like me who intends to actively use the hardware for real-world purposes, the cloud often simply can't compete on price. At home, I have a mini PC with a 5600G, 32GB of RAM, and a few TBs of NVME storage. The entire thing cost less than $600 a few years ago, and consumes around 20W of power on average.

Even on the cheapest cloud providers available, an equivalent setup would exceed that price in less than half a year. SSD storage in particular is disproportionately expensive on the cloud. For small VMs that don't need much storage, it does make sense, but as soon as you scale up, cloud prices quickly start ballooning.

swiftcoder 3 hours ago [-]

Plus you still have access to the whole lot when your ISP goes down (maybe less of a problem than it used to be, but not unheard of)

mattbillenstein 5 hours ago [-]

LOL, no

newsclues 5 hours ago [-]

Text and reference books are free at the library.

You don’t need hardware to learn. Sure it helps but you can learn from a book and pen and paper exercises.

trenchpilgrim 5 hours ago [-]

I disagree. Most of what I've learned about systems comes from debugging the weird issues that only happen on real systems, especially real hardware. The book knowledge is like, 20-30% of it.

titanomachy 5 hours ago [-]

Agreed, I don't think I'd hire a datacenter engineer whose experience consisted of reading books and doing "pen and paper exercises".

glitchc 6 hours ago [-]

I did some calculations on this. Procuring a Mac Studio with the latest Mx Ultra processor and maxing out the memory seems to be the most cost effective way to break into 100b+ parameter model space.

teleforce 5 hours ago [-]

Not quite, as it stands now the most cost effective way is most likely framework desktop or similar system for example HP G1a laptop/PC [1],[2].

[1] The Framework Desktop is a beast:

https://news.ycombinator.com/item?id=44841262

[2] HP ZBook Ultra:

https://www.hp.com/us-en/workstations/zbook-ultra.html

GeekyBear 5 hours ago [-]

Now that we know that Apple has added tensor units to the GPU cores the M5 series of chips will be using, I might be asking myself if I couldn't wait a bit.

t1amat 3 hours ago [-]

This is the right take. You might be able to get decent (2-3x less than a GPU rig) token generation, which is adequate, but your prompt processing speeds are more like 50-100x slower. A hardware solution is needed to make long context actually usable on a Mac.

randomgermanguy 6 hours ago [-]

Depends on how heavy one wants to go with the quants (for Q6-Q4 the AMD Ryzen AI MAX chips seem better/cheaper way to get started).

Also the Mac Studio is a bit hampered by its low compute-power, meaning you really can't use a 100b+ dense model, only MoE feasibly without getting multi minute prompt-processing times (assuming 500+ tokens etc.)

GeekyBear 5 hours ago [-]

Given the RAM limitations of the first gen Ryzen AI MAX, you have no choice but to go heavy on the quantization of the larger LLMs on that hardware.

mercutio2 5 hours ago [-]

Huh? My maxed out Mac Studio gets 60-100 tokens per second on 120B models, with latency on the order of 2 seconds.

It was expensive, but slow it is not for small queries.

Now, if I want to bump the context window to something huge, it does take 10-20 seconds to respond for agent tasks, but it’s only 2-3x slower than paid cloud models, in my experience.

Still a little annoying, and the models aren’t as good, but the gap isn’t nearly as big as you imply, at least for me.

zargon 4 hours ago [-]

GPT OSS 120B only has 5B active parameters. GP specifically said dense models, not MoE.

3 hours ago [-]

EnPissant 3 hours ago [-]

I think the Mac Studio is a poor fit for gpt-oss-120b.

On my 96 GB DDR5-6000 + RTX 5090 box, I see ~20s prefill latency for a 65k prompt and ~40 tok/s decode, even with most experts on the CPU.

A Mac Studio will decode faster than that, but prefill will be 10s of times slower due to much lower raw compute vs a high-end GPU. For long prompts that can make it effectively unusable. That’s what the parent was getting at. You will hit this long before 65k context.

If you have time, could you share numbers for something like:

llama-bench -m <path-to-gpt-oss-120b.gguf> -ngl 999 -fa 1 --mmap 0 -p 65536 -b 4096 -ub 4096

Edit: The only Mac Studio pp65536 datapoint I’ve found is this Reddit thread:

https://old.reddit.com/r/LocalLLaMA/comments/1jq13ik/mac_stu ...

They report ~43.2 minutes prefill latency for a 65k prompt on a 2-bit DeepSeek quant. Gpt-oss-120b should be faster than that, but still very slow.

the8472 6 hours ago [-]

You could try getting a DGX Thor devkit with 128GB unified memory. Cheaper than the 96GB mac studio and more FLOPs.

glitchc 1 hours ago [-]

Yeah but slower memory compared to the M3 Ultra. There's a big difference in memory bandwidth, which seems to be a driving factor for inferencing. Training on the other hand, it's probably a lot faster.

llm_nerd 5 hours ago [-]

The next generation M5 should bring the matmul functionality seen on the A19 Pro to the desktop SoC's GPU -- "tensor" cores, in essence -- and will dramatically improve the running of most AI models on those machine.

Right now the Macs are viable purely because you can get massive amounts of unified memory. Be pretty great when they have the massive matrix FMA performance to complement it.

Palomides 6 hours ago [-]

even a single new mac mini will beat this cluster on any metric, including cost

eesmith 6 hours ago [-]

Geerling links to last month's essay on a Frameboard cluster, at https://www.jeffgeerling.com/blog/2025/i-clustered-four-fram... . In it he writes 'An M3 Ultra Mac Studio with 512 gigs of RAM will set you back just under $10,000, and it's way faster, at 16 tokens per second.' for 671B parameters, that is, that M3 is at least 3x the performance of the other three systems.

encom 3 hours ago [-]

>Mac

>cost effective

lmao

vlovich123 6 hours ago [-]

I’d say it’s inconclusive. For traditional compute it wins on power and cost (it’ll always lose on space). The inference is noted to not be able to use the GPU due to llama.cpp’s vulkan backend AND that clustering software in llama.cpp is bad. I’d say it’s probably still going to be worse for AI but it’s inconclusive where it could be due to the software immaturity (ie not worth it today but could be with better software)

tracker1 4 hours ago [-]

But will there be a CM6 while you're waiting for the software to improve?

randomNumber7 3 hours ago [-]

What I think is strange with stuff like this that you should be able to come to that conclusion without technical knowledge. Just the fact that everyone runs AI on GPUs and NVIDIAs stock skyrocketed since the AI boom should tell you s.th..

Did OP really think his fellow humans are that moronic that they just didn't find out you can plug in together a cuple of rasperri pis?

rustyminnow 2 hours ago [-]

Nobody thought an RPI cluster would ever be competitive, and Geerling never expected anybody would. But it's fun to play "what if" and then make the thing just to see how it stacks up and that's his job. Any implication or suggestion of this being a good idea is just part of the story telling.

Asraelite 3 hours ago [-]

> I don’t know if anyone building a Pi cluster actually goes into it thinking it’s going to be a cost effective endeavor, do they?

Some Raspberry Pi products are sold at a loss, so I could see how it's in the realm of possibility.

llm_nerd 6 hours ago [-]

If you assume that the author did this to have content for his blog and his YouTube channel, it makes much more sense. Going back to the well with a "I regret" entry allows for extra exploiting of a pretty dubious venture.

YouTube is absolute jam packed full of people pitching home "lab" sort of AI buildouts that are just catastrophically ill-advised, but it yields content that seems to be a big draw. For instance Alex Ziskind's content. I worry that people are actually dumping thousands to have poor performing ultra-quantized local AIs that will have zero comparative value.

philipwhiuk 6 hours ago [-]

I doubt anyone does this seriously.

nerdsniper 6 hours ago [-]

I sure hope no one does this seriously expecting to save some money. I enjoy the videos on "catastrophically ill-advised" build-outs. My primary curiosities that get satisfied by them are:

1) How much worse / more expensive are they than a conventional solution?

2) What kinds of weird esoteric issues pop up and how do they get solved (e.g. the resizable BAR issue for GPU's attached to RPi's PCIe slot)

TZubiri 5 hours ago [-]

Fun fact, a raspberry pi does not have a built in Real Time Clock with its own battery, so it relies on network clocks to keep the time.

Another fun fact, the network module of the pi is actually connected to the USB bus, so there's some overhead as well as a throughput limitation.

Fun fact, the Pi does not have a power button, relying on software to shut down cleanly. If you lose access to the machine, it's not possible to avoid corrupted states on the disk.

Despite all of this, if you want to self host some website, the raspberry pi is still an amazingly cost effective choice, from anywhere between 2 to 20000 monthly users, one pi will be overprovisioned. And you can even get an absolutely overkill redundant pi as a failover, but still a single pi can reach 365 days of uptime with no problem, and as long as you don't reboot or lose power or lose internet, you can achieve more than a couple of nines of reliability.

But if you are thinking of a third, much less a 10th raspberry pi, you are probably scaling the wrong way, way before you reach the point where a quantity matters ( a third machine), it becomes cost effective to upgrade the quality of your one or two machines.

On the embedded side it's the same story, these are great for prototyping, but you are not going to order 10k and sell them in production, maybe a small 100 test batch? But you will optimize and make your own PCB before a mass batch.

alias_neo 5 hours ago [-]

> the raspberry pi is still an amazingly cost effective choice

It's really not though. I've been a Pi user and fan since it was first announced, and I have dozens of them, so I'm not hating on RPi here; we did the maths some time back here on HN when something else Pi related came up.

If you go for a Pi5 with say 8GB RAM, by the time you factor in an SSD + HAT + PSU + Case + Cooler (+ maybe a uSD), you're actually already in mini-PC price territory and you can get something much more capable and feature complete for about the same price, or for a few £ more, something significantly more capable, better CPU, iGPU, you'll get an RTC, proper networking, faster storage, more RAM, better cooling, etc, etc, and you won't be using much more electricity either.

I went this route myself and have figuratively and literally shelved a bunch of Pis by replacing them with a MiniPC.

My conclusion, for my own use, after a decade of RPi use, is that a cheap mini PC is the better option these days for hosting/services/server duty and Pis are better for making/tinkering/GPIO related stuff, even size isn't a winner for the Pi any more with the size of some of the mini-PCs on the market.

barnas2 2 hours ago [-]

> SSD + HAT + PSU + Case + Cooler (+ maybe a uSD)

The only 100% required thing on there is some sort of power supply, and an SD card, and I suspect a lot of people have a spare USB-C cable and brick lying around. A cooler is only recommended if you're going to be putting it under sustained CPU load, and they're like $10 on Amazon.

sjsdaiuasgdia 1 hours ago [-]

> a spare USB-C cable and brick lying around

Particularly with Pi 5, any old brick that might be hanging around has a fair chance at not being able to supply sufficient power.

mrguyorama 2 hours ago [-]

>SSD + HAT + PSU + Case + Cooler

Zero of any of that is needed. The new Pi "works best" with a cooler sure but at standard room temps will be fine for serving web apps and custom projects and things. You do not need an SSD. You do not need a HAT for anything.

Apparently the Pi 5 8gb is $120 though WTF.

What personal web site or web app or project can't run just fine on a Pi Zero 2 though? It's a little RAM starved but performance wise it should be sufficient.

Other than second-hand mini PCs, old laptops also make great home servers. They have built in UPS!

TZubiri 4 hours ago [-]

What do you mean by Cooler? Raspberry pi doesn't need a fan.

Also the other peripherals you consider are irrelevant, since you would need them (or not), in other setups. You can use a pi without a PSU for example. And if you use an SSD, you have to consider that cost in whatever you compare it to.

>I went this route myself and have figuratively and literally shelved a bunch of Pis

>and I have dozens of them,

Reread my post? I meant specifically that Pis are great for the 1 to 2 range. with 3 pis you should change to something else. So I'm saying they are good at the 100$-200$ budget, but bad anywhere above that.

J_McQuade 3 hours ago [-]

> What do you mean by Cooler? Raspberry pi doesn't need a fan.

From the official website:

> Does Raspberry Pi 5 need active cooling?

> Raspberry Pi 5 is faster and more powerful than prior-generation Raspberry Pis, and like most general-purpose computers, it will perform best with active cooling.

TZubiri 2 hours ago [-]

Oh. I haven't used 5, i did 3 and 4

Sohcahtoa82 2 hours ago [-]

> What do you mean by Cooler? Raspberry pi doesn't need a fan.

Starting with the Pi 4, they started saying that a cooler isn't required, but that it may thermal throttle without one if you keep the CPU pegged.

geerlingguy 4 hours ago [-]

The Pi 5 / CM5 / Pi 500 series does have a built-in RTC now, though most models require you to buy a separate RTC battery to plug into the RTC battery jack.

stuxnet79 5 hours ago [-]

> Fun fact, a raspberry pi does not have a built in Real Time Clock with its own battery, so it relies on network clocks to keep the time.

> Another fun fact, the network module of the pi is actually connected to the USB bus, so there's some overhead as well as a throughput limitation.

> Fun fact, the Pi does not have a power button, relying on software to shut down cleanly. If you lose access to the machine, it's not possible to avoid corrupted states on the disk.

With all these caveats in mind, a raspberry pi seems to be an incredibly poor choice for distributed computing

CamperBob2 4 hours ago [-]

With all these caveats in mind, a raspberry pi seems to be an incredibly poor choice for distributed computing

Exactly. This build sounds like the proverbial "1024 chickens" in Seymour Cray's famous analogy. If nothing else, the communications overhead will eat you alive.

moduspol 6 hours ago [-]

Also cost effective is to buy used rack mount servers from Amazon. They may be out of warranty but you get a lot more horsepower for your buck, and now your VMs don’t have to be small.

Aurornis 6 hours ago [-]

Putting a retired datacenter rack mount server in your house is a great way to learn how unbearably loud a real rack mount datacenter server is.

Tsiklon 5 hours ago [-]

To quote @swiftonsecurity - https://x.com/swiftonsecurity/status/1650223598903382016 ;

> DO NOT TAKE HOME THE FREE 1U SERVER YOU DO NOT WANT THAT ANYWHERE A CLOSET DOOR WILL NOT STOP ITS BANSHEE WAIL TO THE DARK LORD AN UNHOLY CONDUIT TO THE DEPTHS OF INSOMNIA BINDING DARKNESS TO EVEN THE DAY

buildbot 4 hours ago [-]

This 1000%; and some 1us are extra 666. I had a sparc t2000 at one point, it was so much louder than a 1u Supermicro. Or whatever was in Microsoft HW labs, those you could hear from multiple hallways over… There were non optional earplugs at the doors.

moduspol 3 hours ago [-]

True! They aren't quiet. I keep mine in a well-ventilated room that doesn't typically have people in it.

J_Shelby_J 3 hours ago [-]

Buy a 3/4u case for $100 and put whatever board you want in it with standard PC fans and a decent cpu cooler. Dead silent.

tempest_ 6 hours ago [-]

ahah and pricey power wise.

Currently the cloud providers are dumping second gen xeon scalables and those things are pigs when it comes to power use.

Sound wise its like someone running a hair dryer at full speed all the time and it can be louder under load.

_boffin_ 4 hours ago [-]

Not true. Have one running in the closet and never hear it.

ComputerGuru 4 hours ago [-]

Only if it’s a 1U. 2U units idle at silent.

Y_Y 6 hours ago [-]

If you're following this path, make sure to use the finest traditional server rack that money can buy: https://www.ikea.com/ie/en/p/lack-side-table-white-30449908/

allanrbo 6 hours ago [-]

No, again, just run VMs on your desktop/laptop. The software doesn't know or care if it's a rack mounted machine.

3 hours ago [-]

wccrawford 5 hours ago [-]

Geerling's titles have been increasingly click-bait for a while now. It's pretty sad, because I like his content, but hate the click-bait BS.

jonathanlydall 3 hours ago [-]

If it makes an appreciable difference to how much money he makes on YouTube then I can’t begrudge him for doing it.

Don’t hate the player, hate the game.

mrguyorama 2 hours ago [-]

Blame Youtube. They are the ones that run a purposely zero sum and adversarial system for directing attention at your videos. If he doesn't have a high enough click rate on his videos, Youtube will literally stop showing them to people, even subscribers.

Youtube demonstrably wants clickbait titles and thumbnails. They built tooling to automatically A/B test titles and thumbnails for you.

Youtube could fix this and stop it if they want, but that might lose them 1% of business so they never will.

They love that you blame creators for this market dynamic instead of the people who literally create the market dynamic.

kolbe 5 hours ago [-]

The author, Jeff Geerling, is a very intelligent person. He has more experience with using niche hardware than almost anyone on earth. If he does something, there's usually a good a priori rationale for it.

buildbot 5 hours ago [-]

Jeff is a good person/blogger and does interesting projects but more experience with niche hardware than literally anyone is a stretch.

Like what about the people who maintain the alpha/sparc/parisc linux kernels? Or the designers behind idk tilera or tenstorrent hardware.

geerlingguy 4 hours ago [-]

I was just at VCF Midwest this past weekend, and I can assure you I am on some of the lower echelons of people who know about niche hardware.

I do get to see and play with a lot of interesting systems, but for most of them, I only get to go just under surface-level. It's a lot different seeing someone who's reverse engineered every aspect of an IBM PC110, or someone who's restored an entire old mainframe that was in storage for years... or the group of people who built an entire functional telephone exchange with equipment spread over 50 years (including a cell network, a billing system, etc.).

phatfish 4 hours ago [-]

Youtubers have armies of sycophants (check their video comments if you dare). Not saying they even court them, something to do with video building a stronger parasocial relationship than a text blog I think.

kolbe 4 hours ago [-]

> more experience with niche hardware than literally anyone is a stretch.

This is why I said "almost anyone." If I changed your words, I could disagree with you as well.

AceJohnny2 3 hours ago [-]

> If he does something, there's usually a good a priori rationale for it.

I greatly respect Jeff's work, but he's a professional YouTuber, so his projects will necessarily lean towards clickbait and riding trends (Jeff, I don't mean this as criticism!) He's been a great advocate for doing interesting things with RasPis, but "interesting" != "rational"

amelius 4 hours ago [-]

Is a Pi still considered "niche" hardware?

ww520 5 hours ago [-]

Now. Imagine a Beawulf of these...

cosarara 6 hours ago [-]

> Compared to the $8,000 Framework Cluster I benchmarked last month, this cluster is about 4 times faster:

Slower. 4 times slower.

teleforce 5 hours ago [-]

That's definitely a typo because I've to read the sentence 3 times from the article still cannot make a sense until I saw the figure.

TL;DR, just buy one framework desktop and it's better than the Pi AI cluster of the OP in every single performance metrics including cost, performance, efficiency, headache, etc.

geerlingguy 5 hours ago [-]

Oops, fixed the typo! Thanks.

And regarding efficiency, in CPU-bound tasks, the Pi cluster is slightly more efficient. (Even A76 cores on a 16nm node still do well there, depending on the code being run).

markx2 6 hours ago [-]

> "But if you're on the blog, you're probably not the type to sit through a video anyway. So moving on..."

Thank you!

zeristor 50 minutes ago [-]

If it worked out well he would be trumpeting the glory.

Not so good, and this is the sort of title. you need to bring the punters in for YouTube.

I don't mean to sound too cynical, I appreciate Jeff's videos, just wanted to point out that if you've spent money and time on content you can either ditch it or make a regret video.

Just so long as the thumbnails don't have an arrow on them I'm happy.

dragontamer 1 hours ago [-]

Why build a cluster?

I believe the Rasp Pi cluster is one of the cheapest multi node / MPI machines you can buy. That's useful even if it is t fast. You need to practice the programming interfaces, not necessarily make a fast computer.

However, NUMA is also a big deal. The various AMD Threadrippers with multi-die memory controllers are better on this regards. Maybe the aging Threadrippers 1950x, yes it's much slower than modern chips but the NUMA issues are exaggerated (especially poor) on this old architecture.

That exaggerates the effects of good NUMA and now you as a programmer can get more NUMA skills.

Of course, the best plan is to spend $20,000,000++ on your own custom NUMA nodes cluster out of EPYCs or something.

-------

But no. The best supercomputers are your local supercomputers that you should rent some time from. You need a local box to see various issues and learn to practice programming.

lumost 6 hours ago [-]

I don’t really get why anyone would be buying ai compute unless A) to your goal is to rent out the compute B) no vendor can rent you enough compute when you need it C) you have an exotic funding arrangement that makes compute capex cheap and opex expensive.

Unless you can keep your compute at 70% average utilization for 5 years - you will never save money purchasing your hardware compared to renting it.

horsawlarway 6 hours ago [-]

There are an absolutely stunning number of ways to lose a whole bunch of money very quickly if you're not careful renting compute.

$3,000 is well under many "oopsie billsies" from cloud providers.

And that's outside of the whole "I own it" side of the conversation, where things like latency, control, flexibility, & privacy are all compelling reasons to be willing to spend slightly more.

I still run quite a number of LLM services locally on hardware I bought mid-covid (right around 3k for a dual RTX3090 + 124gb system ram machine).

It's not that much more than you'd spend if you're building a gaming machine anyways, and the nifty thing about hardware I own is that it usually doesn't stop working at the 5 year mark. I have desktops from pre-2008 still running in my basement. 5 year amortization might have the cloud win, but the cloud stops winning long before most hardware dies. Just be careful about watts.

Personally - I don't think pi clusters really make much sense. I love them individually for certain things, and with a management plane like k8s, they're useful little devices to have around. But I definitely wouldn't plan to get good performance from 10 of them in a box. Much better off spending roughly the same money for a single large machine unless you're intentionally trying to learn.

0xbadcafebee 4 hours ago [-]

You could also spill a can of Mountain Dew over the $8,000 AI rig next to you. Oopsies can happen anywhere...

If it's for personal use, do whatever... there's nothing wrong with buying a $60,000 sports car if you get a lot of enjoyment out of driving it. (you could also lease if you want to trade up to the "faster model" next year) For business, renting (and managed hosting) makes more sense.

lumost 4 hours ago [-]

At the local/hobby scale, it’s very much a “do whatever” area. But I can rent a 4090 for a little under a dollar an hour, and I can rent a b200 for $6, it’s very hard to claim I’ll use 10k+ hours of gpu time on a b2000 I buy for myself.

horsawlarway 3 hours ago [-]

So 83 days for payback at the 2k sticker price for the 4090? Sounds like a good time to buy a 4090...

Like, if you buy that card it can still be processing things for you a decade from now.

Or you can get 3 months of rental time.

---

And yes, there is definitely a point where renting makes more sense because the capital outlay becomes prohibitive, and you're not reasonably capable of consuming the full output of the hardware.

But the cloud is a huge cash cow for a reason... You're paying exorbitant prices to rent compared to the cost of ownership.

HenryMulligan 6 hours ago [-]

Data privacy and security don't matter? My secondhand RTX 3060 would buy a lot of cloud credits, but I don't want tons of highly personal data sent to the cloud. I can't imagine how it would be for healthcare and finance, at least if they properly shepherded their data.

tern 6 hours ago [-]

For most people, no, privacy does not matter in this sense, and "security" would only be a relevant term if there was a pre-existing adversarial situation

a2128 6 hours ago [-]

Why do people buy gaming PCs when it's much cheaper to use streaming platforms? I think the two cases share practically the same parallels in terms of reliability, availability, restrictions, flexibility, sovereignty, privacy, etc.

But also when it comes to Vast/RunPod it can be annoying and genuinely become more expensive if you have to rent 2x the number of hours because you constantly have to upload and download data, checkpoints, continuous storage costs, transfer data to another server because the GPU is no longer available, etc. It's just less of a headache if you have an always available GPU with a hard drive plugged into the machine and that's it

ripdog 3 hours ago [-]

Because latency matters when gaming in a way which doesn't matter with AI inference?

Plus cloud gaming is always limited in range of games, there are restrictions on how you can use the PC (like no modding and no swapping savegames in or out).

cellis 4 hours ago [-]

And lest we forget! Forgetting to turn it off!

causal 6 hours ago [-]

1) Data proximity (if you have a lot of data, egress fees add up)

2) Hardware optimization (the exact GPU you want may not always be available for some providers)

3) Not subject to price changes

4) Not subject to sudden Terms of Use changes

5) Know exactly who is responsible if something isn't working.

6) Sense of pride and accomplishment + Heating in the winter

justinrubek 6 hours ago [-]

At some point, the work has to actually be done rather than shuffling the details off to someone else.

2OEH8eoCRo0 6 hours ago [-]

I don't get why anyone would hack on and have fun with unique hardware either /s

seanw444 6 hours ago [-]

It's also not always just about fun or cost effectiveness. Taking the infrastructure into your own hands is a nice way to know that you're not being taken advantage of, and you only have yourself to rely on to make the thing work. Freedom and self-reliance, in short.

MomsAVoxell 2 hours ago [-]

I have a SOPINE cluster board that I'd quite like to get booted up and pressed into some sort of useful service.

I think the biggest problem with cluster products is that they just don't work out of the box. Vendors haven't really done the "last 2%" of development required to make them viable - its left to us purchasers to get the final bits in place.

Still, it'll make a fun distributed computing experimental platform some day.

Just like the Inmos Transputer I've got somewhere, sitting in a box, waiting for a power supply ..

xnx 6 hours ago [-]

Fun project. Was the author hoping for cost effective performance?!

I assumed this was a novelty, like building a RAID array out of floppy drives.

LTL_FTC 6 hours ago [-]

The author is a YouTuber and projects like these pay for themselves with the views they garner. Even the title is designed for engagement.

leptons 6 hours ago [-]

A lot of people don't understand the performance limits of the Raspberry Pi. It's a great little platform for some things, but it isn't really fit for half the use cases I've seen.

Our_Benefactors 6 hours ago [-]

This was my impression as well, the bit about GPU incompatibility with llama.cpp made me think he was in over his head.

noelwelsh 6 hours ago [-]

I'd love to understand the economics here. $3000 purely for fun seems like a lot. $3000 for promotion of a channel? consulting? seems reasonable.

philipwhiuk 5 hours ago [-]

Jeff has a million YouTube subscribers, gets $2000 a month from Patreon and has 200 GitHub sponsors.

The economics of spending $3,000 on a video probably work out fine.

geerlingguy 5 hours ago [-]

It's definitely a stretch for my per-video budget, but I did want to have a 'maxed out' Pi cluster for future testing as well.

A lot of people (here, Reddit, elsewhere) speculate about how good/bad a certain platform or idea is. Since I have the means to actually test how good or bad something is, I try to justify the hardware costs for it.

Similar to testing various graphics cards on Pis, I've probably spent a good $10,000 on those projects over the past few years, but now I have a version of every major GPU from the past 3 generations to test on, not only on Pi, but other Arm platforms like Ampere and Snapdragon.

Which is fun, but also educational; I've learned a lot about inference, GPU memory access, cache coherency, the PCIe bus...

So a lot of intangibles, many of which never make it directly into a blog post or video. (Similar story with my time experiments).

djhworld 2 hours ago [-]

I watched the video and enjoyed it, I think the most interesting part to me was running the distributed Llama.cpp, Jeff mentioned it seems to work in a linear fashion where processing would hop between nodes.

Which got me thinking about how do these frontier AI models work when you (as a user) run a query. Does your query just go to one big box with lots of GPUs attached and it runs in a similar way, but much faster? Do these AI companies write about how their infra works?

numpad0 4 hours ago [-]

> goes round-robin style asking each node to perform its prompt processing, then token generation.

Yeah, this is a now-long-wide-known issue with LLM processing. This can be remediated so that all nodes split computation, but then you'll come back to classical supercomputing problem of node interconnect latency/bandwidth bottlenecks.

It looks to me that many such interconnect simulate Ethernet cards. I wonder if it can be recreated using the M.2 slot rather than using that slot for node-local data, and cost effectively so(like cheaper than bunch of 10GE cards and short DACs).

bityard 4 hours ago [-]

I can think of SOME decent--if suboptimal--reasons to build an RPi cluster, but it never would have occurred to me to try running an LLM on one. Even if you could cluster hundreds or thousands of Pis, it's still just entirely the wrong architecture for running an AI model as they are currently built.

leakycap 2 hours ago [-]

I dipped my toe into cluster computing when rendering a MAXON Cinema 4D project across a lab of ~30 dual G5 Power Macs.

Quickly learned that there is so much more to manage when you split a task up across systems, even when the system (like Cinema 4D) is designed for it.

wcchandler 4 hours ago [-]

I was just exploring Pi’s and AI hats, so this post is appreciatively timely.

I’m finally at the point where I can dedicate time for building an AI with a specific use case in mind. I play competitive paintball and would like to utilize AI for a handful of things. Specifically hit detections in video streams. Pi’s were my natural choice simply because of low cost of entry and wide range of supported products to get a PoV running. I even thought about reaching out to Jeff and asking his input.

This post didn’t change my direction too much, but it did help level set some realistic expectations. So thanks for sharing.

asimpleusecase 4 hours ago [-]

Great write up! This is nice nerd catnip. And sharing a failed project teaches so much. Please, more should share the projects that absorbed tons of time yet never really delivered.

5 hours ago [-]

AlfredBarnes 6 hours ago [-]

My pi's are just an easy onramp for me to have a functional NAS, PIHole, and webcam security.

Not at all the best, but they were cheap. If i WANTED the best or reliable, i'd actually buy real products.

aprdm 6 hours ago [-]

Love Jeff's ansible roles/playbooks and his cluster building ! Quite interesting, I should reserve some time to play with a Pi cluster and ansible, sounds fun

hamonrye 5 hours ago [-]

Strange quirks for language syntax. I read about a parsing system that was distilled towards being-towards this particular species of silver-tail that existed below Golden Gate bridge. Godwin's Law?

bravetraveler 6 hours ago [-]

Wow that's a lot of scratch for... scratch. Pays for itself, I'm sure: effective bait :)

'Worth it any more'? At this size, never. A Pi is a Pi is a Pi!

A few are fine for toying around; beyond that, hah. Price:perf is rough, does not improve with multiplication [of units, cost, or complexity].

deater 6 hours ago [-]

as someone who has built various raspberry pi clusters over the years (I even got an academic paper out of one) the big shame is that as far as I know it's still virtually impossible to use the fairly powerful GPUs they have for GPGPU work

elzbardico 6 hours ago [-]

Frankly, always thought about Pi Clusters as a nerd indulgence, something to play, not to do serious work.

NitpickLawyer 6 hours ago [-]

It reminds me of the Beowulf clusters of the 90s-2000s, that were all the rage at some point, then slowly lost ground... I remember many friends tinkering with some variant of those, we had one in Uni, and there were even some linux distros dedicated to the concept.

gary_0 6 hours ago [-]

Oh yeah, the "imagine a Beowulf cluster of these" Slashdot meme! I miss those days. At least the "can it run Doom?" meme is still alive and kicking.

unregistereddev 5 hours ago [-]

Ditto! It reminded me of the time in college when I built a Beowulf cluster from recently-retired Pentium II desktops.

Was it fast? No. But that wasn't the point. I was learning about distributed computing.

dekhn 3 hours ago [-]

Beowulf style clusters went on to dominate supercomputing.

randomgermanguy 6 hours ago [-]

I think the only exception is specifically for studying network/communciation-topologies. I've seen a couple clusters (ca. 10-50 Pi's) in universities for both research and teaching.

ofrzeta 4 hours ago [-]

There are so many network emulators you can use, such as Mininet or GNS3.

JambalayaJimbo 3 hours ago [-]

I'm sure pedagogically speaking it's better to use physical devices

devmor 6 hours ago [-]

After a few years of experience with them I agree for the most part. They are great for individual projects and even as individual servers for certain loads, but once you start clustering them you will probably get better results from a purpose built computer in the same price range as multiple pis.

drillsteps5 6 hours ago [-]

If he was building compute device for LLM inference specifically it would help to check in advance what that would entail. Like GPU requirement. Which putting bunch of RPis in the cluster doesn't help one bit.

Maybe I'm missing something.

hn_throw_250915 6 hours ago [-]

I read through it and it’s amusing but along with the title being something I’d receive in email from a newsletter mailing list I’ve never subscribed to (hoping it has an unsubscribe link at the bottom), there’s nothing really of hacker curiosity here to keep me hooked. It’s shallow and appeals to some LCD “I did the thing with the stuff and the results will shock you because of how obvious they are now click here” mentality. Vainposting at its most average. The Mac restoration video was somewhat easier to sit through if only because the picture quality beats out a handful of other YT videos doing the exactly same thing as I’m holding back a jaw grating wince of watching someone butchering a board with poor knowledge of soldering iron practice, so YMMV? Back to hackaday for me I think. I’m not here to read submarine resumes of people applying to work at Linus Tech Tips.

paxys 6 hours ago [-]

This is just the evolution of clickbait titles. The only thing missing is a thumbnail of an AI generated raspberry pi cluster with a massive arrow pointing to it and the words "not worth it!!"

Joker_vD 6 hours ago [-]

There also needs to be a Face Screaming in Fear Emoji plastered on the other side of it.

Havoc 5 hours ago [-]

Have a bunch of Pis too, but realized I can use them to create a high availability control plane for a k8s cluster. Pi4s are entirely adequate for that

owaislone 5 hours ago [-]

What is the ideal or realistic setup for a home lab to power big enough models running locally and service ~2-3 users at a time?

yalogin 4 hours ago [-]

Can the LLMs be deployed split across multiple Pis? I thought it was not possible, may be I am not caught up.

zamadatix 5 hours ago [-]

The article focuses on compute performance but I wonder if that was ever the bottleneck considering the memory bandwidth involved.

bearjaws 5 hours ago [-]

Am I the only one who looks at both the Pi Cluster and the Framework PC and wonders how they are both slower and less cost effective than a MacBook Pro M4 Max? 88 token/s on a 2.3b model is not exactly great, most likely you will want a 32 or 70b model.

amelius 5 hours ago [-]

Ok, what are the back-of-the-envelope computations that he should have done before starting to build this?

geerlingguy 5 hours ago [-]

Pi memory bandwidth is less than 10 GB/sec, so AI use cases will be extremely limited. Network I/O is maximum of 1 Gbps (or more if you do some unholy thing with M.2 NICs), so that also limits maximum networked performance.

But still can be decent for HPC learning, CI testing, or isolated multi-node smaller-app performance.

teaearlgraycold 3 hours ago [-]

Having seen someone else build and tinker with a 7 node Pi cluster it seems like an absolute waste of time. 1Gb/s networking, PCIe 3.0 x1, slow RAM, slow CPU. And with all the hats and accessories needed it’s not that good of a deal.

Getting some NUC-like machines makes a lot more sense to me. You’ll get 2.5Gb/s Ethernet at the least and way more FlOPS as well.

Drblessing 5 hours ago [-]

The bee-link AI max+ is the best value AI pc right now.

stirfish 5 hours ago [-]

I came here to ask about these. You like yours?

nodesocket 2 hours ago [-]

I feel like AI is not the right workload for this. I run a 4x node Kubernetes cluster (1x control and 3x workers) and it’s surprising good. While performance is certainly not state of the art by any stretch of the imagination it works great for HomeLab’in. Portainer, Glance, OpenSpeed Test, Semaphore UI, Uptime Kuma, and Home Assistant. Many small and lightweight containers is where a cluster such as this can shine.

system2 3 hours ago [-]

Look at it this way, you had fun playing with an expensive toy. A single $1000 GPU could 10x that, even when you were building those (not today's GPUs). You probably already knew it. But I gotta admit, the rig looks very nice.

iLoveOncall 4 hours ago [-]

> With 160 GB of total RAM, shared by the CPU and iGPU, this could be a small, efficient AI Cluster, right? Well, you'd think.

I don't know anyone who would think this actually.

geerlingguy 3 hours ago [-]

Most people who haven't actually self-hosted AI models would think this—for some reason people still think RAM is RAM, and they don't think about specs like memory speed, whether the full amount of RAM is shared with the GPU, etc.

You'd be surprised by the number of emails, Instagram DMs, YouTube comments, etc. I get—even after explicitly showing how bad a system is at a certain task—asking if a Pi would be good for X, or if they could run ChatGPT on their laptop...

imtringued 6 hours ago [-]

Oh come on Jeff, you forgot to buy GPUs for your AI cluster. Such a beginner mistake.

All you needed to do is buy 4x xtx 7900 used on ebay and build a four node raspberry pi cluster using the external GPU setup you've come up with in one of your previous blog posts [0].

[0] https://www.jeffgeerling.com/blog/2024/use-external-gpu-on-r...

geerlingguy 5 hours ago [-]

More on that soon... ;)

dbg31415 6 hours ago [-]

Was it cost effective? Meh.

Was it a learning experience?

More importantly, did you have some fun? Just a little? (=

phoronixrly 6 hours ago [-]

> Was it a learning experience?

Also no. The guy's a youtuber

On the other hand, will this make him 100+k views? Yes. It's bait - the perfect combo to attract both the AI crowd and the 'homelab' enthusiasts (of which the bulk are yet to find any use for their raspberry devices)...

aprdm 6 hours ago [-]

He is not a YouTuber. And even if he was - what's the problem ?

Jeff has many useful OSS software used by many companies around the world daily - including mine. What have you created ?

op00to 6 hours ago [-]

"He is not a YouTuber"... what?

https://www.youtube.com/c/JeffGeerling

"978K subscribers 527 videos"

Jeff's had a pattern of embellishing controversies, misrepresenting what people say, and using his platform to create narratives that benefit his content's engagement. This is yet another example of farming outrage to get clicks. I don't understand why people drool over his content so much.

aprdm 5 hours ago [-]

I guess I met Jeff's work on Ansible for DevOps: Server and configuration management for humans which is roughly 10 years old.

I then used many of his ansible playbooks on my day to day job, which paid my bills and made my career progress.

I don't check youtube so I didn't know that he was an "youtuber", I do know his other side and how mucH I have leveraged his content/code in my career

vel0city 6 hours ago [-]

He may also be a good OSS contributor and writer, but he is also a Youtuber. Over 500 videos posted, 175M views, nearly a million subscribers.

Not that its a problem, I don't see why it would inherently be a negative thing. Dude seems to make some good content across a lot of different mediums. Cheers to Jeff.

6 hours ago [-]

IAmBroom 6 hours ago [-]

He absolute is a YouTuber.

https://www.jeffgeerling.com/projects

And the inference is that he is doing this for clicks, i.e. clickbait. The very title is disingenuous.

Your attack on the poster above you is childish.

phoronixrly 6 hours ago [-]

> What have you created ?

Nothing that is not AGPL-licensed, so you and your company haven't taken advantage of it.

I am not sure how this relates to my comment though.

aprdm 6 hours ago [-]

[flagged]

wizzwizz4 5 hours ago [-]

GP is not Jeff Geerling.

m3kw9 4 hours ago [-]

when i saw your tok/s on a shtty model, i said yah.

hendersoon 6 hours ago [-]

I mean, obviously it isn't practical, he got a couple of videos out of it.

WhereIsTheTruth 3 hours ago [-]

brand fanboyism, and vendor locking, the bane of our society

the common denominator is always capital gain

capitalism is the reason why we haven't been able to go back to the moon and build bases there

pmarreck 3 hours ago [-]

and yet spaceX is for-profit and is pushing the boundaries again, after NASA (a not-for-profit government-created monopoly) stagnated.

blanket-blaming capitalism without good reasoning is becoming the new red-flag of "can't think critically"

WhereIsTheTruth 3 hours ago [-]

NASA didn't stagnate, NASA landed humans on the Moon with 1960s technology

private space companies, despite decades of hype and funding, have stagnated by comparison

the fact that SpaceX depends heavily on government contracts just to function is yet another proof: their "innovation" isn't self sustaining, it's underwritten by taxpayer money

are you denying that NASA landed on the Moon?

Elon psyop doesn't work on me, i know who is behind it all, they need a charismatic sales man for the masses, just like Ford, Disney, Reagan and all, masking structural power with a digestible story for the masses

> blanket-blaming capitalism without good reasoning is becoming the new red-flag of "can't think critically"

it's quite the opposite, people unable to take criticism of capitalism, talk about "critical thinking", how is China doing?

curtisszmania 4 hours ago [-]

[dead]

kirito1337 6 hours ago [-]

[dead]

pluto_modadic 5 hours ago [-]

[flagged]

IAmBroom 4 hours ago [-]

I'd support this idea if you provided some measure of evidence. As it is, pure ad hominem.

titaniumtown 5 hours ago [-]

Can you elaborate? I don't know what you are referring to.

stanac 5 hours ago [-]

Quick google search, Jeff has a blog post about abortion, I haven't read it, but it starts with a warning:

> This post is more than 10 years old, I do not delete posts...

https://www.jeffgeerling.com/articles/religion/abortion-case...

I guess GP is referring to that post.

traverseda 5 hours ago [-]

A specific women? Women in general? Is vague-posting back in vogue?

pstuart 5 hours ago [-]

Ewww. Religion is a hell of a drug.

deadbabe 6 hours ago [-]

I really don’t understand the hype over raspberry Pi.

It’s an overrated, overhyped little computer. Like ok it’s small I guess but why is it the default that everyone wants to build something new on? Because it’s cheap? Whatever happened to buy once, cry once? Why not just build an actual powerful rig? For your NAS? For your firewalls? For security cameras? For your local AI agents?

jonatron 5 hours ago [-]

In the category of SBC's, it's pretty much the only one that has good software support, not outdated images made with a bunch of kernel patches for a specific kernel version.

qhwudbebd 5 hours ago [-]

This is certainly the reputation but I'm not sure they deserve it. They've always had the horrible closed-source bootloader with threadx running on the gpu, without a free alternative. At least up to pi4 they weren't bad at linux mainlining, but progress on upstreaming pi5 support has been glacial.

Cf. the various Beagle boards which have mainline linux and u-boot support right from release, together with real open hardware right down to board layouts you can customise. And when you come to manufacture something more than just a dev board, you can actually get the SoC from your normal distributor and drop it on your board - unlike the strange Broadcom SoCs rpi use.

I'm quite a lot more positive about rp2040 and rp2350, where they've at least partially broken free of that Broadcom ball-and-chain.

jonatron 14 minutes ago [-]

This sort of comment is great - it's good to know if the tech situation changes, which you could call Hacker News

heresie-dabord 5 hours ago [-]

> an overrated, overhyped little computer.

No, you are dismissive because you don't care about the use-cases.

The R.Pi 4 , 400, and the 500 are great models. Consider all the advantages together:

i= support for current Debian

ii= stellar community

iii= ease of use (UX), especially for people new to Debian and/or coding and/or Linux

iv= quiet, efficient, low power and passively cooled

v= robust enough to be left running for a long time

There are cheaper, more performant x86 and ARM dev boards and SOCs. But nothing compares to the full set of advantages.

That said, building a $3K A.I. cluster is just a senseless, expensive lark. (^;

theultdev 6 hours ago [-]

I use mine for a plex server.

I don't need to transcode + I need something I can leave on that draws little power.

I have a powerful rig, but the one time I get to turn it off is when I'd need the media server lol.

There's a lot of scenarios where power usage comes into play.

These clusters don't make much sense to me though.

deadbabe 5 hours ago [-]

That’s insane, drawing very little power from an always on server is a solved problem.

geerlingguy 5 hours ago [-]

What's your idea of very little power, though?

I know for many who run SBCs (RK3588, Pi, etc.), very little is 1-2W idle, which is almost nothing (and doesn't even need a heatsink if you can stand some throttling from time to time).

Most of the Intel Mini PCs (which are about the same price, with a little more performance) idle at 4-6W, or more.

lostmsu 5 hours ago [-]

Not to mention there's more cost, effort, and energy effective compute from old laptops.

Unless you have a robot body for your potential RPi, don't buy one.

theultdev 5 hours ago [-]

how many can run on batteries?

it's nice to take it on road trips / into hotels.

can't really imagine hauling a server around.

we probably have different definitions of "very little power".

IAmBroom 4 hours ago [-]

> it's nice to take it on road trips / into hotels.

> can't really imagine hauling a server around.

These two sentences contradict each other.

theultdev 3 hours ago [-]

How?

I can fit a raspberry pi and external ssd in my pocket.

I cannot do that for a server.

I could use a laptop, but simply plugging in a firestick to the hotel tv or a projector when camping is nicer.

WhitneyLand 5 hours ago [-]

“with 160 GB of total RAM, shared by the CPU and iGPU, this could be a small, efficient AI Cluster, right? Well, you'd think.”

No, I wouldn’t think.

Loading comments...

densh 3 hours ago [-]

qmr 1 hours ago [-]

No need for so much CPU power, any old quad core would work.

bee_rider 2 hours ago [-]

Tangentially related: I really expected running old MPI programs on stuff like the AMD multi-chip workstation packages to become a bigger thing.

cyberpunk 2 hours ago [-]

globular-toast 2 hours ago [-]

bunderbunder 6 hours ago [-]

Reminds me a bit of one of my favorite NormConf sessions, "Just use one big machine for model training and inference." https://youtu.be/9BXMWDXiugg?si=4MnGtOSwx45KQqoP

Or the oldie-but-goodie paper "Scalability! But at what COST?": https://www.usenix.org/system/files/conference/hotos15/hotos...

bee_rider 5 hours ago [-]

> After fixing the thermals, the cluster did not throttle, and used around 130W. At full power, I got 325 Gflops

https://www.top500.org/lists/top500/

They give the top 10 immediately.

First list (June 1993):

     placement  name            RPEAK (GFlop/s)
     1          CM-5/1024       131.00
     10         Y-MP C916/16256 15.24

Last list he wins, I think (June 1996):

     1          SR2201/1024     307.20  
     10         SX-4/32         64.00

First list he’s bumped out of the top 10 (November 1997):

     1          ASCI Red        1,830.40
     10         T3E             326.40

I think he gets bumped off the full top500 list around 2002-2003. Unfortunately I made the mistake of going by Rpeak here, but they sort by Rmax, and I don’t want to go through the whole list.

Apologies for any transcription errors.

Coffeewine 6 hours ago [-]

It's a pretty rough headline, clearly the author had fun performing the test and constructing the thing.

I would be pretty regretful of just the first sentence in the article, though:

> I ordered a set of 10 Compute Blades in April 2023 (two years ago), and they just arrived a few weeks ago.

That's rough.

geerlingguy 5 hours ago [-]

That's the biggest regret; but I've backed 6 Kickstarter projects over the years. Median time to deliver is 1 year.

Somehow I've actually gotten every item I backed shipped at some point (which is unexpected).

maartin0 5 hours ago [-]

ssl-3 2 hours ago [-]

One cannot (or at least, one is not supposed to be able to) patent someone else's invention.

privatelypublic 50 minutes ago [-]

You theoretically have a year before you even have to apply- but patents are expressly "first to file."

4 hours ago [-]

ComputerGuru 4 hours ago [-]

Guys, don’t take the claim so literally. He’s a successful tech poster. He makes good money showing off his purchases the good money complaining about how expensive they were.

But certainly don’t imitate his choices, his economics aren’t your economics!

esskay 3 hours ago [-]

Computer0 2 hours ago [-]

michaelt 2 hours ago [-]

Then you should take a good hard look at older, much cheaper Raspberry Pis.

Then look at Apple’s ARM offerings, and AWS Graviton if you need ARM with raw power.

If you need embedded/GPIO you should consider an Arduino, or clone. If you need GPIOs and Internet connectivity, look at an ESP32. GPIOs, ARM and wired ethernet? Consdier the the STM32H.

Robotics/machine vision applications, needing IO and lots of compute power? Consider a regular PC with an embedded processor on serial or USB. Or nvidia jetson if you want to run CUDA stuff.

And take a good hard look at your assumptions, as mini PCs using the Intel N100 CPU are very competitive with modern Pis.

privatelypublic 52 minutes ago [-]

dzhiurgis 2 hours ago [-]

Yea but Jeff’s videos are refreshing.

A lot of others are stuck in a loop where they essentially review tech for making more youtube videos - render times, colour accuracy, camera resolution, audio fidelity.

brcmthrowaway 3 hours ago [-]

If only Dan Luu pivoted to this style of content

4 hours ago [-]

system2 3 hours ago [-]

I totally agree. A person with that kind of builder knowledge already knows a decent GPU could 10x that compute power.

fidotron 6 hours ago [-]

If Pi Clusters were actually cost competitive for performance there would be data centres full of them.

shermantanktop 6 hours ago [-]

Like the joke about the economists not picking up the $20 bill on the ground?

Faith in the perfect efficiency of the free market only works out over the long term. In the short term we have a lot of habits that serve as heuristics for doing a good job most of the time.

uncircle 4 hours ago [-]

> Like the joke about the economists not picking up the $20 bill on the ground?

For those like me that don't know the joke:

shermantanktop 1 hours ago [-]

Presumably the non-economist following them picked up the twenty, unencumbered by theory.

ThrowawayR2 5 hours ago [-]

themafia 3 hours ago [-]

It's quite the opposite when corruption becomes involved. There are definite financial incentives for middle men to deliver inefficient and wasteful experiences.

Competition is what creates efficiency. Without it you live in a lie.

infecto 6 hours ago [-]

Sure but for commodities, like server hardware, we can say it’s usually directionally correct. If there are no pi cloud offerings, there is probably a good economic reason for it.

IAmBroom 6 hours ago [-]

> Faith in the perfect efficiency of the free market only works out over the long term

... and even then it doesn't always prove true.

rozab 14 minutes ago [-]

People said the same about PlayStations, to be fair

pmarreck 3 hours ago [-]

Yes. And if women were actually paid 80 cents to the dollar for men, men would be unemployable.

phoronixrly 6 hours ago [-]

If they were cost competitive for ... anything at all really...

ssl-3 33 minutes ago [-]

It depends on the application.

If one just wants a cheap desktop box to do desktop things with, then they're a terrible option, price-wise, compared to things like used corpo mini-PCs.

jacobr1 6 hours ago [-]

They are competitive for hobbyist use cases. Limited home servers, or embedded applications that overlap with arduino.

rebolek 27 minutes ago [-]

They are cost competitive enough for Korg synthesizers which is pretty OK for me.

ACCount37 6 hours ago [-]

Prototyping and low volume.

They're good for long as the development costs dominate the total costs.

magicalhippo 2 hours ago [-]

I picked up several Rpi 4 2GB for $20 each just before covid-19. At that price point they've been quite competitive for small homelab workloads.

The current RPi 5 makes no sense to me in any configuration, given its pricing.

mayli 1 hours ago [-]

Yeah, it's only competitive as a toy for under $35, anything beyond that you can get a cheap x86 with much better performance, a much compatible architecture and much more IOs.

wltr 6 hours ago [-]

Waraqa 5 hours ago [-]

>very energy efficient

If your server has a lot of idle time, ARM will always win.

miunau 2 hours ago [-]

Mythic Beasts rents rpi servers: https://www.mythic-beasts.com/order/rpi/ - there is a niche for it

mayli 1 hours ago [-]

This is also a proof where you can get cheaper and more perf/$ if you buy x86 based vps.

nromiun 6 hours ago [-]

Nobody is really building CPU clusters these days.

anematode 1 hours ago [-]

The best option for DP throughput for hobbyists interested in HPC might be old AMD cards from before they, too, realized that scientific folks would pay up the nose for higher precision.

ted_dunning 3 hours ago [-]

Well, El Capitan uses AMD CPUs (which have integrated GPU capabilities) and it is right on top of the rankings lately.

Frontier is right behind it with the same arrangement.

Having honest to god dedicated GPUs on their own data bus with their own memory isn't necessarily the fastest way to roll.

nromiun 3 hours ago [-]

They do not. The CPUs are only there to support and push data to the GPUs. Much like Nvidia GH200 systems. Nobody buys these APU chips for their CPU parts.

For comparison there are 9,988,224 GPU compute units in El Capitan and only 1,051,392 CPU cores. Roughly one CPU core to push data to 10 GPU CUs.

Aurornis 6 hours ago [-]

bunderbunder 6 hours ago [-]

dukeyukey 5 hours ago [-]

It's good for the soul to have your cluster running in your home somewhere.

NordSteve 4 hours ago [-]

Bad for your power bill though.

platybubsy 3 hours ago [-]

I'm sure 5 rpis will devastate the power grid

duxup 3 hours ago [-]

I need to heat my house too so maybe it helps a little there.

trenchpilgrim 1 hours ago [-]

Still less than renting the same amount of compute. Somewhere between several months and a couple years you pull ahead on costs. Unless you only run your lab a few hours a day.

11101010001100 3 hours ago [-]

You still pay for power for the cloud.

Damogran6 3 hours ago [-]

I got past that back when I was paying for ISDN and had 5 Surplus Desktop PCs...write it off as 'Professional development'

throwaway894345 3 hours ago [-]

What does a few rpis cost on a monthly basis?

theodric 3 hours ago [-]

Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Handy: https://700c.dk/?powercalc

throwaway894345 2 hours ago [-]

> Depends. At full load? At Irish power prices? Just the Pi, no peripherals, no NVMe? 5 units? €13/mo.

Pretty sure most of us aren't running anywhere close to full load 24/7, but whoa, Irish power is expensive. In the central US I pay $0.14/KWh.

ofrzeta 5 hours ago [-]

Maybe so, but even then a second-hand blade server is more cost-effective than a Raspi Cluster.

geerlingguy 3 hours ago [-]

It's definitely not suited for production, but there, you won't find old blade servers either (for the power to performance issue).

Almondsetat 6 hours ago [-]

I can get a Xeon E5-2690V4 with 28 threads and 64GB of RAM for about $150. If you need cores and memory to make a lot of VMs you can do it extremely cheaply

Aurornis 5 hours ago [-]

> I can get a Xeon E5-2690V4 with 28 threads and 64GB of RAM for about $150.

If the goal is a lot of RAM and you don’t care about noise, power, or heat then these can be an okay deal.

semi-extrinsic 5 hours ago [-]

It would also be relevant to then also learn how to do stuff properly with SLURM and Warewulf etc. instead of a poor mans solution with Ansible playbooks like in these blog posts.

p12tic 3 minutes ago [-]

Almondsetat 3 hours ago [-]

You are taking my reply completely out of context. If you want to learn clustering, you need a lot of cores and ram to run many VMs. You don't need them to be individually very powerful.

mattbillenstein 5 hours ago [-]

Power and noise - old server hardware is not something you want in your home.

Commodity desktop cpus with 32 or 64GB RAM can do all of this in a low-power and quiet way without a lot more expense.

nine_k 6 hours ago [-]

It will probably consume $150 worth of electricity in less than a month, even sitting idle :-\

blobbers 6 hours ago [-]

The internet says 100W idle, so maybe more like $40-50 electricity, depending on where you live could be cheaper could be more expensive.

Makes me wonder if I should unplug more stuff when on vacation.

nine_k 6 hours ago [-]

amatecha 4 hours ago [-]

ToucanLoucan 4 hours ago [-]

Fuckin nutty how much juice those things tear through.

rogerrogerr 5 hours ago [-]

100W over a month (rule of thumb 730 hours) is 73kWh. Which is $7.30 at my $0.10/kWh rate, or less than $25 at (what Google told me is) Cali’s average $0.30/kWh.

mercutio2 5 hours ago [-]

Your googling gave results that were likely accurate for California 4-5 years ago. My average cost per kWh is about 60 cents.

Rates have gone up enormously because the cost of wildfires is falling on ratepayers, not the utility owners.

Regulated monopolies are pretty great, aren’t they? Heads I win, tales you lose.

lukevp 5 hours ago [-]

Damogran6 3 hours ago [-]

CORE energy in Colorado is charging $0.10819 per kWh _today_

https://core.coop/my-cooperative/rates-and-regulations/rate-...

cogman10 5 hours ago [-]

Depends entirely on the utilities board doing the regulation.

That said, I'm of the opinion that power/water/internet should all be state/county/city ran. I don't want my utilities companies to have profit motives.

My water company just got bought up by a huge water company conglomerate and, you guessed it, immediate rate increases.

SoftTalker 5 hours ago [-]

If your local regulators approved the merger and higher rates, your complaint is with them as much as the utility company.

Not saying that some regulators are not basically rubber stamps or even corrupt.

cogman10 4 hours ago [-]

I did (as did others), in fact, write in comments and complaints about the rate increases and buyout. That went unheard.

LTL_FTC 4 hours ago [-]

They have definitely increased but not all of California is like this. In the heart of Silicon Valley, Santa Clara, it's about $0.15/kWh. Having Data Centers nearby helps, I suppose.

chermi 3 hours ago [-]

I'm guessing the parent is talking about total bill (transmission, demand charges..) $.15/kwH is probably just the usage, and I am very skeptical that's accurate for residential.

favorited 2 hours ago [-]

yjftsjthsd-h 6 hours ago [-]

> Makes me wonder if I should unplug more stuff when on vacation.

What's the margin on unplugging vs just powering off?

dijit 5 hours ago [-]

By "off" you mean, functionally disabled but with whatever auto-update system in the background with all the radios on for "smart home" reasons - or, "off"?

titanomachy 5 hours ago [-]

100W continuous at 12¢/kWh (US average) is only ~$9 / month. Is your electricity 5x more expensive than the US average?

RussianCow 5 hours ago [-]

$50/month for 100W continuous usage isn't totally mad, and that could climb even higher over the rest of the decade.

mercutio2 5 hours ago [-]

Not OP, but my California TOU rates are between a 40 and 70 cents per kWh.

Still only $50/month, not $150, but I very much care about 100W loads doing no work.

cjbgkagh 5 hours ago [-]

Those kWh prices are insane, that’ll make industry move out of there.

selkin 2 hours ago [-]

Industrial pays different rates than homes.

That said, I am not sure those numbers are true. I am in California (PG&E with East Bay community generation), and my TOU rates are much lower than those.

Almondsetat 5 hours ago [-]

Isn't your home lab supposed to make you learn stuff? Why would you leave it idle?

cjbgkagh 5 hours ago [-]

You wouldn’t, it’s given as a lower bound, it costs more than that when not idling

dijit 5 hours ago [-]

but then you’d turn it off, if you don’t then cloud is much more expensive too.

Also $150 for 100w is crazy, thats like $1.70 per kWh; it would cost about $150 a year at the (high) rates of southern Sweden.

cjbgkagh 5 hours ago [-]

Im not the OP, don’t know how they arrived at that cost.

Personally it’s cheaper to buy the hardware that does spend most of its time idling. Fast turnaround on very large private datasets being key.

swiftcoder 3 hours ago [-]

Obviously the solution is to pickup another hobby, and enter the DIY solar game at the same time as your home lab obsession :D

kjkjadksj 5 hours ago [-]

So shut it off when you don’t need it.

sebastiansm 6 hours ago [-]

On Aliexpress those Xeon+mobo+ram kits are really cheap.

datadrivenangel 5 hours ago [-]

1. Not in the US with tariffs now. 2. I would not trust complicated electronics from Aliexpress from a safety and security perspective.

6 hours ago [-]

kbenson 5 hours ago [-]

Source? That seems like something I would want to take advantage if at the moment...

kllrnohj 5 hours ago [-]

Like a modern 12-thread 9600x runs absolute circles around it https://browser.geekbench.com/processors/amd-ryzen-5-9600x

mattbillenstein 5 hours ago [-]

This is the correct analysis - there's a reason you see this stuff cheap or free.

The homelab group on Reddit is full of people who don't understand any of this - they have full racks in their house that could be replaced with one high-end desktop.

kllrnohj 4 hours ago [-]

> The homelab group on Reddit is full of people who don't understand any of this - they have full racks in their house that could be replaced with one high-end desktop.

Also these old systems mean cheap RAM in large, large capacities. Like 128GB RAM to make ZFS or VMs purr is much cheaper to do on these used systems than anything modern.

mattbillenstein 4 hours ago [-]

Perhaps, but I don't really get the dozens of TB of storage in the home use case a lot of the time either.

Like if you have a large media library, you need to push maybe 10MB/s, you don't need 128GB of RAM to do that...

It's mostly just hardware porn - perhaps there are a few legit use cases for the old hardware, but they are exceedingly rare in my estimate.

kllrnohj 4 hours ago [-]

> Like if you have a large media library, you need to push maybe 10MB/s,

For just streaming a 4k bluray you need more than 10MB/s, Ultra HD bluray tops out at 144 Mbit/s. Not to mention if that system is being hit by something else at the same time (backup jobs, etc...).

Kubuxu 3 hours ago [-]

For 8+ bays you just need a SAS HBA card and one free PCI-E slot. Not to mention that many motherboards will have 6+ SATA ports already.

kllrnohj 2 hours ago [-]

> For 8+ bays you just need a SAS HBA card and one free PCI-E slot. Not to mention that many motherboards will have 6+ SATA ports already.

And what case are you putting them into? What if you want it rack mounted? What about >1gig networking? What if I want a GPU in there to do whisper for home assistant?

Used gaming rigs are great. But used servers also still have loads of value, too. Compute just isn't one of them.

zer00eyz 2 hours ago [-]

Most of the workloads that people with homelabs run, could be run on a 5 year old i5.

A lot of business are paying obscene money to cloud providers when they could have a pair of racks and the staff to support it.

Unless you're paying attention to the bleeding edge of the server market, to its costs (better yet features and affordability) this sort of mistake is easy to make.

montebicyclelo 6 hours ago [-]

Yeah... Looks like can get about $1/hr for 10 small VMs, ($0.10 per VM).

So for $3000, that's 3000 hours, or 125 days, (if just wastefully leave them on all the time, instead of turning them on when needed).

Say you wanted to play around for a couple of hours, that's like.. $3.

(That's assuming there's no bonus for joining / free tier, too.)

wongarsu 5 hours ago [-]

The VMs quickly get expensive if you leave them running though.

That's a total of $800, or 33 days of forgetting to shut down the 10 VMs. Maybe half that if you buy used.

Granted not everyone has $800 or even $400 to drop on hobby projects, renting VMs often does make sense

verdverm 6 hours ago [-]

You can rent a beefy vm with an H100 for $1.50 / hr

I regularly rent this for a few hours at a time for learning and prototyping

Y_Y 6 hours ago [-]

[flagged]

verdverm 5 hours ago [-]

I'll take the H1/200s over a vehicle any day of the week

pinkgolem 3 hours ago [-]

Are you comparing 10 VM with 1 shared core with a 144 core solution?

sam1r 4 hours ago [-]

A great way to do this is… is with a brand new Aws account, which will give you 1 year free across all services with reasonable limits.

jahsome 3 hours ago [-]

Oracle's free tier is pretty generous too.

aprdm 6 hours ago [-]

leoc 5 hours ago [-]

motorest 2 hours ago [-]

> The cost effective way to do it is in the cloud.

cramcgrab 3 hours ago [-]

I don’t know, i keyed this into google Gemini and got pretty far: “ Simulate an AWS AI cluster, command line interface. For each command supply the appropriate AWS AI cluster response”

pinkgolem 4 hours ago [-]

For learning I feel much safer setting everything up locally, worst case I have to reinstall my system.

In the cloud, worst case I have a bill over 5-6 digits.

And I know my ADD, 2 is not super unlikely.

nsxwolf 6 hours ago [-]

That isn’t fun. I have a TI-99/4A in my office hooked up to a raspberry pi so it can use the internet. Why? Because it’s fun. I like to touch and see the things even though it’s all so silly.

bakugo 6 hours ago [-]

swiftcoder 3 hours ago [-]

Plus you still have access to the whole lot when your ISP goes down (maybe less of a problem than it used to be, but not unheard of)

mattbillenstein 5 hours ago [-]

LOL, no

newsclues 5 hours ago [-]

Text and reference books are free at the library.

You don’t need hardware to learn. Sure it helps but you can learn from a book and pen and paper exercises.

trenchpilgrim 5 hours ago [-]

I disagree. Most of what I've learned about systems comes from debugging the weird issues that only happen on real systems, especially real hardware. The book knowledge is like, 20-30% of it.

titanomachy 5 hours ago [-]

Agreed, I don't think I'd hire a datacenter engineer whose experience consisted of reading books and doing "pen and paper exercises".

glitchc 6 hours ago [-]

I did some calculations on this. Procuring a Mac Studio with the latest Mx Ultra processor and maxing out the memory seems to be the most cost effective way to break into 100b+ parameter model space.

teleforce 5 hours ago [-]

Not quite, as it stands now the most cost effective way is most likely framework desktop or similar system for example HP G1a laptop/PC [1],[2].

[1] The Framework Desktop is a beast:

https://news.ycombinator.com/item?id=44841262

[2] HP ZBook Ultra:

https://www.hp.com/us-en/workstations/zbook-ultra.html

GeekyBear 5 hours ago [-]

Now that we know that Apple has added tensor units to the GPU cores the M5 series of chips will be using, I might be asking myself if I couldn't wait a bit.

t1amat 3 hours ago [-]

randomgermanguy 6 hours ago [-]

Depends on how heavy one wants to go with the quants (for Q6-Q4 the AMD Ryzen AI MAX chips seem better/cheaper way to get started).

GeekyBear 5 hours ago [-]

Given the RAM limitations of the first gen Ryzen AI MAX, you have no choice but to go heavy on the quantization of the larger LLMs on that hardware.

mercutio2 5 hours ago [-]

Huh? My maxed out Mac Studio gets 60-100 tokens per second on 120B models, with latency on the order of 2 seconds.

It was expensive, but slow it is not for small queries.

Now, if I want to bump the context window to something huge, it does take 10-20 seconds to respond for agent tasks, but it’s only 2-3x slower than paid cloud models, in my experience.

Still a little annoying, and the models aren’t as good, but the gap isn’t nearly as big as you imply, at least for me.

zargon 4 hours ago [-]

GPT OSS 120B only has 5B active parameters. GP specifically said dense models, not MoE.

3 hours ago [-]

EnPissant 3 hours ago [-]

I think the Mac Studio is a poor fit for gpt-oss-120b.

On my 96 GB DDR5-6000 + RTX 5090 box, I see ~20s prefill latency for a 65k prompt and ~40 tok/s decode, even with most experts on the CPU.

If you have time, could you share numbers for something like:

llama-bench -m <path-to-gpt-oss-120b.gguf> -ngl 999 -fa 1 --mmap 0 -p 65536 -b 4096 -ub 4096

Edit: The only Mac Studio pp65536 datapoint I’ve found is this Reddit thread:

https://old.reddit.com/r/LocalLLaMA/comments/1jq13ik/mac_stu ...

They report ~43.2 minutes prefill latency for a 65k prompt on a 2-bit DeepSeek quant. Gpt-oss-120b should be faster than that, but still very slow.

the8472 6 hours ago [-]

You could try getting a DGX Thor devkit with 128GB unified memory. Cheaper than the 96GB mac studio and more FLOPs.

glitchc 1 hours ago [-]

llm_nerd 5 hours ago [-]

Right now the Macs are viable purely because you can get massive amounts of unified memory. Be pretty great when they have the massive matrix FMA performance to complement it.

Palomides 6 hours ago [-]

even a single new mac mini will beat this cluster on any metric, including cost

eesmith 6 hours ago [-]

encom 3 hours ago [-]

>Mac

>cost effective

lmao

vlovich123 6 hours ago [-]

tracker1 4 hours ago [-]

But will there be a CM6 while you're waiting for the software to improve?

randomNumber7 3 hours ago [-]

Did OP really think his fellow humans are that moronic that they just didn't find out you can plug in together a cuple of rasperri pis?

rustyminnow 2 hours ago [-]

Asraelite 3 hours ago [-]

> I don’t know if anyone building a Pi cluster actually goes into it thinking it’s going to be a cost effective endeavor, do they?

Some Raspberry Pi products are sold at a loss, so I could see how it's in the realm of possibility.

llm_nerd 6 hours ago [-]

philipwhiuk 6 hours ago [-]

I doubt anyone does this seriously.

nerdsniper 6 hours ago [-]

I sure hope no one does this seriously expecting to save some money. I enjoy the videos on "catastrophically ill-advised" build-outs. My primary curiosities that get satisfied by them are:

1) How much worse / more expensive are they than a conventional solution?

2) What kinds of weird esoteric issues pop up and how do they get solved (e.g. the resizable BAR issue for GPU's attached to RPi's PCIe slot)

TZubiri 5 hours ago [-]

Fun fact, a raspberry pi does not have a built in Real Time Clock with its own battery, so it relies on network clocks to keep the time.

Another fun fact, the network module of the pi is actually connected to the USB bus, so there's some overhead as well as a throughput limitation.

Fun fact, the Pi does not have a power button, relying on software to shut down cleanly. If you lose access to the machine, it's not possible to avoid corrupted states on the disk.

alias_neo 5 hours ago [-]

> the raspberry pi is still an amazingly cost effective choice

I went this route myself and have figuratively and literally shelved a bunch of Pis by replacing them with a MiniPC.

barnas2 2 hours ago [-]

> SSD + HAT + PSU + Case + Cooler (+ maybe a uSD)

sjsdaiuasgdia 1 hours ago [-]

> a spare USB-C cable and brick lying around

Particularly with Pi 5, any old brick that might be hanging around has a fair chance at not being able to supply sufficient power.

mrguyorama 2 hours ago [-]

>SSD + HAT + PSU + Case + Cooler

Apparently the Pi 5 8gb is $120 though WTF.

What personal web site or web app or project can't run just fine on a Pi Zero 2 though? It's a little RAM starved but performance wise it should be sufficient.

Other than second-hand mini PCs, old laptops also make great home servers. They have built in UPS!

TZubiri 4 hours ago [-]

What do you mean by Cooler? Raspberry pi doesn't need a fan.

>I went this route myself and have figuratively and literally shelved a bunch of Pis

>and I have dozens of them,

J_McQuade 3 hours ago [-]

> What do you mean by Cooler? Raspberry pi doesn't need a fan.

From the official website:

> Does Raspberry Pi 5 need active cooling?

> Raspberry Pi 5 is faster and more powerful than prior-generation Raspberry Pis, and like most general-purpose computers, it will perform best with active cooling.

TZubiri 2 hours ago [-]

Oh. I haven't used 5, i did 3 and 4

Sohcahtoa82 2 hours ago [-]

> What do you mean by Cooler? Raspberry pi doesn't need a fan.

Starting with the Pi 4, they started saying that a cooler isn't required, but that it may thermal throttle without one if you keep the CPU pegged.

geerlingguy 4 hours ago [-]

The Pi 5 / CM5 / Pi 500 series does have a built-in RTC now, though most models require you to buy a separate RTC battery to plug into the RTC battery jack.

stuxnet79 5 hours ago [-]

> Fun fact, a raspberry pi does not have a built in Real Time Clock with its own battery, so it relies on network clocks to keep the time.

> Another fun fact, the network module of the pi is actually connected to the USB bus, so there's some overhead as well as a throughput limitation.

> Fun fact, the Pi does not have a power button, relying on software to shut down cleanly. If you lose access to the machine, it's not possible to avoid corrupted states on the disk.

With all these caveats in mind, a raspberry pi seems to be an incredibly poor choice for distributed computing

CamperBob2 4 hours ago [-]

With all these caveats in mind, a raspberry pi seems to be an incredibly poor choice for distributed computing

Exactly. This build sounds like the proverbial "1024 chickens" in Seymour Cray's famous analogy. If nothing else, the communications overhead will eat you alive.

moduspol 6 hours ago [-]

Also cost effective is to buy used rack mount servers from Amazon. They may be out of warranty but you get a lot more horsepower for your buck, and now your VMs don’t have to be small.

Aurornis 6 hours ago [-]

Putting a retired datacenter rack mount server in your house is a great way to learn how unbearably loud a real rack mount datacenter server is.

Tsiklon 5 hours ago [-]

To quote @swiftonsecurity - https://x.com/swiftonsecurity/status/1650223598903382016 ;

buildbot 4 hours ago [-]

moduspol 3 hours ago [-]

True! They aren't quiet. I keep mine in a well-ventilated room that doesn't typically have people in it.

J_Shelby_J 3 hours ago [-]

Buy a 3/4u case for $100 and put whatever board you want in it with standard PC fans and a decent cpu cooler. Dead silent.

tempest_ 6 hours ago [-]

ahah and pricey power wise.

Currently the cloud providers are dumping second gen xeon scalables and those things are pigs when it comes to power use.

Sound wise its like someone running a hair dryer at full speed all the time and it can be louder under load.

_boffin_ 4 hours ago [-]

Not true. Have one running in the closet and never hear it.

ComputerGuru 4 hours ago [-]

Only if it’s a 1U. 2U units idle at silent.

Y_Y 6 hours ago [-]

If you're following this path, make sure to use the finest traditional server rack that money can buy: https://www.ikea.com/ie/en/p/lack-side-table-white-30449908/

allanrbo 6 hours ago [-]

No, again, just run VMs on your desktop/laptop. The software doesn't know or care if it's a rack mounted machine.

3 hours ago [-]

wccrawford 5 hours ago [-]

Geerling's titles have been increasingly click-bait for a while now. It's pretty sad, because I like his content, but hate the click-bait BS.

jonathanlydall 3 hours ago [-]

If it makes an appreciable difference to how much money he makes on YouTube then I can’t begrudge him for doing it.

Don’t hate the player, hate the game.

mrguyorama 2 hours ago [-]

Youtube demonstrably wants clickbait titles and thumbnails. They built tooling to automatically A/B test titles and thumbnails for you.

Youtube could fix this and stop it if they want, but that might lose them 1% of business so they never will.

They love that you blame creators for this market dynamic instead of the people who literally create the market dynamic.

kolbe 5 hours ago [-]

buildbot 5 hours ago [-]

Jeff is a good person/blogger and does interesting projects but more experience with niche hardware than literally anyone is a stretch.

Like what about the people who maintain the alpha/sparc/parisc linux kernels? Or the designers behind idk tilera or tenstorrent hardware.

geerlingguy 4 hours ago [-]

I was just at VCF Midwest this past weekend, and I can assure you I am on some of the lower echelons of people who know about niche hardware.

phatfish 4 hours ago [-]

kolbe 4 hours ago [-]

> more experience with niche hardware than literally anyone is a stretch.

This is why I said "almost anyone." If I changed your words, I could disagree with you as well.

AceJohnny2 3 hours ago [-]

> If he does something, there's usually a good a priori rationale for it.

amelius 4 hours ago [-]

Is a Pi still considered "niche" hardware?

ww520 5 hours ago [-]

Now. Imagine a Beawulf of these...

cosarara 6 hours ago [-]

> Compared to the $8,000 Framework Cluster I benchmarked last month, this cluster is about 4 times faster:

Slower. 4 times slower.

teleforce 5 hours ago [-]

That's definitely a typo because I've to read the sentence 3 times from the article still cannot make a sense until I saw the figure.

TL;DR, just buy one framework desktop and it's better than the Pi AI cluster of the OP in every single performance metrics including cost, performance, efficiency, headache, etc.

geerlingguy 5 hours ago [-]

Oops, fixed the typo! Thanks.

And regarding efficiency, in CPU-bound tasks, the Pi cluster is slightly more efficient. (Even A76 cores on a 16nm node still do well there, depending on the code being run).

markx2 6 hours ago [-]

> "But if you're on the blog, you're probably not the type to sit through a video anyway. So moving on..."

Thank you!

zeristor 50 minutes ago [-]

If it worked out well he would be trumpeting the glory.

Not so good, and this is the sort of title. you need to bring the punters in for YouTube.

I don't mean to sound too cynical, I appreciate Jeff's videos, just wanted to point out that if you've spent money and time on content you can either ditch it or make a regret video.

Just so long as the thumbnails don't have an arrow on them I'm happy.

dragontamer 1 hours ago [-]

Why build a cluster?

That exaggerates the effects of good NUMA and now you as a programmer can get more NUMA skills.

Of course, the best plan is to spend $20,000,000++ on your own custom NUMA nodes cluster out of EPYCs or something.

-------

But no. The best supercomputers are your local supercomputers that you should rent some time from. You need a local box to see various issues and learn to practice programming.

lumost 6 hours ago [-]

Unless you can keep your compute at 70% average utilization for 5 years - you will never save money purchasing your hardware compared to renting it.

horsawlarway 6 hours ago [-]

There are an absolutely stunning number of ways to lose a whole bunch of money very quickly if you're not careful renting compute.

$3,000 is well under many "oopsie billsies" from cloud providers.

And that's outside of the whole "I own it" side of the conversation, where things like latency, control, flexibility, & privacy are all compelling reasons to be willing to spend slightly more.

I still run quite a number of LLM services locally on hardware I bought mid-covid (right around 3k for a dual RTX3090 + 124gb system ram machine).

0xbadcafebee 4 hours ago [-]

You could also spill a can of Mountain Dew over the $8,000 AI rig next to you. Oopsies can happen anywhere...

lumost 4 hours ago [-]

horsawlarway 3 hours ago [-]

So 83 days for payback at the 2k sticker price for the 4090? Sounds like a good time to buy a 4090...

Like, if you buy that card it can still be processing things for you a decade from now.

Or you can get 3 months of rental time.

---

And yes, there is definitely a point where renting makes more sense because the capital outlay becomes prohibitive, and you're not reasonably capable of consuming the full output of the hardware.

But the cloud is a huge cash cow for a reason... You're paying exorbitant prices to rent compared to the cost of ownership.

HenryMulligan 6 hours ago [-]

tern 6 hours ago [-]

For most people, no, privacy does not matter in this sense, and "security" would only be a relevant term if there was a pre-existing adversarial situation

a2128 6 hours ago [-]

ripdog 3 hours ago [-]

Because latency matters when gaming in a way which doesn't matter with AI inference?

Plus cloud gaming is always limited in range of games, there are restrictions on how you can use the PC (like no modding and no swapping savegames in or out).

cellis 4 hours ago [-]

And lest we forget! Forgetting to turn it off!

causal 6 hours ago [-]

1) Data proximity (if you have a lot of data, egress fees add up)

2) Hardware optimization (the exact GPU you want may not always be available for some providers)

3) Not subject to price changes

4) Not subject to sudden Terms of Use changes

5) Know exactly who is responsible if something isn't working.

6) Sense of pride and accomplishment + Heating in the winter

justinrubek 6 hours ago [-]

At some point, the work has to actually be done rather than shuffling the details off to someone else.

2OEH8eoCRo0 6 hours ago [-]

I don't get why anyone would hack on and have fun with unique hardware either /s

seanw444 6 hours ago [-]

MomsAVoxell 2 hours ago [-]

I have a SOPINE cluster board that I'd quite like to get booted up and pressed into some sort of useful service.

Still, it'll make a fun distributed computing experimental platform some day.

Just like the Inmos Transputer I've got somewhere, sitting in a box, waiting for a power supply ..

xnx 6 hours ago [-]

Fun project. Was the author hoping for cost effective performance?!

I assumed this was a novelty, like building a RAID array out of floppy drives.

LTL_FTC 6 hours ago [-]

The author is a YouTuber and projects like these pay for themselves with the views they garner. Even the title is designed for engagement.

leptons 6 hours ago [-]

A lot of people don't understand the performance limits of the Raspberry Pi. It's a great little platform for some things, but it isn't really fit for half the use cases I've seen.

Our_Benefactors 6 hours ago [-]

This was my impression as well, the bit about GPU incompatibility with llama.cpp made me think he was in over his head.

noelwelsh 6 hours ago [-]

I'd love to understand the economics here. $3000 purely for fun seems like a lot. $3000 for promotion of a channel? consulting? seems reasonable.

philipwhiuk 5 hours ago [-]

Jeff has a million YouTube subscribers, gets $2000 a month from Patreon and has 200 GitHub sponsors.

The economics of spending $3,000 on a video probably work out fine.

geerlingguy 5 hours ago [-]

It's definitely a stretch for my per-video budget, but I did want to have a 'maxed out' Pi cluster for future testing as well.

Which is fun, but also educational; I've learned a lot about inference, GPU memory access, cache coherency, the PCIe bus...

So a lot of intangibles, many of which never make it directly into a blog post or video. (Similar story with my time experiments).

djhworld 2 hours ago [-]

numpad0 4 hours ago [-]

> goes round-robin style asking each node to perform its prompt processing, then token generation.

bityard 4 hours ago [-]

leakycap 2 hours ago [-]

I dipped my toe into cluster computing when rendering a MAXON Cinema 4D project across a lab of ~30 dual G5 Power Macs.

Quickly learned that there is so much more to manage when you split a task up across systems, even when the system (like Cinema 4D) is designed for it.

wcchandler 4 hours ago [-]

I was just exploring Pi’s and AI hats, so this post is appreciatively timely.

This post didn’t change my direction too much, but it did help level set some realistic expectations. So thanks for sharing.

asimpleusecase 4 hours ago [-]

Great write up! This is nice nerd catnip. And sharing a failed project teaches so much. Please, more should share the projects that absorbed tons of time yet never really delivered.

5 hours ago [-]

AlfredBarnes 6 hours ago [-]

My pi's are just an easy onramp for me to have a functional NAS, PIHole, and webcam security.

Not at all the best, but they were cheap. If i WANTED the best or reliable, i'd actually buy real products.

aprdm 6 hours ago [-]

Love Jeff's ansible roles/playbooks and his cluster building ! Quite interesting, I should reserve some time to play with a Pi cluster and ansible, sounds fun

hamonrye 5 hours ago [-]

Strange quirks for language syntax. I read about a parsing system that was distilled towards being-towards this particular species of silver-tail that existed below Golden Gate bridge. Godwin's Law?

bravetraveler 6 hours ago [-]

Wow that's a lot of scratch for... scratch. Pays for itself, I'm sure: effective bait :)

'Worth it any more'? At this size, never. A Pi is a Pi is a Pi!

A few are fine for toying around; beyond that, hah. Price:perf is rough, does not improve with multiplication [of units, cost, or complexity].

deater 6 hours ago [-]

elzbardico 6 hours ago [-]

Frankly, always thought about Pi Clusters as a nerd indulgence, something to play, not to do serious work.

NitpickLawyer 6 hours ago [-]

gary_0 6 hours ago [-]

Oh yeah, the "imagine a Beowulf cluster of these" Slashdot meme! I miss those days. At least the "can it run Doom?" meme is still alive and kicking.

unregistereddev 5 hours ago [-]

Ditto! It reminded me of the time in college when I built a Beowulf cluster from recently-retired Pentium II desktops.

Was it fast? No. But that wasn't the point. I was learning about distributed computing.

dekhn 3 hours ago [-]

Beowulf style clusters went on to dominate supercomputing.

randomgermanguy 6 hours ago [-]

I think the only exception is specifically for studying network/communciation-topologies. I've seen a couple clusters (ca. 10-50 Pi's) in universities for both research and teaching.

ofrzeta 4 hours ago [-]

There are so many network emulators you can use, such as Mininet or GNS3.

JambalayaJimbo 3 hours ago [-]

I'm sure pedagogically speaking it's better to use physical devices

devmor 6 hours ago [-]

drillsteps5 6 hours ago [-]

Maybe I'm missing something.

hn_throw_250915 6 hours ago [-]

paxys 6 hours ago [-]

This is just the evolution of clickbait titles. The only thing missing is a thumbnail of an AI generated raspberry pi cluster with a massive arrow pointing to it and the words "not worth it!!"

Joker_vD 6 hours ago [-]

There also needs to be a Face Screaming in Fear Emoji plastered on the other side of it.

Havoc 5 hours ago [-]

Have a bunch of Pis too, but realized I can use them to create a high availability control plane for a k8s cluster. Pi4s are entirely adequate for that

owaislone 5 hours ago [-]

What is the ideal or realistic setup for a home lab to power big enough models running locally and service ~2-3 users at a time?

yalogin 4 hours ago [-]

Can the LLMs be deployed split across multiple Pis? I thought it was not possible, may be I am not caught up.

zamadatix 5 hours ago [-]

The article focuses on compute performance but I wonder if that was ever the bottleneck considering the memory bandwidth involved.

bearjaws 5 hours ago [-]

amelius 5 hours ago [-]

Ok, what are the back-of-the-envelope computations that he should have done before starting to build this?

geerlingguy 5 hours ago [-]

But still can be decent for HPC learning, CI testing, or isolated multi-node smaller-app performance.

teaearlgraycold 3 hours ago [-]

Getting some NUC-like machines makes a lot more sense to me. You’ll get 2.5Gb/s Ethernet at the least and way more FlOPS as well.

Drblessing 5 hours ago [-]

The bee-link AI max+ is the best value AI pc right now.

stirfish 5 hours ago [-]

I came here to ask about these. You like yours?

nodesocket 2 hours ago [-]

system2 3 hours ago [-]

iLoveOncall 4 hours ago [-]

> With 160 GB of total RAM, shared by the CPU and iGPU, this could be a small, efficient AI Cluster, right? Well, you'd think.

I don't know anyone who would think this actually.

geerlingguy 3 hours ago [-]

imtringued 6 hours ago [-]

Oh come on Jeff, you forgot to buy GPUs for your AI cluster. Such a beginner mistake.

All you needed to do is buy 4x xtx 7900 used on ebay and build a four node raspberry pi cluster using the external GPU setup you've come up with in one of your previous blog posts [0].

[0] https://www.jeffgeerling.com/blog/2024/use-external-gpu-on-r...

geerlingguy 5 hours ago [-]