mac studio runs large models like that
by Poster
May 19, 2025
26
That is, the level of a toy
mac studio m3 ultra, 512g memory/video memory, 671b q4 _ k _ m, gpu and memory are full, more than 10
tokens/s
32b, not much memory, 8%, but the gpu is always full, more than 20 tokens/s
If you add embedded and rerank models (standard in the knowledge base) to one machine, it is basically stuck
Knowledge base for running obsidian and dify, the speed is about the same as my amd + 64g memory + 4060ti 16g running 14b.
Replies
-
Anonymous10417 May 18, 2025mac is the advantage of memory/video memory The computing power is just not good, not as good as nv's flagship graphics card ps: I saw it next door too
-
Anonymous4493 May 18, 2025If you beat nv in all directions, Apple is willing to sell it, but you can't.
-
Anonymous5607 May 19, 2025Apple's current things are just for entertainment. Don't really be a productivity tool. Be nice to yourself
-
Anonymous1839 May 19, 2025What others are saying is that the 671b model can run at this price, and the graphics card at the same price does not have enough video memory to run, but no one has ever said that the mac can run large models faster. After all, there is no cuda acceleration, and Apple's metal ecosystem is not as good as cuda
-
Anonymous11774 May 19, 2025The speed is not good, but at least it works
-
Anonymous472 May 19, 2025It's good to be compared with consumer-grade graphics cards, at least it's inexpensive and you can buy it