TScale – Distributed training on consumer GPUs

130 points by zX41ZdbW 2 months ago

I'm trying to run this but fo.cpp doesn't exist in the repository. I made an issue see https://github.com/Foreseerr/TScale/issues/1

mdaniel 2 months ago

I suspect this was prematurely published to HN and was in fact just someone's weekend project
https://github.com/Foreseerr/TScale/blob/aa2638c53c74dd33280...
https://github.com/Foreseerr/TScale/blob/aa2638c53c74dd33280...
and I struggle to think of what would lead one to the urge to implement a key=value config file parser in 2025 https://github.com/Foreseerr/TScale/blob/aa2638c53c74dd33280...
On top of that, people who do $(git add . && git commit -myolo) drive me crazy https://github.com/Foreseerr/TScale/blob/main/logs/125m_1T_f...
- comex 2 months ago
  
  > and I struggle to think of what would lead one to the urge to implement a key=value config file parser in 2025
  C/C++ culture never changes.
  As many new build tools and package managers as people come up with, the ‘default’ environment is still one where adding dependencies is hard, so people roll their own utilities instead.
  
  fc417fc802 2 months ago
  
  I can only speak for myself but I don't think you've got the cause and effect right. Dependences tend to have their own dependencies (which have ...). It's not so much the difficulty as it is the awareness of it that leads me to minimize my dependencies to the bare minimum.
  All my dependencies are locally cloned. I build them from source in a network isolated environment. And yeah, that makes it more expensive to bring new ones in so I tend to shy away from it. I see that as a good thing.
  That said, if you're willing to give cmake access to the network things largely just work as long as you don't attempt anything too exotic compared to the original authors. For that matter boost already has a decent solution for pretty much anything and is available from your distros repos. Rolling your own is very much a cultural past time as opposed to a technical necessity.
  
  _zoltan_ 2 months ago
  
  CMake makes it a lot easier. couple that with conda and it's pretty good.
  I'm coming from a Java/python background originally and compared to that it's more finicky but not bad at all.
- spaceywilly 2 months ago
  
  > and I struggle to think of what would lead one to the urge to implement a key=value config file parser in 2025
  That could be a symptom of LLM coding. I have found at times they will go down a rabbit hole of coding up a complicated solution to something when I know that a library already exists it could’ve used. I’m sure part of the problem is that it isn’t able to search for libraries to solve problems, so if its training data didn’t use a particular library, it will not be able to use it.

fizx 2 months ago

What is this 1T index technique they seem so hyped about?

emorning3 2 months ago

>> In this case we build a model with 1T index which we lookup for every token to make prediction with much smaller model. <<
This index seems to be used to minimize the size of models.
I'm familiar with term indexing as described in The Handbook of Automated Reasoning and I imagine that this index helps them recognize 'generalizations'.
In the way that a rewrite rule can be used to reduce an infinite number of expressions, not just a single expression, a generalization can be used to minimize models.
Generally, such an index would be some kind of prefix-tree.
Just a guess, guessing is fun

TYMorningCoffee 2 months ago

Can the inference piece be partitioned over multiple hosts?

Edit: algorithmed or partitioned in a way that overcomes the network bottleneck

Maxious 2 months ago

> prima.cpp is a distributed implementation of llama.cpp that lets you run 70B-level LLMs on your everyday devices— laptops, desktops, phones, and tablets (GPU or no GPU, it’s all good). With it, you can run QwQ-32B, Qwen 2.5-72B, Llama 3-70B, or DeepSeek R1 70B right from your local home cluster!
https://github.com/Lizonghang/prima.cpp
happyPersonR 2 months ago

Pretty sure llama.cpp can already do that
- TYMorningCoffee 2 months ago
  
  I forgot to clarify dealing with the network bottleneck
  
  moralestapia 2 months ago
  
  Just my two cents from experience, any sufficiently advanced LLM training or inference pipeline eventually figures out that the real bottleneck is the network!

gitroom 2 months ago

tbh i never get why people keep reinventing config parsers, but i guess old habits die slow

bheadmaster 2 months ago

Sometimes, every config parser that is popular enough to be considered trusted is just chock full of features that you don't need and increases both your build time and binary size, without bringing much value beyond doing the config parsing you could've written yourself in a few minutes.
Sometimes, what you want is just a simple key=value config parser.
A little re-inventing the wheel is better than a little dependency.

dgekfeg 2 months ago

[flagged]

revskill 2 months ago

Interesting that you put code in code folder, not src.

ArtTimeInvestor 2 months ago

Even with consumer GPUs, the AI stack is completely dependent on ASML, isn't it?

Thought experiment: What would happen if the Dutch government decided that AI is bad for mankind and shuts down ASML? Would the world be stuck in terms of AI? For how long?

bgnn 2 months ago

That's a silly the thought. ASML isn't controlled by the Dutch government.
Also, everything in computing is dependent on semiconductors. ASML is just one player. There are tens of thousands companies involved in the industry and some of them are single suppliers of critical materials, machines or software. It's wrong to single out ASML.
- mschuster91 2 months ago
  
  > ASML isn't controlled by the Dutch government.
  Of course they are. The Dutch government is who ordered ASML to not export their brand new stuff to China.
  
  wokkel 2 months ago
  
  Actually it was the usa presuring the Dutch government.
  
  coredog64 2 months ago
  
  ASML licensed technologies from US companies during the development of EUV. That's what gives the US the leverage to do things like block sales to China.
  
  hotstickyballs 2 months ago
  
  Distinction without a difference
TechDebtDevin 2 months ago

ASML publishes most of the research and theres not much stopping people from building their own EUV lithography machines. Its just very very very hard and basically the equivalent of doing magic. China is making incredible progress on this front.
- airstrike 2 months ago
  
  The problem with these things is that there are always trade secrets that aren't published anywhere. So you'd need to actually hire people with specific knowledge to be able to replicate it.
  The world (and the West specifically) definitely needs to build redundancy ASAP here.
  
  TechDebtDevin 2 months ago
  
  The new machines are 2-3 stories tall, require an Airbus to transport, have complexity on par with the worlds largest particle accelerators, if not more complex. Because of this, the supply chains are highly intertwined no one country and can isolate that supply chain. The Dutch can't build it without our contributions, and neither could we without theirs. Lots of moving parts here literally and figuratively.
  
  airstrike 2 months ago
  
  That's a separate concern and doesn't change the fact that parts of that supply chain are irreplaceable.
  The Dutch don't have to willfully sabotage ASML for it to be an issue.
SecretDreams 2 months ago

Like all novel things, once you prove it can be done, someone else will do it. If you shut ASML down, some other country that is already working on it will catch up. ASML existing is better because at least the person ahead can keep trying to remain ahead.