Ask HN: Dedicated hardware can 10x inference. Can we do the same with training?

chenxi9649

3d ago

4

null_investor

Yes, there are a few companies that are in that space, don't remember the names, but it's worth/possible, just likely need a few years of development and a real usage that makes it worth it.

But it might not ever catch on, as they would need to reduce the cost a lot and the final hardware probably wouldn't be usable to train a different model. Even though H100s are expensive, once you buy a few clusters of them you can use them for a long time and train many models with it and different purposes like machine learning workloads, that flexibility wouldn't exist if you just burn the model into silicon.

The complexity is mostly about the fact that nobody was really looking into this until a few years back, and as it involves semiconductors manufacturing and product development cycles that can take sometime.

Even though the nVidia CEO wants to pretend this is rocket science so people believe that it has some kind of moat, it isn't, it's actually simple and there are plenty of semiconductors companies that could do this if they'd invest the time and money needed to do it.

Maybe this might be actually needed when we find out that we need to train much larger models with much longer training times (AFAIK LLAMA is about 90 days), or the energy costs start to get really crazy, burning it into silicon would likely reduce those expenses at the cost of lower flexibility.

I feel we're still at the "experimentational" time for LLMs and once we have multiple business applications, different models with need a different training approach and much more specialized. It's important to say not so much time has passed since GPT-2 was launched. Better solutions will come as necessary and competition in AI atm is high.

There's lots of money & ideas being tried out just this moment.