What would be the benefit of using ZML instead of relying on StableHLO/PJRT? Because the cost of porting models is for sure high.
Hi ya! Want to say this looks awesome :) really interested in the sharded inference demo!!! You said it was experimental, is it in the examples folder at all?? (On phone atm, so apologies for not investigating further)
First of all, great job! I think the inference will become more and more important.

That being said, I have a question regarding the ease of use. How difficult it is for someone with python/c++ background to get used to zig and (re)write a model to use with zml?

Given that the focus is performance, do you have any benchmarks to compare against the likes of TensoRT-LLM.
my dreams have come true. hardware-agnostic ml primitives in a typed, compiled language.

my only question is: is zig stable enough to base such a project on?