@bluedevil "Some coverage of this project has overstated its implications. To be clear:
Training works, but utilization is low (~2-3% of peak) with significant engineering challenges remaining Many element-wise operations still fall back to CPU This does not replace GPU training for anything beyond small research models today"