LLM inference engine from scratch in C++ – why output tokens cost 5x

(anirudhsathiya.com)

9 points | by ani17 2 days ago ago

5 comments