The Second Half of Model Architecture
tldr: We spent a decade scaling computation inside layers. We forgot to scale communication between them. That’s about to change.
朱良辉 | Ph.D. at HUST
tldr: We spent a decade scaling computation inside layers. We forgot to scale communication between them. That’s about to change.