Language Models

Early results in natural language processing show that a single-layer TPNs model (8M parameters, ~O(N) complexity outpaced the BabyLM Challenge baselines, GPT-2 (124M parameters, 12 layers, O(N^2)) and GPT-BERT (30M, 12 layers, O(N^2)). See: https://aclanthology.org/2025.babylm-main.24/