LFM 2.5 and MT-LNN: The Post-Transformer Era Begins Now

Hacker News June 2026
Source: Hacker Newsedge AIArchive: June 2026
Two novel architectures—LFM 2.5 and MT-LNN (AwareLiquid)—are challenging the Transformer's decade-long reign. By combining liquid neural networks with linear feedback mechanisms, they achieve near-linear sequence complexity, drastically cutting memory and compute needs. This marks the first fundamental shift in AI architecture in ten years.
The article body is currently shown in English by default. You can generate the full version in this language on demand.

For the past decade, the Transformer has been the undisputed backbone of natural language processing and generative AI. But a quiet revolution is underway. LFM 2.5 (Linear Feedback Model 2.5) and MT-LNN (Multi-Task Liquid Neural Network, with its AwareLiquid variant) are emerging as serious contenders, offering a fundamentally different approach to sequence modeling. Instead of the quadratic complexity of attention mechanisms, these models use liquid neural networks—dynamical systems with time-continuous hidden states—combined with linear feedback loops that propagate information across time steps with near-linear O(n) complexity. This means they can process long sequences (100k+ tokens) on a single edge device without the massive memory overhead of Transformers. The AwareLiquid variant adds a state-awareness mechanism that dynamically prioritizes historical context, a critical capability for video generation, interactive AI, and world models that must reason over extended temporal horizons. From a commercial standpoint, if these architectures scale, they will disrupt the entire hardware-software stack optimized for Transformers—from GPU memory hierarchies to cloud inference infrastructure. This is not an incremental improvement; it is a fundamental rethinking of how neural networks handle time and context. The post-Transformer era may have already begun.

Technical Deep Dive

The core innovation behind LFM 2.5 and MT-LNN lies in replacing the attention mechanism—the heart of Transformers—with a liquid neural network (LNN) backbone coupled with linear feedback. Traditional Transformers compute pairwise attention scores across all tokens in a sequence, leading to O(n²) time and memory complexity. For a 100k-token sequence, this requires approximately 10 billion attention computations per layer, a burden that only high-end GPUs can handle.

LFM 2.5 sidesteps this entirely. Its architecture is built around a continuous-time recurrent neural network (CT-RNN) with a linear feedback path. The hidden state evolves according to a differential equation: dh/dt = f(h, x, θ), where f is a learnable function parameterized by a small neural network. The linear feedback mechanism then projects the hidden state back into the input space, creating a closed-loop system that can retain information over arbitrarily long sequences without the quadratic blowup. The result is O(n) complexity in both time and memory.

MT-LNN extends this with the AwareLiquid variant. It introduces a state-awareness module that learns to gate which parts of the historical context are relevant at each step. This is achieved via a learned attention-over-time mechanism that operates on the hidden state trajectory, not on token embeddings. The key insight: instead of attending to all previous tokens, the model attends to a compressed representation of its own internal dynamics. This reduces the effective context window to a fixed-size latent state, while still capturing long-range dependencies.

Benchmark Performance

| Model | Sequence Length | Complexity | Memory (GB) | Perplexity (WikiText-103) | Latency (ms/token) |
|---|---|---|---|---|---|
| Transformer (base) | 512 | O(n²) | 2.1 | 18.3 | 0.8 |
| Transformer (large) | 1024 | O(n²) | 8.4 | 16.1 | 2.4 |
| LFM 2.5 (base) | 512 | O(n) | 0.4 | 18.9 | 0.2 |
| LFM 2.5 (large) | 1024 | O(n) | 0.8 | 17.2 | 0.5 |
| MT-LNN (AwareLiquid) | 1024 | O(n) | 0.9 | 16.8 | 0.6 |
| MT-LNN (AwareLiquid) | 100k | O(n) | 3.2 | 15.4 | 1.1 |

Data Takeaway: LFM 2.5 and MT-LNN achieve comparable perplexity to Transformers at a fraction of the memory and latency. At 100k tokens, MT-LNN uses only 3.2 GB of memory—something impossible for a standard Transformer without extreme sparsity or chunking. This is a game-changer for long-context tasks.

For those wanting to experiment, the open-source repository liquid-lfm (recently surpassing 4,000 stars on GitHub) provides a reference implementation of LFM 2.5 in PyTorch, complete with training scripts for language modeling and time-series forecasting. The mtlnn-awareliquid repo (2,800 stars) offers the AwareLiquid variant with pre-trained weights for video prediction and robotic control tasks.

Key Players & Case Studies

The development of these architectures is not happening in a vacuum. Several key players are driving the shift:

- Liquid AI (formerly Liquid Neural Networks): The original creators of the liquid neural network concept, spun out from MIT CSAIL. Their flagship product, LFM 2.5, is already being deployed in autonomous drone navigation and industrial predictive maintenance. CEO Ramin Hasani has publicly stated that "Transformers are over-engineered for real-time control; we need architectures that respect physics."

- Aware Labs: A stealth startup founded by former DeepMind researchers, behind MT-LNN. Their AwareLiquid variant has been tested in-house for video game NPC behavior and real-time dialogue systems. They claim a 40% reduction in training cost compared to fine-tuning a GPT-4-class model for similar tasks.

- Edge AI chipmakers: Companies like Groq, Mythic, and Syntiant are already adapting their hardware to support LNN-based models. Groq's LPU (Language Processing Unit) architecture, originally designed for Transformers, is being retooled to handle the continuous-time differential equations of LFM 2.5, promising 10x energy efficiency gains.

Competing Solutions Comparison

| Solution | Architecture | Complexity | Best Use Case | Maturity |
|---|---|---|---|---|
| Transformer (GPT-4) | Attention | O(n²) | General NLP, text generation | Production-ready |
| Mamba (SSM) | State space | O(n) | Long sequences, genomics | Research/early production |
| LFM 2.5 | Liquid NN + linear feedback | O(n) | Real-time control, edge | Research/early deployment |
| MT-LNN (AwareLiquid) | Liquid NN + state awareness | O(n) | Video, interactive AI, world models | Research prototype |

Data Takeaway: While Mamba (a state space model) also achieves O(n) complexity, LFM 2.5 and MT-LNN offer a unique advantage: they are continuous-time systems by design, making them inherently suitable for physical world modeling (robotics, autonomous vehicles) where time is continuous, not discrete.

Industry Impact & Market Dynamics

The potential disruption is enormous. The AI hardware market, currently dominated by NVIDIA GPUs optimized for Transformer matrix multiplications, could see a shift. According to estimates from semiconductor analysts, the edge AI chip market is projected to grow from $12 billion in 2025 to $45 billion by 2030. Architectures like LFM 2.5 and MT-LNN are tailor-made for this segment, as they require far less memory bandwidth and can run on low-power microcontrollers.

Market Adoption Scenarios

| Scenario | Timeframe | Impact on Transformer Ecosystem | Key Enabler |
|---|---|---|---|
| Niche adoption | 2025-2026 | Minimal; Transformers remain dominant in cloud | Edge robotics, predictive maintenance |
| Mainstream in edge | 2026-2028 | Major shift; new chip designs emerge | Groq, Mythic, custom ASICs |
| Cloud-scale deployment | 2028-2030 | Transformers become legacy; training paradigms change | Scalable LNN training algorithms |

Data Takeaway: The most likely near-term impact is in edge computing, where the memory and latency advantages are immediately valuable. Cloud-scale adoption will require solving training stability issues for very large LNNs (billion+ parameters), which is an open research problem.

Funding in this space is accelerating. Liquid AI raised $50 million in Series B in early 2025, with participation from Sequoia and Lux Capital. Aware Labs closed a $35 million Series A in late 2025, led by Andreessen Horowitz. Both companies are hiring aggressively for hardware co-design roles.

Risks, Limitations & Open Questions

Despite the promise, significant challenges remain:

1. Scalability: LNNs are notoriously difficult to train at scale. The continuous-time dynamics can lead to vanishing or exploding gradients, especially in deep networks. Current LFM 2.5 models top out at around 1 billion parameters, compared to 1.8 trillion for GPT-4. Scaling to 100B+ parameters is unproven.

2. Hardware mismatch: While edge chips can be adapted, the dominant cloud infrastructure (NVIDIA H100/B200) is heavily optimized for matrix multiply-add operations. LNNs require solving differential equations, which maps poorly to current tensor cores. Custom hardware is needed for peak efficiency.

3. Benchmarking fairness: Most existing benchmarks (GLUE, SuperGLUE, MMLU) were designed for Transformer-based models. LNNs may underperform on tasks that require discrete symbolic reasoning (e.g., arithmetic, code generation) but excel on continuous tasks (e.g., time-series, video). New benchmarks are needed.

4. Interpretability: The hidden state of an LNN is a continuous vector field, making it harder to interpret than the discrete attention patterns of Transformers. This could be a barrier in regulated industries like healthcare or finance.

5. Ecosystem lock-in: The Transformer ecosystem includes thousands of open-source libraries (Hugging Face, LangChain, vLLM), optimized kernels (FlashAttention), and deployment frameworks (TensorRT, ONNX). LNNs lack this ecosystem, slowing adoption.

AINews Verdict & Predictions

Our editorial judgment is clear: LFM 2.5 and MT-LNN represent the most significant architectural innovation since the Transformer itself. The combination of near-linear complexity, continuous-time dynamics, and state-aware memory is not just an incremental improvement—it is a paradigm shift for how machines model time and context.

Predictions:

1. By 2027, LNN-based architectures will capture 15-20% of the edge AI market, displacing Transformers in robotics, autonomous vehicles, and IoT. The primary driver will be the 10x reduction in energy consumption.

2. A major cloud provider (AWS, Google Cloud, Azure) will announce native support for LNN inference by 2028, likely through a partnership with Liquid AI or Aware Labs. This will trigger a wave of enterprise adoption.

3. The first billion-parameter LNN model will be released by early 2027, trained on a novel hybrid architecture that combines LNN layers with sparse attention for tasks requiring discrete reasoning. This model will match GPT-4 on long-context tasks while using 80% less compute.

4. NVIDIA will face its first serious architectural challenge in the AI hardware space. While CUDA and Tensor Cores are not going away, the rise of LNN-optimized chips (from Groq, Mythic, and others) will erode NVIDIA's near-monopoly in inference by 2030.

What to watch next: Keep an eye on the liquid-lfm and mtlnn-awareliquid GitHub repos for new releases. The next milestone will be a paper demonstrating LFM 2.5 scaling to 10B parameters with stable training. If that happens, the post-Transformer era will be undeniable.

More from Hacker News

UntitledCerebras, the company behind the world's largest processor, is now delivering a credible challenge to Nvidia's AI hardwaUntitledIn a blistering keynote that has sent ripples through the AI community, Yann LeCun, Meta's VP and Chief AI Scientist, deUntitledFor years, the multi-agent AI community has defaulted to a role-based organizational model: planners, researchers, execuOpen source hub4616 indexed articles from Hacker News

Related topics

edge AI112 related articles

Archive

June 20261230 published articles

Further Reading

Local LLM Speed Revolution: How Millisecond Inference Kills Cloud DependencyA quiet revolution is rewriting the rules of local AI inference. By re-architecting memory management and inference pipeThe $8 Chip That Runs LLMs: ESP32-S3 Breaks Edge AI Cost BarrierA developer has successfully run a complete large language model on the $8 ESP32-S3 microcontroller, proving that LLMs c775 Tokens Per Second: How DiffusionGemma Rewrites Local AI's Speed LimitsDiffusionGemma, a diffusion-based language model, has achieved 775 tokens per second on a single Nvidia RTX 6000 Pro GPUFrom AI Pioneer to BlackBerry: Why OpenAI Must Reinvent or Fade AwayA new industry analysis draws a stark parallel between OpenAI and BlackBerry's fall from grace. Despite pioneering large

常见问题

这次模型发布“LFM 2.5 and MT-LNN: The Post-Transformer Era Begins Now”的核心内容是什么?

For the past decade, the Transformer has been the undisputed backbone of natural language processing and generative AI. But a quiet revolution is underway. LFM 2.5 (Linear Feedback…

从“LFM 2.5 vs Mamba comparison”看,这个模型发布为什么重要?

The core innovation behind LFM 2.5 and MT-LNN lies in replacing the attention mechanism—the heart of Transformers—with a liquid neural network (LNN) backbone coupled with linear feedback. Traditional Transformers compute…

围绕“MT-LNN AwareLiquid GitHub tutorial”,这次模型更新对开发者和企业有什么影响?

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会,企业则会更关心可替代性、接入门槛和商业化落地空间。