RelaxAI obniża koszty inferencji o 80%: rzucając wyzwanie dominacji OpenAI i Claude

Hacker News May 2026
Source: Hacker NewsAI inferenceArchive: May 2026
Brytyjski startup RelaxAI zaprezentował suwerenną usługę inferencji dla dużych modeli językowych, twierdząc, że koszty stanowią zaledwie 20% kosztów OpenAI i Anthropic Claude. Optymalizując architekturę inferencji i wykorzystując lokalną infrastrukturę, usługa obiecuje wydajność na poziomie korporacyjnym za ułamek ceny.
The article body is currently shown in English by default. You can generate the full version in this language on demand.

RelaxAI, a UK-based AI startup, has launched a sovereign large language model inference service that it claims reduces costs by 80% compared to offerings from OpenAI and Anthropic. The company achieves this through a combination of advanced quantization techniques, speculative decoding, and dynamic batching, all running on UK-based data centers to ensure GDPR compliance. This move directly challenges the pricing hegemony of American AI giants and signals a shift toward localized, cost-efficient AI infrastructure. While independent benchmarks are pending, RelaxAI's approach could democratize access to real-time AI applications like customer service and document analysis for European enterprises. The service's 'sovereign' label taps into growing concerns over data sovereignty, potentially giving it a unique competitive advantage in the EU market. If successful, RelaxAI may force a broader industry re-evaluation of inference pricing, accelerating AI adoption across sectors.

Technical Deep Dive

RelaxAI's claimed 80% cost reduction is not a simple price war but the result of a carefully engineered inference stack. The company has not open-sourced its full architecture, but based on technical disclosures and industry analysis, several key innovations stand out.

Advanced Quantization: RelaxAI employs a proprietary mixed-precision quantization scheme that reduces model weights from FP16 to INT4/INT8 without significant accuracy loss. Unlike standard post-training quantization, their method uses adaptive calibration datasets tailored to enterprise use cases (e.g., legal document summarization, customer support). This reduces memory bandwidth requirements by up to 4x, directly lowering per-token costs.

Speculative Decoding: The service uses a smaller, faster 'draft' model to generate candidate tokens, which are then verified by the main model. This technique, popularized by DeepMind and others, can achieve 2-3x speedups in latency-constrained scenarios. RelaxAI claims to have optimized the draft model selection dynamically based on input complexity, further improving efficiency.

Dynamic Batching & Continuous Batching: Instead of static batch sizes, RelaxAI's inference server uses continuous batching, where requests are processed as they arrive, maximizing GPU utilization. This is similar to techniques used in vLLM, a popular open-source inference engine (GitHub: vllm-project/vllm, over 30,000 stars). However, RelaxAI claims to have added a proprietary scheduling algorithm that prioritizes low-latency requests without starving throughput.

Infrastructure Optimization: By running on UK-based data centers (likely using AWS or Azure's London regions), RelaxAI avoids transatlantic data transfer costs and latency. More importantly, it leverages cheaper renewable energy and local tax incentives, contributing to the cost advantage.

Benchmark Claims: RelaxAI has published preliminary performance data on its blog. While independent verification is needed, the numbers are striking:

| Metric | RelaxAI | OpenAI GPT-4o | Anthropic Claude 3.5 Sonnet |
|---|---|---|---|
| Cost per 1M input tokens | $1.00 | $5.00 | $3.00 |
| Cost per 1M output tokens | $4.00 | $15.00 | $15.00 |
| Latency (avg, 100 tokens) | 350ms | 400ms | 380ms |
| MMLU Score (claimed) | 87.2 | 88.7 | 88.3 |

Data Takeaway: RelaxAI's cost advantage is clear, but the MMLU score is slightly lower. For many enterprise applications, the trade-off between a 1-2% accuracy drop and an 80% cost reduction will be acceptable, especially for high-volume, real-time tasks.

Key Players & Case Studies

RelaxAI is not operating in a vacuum. Several other players are pursuing similar cost-reduction strategies, though none have yet claimed such dramatic savings.

Competitors:
- Together AI: Offers inference APIs with competitive pricing (~$0.50/1M tokens for Llama 3 70B) but lacks the 'sovereign' angle.
- Fireworks AI: Focuses on fast inference with optimized models, but pricing is still higher than RelaxAI's claim.
- Groq: Uses custom LPU hardware for ultra-low latency, but costs are comparable to OpenAI.
- European Challengers: German startup Aleph Alpha and French Mistral AI offer sovereign AI but with higher prices.

Case Study: European Enterprise Adoption
Consider a large German insurance company processing 10 million customer queries per month. Using OpenAI GPT-4o, the cost would be approximately $50,000/month (assuming 500 tokens per query). With RelaxAI, the same workload would cost $10,000/month, a saving of $480,000 annually. Moreover, because data stays in the UK/EU, GDPR compliance is simplified, reducing legal overhead.

Comparison Table:

| Feature | RelaxAI | OpenAI | Anthropic | Mistral AI |
|---|---|---|---|---|
| Sovereign (EU data) | Yes | No | No | Yes |
| Cost per 1M tokens (input) | $1.00 | $5.00 | $3.00 | $2.50 |
| Model size (est.) | ~70B | ~200B | ~200B | ~70B |
| Open-source model | No | No | No | Yes (Mistral 7B) |
| Latency (avg) | 350ms | 400ms | 380ms | 450ms |

Data Takeaway: RelaxAI's combination of low cost and data sovereignty gives it a unique position, but the closed-source nature may deter some open-source advocates.

Industry Impact & Market Dynamics

RelaxAI's entry could reshape the AI inference market in several ways.

Pricing Pressure: The most immediate impact is on pricing. If RelaxAI can maintain quality, OpenAI and Anthropic may be forced to lower their prices, especially for European customers. This could trigger a price war, benefiting consumers but squeezing margins for AI companies.

Sovereign AI Movement: RelaxAI's 'sovereign' branding taps into a growing geopolitical trend. The EU's AI Act and GDPR create a regulatory moat that favors local providers. We may see a wave of similar startups in other regions (e.g., Southeast Asia, Latin America) offering localized inference.

Market Size: The global AI inference market was valued at $18.5 billion in 2024 and is projected to reach $87.5 billion by 2030 (CAGR 29.5%). Even capturing 5% of this market would give RelaxAI a $4.4 billion revenue opportunity by 2030.

Funding & Growth: RelaxAI has raised $50 million in Series A from undisclosed European VCs. This is modest compared to OpenAI's billions, but it reflects a lean, focused approach.

Data Table: Market Projections

| Year | Global Inference Market ($B) | RelaxAI Market Share (est.) | RelaxAI Revenue ($B) |
|---|---|---|---|
| 2025 | 24.0 | 0.5% | 0.12 |
| 2026 | 31.2 | 1.5% | 0.47 |
| 2027 | 40.6 | 3.0% | 1.22 |
| 2028 | 52.8 | 4.0% | 2.11 |
| 2029 | 68.6 | 5.0% | 3.43 |
| 2030 | 87.5 | 5.0% | 4.38 |

Data Takeaway: Even a modest market share translates into substantial revenue, making RelaxAI a credible long-term player.

Risks, Limitations & Open Questions

Despite the promise, several risks remain.

Performance Verification: Independent benchmarks are crucial. RelaxAI's claimed MMLU score of 87.2 needs third-party validation. If the actual score is lower (e.g., 85), the cost advantage may not compensate for quality loss in high-stakes applications like legal or medical advice.

Scalability: RelaxAI's current infrastructure may not handle sudden demand spikes. OpenAI's massive GPU clusters provide resilience that a startup may lack.

Model Quality: RelaxAI uses a proprietary model, likely based on open-source architectures (e.g., Llama 3). If the base model improves, RelaxAI must keep pace. They may lack the research depth of OpenAI or Anthropic.

Regulatory Risks: The 'sovereign' label could attract scrutiny. If RelaxAI's data centers are found to have any US ties, the GDPR advantage evaporates.

Vendor Lock-in: Enterprises may hesitate to commit to a single provider, especially one without a long track record.

AINews Verdict & Predictions

RelaxAI represents a significant shift in the AI inference market. Its focus on cost efficiency rather than raw model size is a strategic masterstroke, especially for the price-sensitive European enterprise market.

Prediction 1: Within 12 months, OpenAI and Anthropic will introduce 'sovereign' pricing tiers for European customers, cutting prices by 30-50% to compete.

Prediction 2: RelaxAI will be acquired within 2 years by a larger European tech company (e.g., SAP, Siemens) seeking to bolster its AI stack, likely for $1-2 billion.

Prediction 3: The 'sovereign inference' model will be replicated in other regions (e.g., India, Brazil) by local startups, fragmenting the global inference market.

What to Watch: Independent benchmarks from Stanford's HELM or LMSYS's Chatbot Arena. If RelaxAI scores within 2% of GPT-4o, its success is all but assured. Also, watch for partnerships with European cloud providers (e.g., OVHcloud, Deutsche Telekom) that could expand its reach.

RelaxAI is not just a cost play; it's a strategic bet on a multipolar AI world. The question is not whether it will succeed, but how quickly the incumbents will adapt.

More from Hacker News

Ślepota czasowa: dlaczego LLM nie potrafią uchwycić przyczyny i skutkuA new open-source research paper, led by a team from MIT and the University of Cambridge, has systematically demonstrateWhichLLM: Narzędzie open-source, które dopasowuje modele AI do Twojego sprzętuThe open-source project WhichLLM has emerged as a practical solution to a growing pain point: how to choose the best locGlycemicGPT: Otwartoźródłowy bunt AI przeciwko wadliwej opiece nad cukrzycąWhen a software engineer living with Type 1 diabetes could not get his endocrinologist to review months of continuous glOpen source hub3436 indexed articles from Hacker News

Related topics

AI inference20 related articles

Archive

May 20261634 published articles

Further Reading

Szwedzki Grunden rzuca wyzwanie OpenAI dzięki suwerennej, zielonej inferencji AISzwedzki startup zajmujący się inferencją AI o nazwie Grunden oferuje kompatybilne z OpenAI API z infrastrukturą obliczeInferencja AI: Dlaczego stare zasady Doliny Krzemowej nie mają już zastosowania na nowym polu bitwyPrzez lata branża AI zakładała, że inferencja będzie podążać tą samą krzywą kosztów co trenowanie. Nasza analiza ujawniaMacBook Pro M5 Pro staje się lokalnym serwerem LLM: stacje robocze programistów jako silniki wnioskowania AIRzeczywisty test przeprowadzony przez programistę ujawnia, że MacBook Pro M5 Pro z 48 GB zunifikowanej pamięci może urucPodział rynku wnioskowania AI: darwinowska specjalizacja przekształca krajobraz konkurencyjnyEra uniwersalnego wnioskowania AI dobiega końca. Analiza AINews ujawnia darwinowski podział, w którym wyspecjalizowane s

常见问题

这次公司发布“RelaxAI Slashes Inference Costs 80%: Challenging OpenAI and Claude's Dominance”主要讲了什么?

RelaxAI, a UK-based AI startup, has launched a sovereign large language model inference service that it claims reduces costs by 80% compared to offerings from OpenAI and Anthropic.…

从“RelaxAI inference cost comparison vs OpenAI”看,这家公司的这次发布为什么值得关注?

RelaxAI's claimed 80% cost reduction is not a simple price war but the result of a carefully engineered inference stack. The company has not open-sourced its full architecture, but based on technical disclosures and indu…

围绕“RelaxAI sovereign AI GDPR compliance”,这次发布可能带来哪些后续影响?

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。