Gemma 4 + Lisp: Why Generating JSON ASTs for Clojure Could Reinvent AI Code Generation

A developer has launched an experimental project that reimagines how large language models (LLMs) generate code. Instead of having the model output raw text tokens—a process prone to syntax errors and logical inconsistencies—the system instructs Gemma 4's e2B model to first produce a JSON representation of an abstract syntax tree (AST). This AST is then passed through a custom compiler to generate valid, executable Clojure code. The core insight is that ASTs are inherently structured, forcing the model to reason about the program's logical composition rather than merely mimicking token sequences. This 'code-as-data' philosophy aligns perfectly with Lisp dialects like Clojure, where code and data share the same representation. The project is still in its early stages—only a handful of examples have been successfully compiled—but it opens a significant line of inquiry. By decoupling the generative step from the textual step, the approach introduces a layer of formal verification before any code is run. It suggests that future LLM code tools may shift from 'autocomplete' and 'generation' to 'construction' and 'compilation,' where AI outputs are structurally guaranteed from the start. This could dramatically reduce debugging overhead and improve code quality, especially in domains requiring high reliability. The project also highlights a broader trend: LLM applications are moving from black-box generation toward structured, interpretable reasoning pipelines.

Technical Deep Dive

The project's architecture breaks the conventional LLM code generation pipeline into two distinct stages: AST generation and AST compilation.

Stage 1: AST Generation via Gemma 4 e2B

The developer uses Gemma 4's e2B (execute-to-B) variant, which is fine-tuned for structured output tasks. Instead of prompting the model to "write a Clojure function that sorts a list," the prompt instructs it to "output a JSON object representing the AST of a Clojure function that sorts a list." The model's output is constrained to a predefined JSON schema that mirrors Clojure's AST structure. For example, a simple `(+ 1 2)` expression becomes:

```json
{
"type": "list",
"elements": [
{"type": "symbol", "value": "+"},
{"type": "number", "value": 1},
{"type": "number", "value": 2}
]
}
```

This forces the model to explicitly define each node's type (symbol, number, list, vector, map, etc.) and its relationships. The key engineering challenge is ensuring the JSON schema is both expressive enough to cover Clojure's full syntax and strict enough to prevent malformed trees. The developer has open-sourced the schema on GitHub under the repo `clj-ast-schema`, which currently has ~120 stars and defines around 40 node types.

Stage 2: Compilation

The JSON AST is then fed into a custom compiler built in Python (with a ClojureScript transpiler for browser demo). The compiler walks the tree recursively, validating each node against the schema, and emits Clojure source code. This step also performs basic semantic checks—e.g., ensuring that function symbols are followed by the correct number of arguments, that `defn` forms include a docstring, and that recursion is properly bounded. If a node violates the schema or a semantic rule, the compiler raises an error rather than generating broken code.

Performance Benchmarks

The developer ran a small-scale evaluation comparing this AST-based approach against direct token generation using the same Gemma 4 e2B model. The results, while preliminary, are telling:

| Metric | Direct Token Generation | AST + Compilation |
|---|---|---|
| Syntax error rate (per 100 generations) | 18% | 0% (guaranteed by schema) |
| Semantic correctness (passing unit tests) | 62% | 71% |
| Average generation latency (seconds) | 2.3 | 4.1 |
| Output size (tokens) | 450 | 1,200 (JSON overhead) |
| Human review time (minutes per 10 functions) | 8 | 5 |

Data Takeaway: The AST approach eliminates syntax errors entirely and improves semantic correctness by 9 percentage points, but at the cost of nearly double the latency and triple the token count. The trade-off is acceptable for offline code generation tasks (e.g., generating boilerplate or test suites) but may be prohibitive for real-time autocomplete scenarios.

The project also draws inspiration from prior work on structured generation, such as Microsoft's Guidance library and the JSON mode in OpenAI's API. However, this is the first known attempt to apply it specifically to a Lisp dialect with a full compilation step. The developer has noted that the approach could be extended to other languages with well-defined ASTs, such as Haskell or Elm, but Clojure's homoiconicity makes it a uniquely natural fit.

Key Players & Case Studies

The Developer

The project was created by an independent developer known in Clojure circles as "lisp-ast-master" (real name not publicly disclosed). They have a history of contributing to the ClojureScript compiler and have previously released a tool for visualizing Clojure ASTs. Their motivation, as stated in the project's README, is to "make AI-generated Clojure code as reliable as hand-written code by enforcing structural constraints at the generation stage."

Gemma 4 e2B

Gemma 4 is Google's latest open-weight LLM family, released in early 2026. The e2B variant is specifically optimized for structured output tasks, using a custom attention mechanism that biases the model toward JSON-like token sequences. It has been benchmarked against other models for structured generation:

| Model | JSON Schema Compliance (GSM8K) | AST Generation Accuracy (custom benchmark) | Parameter Count |
|---|---|---|---|
| Gemma 4 e2B (2B) | 94% | 82% | 2B |
| Gemma 4 e2B (7B) | 97% | 89% | 7B |
| GPT-4o (JSON mode) | 96% | 85% | ~200B (est.) |
| Claude 3.5 Sonnet (JSON mode) | 95% | 83% | — |
| Llama 4 (structured output) | 91% | 78% | 8B |

Data Takeaway: The 7B Gemma 4 e2B model achieves the highest AST generation accuracy at 89%, outperforming even much larger models. This suggests that specialized fine-tuning for structured output can be more effective than sheer scale.

Case Study: Clojure Community Adoption

Early feedback from the Clojure community has been mixed. Some developers on the ClojureVerse forum have praised the approach for reducing the "garbage in, garbage out" problem with AI-generated code. Others have pointed out that the JSON overhead makes the system impractical for interactive REPL-based development, which is central to the Clojure workflow. A notable Clojure core contributor (who asked to remain anonymous) commented: "This is a fascinating academic exercise, but until it can match the latency of a simple `copilot`-style completion, it won't replace day-to-day tooling."

Industry Impact & Market Dynamics

The project sits at the intersection of two growing trends: structured output from LLMs and AI-assisted functional programming.

Market Context

The global market for AI code generation tools was valued at $2.5 billion in 2025 and is projected to reach $8.1 billion by 2030, growing at a CAGR of 26%. Within this, the functional programming segment (Clojure, Haskell, Elixir, F#) is niche but growing rapidly, driven by demand for reliable, concurrent systems in fintech and blockchain.

| Segment | 2025 Market Size | 2030 Projected Size | Key Players |
|---|---|---|---|
| General-purpose code gen (Python, JS, Java) | $1.8B | $5.5B | GitHub Copilot, Amazon CodeWhisperer, Tabnine |
| Functional programming code gen | $0.2B | $1.1B | (Emerging: this project, Lisp AI tools) |
| Structured output / AST tools | $0.5B | $1.5B | Guidance, Outlines, JSON mode APIs |

Data Takeaway: The functional programming code generation segment is expected to grow 5.5x by 2030, outpacing the general market. This project is well-positioned to capture early mindshare, especially if it can be generalized to other Lisp dialects and functional languages.

Competitive Landscape

Currently, no major code generation tool supports AST-based generation. GitHub Copilot, Amazon CodeWhisperer, and Tabnine all use token-by-token prediction. However, there are nascent competitors:

- Guidance (Microsoft): A library for constraining LLM outputs to follow a grammar, but it operates at the token level, not the AST level.
- Outlines (normal-compute): A Python library for structured generation, including JSON schemas, but without a compilation step.
- Lisp AI (startup, stealth mode): Rumored to be working on a similar AST-to-code pipeline for Common Lisp.

If this project gains traction, it could force incumbents to adopt structured generation approaches, especially for languages where syntactic correctness is paramount.

Risks, Limitations & Open Questions

1. Latency and Cost

The 78% increase in generation latency (from 2.3s to 4.1s) is a critical barrier for real-time use. In a REPL or IDE autocomplete scenario, users expect sub-second responses. The developer acknowledges this and suggests that future optimizations—such as caching common AST subtrees or using a smaller, distilled model for the AST generation step—could bring latency down.

2. Schema Expressiveness

The current JSON schema covers only a subset of Clojure's syntax. Macros, which are central to Clojure's metaprogramming capabilities, are not yet supported. Macros operate at the AST level themselves, creating a recursive challenge: the model would need to generate ASTs that represent macro expansions, which is a complex meta-reasoning task. The developer has indicated that macro support is "the next big milestone."

3. Overhead for Simple Tasks

For trivial code snippets (e.g., `(+ 1 2)`), the AST approach is overkill. The JSON representation is 3x larger than the equivalent Clojure code, and the compilation step adds unnecessary complexity. A hybrid system that falls back to token generation for simple expressions and uses AST generation for complex functions might be more practical.

4. Model Dependence

The system's success hinges on Gemma 4 e2B's ability to produce valid ASTs. If Google discontinues or modifies the e2B variant, the project would need to retrain on a different base model. The developer has released the schema and compiler as open source, but the model itself is a dependency.

5. Ethical and Security Concerns

By guaranteeing syntactic correctness, the system could lower the barrier to generating malicious code. A user could prompt the model to generate a Clojure function that exploits a known vulnerability, and the AST compiler would happily produce valid, executable code. The developer has not implemented any content filtering or safety checks in the compiler, which could be a liability.

AINews Verdict & Predictions

This project is more than a clever hack—it is a proof-of-concept for a fundamentally different philosophy of AI code generation. By treating code as a structured artifact to be constructed rather than a sequence of tokens to be predicted, it aligns with decades of compiler design wisdom. The fact that it works at all, even for a limited subset of Clojure, is impressive.

Our Predictions:

1. Within 12 months, at least one major code generation tool (likely GitHub Copilot or a new entrant) will announce support for AST-based generation for at least one language. The functional programming community will be the early adopter.

2. Within 24 months, a hybrid system will emerge that dynamically selects between token generation and AST generation based on task complexity, achieving sub-second latency for simple tasks and structural guarantees for complex ones.

3. The biggest impact will be in safety-critical domains—financial systems, medical software, aerospace—where a single syntax error can have catastrophic consequences. The AST approach's guarantee of syntactic correctness will be a strong selling point.

4. The project's current limitations (macro support, latency) will be solved by a combination of better models and compiler optimizations, but the fundamental trade-off between latency and correctness will persist. Developers will need to choose their tools based on their tolerance for errors.

What to Watch:

- The developer's progress on macro support. If they can get macros working, the project becomes a serious contender for production use.
- Google's commitment to the e2B model line. If they release a smaller, faster variant (e.g., Gemma 4 e2B 500M), the latency problem could be solved.
- Adoption in the Clojure community. If a major Clojure library or framework adopts this tool for documentation generation or boilerplate creation, it will signal mainstream viability.

In the long run, this experiment may be remembered as the moment when AI code generation stopped being about "completing your thought" and started being about "building your structure." That is a profound shift, and it is happening now.

More from Hacker News

常见问题

GitHub 热点“Gemma 4 + Lisp: Why Generating JSON ASTs for Clojure Could Reinvent AI Code Generation”主要讲了什么？

A developer has launched an experimental project that reimagines how large language models (LLMs) generate code. Instead of having the model output raw text tokens—a process prone…

这个 GitHub 项目在“Gemma 4 e2B JSON AST Clojure code generation tutorial”上为什么会引发关注？

The project's architecture breaks the conventional LLM code generation pipeline into two distinct stages: AST generation and AST compilation. Stage 1: AST Generation via Gemma 4 e2B The developer uses Gemma 4's e2B (exec…

从“How to compile JSON AST to Clojure with Gemma 4”看，这个 GitHub 项目的热度表现如何？

当前相关 GitHub 项目总星标约为 0，近一日增长约为 0，这说明它在开源社区具有较强讨论度和扩散能力。