Technical Deep Dive
The philipbrown/handwritten-digits project implements a fully connected feedforward neural network with one hidden layer, using sigmoid activation and cross-entropy loss. The core innovation is its pure Elixir implementation: every matrix multiplication, activation function, and gradient descent update is written in Elixir, relying on the language's built-in data structures (lists, tuples, maps) and pattern matching for control flow.
Architecture:
- Input layer: 784 neurons (28x28 pixel images flattened)
- Hidden layer: 100 neurons with sigmoid activation
- Output layer: 10 neurons (one per digit 0-9) with softmax
- Training: Stochastic gradient descent (SGD) with learning rate 0.5, no momentum or adaptive methods
- Batch size: 1 (truly online learning)
- Epochs: 10 (default)
The network uses a manual backpropagation implementation. The gradient for each weight is computed by iterating over training examples one at a time, updating weights immediately. This is computationally expensive and prone to noisy gradients, contributing to the lower accuracy.
Performance Benchmarks:
| Metric | philipbrown/handwritten-digits | Python (NumPy) Reference | TensorFlow/Keras (CPU) |
|---|---|---|---|
| Test Accuracy | ~82% | ~92% | ~99.2% |
| Training Time (10 epochs) | ~45 minutes | ~2 minutes | ~30 seconds |
| Lines of Code | ~500 | ~100 (with NumPy) | ~20 (high-level API) |
| GPU Support | No | No (NumPy) | Yes |
| Automatic Differentiation | No | No | Yes |
Data Takeaway: The Elixir implementation is 22x slower than a Python NumPy baseline and 90x slower than TensorFlow on CPU, with 17 percentage points lower accuracy. This starkly illustrates the cost of avoiding optimized linear algebra libraries and automatic differentiation.
The project uses Elixir's `Enum` and `Stream` modules for data processing, and `Agent` for managing model state during training. The lack of a tensor library means all operations are done on nested lists, leading to O(n²) memory overhead for matrix operations. For comparison, the `Nx` library (Elixir's native numerical computing library) would provide tensor operations with GPU support, but the project deliberately avoids this dependency to remain "pure Elixir."
Key Engineering Trade-offs:
- Concurrency: Elixir's lightweight processes could theoretically parallelize forward passes across examples, but the current implementation is sequential. A concurrent version using `Task.async` could speed up inference but would complicate gradient accumulation.
- Pattern Matching: Used elegantly for activation function selection and error calculation, but leads to verbose code compared to vectorized operations.
- Immutability: Ensures no side effects during training, but forces copying of entire weight matrices on each update, increasing memory pressure.
Key Players & Case Studies
The project's sole contributor is Philip Brown, an Elixir developer and author of the book "Functional Web Development with Elixir, OTP, and Phoenix." His background is in web development, not machine learning research, which explains the project's educational focus. The repository serves as a companion to his blog posts and conference talks about functional programming concepts.
Comparison with Other Functional ML Projects:
| Project | Language | Approach | Accuracy (MNIST) | Production Ready? |
|---|---|---|---|---|
| philipbrown/handwritten-digits | Elixir | Pure functional, no deps | ~82% | No |
| `Nx` + `Axon` (Elixir) | Elixir | Tensor library + neural network framework | ~99% | Partial (GPU support, but ecosystem young) |
| `Flux.jl` (Julia) | Julia | Differentiable programming | ~99% | Yes (Julia's ML ecosystem) |
| `clojure.core.matrix` | Clojure | Matrix library with GPU backends | ~97% | Partial |
| `HLearn` (Haskell) | Haskell | Algebraic structures for ML | ~95% | No (abandoned) |
Data Takeaway: The Elixir ecosystem has more capable ML tools (Nx/Axon) that achieve near-state-of-the-art accuracy, but philipbrown/handwritten-digits deliberately ignores them to demonstrate fundamentals. This positions it as a teaching tool, not a competitor.
Case Study: Educational Value
The project has been used in Elixir meetups and online courses to teach neural network basics. For example, the Elixir School community has referenced it in tutorials about recursion and list processing. However, its adoption is limited: the GitHub repo has only 44 stars and 3 forks, indicating minimal traction even within the Elixir community. By contrast, the `Nx` library has over 2,500 stars and active development from the Dashbit team.
Industry Impact & Market Dynamics
The broader trend is the expansion of machine learning beyond Python. Languages like Julia, Rust, and Mojo are vying for ML workloads, each claiming better performance or safety. Elixir's pitch is its concurrency model and fault-tolerance, which could be valuable for ML inference in distributed systems (e.g., real-time recommendation engines on a BEAM cluster).
Market Context:
| Language | ML Ecosystem Maturity | Primary Use Case | Key Advantage |
|---|---|---|---|
| Python | Mature (TensorFlow, PyTorch, JAX) | Research & production | Largest community, libraries |
| Julia | Growing (Flux, DiffEq) | Scientific computing | Speed, differentiable programming |
| Rust | Emerging (burn, candle) | Embedded, safety-critical | Memory safety, no GC |
| Elixir | Nascent (Nx, Axon, Bumblebee) | Web, distributed systems | Concurrency, fault-tolerance |
Data Takeaway: Elixir's ML ecosystem is the least mature, with no production-grade training frameworks. Its niche is likely inference serving in BEAM-based systems, not model training.
The project's existence reflects a growing curiosity among Elixir developers about ML, but the lack of accuracy and performance means it will not influence industry adoption. The real impact is indirect: it may inspire contributions to Nx/Axon or spark interest in functional ML research.
Risks, Limitations & Open Questions
Key Limitations:
1. Accuracy ceiling: The 82% accuracy is below the 90% threshold needed for even basic digit recognition tasks. Misclassifications are common for ambiguous digits (e.g., 4 vs. 9).
2. No generalization: The single hidden layer architecture cannot learn complex features like stroke order or curvature, which modern CNNs capture.
3. Training instability: SGD with batch size 1 leads to high variance in gradient estimates, causing the loss to oscillate rather than converge smoothly.
4. Scalability: Training on the full MNIST dataset (60,000 examples) takes nearly an hour on a modern laptop. Scaling to larger datasets (e.g., CIFAR-10 with 32x32 color images) would be impractical.
Open Questions:
- Can Elixir's concurrency be effectively leveraged for ML training? The project doesn't explore this, but a distributed SGD implementation using Elixir's `GenServer` could be an interesting research direction.
- Will the Elixir community embrace ML, or will it remain a niche curiosity? The growth of Nx and Livebook suggests momentum, but adoption is slow.
- Is there a use case for "pure" functional ML in production? Immutability could help with reproducibility and debugging, but the performance cost is currently prohibitive.
Ethical Considerations:
The project itself has no ethical concerns, but its low accuracy could mislead learners into thinking this is a viable approach. Without context, beginners might assume 80% accuracy is acceptable, when in practice even 99% is insufficient for critical applications like bank check processing.
AINews Verdict & Predictions
Verdict: philipbrown/handwritten-digits is a commendable educational exercise but a non-starter for any practical application. It succeeds in demonstrating how functional programming concepts map to neural network computations, making it a useful resource for Elixir developers learning ML fundamentals. However, its low accuracy, poor performance, and lack of optimization mean it will never be used in production. The project is best viewed as a historical artifact—a snapshot of what Elixir ML looked like before proper libraries emerged.
Predictions:
1. Within 12 months: The repository will see fewer than 100 total stars, as the Elixir community shifts focus to Nx/Axon for serious ML work. Philip Brown may update the project to use Nx for performance, but the "pure Elixir" constraint will be abandoned.
2. Within 3 years: Elixir will have a production-grade ML inference framework (likely built on Nx + Rustler for GPU acceleration), but training will remain in Python. Projects like this will be remembered as early experiments.
3. The real opportunity: Elixir's sweet spot is not training but serving—running pre-trained models in a BEAM cluster for low-latency inference. The `Bumblebee` library (Hugging Face models in Elixir) is already pursuing this. philipbrown/handwritten-digits is a stepping stone toward that vision.
What to Watch: The next version of Nx (0.7+) with improved automatic differentiation, and whether Elixir's ML community produces a benchmark that achieves >99% accuracy on MNIST using idiomatic Elixir code. If that happens, the language could carve a niche in edge inference for IoT devices, where its low resource footprint and concurrency are advantages.