137 Particles Labs Initiative

Resurrecting Prose

The standard for pure Go NLP was abandoned. We are bringing it back with modern Go idioms, embedded assets, and enterprise-grade performance optimization.

View on GitHub View Roadmap

The Legacy Debt

Relying on deprecated go-bindata for assets.
Massive binary bloat due to uncompressed models.
Slow NER startup (parsing 150k lines on init).
"All or Nothing" model loading (high RAM usage).

The 137 Refactor

Native Go //go:embed support.
Stream-based parsing for zero-latency startup.
Smart LRU caching for model management.
Disk-based loading for custom enterprise models.

Refactoring Roadmap

Our phased approach to modernization.

Phase 1
Modernization
Replacing legacy go-bindata with native embed. This reduces build complexity and allows for cleaner binary distributions. We are also enabling "Disk Mode" to allow loading models from the filesystem instead of compiled binaries.
Phase 2
Optimization
The NER engine currently parses 150k lines of training data on startup. We are refactoring this to use pre-compiled binary formats or streaming parsers to reduce initialization time from seconds to milliseconds.
Phase 3
Intelligence
Implementing a "Smart Loader" with LRU (Least Recently Used) logic. The system will unload unused models to free up RAM for other processes, critical for running on constrained edge devices.
Phase 4
Documentation
Building a comprehensive `examples/` directory. No more guessing. We will provide copy-pasteable examples for Tokenization, NER, and custom model implementation.