LIVE
Tiny OCR
A specialized OCR service that converts images of math equations, chemical structures, handwriting, and documents into structured text (LaTeX, SMILES, Markdown). Targets researchers, chemists, educators, and developers with sub-50ms latency and 99.7% accuracy on STEM content.
COMPLETE
Tiny Tessarachnid
A simplified modern implementation of Tesseract OCR. The core model powering the tiny-ocr service, glyph-daemon, and related OCR projects.
DEPLOYED
Glyph Daemon
Encrypted inference backend for the tiny-tessarachnid OCR models. Serves three model versions (RetinaOCRNet, RetinaOCRGPT, ContourOCRNet) behind a FastAPI endpoint with AES-256 encrypted weights that decrypt into memory at boot.
DEPLOYED
Winnie Interface
Web interface for rating OCR bounding-box annotations for the tiny-tessarachnid pipeline. Shows document images alongside interactive SVG bounding-box overlays at four hierarchical levels with Good/Bad rating buttons for human feedback.
RESEARCH
Nissl
Extends the tiny-tessarachnid OCR architecture with a masked infill head that proves the model understands detected content by reconstructing it from surrounding context. Uses ResNet encoder with autoregressive detection and U-Net decoder for pixel reconstruction.
IN DEVELOPMENT
Macrodose
A macOS behaviour logger and hierarchical action-prediction model. Captures mouse clicks, keyboard events, screenshots, and the accessibility tree at every event, then trains a model that predicts the next user action by auto-regressively zooming through the UI hierarchy.
RESEARCH
Mangaka
Converts manga pages into structured JSON representations and back into images through four stages: cloud annotation with vision LLMs (Claude, GPT-4o, Gemini), local model training (GPT-2 + Stable Diffusion), distillation loop, and dataset generation for RAG/finetuning.
RESEARCH
Musepix
A retina-based hierarchical detector for sheet music images. Detects and recognizes musical elements with audio OCR, MIDI decoding, symbol similarity models, and OpenAI-based annotation.
DEPLOYED
Handwriting Orca Math
Handwriting recognition system for mathematical content.
DEPLOYED
Room Detection
Room detection and classification from photos. Full AWS infrastructure with EC2 instance, Lambda functions, ECS cluster, and S3 storage. Deployed via CloudFormation and SAM CLI.
IN DEVELOPMENT
Every Noise
Scrapes genre-to-artist mappings from Every Noise at Once and Spotify, downloads music, maps artists to YouTube, and trains a music genre classifier.
FUNCTIONAL
Selective Speaker
Voice/speaker-related application with a registered Firebase Android app.
COMPLETE
Browser-Use Travel Agent Benchmark
A benchmark suite for evaluating AI agents that use browsers to complete travel-related tasks. Includes categorized tasks, a runner, and report generation.
LIVE
Spindle
An AI research agent that visualizes web traversal as a real-time 3D graph. Every page the agent visits becomes an explorable node. Supports GPT-4o, Claude Opus/Sonnet/Haiku, o3 and other frontier models with mid-conversation switching. Integrated Tavily web search with live result cards, full-page content extraction, and chat history. Free tier or pay-as-you-go token billing via Stripe.
LIVE
BranchGPT
An AI chat interface that visualizes conversations as a branching tree graph using ReactFlow. Supports multi-provider LLMs, branch/fork any message to explore alternate responses, collapse and hide branches, multi-select nodes, and compose from context. Features Stripe pay-as-you-go billing and NextAuth login.