Open source projects and technical explorations in distributed systems, backend engineering, and developer tools.
Distributed Consensus File System
Byzantine-fault-tolerant distributed file system with miner-client architecture, gossip-based membership, and proof-of-work consensus.
GoConsensusP2PGossipPoW
Demonstrates:
- Byzantine-fault-tolerant consensus
- Miner-client architecture
- Gossip-based membership
Vector xLite
Fast, lightweight vector search engine with HNSW indexing, SQL-based filtering, and payload storage. Three modes: embedded (in-process), standalone (server), and clustered (Raft-replicated across nodes).
RustGoVector SearchHNSWRaftSQL
Demonstrates:
- Approximate nearest neighbor search via HNSW
- SQL-based metadata filtering and payload storage
- Three deployment modes: embedded / standalone / Raft-clustered
KV Storage Engine
High-performance storage engine based on log-structured merge trees with crash-safe WAL recovery and tunable compaction.
GoLSM TreeWALKey-Value Storage
Demonstrates:
- High write throughput and read efficiency
- Crash-safe with WAL-based recovery
- Customizable compaction + caching strategies
LSM Db Learned Index
Learned LSM-tree index leveraging linear key relationships for ~33% YCSB read improvement over RadixSpline. Research collaboration with Dr. Song Jiang.
C++LSM TreeLearned IndexOpenTBBOptimistic Locking
Demonstrates:
- ~33% YCSB read improvement over RadixSpline
- Non-blocking compaction via optimistic locking
- Research collaboration (Dr. Song Jiang)
Distributed Transaction Patterns
Microservice distributed transactions using SAGA orchestration over RabbitMQ with MassTransit routing.
C#SAGARabbitMQMassTransitMicroservices
Demonstrates:
- SAGA orchestration pattern
- Message-driven architecture
- Eventual consistency across services
Data Migration: DynamoDB → Aurora
Production data-migration system from a Cefalo client engagement: zero-downtime DynamoDB → Aurora PostgreSQL migration of 40M+ records using Spring Batch with segmented scans, validation, and incremental sync.
JavaSpring BootSpring BatchAWS DynamoDBAurora PostgreSQL
Demonstrates:
- 40M+ records, zero-downtime migration
- Segmented parallel scans with Spring Batch
- Validation + incremental sync
Crudify ORM
Rust ORM that auto-generates CRUD methods, DTOs, and pagination helpers via Entity derive macros.
RustORMProcedural MacrosCode Generation
Demonstrates:
- Auto-generated CRUD operations
- DTO generation from entities
- Built-in pagination support
Mathematical Expression Parser
LR parser implementation with parse tables and parse trees for mathematical expressions.
C#LR ParserCompilersParse Tables
Demonstrates:
- LR parser implementation from scratch
- Parse table generation
- Parse tree construction
Document Summarizer
Document summarization tool combining extractive and abstractive techniques for long-form text.
PythonNLPSummarization
Demonstrates:
- Extractive summarization pipeline
- Abstractive summarization layer
- Configurable input size + output length
QPDF Transformer
Content-preserving PDF document transformer for manipulating PDF files while maintaining document integrity.
C++PDFDocument Processing
Demonstrates:
- Content-preserving transformations
- PDF structure manipulation
- Encryption/decryption support