High-Performance Log Search for Low-Cost Storage
Search terabytes of logs on cheap HDDs at SSD speeds
Problem: Fast storage (SSD) is expensive. Cheap storage (HDD/Object Storage) is slow. Solution: Makigami makes cheap storage fast enough for interactive search.
Traditional log search tools like grep and zcat | grep read files sequentially from start to finish. On large log files stored on HDDs, this means minutes of waiting.
makigami creates a search index (approx. 1/3 of original file size) that enables intelligent block-skipping, dramatically reducing read time.
| Tool | Storage | Log Size | Search Time | Speedup |
|---|---|---|---|---|
| makigami + grep | HDD (500MB/s seq.) | 200GB | 5.5 seconds | Up to 40x faster* |
| zstd -d -c | grep | HDD (500MB/s seq.) | 200GB | 3m 37s | baseline |
*40x speedup achieved in cold-cache scenarios where the target data is near the end of the file.
Named after the Japanese word for "scroll paper" (巻紙), makigami processes logs sequentially like unrolling a scroll—but intelligently skips irrelevant sections.
macOS (Apple Silicon):
curl -LO https://github.com/nt-riken/makigami/releases/download/v0.1.0/mg-macos-arm64
chmod +x mg-macos-arm64
sudo mv mg-macos-arm64 /usr/local/bin/mgLinux / Other (Build from Source):
# Requires Rust installed (https://rustup.rs/)
git clone https://github.com/nt-riken/makigami.git
cd makigami
cargo build --release
sudo mv target/release/mg /usr/local/bin/Step 1: Build index — Creates compressed .zstd file and tiny .mg index
mg build access.log
# Output: access.log.zstd (compressed) + access.log.mg (index, ~33% size)Step 2: Search — Lightning-fast search using the index
mg search -z access.log.zstd "404 NOT FOUND" | grep "404 NOT FOUND"Step 3: Pipe to your tools — Full UNIX philosophy compatibility
mg search -z access.log.zstd "ERROR" | grep "database" | awk '{print $1, $2}'Sequential read optimization designed specifically for HDD performance characteristics. No random seeks means maximum throughput.
Index files are approx. 1/3 of the original log size. A 200GB log produces a ~66GB index.
Works seamlessly with grep, awk, sed, sort, and any other command-line tool. No lock-in.
Uses Zstandard compression for storage efficiency while maintaining search speed.
- Historical log analysis — Search years of archived logs without expensive storage
- Cost-effective log management — Use cheap HDDs instead of SSDs for cold log storage
- Compliance & audit — Quick searches across massive audit log archives
- Personal to enterprise — Scales from single-machine to distributed storage
┌─────────────┐ mg build ┌─────────────────┐
│ access.log │ ───────────────► │ access.log.zstd │ (compressed data)
│ (200GB) │ │ access.log.mg │ (index, ~66GB)
└─────────────┘ └─────────────────┘
│
│ mg search "pattern"
▼
┌─────────────────┐
│ Skip irrelevant │
│ blocks using │──► Only read matching blocks
│ index │ (sequential, fast on HDD)
└─────────────────┘
Q: Does makigami work on SSDs?
A: Yes, but the performance advantage is smaller. makigami's sequential read optimization is designed to maximize HDD throughput where random access is slow.
Q: What about Windows?
A: Currently tested on Linux and macOS. Windows support is untested but may work.
Q: How does it compare to Elasticsearch/Splunk?
A: makigami is a lightweight CLI tool for searching compressed log files, not a full SIEM. It's ideal for cold storage search where you don't need real-time indexing or complex queries.
Q: Can I search without building an index first?
A: No, the index is required for the performance benefits. Without it, use standard zcat | grep.
- Optimize index size (target: significantly below 33%)
- Windows support
- macOS ARM64 binary
- x86 Linux binary
- Parallel search across multiple files
- crates.io publication
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
