Skip to content

nt-riken/makigami

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Makigami Logo

巻紙 makigami

High-Performance Log Search for Low-Cost Storage

Search terabytes of logs on cheap HDDs at SSD speeds

License Release Stars


Why makigami?

Problem: Fast storage (SSD) is expensive. Cheap storage (HDD/Object Storage) is slow. Solution: Makigami makes cheap storage fast enough for interactive search.

Traditional log search tools like grep and zcat | grep read files sequentially from start to finish. On large log files stored on HDDs, this means minutes of waiting.

makigami creates a search index (approx. 1/3 of original file size) that enables intelligent block-skipping, dramatically reducing read time.

Tool Storage Log Size Search Time Speedup
makigami + grep HDD (500MB/s seq.) 200GB 5.5 seconds Up to 40x faster*
zstd -d -c | grep HDD (500MB/s seq.) 200GB 3m 37s baseline

*40x speedup achieved in cold-cache scenarios where the target data is near the end of the file.

Named after the Japanese word for "scroll paper" (巻紙), makigami processes logs sequentially like unrolling a scroll—but intelligently skips irrelevant sections.


Quick Start

Installation

macOS (Apple Silicon):

curl -LO https://github.com/nt-riken/makigami/releases/download/v0.1.0/mg-macos-arm64
chmod +x mg-macos-arm64
sudo mv mg-macos-arm64 /usr/local/bin/mg

Linux / Other (Build from Source):

# Requires Rust installed (https://rustup.rs/)
git clone https://github.com/nt-riken/makigami.git
cd makigami
cargo build --release
sudo mv target/release/mg /usr/local/bin/

Basic Usage

Step 1: Build index — Creates compressed .zstd file and tiny .mg index

mg build access.log
# Output: access.log.zstd (compressed) + access.log.mg (index, ~33% size)

Step 2: Search — Lightning-fast search using the index

mg search -z access.log.zstd "404 NOT FOUND" | grep "404 NOT FOUND"

Step 3: Pipe to your tools — Full UNIX philosophy compatibility

mg search -z access.log.zstd "ERROR" | grep "database" | awk '{print $1, $2}'

Features

⚡ Optimized for Slow Storage

Sequential read optimization designed specifically for HDD performance characteristics. No random seeks means maximum throughput.

📦 Search Index

Index files are approx. 1/3 of the original log size. A 200GB log produces a ~66GB index.

🔧 UNIX Philosophy

Works seamlessly with grep, awk, sed, sort, and any other command-line tool. No lock-in.

🗜️ Built-in Compression

Uses Zstandard compression for storage efficiency while maintaining search speed.


Use Cases

  • Historical log analysis — Search years of archived logs without expensive storage
  • Cost-effective log management — Use cheap HDDs instead of SSDs for cold log storage
  • Compliance & audit — Quick searches across massive audit log archives
  • Personal to enterprise — Scales from single-machine to distributed storage

How It Works

┌─────────────┐     mg build      ┌─────────────────┐
│  access.log │ ───────────────►  │ access.log.zstd │  (compressed data)
│   (200GB)   │                   │ access.log.mg   │  (index, ~66GB)
└─────────────┘                   └─────────────────┘
                                           │
                                           │ mg search "pattern"
                                           ▼
                                  ┌─────────────────┐
                                  │ Skip irrelevant │
                                  │ blocks using    │──► Only read matching blocks
                                  │ index           │    (sequential, fast on HDD)
                                  └─────────────────┘

FAQ

Q: Does makigami work on SSDs?

A: Yes, but the performance advantage is smaller. makigami's sequential read optimization is designed to maximize HDD throughput where random access is slow.

Q: What about Windows?

A: Currently tested on Linux and macOS. Windows support is untested but may work.

Q: How does it compare to Elasticsearch/Splunk?

A: makigami is a lightweight CLI tool for searching compressed log files, not a full SIEM. It's ideal for cold storage search where you don't need real-time indexing or complex queries.

Q: Can I search without building an index first?

A: No, the index is required for the performance benefits. Without it, use standard zcat | grep.


TODO

  • Optimize index size (target: significantly below 33%)
  • Windows support
  • macOS ARM64 binary
  • x86 Linux binary
  • Parallel search across multiple files
  • crates.io publication

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


License

Apache-2.0


🐛 Report Bug✨ Request Feature

About

Indexed quick search solution for large log data and slow HDD

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors