Skip to content

imazen/archmage

Repository files navigation

archmage CI crates.io lib.rs docs.rs license

Browse 12,000+ SIMD Intrinsics → · Docs · Magetypes · API Docs

Archmage lets you write SIMD code in Rust without unsafe. It works on x86-64, AArch64, and WASM.

[dependencies]
archmage = "0.9"

The problem

Raw SIMD in Rust requires unsafe for every intrinsic call:

use std::arch::x86_64::*;

// Every. Single. Call.
unsafe {
    let a = _mm256_loadu_ps(data.as_ptr());      // unsafe: raw pointer
    let b = _mm256_set1_ps(2.0);                  // unsafe: needs target_feature
    let c = _mm256_mul_ps(a, b);                  // unsafe: needs target_feature
    _mm256_storeu_ps(out.as_mut_ptr(), c);         // unsafe: raw pointer
}

Miss a feature check and you get undefined behavior on older CPUs. Wrap everything in unsafe and hope for the best.

The solution

use archmage::prelude::*;  // tokens, traits, macros, intrinsics, safe memory ops

// X64V3Token = AVX2 + FMA (Haswell 2013+, Zen 1+)
#[arcane]
fn multiply(_token: X64V3Token, data: &[f32; 8]) -> [f32; 8] {
    let a = _mm256_loadu_ps(data);          // safe: takes &[f32; 8], not *const f32
    let b = _mm256_set1_ps(2.0);            // safe: inside #[target_feature]
    let c = _mm256_mul_ps(a, b);            // safe: value-based (Rust 1.87+)
    let mut out = [0.0f32; 8];
    _mm256_storeu_ps(&mut out, c);          // safe: takes &mut [f32; 8]
    out
}

fn main() {
    // Runtime CPU check — returns None if AVX2+FMA unavailable
    if let Some(token) = X64V3Token::summon() {
        let result = multiply(token, &[1.0; 8]);
        println!("{:?}", result);
    }
}

No unsafe anywhere. Your crate can use #![forbid(unsafe_code)].

How Rust enforces SIMD safety

Rust 1.86 (Apr 2025) and 1.87 (May 2025) changed the rules for #[target_feature] functions:

  Rust's #[target_feature] call rules (1.86+)

  ┌─────────────────────────┐         ┌──────────────────────────────┐
  │  fn normal_code()       │ unsafe  │ #[target_feature(avx2, fma)] │
  │                         │────────▶│ fn simd_work()               │
  │  (no target features)   │         │                              │
  └─────────────────────────┘         └──────────────┬───────────────┘
                                                     │
          Calling simd_work() from                   │ safe
          normal_code() requires                     │ (subset of
          unsafe { }. The caller                     ▼ caller's features)
          has fewer features.          ┌──────────────────────────────┐
                                       │ #[target_feature(avx2)]      │
                                       │ fn simd_helper()             │
                                       │                              │
                                       └──────────────────────────────┘

  Caller has same or superset features? Safe call. No unsafe needed.
  Caller has fewer features?            Rust requires an unsafe block.

Inside a #[target_feature] function, value-based intrinsics are safe_mm256_add_ps, shuffles, compares, anything that doesn't touch a pointer. Only pointer-based memory ops (_mm256_loadu_ps(ptr)) remain unsafe.

Two gaps remain:

  1. The boundary crossing. The first call from normal code into a #[target_feature] function requires unsafe. Someone has to verify the CPU actually has those features.

  2. Memory operations. _mm256_loadu_ps takes *const f32. Raw pointers need unsafe.

How archmage closes both gaps

Archmage makes the boundary crossing sound by tying it to runtime CPU detection. You can't cross without proof.

  Your crate: #![forbid(unsafe_code)]

  ┌────────────────────────────────────────────────────────────┐
  │                                                            │
  │  1. summon()          Checks CPUID. Returns Some(token)    │
  │     X64V3Token        only if AVX2+FMA are present.        │
  │                       Token is zero-sized proof.           │
  │          │                                                 │
  │          ▼                                                 │
  │  2. #[arcane] fn      Reads token type from signature.     │
  │     my_func(token,..) Generates #[target_feature] sibling. │
  │                       Wraps the unsafe call internally —   │
  │          │            your code never writes unsafe.        │
  │          ▼                                                 │
  │  3. More #[arcane]    Matching features? Safe call.        │
  │     + plain fns       LLVM inlines — no boundary.          │
  │          │                                                 │
  │          ▼                                                 │
  │  4. Intrinsics        Value ops: safe (Rust 1.87+)         │
  │     in scope          Memory ops: safe_unaligned_simd      │
  │                       takes &[f32; 8], not *const f32      │
  │                                                            │
  │  Result: all intrinsics are safe. No unsafe in your code.  │
  └────────────────────────────────────────────────────────────┘

Tokens are grouped by common CPU tiers. X64V3Token covers AVX2+FMA+BMI2 — the set that Haswell (2013) and Zen 1+ share. NeonToken covers AArch64 NEON. Arm64V2Token covers CRC+RDM+DotProd+FP16+AES+SHA2 — the set that Apple M1, Cortex-A55+, and Graviton 2+ share. You pick a tier, not individual features. summon() checks all features in the tier atomically; it either succeeds (every feature present) or returns None. The token is zero-sized — passing it costs nothing. Detection is cached (~1.3 ns), or compiles away entirely with -Ctarget-cpu=haswell.

#[arcane] is the trampoline. It generates a sibling function with #[target_feature(enable = "avx2,fma,...")] and an #[inline(always)] wrapper that calls it through unsafe. The macro generates the unsafe block, not you. Since the token's existence proves the features are present, the call is sound. From inside the #[arcane] function, you can use intrinsics directly (value ops are safe) and call other #[arcane] functions with matching features (safe under Rust 1.86+, and LLVM inlines the wrapper away — zero overhead). #[arcane] handles #[cfg(target_arch)] gating automatically.

safe_unaligned_simd (by okaneco) closes the memory gap. It shadows core::arch's pointer-based load/store functions with reference-based versions — _mm256_loadu_ps takes &[f32; 8] instead of *const f32. Same names, safe signatures. Archmage re-exports these through import_intrinsics, so the safe versions are in scope automatically.

All tokens compile on all platforms. On the wrong architecture, summon() returns None. You rarely need #[cfg(target_arch)] in your code.

Why #![forbid(unsafe_code)] matters for AI-written SIMD

AI is a patient compiler. It can write SIMD intrinsics, run benchmarks, iterate on hot loops, and try instruction sequences that no human would bother testing. Constraining AI to #![forbid(unsafe_code)] means it can't introduce undefined behavior — the type system catches unsound calls at compile time. The result is hand-tuned SIMD that's both fast and provably safe.

One import, zero #[cfg]

use archmage::prelude::* gives you everything — tokens, traits, macros, all platform intrinsics, and safe memory ops. All tokens compile on all platforms; summon() returns None on the wrong architecture. You rarely need #[cfg(target_arch)] in your code.

use archmage::prelude::*;

#[arcane]  // intrinsics already in scope from prelude
fn example(_: X64V3Token, data: &[f32; 8]) -> __m256 {
    _mm256_loadu_ps(data)
}

#[arcane] wraps its output in #[cfg(target_arch = "x86_64")] automatically — the function doesn't exist on ARM, and that's fine. For dispatch across architectures, incant! handles the cfg gating for you:

use archmage::prelude::*;

#[arcane]
fn process_v3(_: X64V3Token, data: &mut [f32]) { /* AVX2 */ }
#[arcane]
fn process_neon(_: NeonToken, data: &mut [f32]) { /* NEON */ }
fn process_scalar(_: ScalarToken, data: &mut [f32]) { /* fallback */ }

pub fn process(data: &mut [f32]) {
    incant!(process(data), [v3, neon, scalar])  // no #[cfg] needed
}

If you prefer scoped imports over the prelude, use import_intrinsics to inject intrinsics into the function body only:

use archmage::{X64V3Token, SimdToken, arcane};

#[arcane(import_intrinsics)]  // injects intrinsics inside this function only
fn example(_: X64V3Token, data: &[f32; 8]) -> core::arch::x86_64::__m256 {
    _mm256_loadu_ps(data)  // in scope from import_intrinsics
}

Both import the same combined intrinsics module — the prelude imports them at module scope, import_intrinsics scopes them to the function body.

Which macro do I use?

         Writing a SIMD function?
         ┌───────────┴───────────┐
  Hand-written intrinsics    Scalar code, let compiler
         │                   auto-vectorize
         │                        │
  ┌──────▼──────┐          ┌──────▼──────────┐
  │  #[arcane]  │          │  #[autoversion]  │
  │             │          │                  │
  │ Use for all │          │ Generates per-   │
  │ SIMD fns —  │          │ tier variants +  │
  │ entry points│          │ dispatcher from  │
  │ AND helpers │          │ one scalar body  │
  └──────┬──────┘          └─────────────────┘
         │
  Need cross-arch dispatch?
         │
  ┌──────▼──────────────────┐
  │  incant!()              │
  │                         │
  │ Routes to _v3, _neon,   │
  │ _scalar, etc.           │
  │ Handles all #[cfg] for  │
  │ you — zero boilerplate  │
  └─────────────────────────┘

Plain #[inline(always)] functions with no macro also work — if they inline into an #[arcane] caller, LLVM compiles them with the caller's features for free.

#[arcane] everywhere

Use #[arcane] for every SIMD function — entry points and internal helpers alike. When one #[arcane] function calls another with matching features, LLVM inlines the wrapper away. Zero overhead.

use archmage::prelude::*;

#[arcane]
fn dot_product(token: X64V3Token, a: &[f32; 8], b: &[f32; 8]) -> f32 {
    let products = mul_vectors(token, a, b);  // #[arcane] — inlines, zero cost
    horizontal_sum(token, products)            // #[arcane] — same
}

#[arcane]
fn mul_vectors(_: X64V3Token, a: &[f32; 8], b: &[f32; 8]) -> __m256 {
    _mm256_mul_ps(_mm256_loadu_ps(a), _mm256_loadu_ps(b))
}

#[arcane]
fn horizontal_sum(_: X64V3Token, v: __m256) -> f32 {
    let sum = _mm256_hadd_ps(v, v);
    let sum = _mm256_hadd_ps(sum, sum);
    let low = _mm256_castps256_ps128(sum);
    let high = _mm256_extractf128_ps::<1>(sum);
    _mm_cvtss_f32(_mm_add_ss(low, high))
}

fn main() {
    if let Some(token) = X64V3Token::summon() {
        let result = dot_product(token, &[1.0; 8], &[2.0; 8]);
        println!("{result}");
    }
}

The target-feature boundary

The one thing that costs performance is calling an #[arcane] function from code without matching features — each call crosses LLVM's #[target_feature] optimization boundary. Put loops inside the #[arcane] function, not around it.

Processing 1000 8-float vector additions (full benchmark details):

Pattern Time Why
#[arcane] calling #[arcane] (same features) 547 ns Features match — LLVM inlines
#[arcane] per iteration from non-SIMD code 2209 ns (4x) Target-feature boundary per call
Bare #[target_feature] (no archmage) 2222 ns (4x) Same boundary — archmage adds nothing

The 4x penalty is LLVM's #[target_feature] boundary, not archmage overhead. Bare #[target_feature] without archmage has the same cost. With real workloads (DCT-8), the boundary costs up to 6.2x.

The rule: enter #[arcane] once from non-SIMD code, put the loop inside. From within #[arcane], call other #[arcane] functions freely — same or subset features means LLVM inlines.

For trait impls, use #[arcane(_self = Type)] — a nested inner-function approach (since sibling would add methods not in the trait definition).

Auto-vectorization with #[autoversion]

Don't want to write intrinsics? Write plain scalar code and let the compiler vectorize it:

use archmage::prelude::*;

#[autoversion]
fn sum_of_squares(data: &[f32]) -> f32 {
    let mut sum = 0.0f32;
    for &x in data {
        sum += x * x;
    }
    sum
}

// Call directly — no token needed, no unsafe:
let result = sum_of_squares(&my_data);

#[autoversion] generates a separate copy of your function for each architecture tier — each compiled with #[target_feature] to unlock the auto-vectorizer — plus a runtime dispatcher that picks the best one. On x86-64 with AVX2+FMA, that loop compiles to vfmadd231ps (8 floats per cycle). On ARM, you get fmla. The _scalar fallback compiles without SIMD features as a safety net.

The macro auto-injects a hidden SimdToken parameter internally — you don't need to add one. The generated dispatcher has your original signature. You can optionally write _token: SimdToken if you prefer the explicit style, but it's not required.

What gets generated (default tiers):

  • sum_of_squares_v4(token: X64V4Token, ...) — AVX-512 (with avx512 feature)
  • sum_of_squares_v3(token: X64V3Token, ...) — AVX2+FMA
  • sum_of_squares_neon(token: NeonToken, ...) — AArch64 NEON
  • sum_of_squares_wasm128(token: Wasm128Token, ...) — WASM SIMD
  • sum_of_squares_scalar(token: ScalarToken, ...) — no SIMD
  • sum_of_squares(data: &[f32]) -> f32dispatcher (token param removed)

Explicit tiers: #[autoversion(v3, neon)]. scalar is always implicit. Use modifiers to tweak defaults: #[autoversion(+arm_v2)] adds a tier, #[autoversion(-wasm128)] removes one. Gate a tier on a Cargo feature: #[autoversion(v4(cfg(avx512)), v3, neon)].

For inherent methods, self works naturally — no special parameters needed. For trait method delegation, use #[autoversion(_self = MyType)] and _self in the body. See the full parameter reference or the API docs.

When to use which:

#[autoversion] #[arcane] + incant!
You write Scalar loops SIMD intrinsics
Vectorization Compiler auto-vectorizes You choose the instructions
Lines of code 1 attribute Per-tier variants + dispatch
Best for Simple numeric loops Hand-tuned SIMD kernels

SIMD types with magetypes

magetypes provides ergonomic SIMD vector types (f32x8, i32x4, etc.) with natural Rust operators. It's an exploratory companion crate — the API may change between releases.

[dependencies]
archmage = "0.9"
magetypes = "0.9"
use archmage::prelude::*;
use magetypes::simd::f32x8;

fn dot_product(a: &[f32], b: &[f32]) -> f32 {
    if let Some(token) = X64V3Token::summon() {
        dot_product_simd(token, a, b)
    } else {
        a.iter().zip(b).map(|(x, y)| x * y).sum()
    }
}

#[arcane]
fn dot_product_simd(token: X64V3Token, a: &[f32], b: &[f32]) -> f32 {
    let mut sum = f32x8::zero(token);
    for (a_chunk, b_chunk) in a.chunks_exact(8).zip(b.chunks_exact(8)) {
        let va = f32x8::load(token, a_chunk.try_into().unwrap());
        let vb = f32x8::load(token, b_chunk.try_into().unwrap());
        sum = va.mul_add(vb, sum);
    }
    sum.reduce_add()
}

f32x8 wraps __m256 on x86 with AVX2. On ARM/WASM, it's polyfilled with two f32x4 operations — same API, automatic fallback. The #[arcane] wrapper lets LLVM optimize the entire loop as a single SIMD region.

Tier naming conventions

incant! and #[autoversion] dispatch to suffixed functions. incant!(sum(data)) calls sum_v3, sum_neon, etc. These suffixes correspond to tokens:

Suffix Token Arch Key features
_v1 X64V1Token x86_64 SSE2 (baseline)
_v2 X64V2Token x86_64 + SSE4.2, POPCNT
_v3 X64V3Token x86_64 + AVX2, FMA, BMI2
_v4 X64V4Token x86_64 + AVX-512
_neon NeonToken aarch64 NEON
_arm_v2 Arm64V2Token aarch64 + CRC, RDM, DotProd, AES, SHA2
_arm_v3 Arm64V3Token aarch64 + SHA3, I8MM, BF16
_wasm128 Wasm128Token wasm32 SIMD128
_scalar ScalarToken any No SIMD (always available)

By default, incant! tries _v4 (if the avx512 feature is enabled), _v3, _neon, _wasm128, then _scalar. You can restrict to specific tiers: incant!(sum(data), [v3, neon, scalar]). Tier names accept the _ prefix — _v3 is identical to v3, matching the suffix on generated function names.

Instead of restating the entire default list, use modifiers: [+arm_v2] adds a tier, [-wasm128] removes one, [+v4] makes v4 unconditional. Gate a tier on a Cargo feature with v4(cfg(avx512)) (shorthand: v4(avx512)). All entries must be modifiers or all plain — mixing is a compile error.

Always include scalar (or default) in explicit tier lists — it documents the fallback path. (Currently auto-appended if omitted; will become a compile error in v1.0.)

Runtime dispatch with incant! (alias: dispatch_variant!)

Write platform-specific variants with concrete types, then dispatch at runtime:

use archmage::prelude::*;
use magetypes::simd::f32x8;

// No #[cfg] needed — #[arcane] handles it
#[arcane]
fn sum_squares_v3(token: X64V3Token, data: &[f32]) -> f32 {
    let chunks = data.chunks_exact(8);
    let mut acc = f32x8::zero(token);
    for chunk in chunks {
        let v = f32x8::from_array(token, chunk.try_into().unwrap());
        acc = v.mul_add(v, acc);
    }
    acc.reduce_add() + chunks.remainder().iter().map(|x| x * x).sum::<f32>()
}

fn sum_squares_scalar(_token: ScalarToken, data: &[f32]) -> f32 {
    data.iter().map(|x| x * x).sum()
}

/// Dispatches to the best available at runtime — no #[cfg] needed.
fn sum_squares(data: &[f32]) -> f32 {
    incant!(sum_squares(data), [v3, scalar])
}

Each variant's first parameter is the matching token type — _v3 takes X64V3Token, _neon takes NeonToken, etc.

_scalar is mandatory. incant! always emits an unconditional call to fn_scalar(ScalarToken, ...) as the final fallback. If the _scalar function doesn't exist, you get a compile error — not a runtime failure. Always include scalar in explicit tier lists (e.g., [v3, neon, scalar]) to document this dependency. Currently scalar is auto-appended if omitted; this will become a compile error in v1.0.

incant! wraps each tier's call in #[cfg(target_arch)] and #[cfg(feature)] guards, so you only define variants for architectures you target. With no explicit tier list, incant! dispatches to v3, neon, wasm128, and scalar by default (plus v4 if the avx512 feature is enabled).

Gate a tier on a Cargo feature with the tier(cfg(feature)) syntax: incant!(sum(data), [v4(cfg(avx512)), v3, neon, scalar]). The shorthand v4(avx512) also works.

Use modifiers to tweak the default tier list without restating it: [+arm_v2] adds a tier, [-wasm128] removes one. All entries must be modifiers or all plain.

Known tiers: v1, v2, x64_crypto, v3, v3_crypto, v4, v4x, arm_v2, arm_v3, neon, neon_aes, neon_sha3, neon_crc, wasm128, scalar.

If you already have a token, use with to dispatch on its concrete type: incant!(func(data) with token, [v3, neon, scalar]). This uses IntoConcreteToken for compile-time monomorphized dispatch — no runtime summon.

Token position

Use Token to mark where the summoned token is placed: incant!(process(data, Token), [v3, scalar]) puts the token last. Without Token, the token is prepended.

Zero-overhead nesting

Inside #[arcane] or #[autoversion] bodies, incant! is automatically rewritten to a direct call — no runtime dispatch. The rewriter handles downcasting, upgrade attempts, and feature-gated tiers. See the dispatch docs for details.

Tokens

Token Alias Features Hardware
X64V1Token Sse2Token SSE, SSE2 x86_64 baseline (always available)
X64V2Token + SSE4.2, POPCNT Nehalem 2008+
X64CryptoToken V2 + PCLMULQDQ, AES-NI Westmere 2010+
X64V3Token + AVX2, FMA, BMI2 Haswell 2013+, Zen 1+
X64V3CryptoToken V3 + VPCLMULQDQ, VAES Zen 3+ 2020, Alder Lake 2021+
X64V4Token Server64 + AVX-512 (requires avx512 feature) Skylake-X 2017+, Zen 4+
NeonToken Arm64 NEON All 64-bit ARM
Arm64V2Token + CRC, RDM, DotProd, FP16, AES, SHA2 A55+, M1+, Graviton 2+
Arm64V3Token + FHM, FCMA, SHA3, I8MM, BF16 A510+, M2+, Snapdragon X
Wasm128Token WASM SIMD Compile-time only
ScalarToken (none) Always available

Higher tokens subsume lower ones: X64V4TokenX64V3TokenX64V2TokenX64V1Token. Downcasting is free (zero-cost). incant! handles #[cfg(target_arch)] gating automatically — you don't need to write cfg guards for cross-arch dispatch.

See token-registry.toml for the complete mapping of tokens to CPU features.

Testing SIMD dispatch paths

for_each_token_permutation tests every incant! dispatch path on your native hardware — no cross-compilation needed. It disables tokens one at a time, running your closure at each combination from "all SIMD enabled" down to "scalar only":

use archmage::testing::{for_each_token_permutation, CompileTimePolicy};

#[test]
fn sum_squares_matches_across_tiers() {
    let data: Vec<f32> = (0..1024).map(|i| i as f32).collect();
    let expected: f32 = data.iter().map(|x| x * x).sum();

    let report = for_each_token_permutation(CompileTimePolicy::Warn, |perm| {
        let result = sum_squares(&data);
        assert!(
            (result - expected).abs() < 1e-1,
            "mismatch at tier: {perm}"
        );
    });

    assert!(report.permutations_run >= 2, "expected multiple tiers");
}

On an AVX-512 machine this runs 5–7 permutations; on Haswell, 3. Tokens the CPU doesn't have are skipped automatically.

The testable_dispatch feature

When you compile with -Ctarget-cpu=native (or -Ctarget-cpu=haswell, etc.), Rust bakes the feature checks into the binary at compile time. X64V3Token::summon() compiles to a constant Some(token) — it can't be disabled at runtime, and for_each_token_permutation can't test fallback paths.

The testable_dispatch feature forces runtime detection even when compile-time detection would succeed. Enable it in dev-dependencies for full permutation coverage:

[dev-dependencies]
archmage = { version = "0.9", features = ["testable_dispatch"] }

CompileTimePolicy controls what happens when a token can't be disabled:

Policy Behavior
Warn Skip the token, collect a warning in the report
WarnStderr Same, plus print each warning to stderr
Fail Panic — use in CI with testable_dispatch where full coverage is expected

Serializing parallel tests with lock_token_testing()

Token disabling is process-wide — if cargo test runs tests in parallel, one test disabling X64V3Token would break another test that expects it to be available. for_each_token_permutation acquires an internal mutex automatically. If you need to disable tokens manually (via dangerously_disable_token_process_wide), wrap your test in lock_token_testing() to serialize against other permutation tests:

use archmage::testing::lock_token_testing;
use archmage::{X64V3Token, SimdToken};

#[test]
fn manual_disable_test() {
    let _lock = lock_token_testing();
    let baseline = my_function(&data);
    X64V3Token::dangerously_disable_token_process_wide(true).unwrap();
    let fallback = my_function(&data);
    X64V3Token::dangerously_disable_token_process_wide(false).unwrap();
    assert_eq!(baseline, fallback);
}

The lock is reentrant — for_each_token_permutation called from within a lock_token_testing scope works without deadlock.

For the full testing API, see the testing docs.

Feature flags

Feature Default
std yes Standard library (required for runtime detection)
macros yes No-op (macros are always available). Kept for backwards compatibility
avx512 no AVX-512 tokens (X64V4Token, X64V4xToken, Avx512Fp16Token)
testable_dispatch no Makes token disabling work with -Ctarget-cpu=native

Acknowledgments

  • safe_unaligned_simd by okaneco — Reference-based wrappers for every SIMD load/store intrinsic across x86, ARM, and WASM. This crate closed the last unsafe gap: _mm256_loadu_ps taking *const f32 was the one thing you couldn't make safe without a wrapper. Archmage depends on it and re-exports its functions through import_intrinsics, shadowing core::arch's pointer-based versions automatically.

Image tech I maintain

State of the art codecs* zenjpeg · zenpng · zenwebp · zengif · zenavif (rav1d-safe · zenrav1e · zenavif-parse · zenavif-serialize) · zenjxl (jxl-encoder · zenjxl-decoder) · zentiff · zenbitmaps · heic · zenraw · zenpdf · ultrahdr · mozjpeg-rs · webpx
Compression zenflate · zenzop
Processing zenresize · zenfilters · zenquant · zenblend
Metrics zensim · fast-ssim2 · butteraugli · resamplescope-rs · codec-eval · codec-corpus
Pixel types & color zenpixels · zenpixels-convert · linear-srgb · garb
Pipeline zenpipe · zencodec · zencodecs · zenlayout · zennode
ImageResizer ImageResizer (C#) — 24M+ NuGet downloads across all packages
Imageflow Image optimization engine (Rust) — .NET · node · go — 9M+ NuGet downloads across all packages
Imageflow Server The fast, safe image server (Rust+C#) — 552K+ NuGet downloads, deployed by Fortune 500s and major brands

* as of 2026

General Rust awesomeness

archmage · magetypes · enough · whereat · zenbench · cargo-copter

And other projects · GitHub @imazen · GitHub @lilith · lib.rs/~lilith · NuGet (over 30 million downloads / 87 packages)

License

MIT OR Apache-2.0

About

Safely invoke your intrinsic power, using the tokens granted to you by the CPU. Cast primitive magics faster than any mage alive.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Languages