fix: RepeatedScan::execute is CPU-work not I/O#7580
Conversation
I also renamed `Handle::spawn_blocking` to `Handle::spawn_blocking_io` which better reflects what it does. Signed-off-by: Daniel King <dan@spiraldb.com>
robert3005
left a comment
There was a problem hiding this comment.
This is effectively the same, you're just going through a different interface
|
In Spiral, it is not, we use a Rayon thread pool for CPU-heavy work. |
|
but honestly the overhead of actually spawning this (very tiny, very quick) work seems not worth it. It makes my example lose about 2/5 of its throughput. |
|
It can be pretty expensive depending on how complex your projection expression and layout tree are |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.054x ➖ datafusion / vortex-file-compressed (1.054x ➖, 0↑ 1↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.962x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.963x ➖, 1↑ 0↓)
datafusion / parquet (0.970x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.969x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.969x ➖, 1↑ 0↓)
duckdb / parquet (0.962x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.107x ❌, 0↑ 14↓)
datafusion / vortex-compact (1.107x ❌, 0↑ 13↓)
datafusion / parquet (1.086x ➖, 0↑ 12↓)
datafusion / arrow (1.192x ❌, 0↑ 18↓)
duckdb / vortex-file-compressed (1.095x ➖, 0↑ 10↓)
duckdb / vortex-compact (1.093x ➖, 0↑ 13↓)
duckdb / parquet (1.064x ➖, 1↑ 7↓)
duckdb / duckdb (1.081x ➖, 0↑ 11↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.898x ✅, 53↑ 0↓)
datafusion / vortex-compact (0.903x ➖, 46↑ 0↓)
datafusion / parquet (0.909x ➖, 39↑ 1↓)
duckdb / vortex-file-compressed (0.900x ✅, 57↑ 0↓)
duckdb / vortex-compact (0.926x ➖, 32↑ 1↓)
duckdb / parquet (0.942x ➖, 9↑ 0↓)
duckdb / duckdb (0.910x ➖, 33↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.378x ❌, 0↑ 5↓)
datafusion / vortex-compact (1.062x ➖, 0↑ 1↓)
datafusion / parquet (1.113x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.059x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.026x ➖, 0↑ 0↓)
duckdb / parquet (1.070x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.000x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.992x ➖, 0↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.990x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.989x ➖, 0↑ 0↓)
datafusion / parquet (1.003x ➖, 0↑ 0↓)
datafusion / arrow (1.013x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.000x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.992x ➖, 0↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 0↓)
duckdb / duckdb (0.991x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.012x ➖, 1↑ 1↓)
datafusion / parquet (1.014x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.008x ➖, 0↑ 3↓)
duckdb / parquet (1.000x ➖, 0↑ 1↓)
duckdb / duckdb (1.013x ➖, 0↑ 1↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.314x ❌, 0↑ 10↓)
datafusion / vortex-compact (1.225x ➖, 0↑ 8↓)
datafusion / parquet (0.997x ➖, 1↑ 2↓)
duckdb / vortex-file-compressed (1.114x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.118x ➖, 0↑ 0↓)
duckdb / parquet (1.187x ➖, 0↑ 3↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (1.128x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.075x ➖, 0↑ 1↓)
datafusion / parquet (1.128x ➖, 0↑ 4↓)
duckdb / vortex-file-compressed (1.087x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.014x ➖, 0↑ 0↓)
duckdb / parquet (1.097x ➖, 0↑ 0↓)
Full attributed analysis
|
|
@robert3005 the benchmarks wouldn't show a change unless y'all used my CPUSegregatedExecutor which puts |
|
I was curious whether spawn_blocking vs spawn made a difference |
I also renamed
Handle::spawn_blockingtoHandle::spawn_blocking_iowhich better reflects what it does.