Add ability to override function behaviour via registry in VortexSession#7588
Add ability to override function behaviour via registry in VortexSession#7588robert3005 wants to merge 12 commits intodevelopfrom
Conversation
363a27c to
ea25a77
Compare
joseph-isaacs
left a comment
There was a problem hiding this comment.
let's run benchmarks.
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.929x ➖ datafusion / vortex-file-compressed (0.929x ➖, 4↑ 0↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.947x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.973x ➖, 0↑ 0↓)
datafusion / parquet (0.974x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.991x ➖, 2↑ 1↓)
duckdb / vortex-compact (0.990x ➖, 0↑ 0↓)
duckdb / parquet (0.946x ➖, 2↑ 0↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.053x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.049x ➖, 0↑ 0↓)
datafusion / parquet (1.064x ➖, 0↑ 3↓)
datafusion / arrow (1.083x ➖, 0↑ 7↓)
duckdb / vortex-file-compressed (1.063x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.048x ➖, 0↑ 1↓)
duckdb / parquet (1.008x ➖, 3↑ 2↓)
duckdb / duckdb (1.053x ➖, 0↑ 2↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.015x ➖, 1↑ 2↓)
datafusion / vortex-compact (1.012x ➖, 0↑ 1↓)
datafusion / parquet (1.013x ➖, 1↑ 2↓)
duckdb / vortex-file-compressed (1.001x ➖, 1↑ 1↓)
duckdb / vortex-compact (1.000x ➖, 2↑ 2↓)
duckdb / parquet (1.005x ➖, 0↑ 0↓)
duckdb / duckdb (1.015x ➖, 0↑ 3↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.827x ➖, 2↑ 0↓)
datafusion / vortex-compact (1.024x ➖, 0↑ 1↓)
datafusion / parquet (0.977x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.841x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.898x ➖, 0↑ 0↓)
duckdb / parquet (0.949x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Random AccessVortex (geomean): 0.949x ➖ unknown / unknown (0.959x ➖, 4↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (1.007x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.010x ➖, 0↑ 0↓)
duckdb / parquet (0.992x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.049x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.055x ➖, 0↑ 1↓)
datafusion / parquet (1.039x ➖, 0↑ 0↓)
datafusion / arrow (1.039x ➖, 0↑ 2↓)
duckdb / vortex-file-compressed (1.069x ➖, 0↑ 3↓)
duckdb / vortex-compact (1.068x ➖, 0↑ 2↓)
duckdb / parquet (1.036x ➖, 0↑ 0↓)
duckdb / duckdb (1.038x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.911x ➖, 2↑ 1↓)
datafusion / vortex-compact (0.999x ➖, 3↑ 3↓)
datafusion / parquet (1.153x ➖, 0↑ 4↓)
duckdb / vortex-file-compressed (0.975x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.007x ➖, 0↑ 0↓)
duckdb / parquet (0.966x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.938x ➖, 4↑ 0↓)
datafusion / parquet (0.955x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.982x ➖, 3↑ 3↓)
duckdb / parquet (0.991x ➖, 0↑ 0↓)
duckdb / duckdb (0.984x ➖, 2↑ 0↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: CompressionVortex (geomean): 1.013x ➖ unknown / unknown (1.027x ➖, 0↑ 12↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.767x ➖, 7↑ 1↓)
datafusion / vortex-compact (0.942x ➖, 1↑ 1↓)
datafusion / parquet (0.939x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.968x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.922x ➖, 0↑ 0↓)
duckdb / parquet (0.954x ➖, 0↑ 0↓)
Full attributed analysis
|
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
3186c7c to
167b2bb
Compare
|
I really think we want one lookup for either of the kernels types execute/reduce? |
|
@joseph-isaacs I think I applied the changes you have been thinking about |
This logic isn't used yet but will be used to allow us to customise behaviour of
functions depending on an integration point, i.e. Datafusion can have it's
casting logic that is different from arrow casting logic while everyone using
vortex can still continue calling
castand not specialize for the enginethey're using
Thing to consider is whether we want require passing session to optimise or whether we should remove the implicit optimise calls and defer them to execute loop
The next pr will replace struct casting logic with Arrow and DF specific
behaviour.
Signed-off-by: Robert Kruszewski github@robertk.io