Progression of DuckDB Performance

Parviz Deyhim
1 min readDec 4, 2023

In my previous blog post, I showed how DuckDB is incredibly fast if it has enough memory. In my new benchmark, I looked at DuckDB benchmark across three versions: 0.7.0, 0.8.1, and 0.9.1. The study aimed to understand how DuckDB’s performance has evolved in these versions. The benchmark used the TPCH 5GB partitioned parquet dataset and was conducted on Google Cloud with a large VM type to minimize external performance impacts. I performed this benchmark using a platform I’ve developed, Benchops.io, which automates various aspects of running benchmarks. I believe benchmarks should be easy to run, analyze, and replicate. That’s what I’m hoping to do with this platform. Still early days for it, and if anyone’s interested in collaborating and building it, please get in touch with me.

The full report is here: Progression of DuckDB Performance

https://www.benchops.io/report?id=79b0fab9-68f4-46a8-82ec-7a46adcc1506

Future steps include investigating the potential impact of CPU noisy neighbor and testing DuckDB’s performance with larger datasets to gain further insights into its scalability and efficiency.

--

--

Parviz Deyhim

Data lover and cloud architect @databricks (ex-google, ex-aws)