DuckDB is a high-performance, in-process analytical database that lets you run SQL queries directly on your files.
Some similarities with ClickHouse, especially with clickhouse-local
.
Example snippets
Parquet to CSV
duckdb -cmd '.mode csv' -c "select * from './validation.parquet'" > validation.csv
Extensions
- https://duckdb.org/docs/extensions/overview
Python API
Segfault when importing
Sometimes when you import duckdb
you get a segfault. If this happens, moving the duckdb import before your other imports can help.
Example
import duckdb
conn = duckdb.connect("test.duckdb")