Use Azurite wih DuckDb for local development

My favourite tool for ad-hoc data work is now DuckDb.

Here I will show you how you can use DuckDb and the Azure Storage Emulator Azurite.

docker run -d --name azurite -p 10000:10000 -p 10001:10001 -p 10002:10002 mcr.microsoft.com/azure-storage/azurite

Once the emulator is running, create a container and add some files (*.csv or *.parquet) to the container.

Then start the DuckDb CLI. Install the Azure extension in DuckDb and configure the secret to use a connection string to the local emulator:

INSTALL azure;
LOAD azure;

CREATE SECRET secret (
    TYPE AZURE,
    CONNECTION_STRING 'DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;'
);

NB: Don’t be alarmed about the accoutn key in the above example, that is the well-known key used in the local emulator!

Once the about installation and configuration is done you can start using DuckDb to query on the files in the Azurite container:

SELECT COUNT(*) FROM 'az://testing/*.csv';