Running dbt workflows locally with LocalStack for Snowflake

Introduction

Running and testing dbt workflows usually needs an active connection to the Snowflake Cloud. This setup is slow and costly because development uses compute credits, has latency with large queries, and lacks quick feedback. This “testing tax” delays developers and increases cloud costs, making local development hard for data teams.

LocalStack for Snowflake changes this. It emulates the Snowflake data cloud locally in a Docker container, allowing you to run Snowflake SQL queries and more. By mimicking the Snowflake API surface, you can connect tools like dbt to your local instance for fast development and testing, without needing to use the Snowflake Cloud until deployment.

This tutorial shows how to set up and run a full data pipeline locally using dbt and the LocalStack for Snowflake emulator. It uses a sample Smart Factory Monitoring App to demonstrate local development, testing, and orchestration of dbt transformations.

How LocalStack works with dbt

LocalStack’s Snowflake emulator imitates the Snowflake Cloud in your local environment, so dbt interacts with it the same way as with the real cloud. When you run dbt run or dbt test, the dbt-snowflake adapter sends API requests to LocalStack instead of Snowflake.

To make this seamless, LocalStack uses the DNS name snowflake.localhost.localstack.cloud, which resolves to 127.0.0.1. This keeps all traffic local and also handles SSL certificates, enabling HTTPS without manual setup.

To connect dbt to LocalStack, you need to configure a dbt profile with the host pointing to the LocalStack DNS name and mock credentials.

Here is an example profiles.yml configuration:

localstack:
  outputs:
    dev:
      account: localstack
      host: snowflake.localhost.localstack.cloud
      port: 4566
      database: test
      password: test
      role: test
      schema: public
      threads: 1
      type: snowflake
      user: test
      warehouse: test
  target: dev

By using the --profile localstack flag, all dbt commands are routed to the local Snowflake emulator. No changes are needed to dbt models, tests, or project structure, so you can develop and test data transformations entirely locally.

Prerequisites

Before you begin, ensure you have the following tools installed and configured:

LocalStack CLI with a valid LocalStack Auth Token
Docker
Snowflake CLI with a localstack connection configured
Python & pip

Step 1: Setup the project

This tutorial builds and runs a Smart Factory Monitoring App. It ingests sensor data (in the form of CSV files) from an S3 bucket, loads it using Snowpipe, transforms it with dbt, and visualizes it in a Streamlit dashboard. The full workflow runs locally with the Snowflake emulator.

Image Resizer with Lambda & S3

1.1: Clone the Repository

First, clone the sample application repository and go to the project directory:

git clone https://github.com/localstack-samples/snowflake-smart-factory-app.git
cd snowflake-smart-factory-app

1.2: Install Dependencies

Next, install the required Python dependencies for the project by running:

virtualenv env
source env/bin/activate
pip install -r requirements.txt
dbt deps

These commands sets up the virtual environment and installs all necessary packages, including dbt-snowflake, streamlit, dbt project dependencies, and others.

Step 2: Configure dbt for the Snowflake Emulator

To connect dbt to the LocalStack Snowflake emulator, set up a dbt profile. The repo includes a sample profiles.yml file. You can use it as-is or delete it and create a new one.

localstack:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: test
      user: test
      password: test
      role: test
      database: FACTORY_PIPELINE_DEMO
      schema: PUBLIC
      threads: 1
      # For LocalStack, you need to specify host and port
      host: snowflake.localhost.localstack.cloud
      port: 4566

The key configuration details are:

host: snowflake.localhost.localstack.cloud points to the local emulator.
port: 4566 is the default LocalStack port.
Credentials: Set user, password, account, etc., to any test value.

For this tutorial, the sample app uses the database FACTORY_PIPELINE_DEMO and the PUBLIC schema.

Step 3: Create Snowflake & AWS resources

Now, let’s create the Snowflake and AWS resources needed for the application.

3.1: Start LocalStack for Snowflake

First, start the LocalStack container with the Snowflake emulator. Make sure you set your LOCALSTACK_AUTH_TOKEN using the LocalStack CLI.

localstack auth set-token YOUR_AUTH_TOKEN
localstack start --stack snowflake

Replace YOUR_AUTH_TOKEN with your actual LocalStack Auth Token. This command will start LocalStack, including the AWS services (S3) and the Snowflake emulator needed for the application.

3.2: Create Snowflake resources

To create your local Snowflake resources, you can run the SQL script (setup_snowflake.sql) provided in the repository.

To execute this script, you can use the Snowflake CLI (snow) with a LocalStack connection configured:

snow sql --file setup/01_setup_snowflake.sql --connection localstack

The script:

Creates a database and table for sensor readings
Sets up an external stage connected to a local S3 bucket
Configures a Snowpipe with AUTO_INGEST enabled

Any CSV file uploaded to the S3 bucket is then automatically ingested into the table for transformation.

3.3: Create AWS resources

Next, create a local S3 bucket to store raw sensor data. You also need to set up bucket notifications to trigger the Snowpipe on new file uploads.

Run the script that uses the AWS SDK for Python to configure this.

python setup/02_configure_s3_bucket.py

Use the AWS CLI or the LocalStack Web Application to list local buckets and verify the new bucket.

LocalStack S3 Bucket List

Step 4: Run the End-to-End Pipeline Locally

Now, let’s bring the entire data pipeline to life on your local machine.

4.1: Upload the initial data

To trigger the data pipeline, upload the first batch of sensor data (data/sensor_data_batch_1.csv) to the S3 bucket to trigger the data pipeline.

This triggers the Snowpipe, which ingests the data into the raw table in the local Snowflake instance. Run the script to upload the data:

python setup/03_upload_file.py

4.2: Run dbt Transformations

With the raw data loaded, you can now run your dbt models to transform it. Run the following command:

dbt run --profile localstack

This command executes the dbt models defined in the project. The output would be:

01:15:21  Found 3 models, 39 data tests, 1 source, 573 macros
01:15:21
01:15:24  Concurrency: 1 threads (target='dev')
01:15:24
01:15:24  1 of 3 START sql incremental model PUBLIC_raw.sensor_data ...................... [RUN]
01:15:27  1 of 3 OK created sql incremental model PUBLIC_raw.sensor_data ................. [SUCCESS 1 in 2.32s]
01:15:27  2 of 3 START sql view model PUBLIC_staging.sensor_readings_view ................ [RUN]
01:15:27  2 of 3 OK created sql view model PUBLIC_staging.sensor_readings_view ........... [SUCCESS 1 in 0.61s]
01:15:27  3 of 3 START sql table model PUBLIC_marts.machine_health_metrics ............... [RUN]
01:15:32  3 of 3 OK created sql table model PUBLIC_marts.machine_health_metrics .......... [SUCCESS 1 in 5.07s]
01:15:32
01:15:32  Finished running 1 incremental model, 1 view model, 1 table model in 0 hours 0 minutes and 11.47 seconds (11.47s).
01:15:32
01:15:32  Completed successfully

Because LocalStack offers high parity with the real Snowflake API, you don’t need to change your models or SQL code. Your project just works locally.

4.3: Run dbt tests

After running dbt run to transform the data, use dbt test to check data quality and integrity. Tests can assert conditions like no nulls or unique primary keys. Running these tests locally provides an immediate feedback loop.

You can execute all the tests defined in the dbt project:

dbt test --profile localstack

The sample project includes several data quality tests defined in the schema.yml files.

The output will look similar to this, indicating that all tests have passed:

01:16:28  37 of 39 START test table_not_empty_processed_machine_health_ .................. [RUN]
01:16:28  37 of 39 PASS table_not_empty_processed_machine_health_ ........................ [PASS in 0.77s]
01:16:28  38 of 39 START test table_not_empty_sensor_readings_view_ ...................... [RUN]
01:16:29  38 of 39 PASS table_not_empty_sensor_readings_view_ ............................ [PASS in 0.79s]
01:16:29  39 of 39 START test unique_processed_machine_health_machine_id ................. [RUN]
01:16:30  39 of 39 PASS unique_processed_machine_health_machine_id ....................... [PASS in 0.80s]
01:16:30
01:16:30  Finished running 39 data tests in 0 hours 0 minutes and 37.63 seconds (37.63s).
01:16:30
01:16:30  Completed successfully
01:16:30
01:16:30  Done. PASS=39 WARN=0 ERROR=0 SKIP=0 TOTAL=39

This local validation helps catch issues early. For example, if the FILE_FORMAT in setup_snowflake.sql had a wrong delimiter, Snowpipe would load data incorrectly.

A not_null test would fail immediately, showing the problem before it affects production. This fast feedback loop is a key benefit of using LocalStack.

4.4: Visualize the Data

Finally, you can use a Streamlit web app to query and visualize the transformed data.

cd app/src/module-ui/src && streamlit run ui.py

This starts the Streamlit dashboard, connects to the local Snowflake emulator, queries the machine_health_metrics model, and displays a real-time monitoring view for the smart factory.

LocalStack Streamlit Dashboard

Step 5: Automate with an Orchestration Pipeline

Beyond manual runs, you can also test and run automated data workflows. The sample project includes a Dagster pipeline for orchestrating dbt transformations.

Start the Dagster pipeline:

dagster dev -f pipeline/assets.py

This launches the Dagster UI, where you can see your dbt assets. The pipeline includes an S3 sensor that automatically triggers your dbt models whenever a new file is uploaded.

To test this, upload a new batch of sensor data:

python setup/03_upload_file.py --latest

This command uploads sensor_data_batch_2.csv to S3. In the Dagster UI, a new run starts, triggering the dbt workflow to process the data.

LocalStack Dagster Pipeline

This shows how to build and test full automated data pipelines locally, not just dbt models.

Conclusion

LocalStack for Snowflake enables data engineers to shift left by running the data cloud locally. This tutorial showed how to build, test, and run a full dbt data pipeline without cloud costs or remote infrastructure.

By using a local emulator, you can:

Prototype quickly with safe tests and fast feedback
Avoid testing tax from using the real cloud for development
Test dbt models and full data workflows before production
Share reproducible development environments using Cloud Pods

This local-first development model leads to higher-quality data products, faster development cycles, and more efficient data teams.