1000X Faster SQL Linting
We introduce SDF lint, a high performance dialect specific SQL linter, written in Rust.
Hygiene, readability, consistency, and correctness of SQL code are some of the many barriers to moving forward with self-serve data. Whether it be convoluted tables with a growing number of columns, or a 1,000+ line unformatted chain of SQL CTE’s conspicuously named “dim_person”, many data teams struggle to scale while keeping engineering efficiency high. If a smart analyst who writes SQL for a living can’t understand the SQL behind their data pipelines, you’ve got a problem.
Code is read more often than it is written. In software development, Linters and Formatters have been standardizing code quality, development expectations, and readability for decades. They help engineers focus on logic rather than layout. They automate consistency, and streamline collaboration.
Today we’re excited to announce high performance SQL linting and formatting with SDF providing 100X - 1000X performance improvements over the existing standard. The result: linting and formatting become virtually zero-cost operations in daily development and in CI/CD. Both the Linter and Formatter have common sense defaults so that configuration is minimal and out of the box behavior should work for most engineering teams. SDF lint and format are supported for all SDF projects (including macros!) and non-templated general SQL. Support for dbt projects and dbt templating is coming soon.
As of SDF release v0.10.0-p we introduce SDF native SQL linting and formatting functionality for with up to 1000X performance increases over SQLFluff in large SQL projects.
In SQL development the current de-facto standard for linting is SQLFluff which provides sensible rules and a high degree of configurability. Unfortunately, SQLFluff has severe performance limitations (due to its python runtime), and only provides high-level syntax reviews rather than feedback driven by a deep semantic understanding of SQL.
SDF’s SQL Linter and Formatter have a high degree of compatibility with SQLFLuff but are based on SDF’s own dialect-specific all-Rust SQL parsers and highly parallelized visitor algorithms. This results in incredible performance and out-of-the box compatibility with every SDF workspace.
In fact, SDF lint is so fast that it is primarily limited by the time needed read files from disk!
Benefits of SDF Lint
Linters catch common mistakes (like syntax errors) before code is executed. They enforce coding standards and help maintain a consistent style across an organization. By increasing uniformity, code reviews and collaboration between engineers becomes easier and more fluid.
SDF Lint offers major improvements to the developer experience, and to organizations keen to standardize and unify their SQL code.
SDF lint is remarkably fast and accurate, with massive performance improvements over the current standard SQLFluff. SDF’s linter is written in Rust, highly parallelized, and underpinned by proper ANTLR grammar definitions for supported SQL dialects.
SDF lint is easy to use and integrated into every release of SDF. There are no package dependencies, python virtual envs, or integrations to manage.
SDF Lint supports all Jinja macros and configuration (variables) within an SDF workspace.
SDF Lint strives to be compatible with SQLFluff for syntax rules. SQLFluff aliases are provided for SDF rules where possible. Check out the linter rules as they compare to SQLFluff here.
SDF Lint can be used independently of SDF’s transformation layer on raw SQL. All that’s needed is an SDF workspace specifying SQL file include paths or directories, and optionally a linting configuration.
Easy Configuration & Sensible Defaults
Every SDF workspace now implicitly includes the linter configuration below.
workspace:
name: my_workspace
...
defaults:
dialect: snowflake
---
# This the default lint configuration
sdf-args:
lint: >
-w capitalization-keywords=consistent
-w capitalization-literals=consistent
-w capitalization-types=consistent
-w capitalization-functions=consistent
-w references-quoting
-w structure-else-null
-w structure-unused-cte
-w structure-distinct
-w convention-terminator
To modify the linter defaults, add an sdf-args block in your SDF YML configuration, or just specify the rules you’d like in the command line when running sdf lint.
Run sdf lint —help to learn more about SDF’s lint rules and configuration options.
Getting Started
SDF Lint is now available in preview build v0.10.0-p
For more, see the linter documentation, or join SDF’s community slack!
To get started
Download SDF Preview. SDF has 2 release channels: stable, and preview. The linter and formatter are in preview today, and will be released to stable in a future release.
To install SDF Preview, see the installation documentation
If you already have SDF installed run:
sudo sdf system update -p
to join the preview channel.
Create or `cd` into your SDF workspace
That’s it - start linting!
Common commands:
sdf lint
→ runs the linter on your whole projectsdf lint path/to/file.sql
→ runs the linter on a specifc filesdf lint —fix
→ Auto-fixes rules where possiblesdf format
→ Format all SQL files captured in the workspace
Limitations
As of this initial release, SDF supports a limited set of dialects.
🟢 Snowflake
🟢 BigQuery
🟡 Redshift
🔴 Trino
🔴 Other SQL Dialects (not immediately planned)
We look forward to supporting more dialects, and reaching higher parity with SQLFluff soon.
Summary
SDF is focused on building first-class, high performance tooling for data development. Underpinned by best-in-class semantic understanding of many SQL dialects, our mission is to provide a next generation transformation layer and dialect agnostic database engine.