What is Dagster?
Dagster is a data orchestration platform for building, running, and observing ETL/ELT and AI/ML pipelines.
It orchestrates transformations across dbt, Databricks, and Python, and moves data between SaaS sources and warehouses such as Snowflake and BigQuery.
Features include pipeline scheduling, experiment tracking, model training workflows, and real-time health metrics for freshness, performance, cost, and reliability.
Built-in observability provides dataset lineage, auto-generated documentation, alerting with Slack integration, and tools for impact analysis and debugging.
The data catalog and lineage tools help teams discover datasets, assign ownership, and maintain up-to-date metadata.
Compass surfaces context-aware answers from warehouse data for business users while governance is enforced via GitOps and team-level controls.
Dagster pricing Free trial
Verify on the official pricing page.
Start free trialDagster user reviews
Would you recommend Dagster?
Dagster's key features
-
Unified control plane for building, scaling, and observing AI and data pipelines
-
Orchestration of ETL/ELT pipelines and data transformations (dbt, Databricks, Python) with integrations to warehouses like Snowflake and BigQuery
-
Native support for AI/ML workflows including data preparation, model training, and experiment tracking
-
Integrated observability and lineage with built-in lineage, real-time health metrics, alerting, and auto-generated dataset documentation
-
Enterprise-grade security and deployment: SSO, RBAC, SCIM, SAML, SOC 2/HIPAA compliance, multi-tenant deployments, audit logs, and flexible cloud/region deployment options
Dagster use cases
-
Orchestrate end-to-end ETL/ELT workflows with Dagster, integrating dbt, Databricks, Python and SaaS sources to load and transform data into your warehouse, schedule jobs, monitor runs and dataset lineage, and enforce governance and enterprise security
-
Automate AI/ML pipeline orchestration using Dagster to coordinate data ingestion, feature engineering, distributed model training on Databricks, model lineage and versioning, and continuous retraining with real-time observability and alerts
-
Centralize dataset cataloging and compliance by using Dagster to capture metadata and lineage across pipelines, provide searchable data catalogs for analysts, apply access controls and governance policies, and monitor data quality and provenance in production
Who is it for?
-
Data engineers
-
Data scientists
-
Ml engineers
-
Analytics engineers
-
Data platform teams
-
Enterprise data teams