Skip to content

Getting Started

Welcome to clgraph! Get up and running in minutes.


What You'll Learn

This guide will help you:

  1. Install clgraph - Set up in your environment
  2. Parse SQL - Build your first lineage graph
  3. Explore lineage - Understand table and column dependencies
  4. Execute pipelines - Run SQL or generate Airflow DAGs
  5. Use advanced features - Metadata propagation, LLM documentation, pipeline splitting

Installation

Install from PyPI:

pip install clgraph

Takes 2 minutes. Full guide →


Quick Start

5-minute tutorial to build your first pipeline:

from clgraph import Pipeline

# Parse SQL
pipeline = Pipeline.from_sql_files("examples/sql_files/", dialect="bigquery")

# Explore lineage
tables = pipeline.table_graph.tables
sources = pipeline.trace_column_backward("table", "column")

# Execute
results = pipeline.run(executor=my_executor)

Full tutorial →


Examples

Real-world use cases:

  • PII Compliance Audit - Find all sensitive data
  • Impact Analysis - Know what breaks before making changes
  • Multi-Schedule Pipelines - Different frequencies for different tables
  • LLM Documentation - Auto-generate descriptions
  • Root Cause Analysis - Trace data issues back to source

See all examples →


Learning Path

For New Users

  1. Install - Get clgraph set up
  2. Quick Start - Build your first pipeline
  3. Concepts: From SQL to Lineage Graph - Understand how it works

For Existing Projects

  1. Install - Add to your project
  2. Examples - Find your use case
  3. Concepts: Table Lineage & Orchestration - Learn execution patterns

For Production Deployment

  1. Quick Start: Generate Airflow DAG - Create production DAG
  2. Examples: Multi-Schedule Pipeline - Split by frequency
  3. Examples: Team-Based Split - Organize by ownership

Need Help?


Next Steps

Ready to dive in?

Or explore the fundamentals: