Getting Started
Welcome to clgraph! Get up and running in minutes.
What You'll Learn
This guide will help you:
- Install clgraph - Set up in your environment
- Parse SQL - Build your first lineage graph
- Explore lineage - Understand table and column dependencies
- Execute pipelines - Run SQL or generate Airflow DAGs
- Use advanced features - Metadata propagation, LLM documentation, pipeline splitting
Quick Links
Installation
Install from PyPI:
Takes 2 minutes. Full guide →
Quick Start
5-minute tutorial to build your first pipeline:
from clgraph import Pipeline
# Parse SQL
pipeline = Pipeline.from_sql_files("examples/sql_files/", dialect="bigquery")
# Explore lineage
tables = pipeline.table_graph.tables
sources = pipeline.trace_column_backward("table", "column")
# Execute
results = pipeline.run(executor=my_executor)
Examples
Real-world use cases:
- PII Compliance Audit - Find all sensitive data
- Impact Analysis - Know what breaks before making changes
- Multi-Schedule Pipelines - Different frequencies for different tables
- LLM Documentation - Auto-generate descriptions
- Root Cause Analysis - Trace data issues back to source
Learning Path
For New Users
- Install - Get clgraph set up
- Quick Start - Build your first pipeline
- Concepts: From SQL to Lineage Graph - Understand how it works
For Existing Projects
- Install - Add to your project
- Examples - Find your use case
- Concepts: Table Lineage & Orchestration - Learn execution patterns
For Production Deployment
- Quick Start: Generate Airflow DAG - Create production DAG
- Examples: Multi-Schedule Pipeline - Split by frequency
- Examples: Team-Based Split - Organize by ownership
Need Help?
- Common questions? Check the FAQ
- Bug reports? File an issue
- Feature requests? Start a discussion
Next Steps
Ready to dive in?
- Installation - Get set up (2 minutes)
- Quick Start - Build your first pipeline (5 minutes)
- Examples - See real-world use cases
Or explore the fundamentals:
- Concepts - How clgraph works
- API Documentation - Full reference