Sql-Server-Table-Reconciliation
Sql-Server-Table-Reconciliation是一款data方向的AI技能,核心价值是Use when: comparing SQL Server tables across instances, data migration validation, ETL verification, row mismatch detection, schema drift, reconciliation report, production vs staging comparison,可用于解决开发者在data领域的实际问题,帮助用户提升效率、自动化重复任务或优化工作流。
Use when: comparing SQL Server tables across instances, data migration validation, ETL verification, row mismatch detection, schema drift, reconciliation report, production vs staging comparison. Uses
mkdir -p ./skills/sql-server-table-reconciliation && curl -sfL https://raw.githubusercontent.com/github/awesome-copilot/main/skills/sql-server-table-reconciliation/SKILL.md -o ./skills/sql-server-table-reconciliation/SKILL.md Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).
Skill Content
# SQL Server Table Reconciliation
Compare identical tables across two SQL Server instances using Python with `mssql-python` driver and Apache Arrow. Detect missing rows, column mismatches, schema drift, and produce a reconciliation report.
Workflow
1. Collect connection details for source and target
2. Identify primary key / composite key
3. Detect schema differences
4. Extract data via Arrow for efficient columnar transfer
5. Compare rows and columns
6. Generate reconciliation report
Collect Inputs
| Parameter | Required | Description |
|-----------|----------|-------------|
| Source server | Yes | Source SQL Server (e.g. `prod-server.database.windows.net`) |
| Source database | Yes | Source database name |
| Target server | Yes | Target SQL Server (e.g. `staging-server.database.windows.net`) |
| Target database | Yes | Target database name |
| Tables | Yes | Comma-separated `schema.table` names, or `schema.*` wildcard (e.g. `dbo.Orders,dbo.Items` or `dbo.*`) |
| Auth mode | Yes | `sql` (user/password) or `entra` (Azure AD/token) |
| Primary key | Auto-detect | Column(s) forming the row identity. Auto-detect from metadata if not provided. |
| Columns to compare | All | Subset of columns, or all non-PK columns |
| Chunk size | `100000` | Rows per batch for large tables |
| Output format | `console` | `console`, `csv`, `parquet`, or `json` |
Bundled Script
The reconciliation logic is provided as a standalone script at `scripts/reconcile.py`. Invoke it with the appropriate arguments based on user inputs:
python scripts/reconcile.py \
--source-server <source_server> \
--source-database <source_database> \
--target-server <target_server> \
--target-database <target_database> \
--tables "<table_spec>" \
--auth <sql|entra> \
--chunk-size <chunk_size> \
--output <console|csv|json>Optional arguments
| Argument | Description |
|----------|-------------|
| `--primary-key` | Comma-separated PK column(s). Omit to auto-detect. |
| `--columns` | Comma-separated columns to compare. Omit to compare all non-PK columns. |
Example invocations
Single table with SQL auth:
python scripts/reconcile.py \
--source-server prod-server.database.windows.net \
--source-database ProdDB \
--target-server staging-server.database.windows.net \
--target-database StagingDB \
--tables "dbo.Orders" \
--auth sql \
--output consoleWildcard with Entra auth and CSV output:
python scripts/reconcile.py \
--source-server prod-server.database.windows.net \
--source-database ProdDB \
--target-server staging-server.database.windows.net \
--target-database StagingDB \
--tables "dbo.*" \
--auth entra \
--output csvPrerequisites
Install required packages before running:
pip install mssql-python pyarrow pandasComparison Rules
- **Normalize types before comparing**: cast decimals to same precision, trim strings, normalize datetime to UTC
- **NULL handling**: `NULL == NULL` is considered a match (both sides missing = no diff)
- **Ignore row order**: always compare by PK join, never positional
- **Large tables**: chunk extraction with `OFFSET/FETCH` or `ROW_NUMBER()` partitioning
Hash-Based Optimization (for large tables)
When table has >1M rows, generate a hash pre-check:
SELECT {pk_cols},
HASHBYTES('SHA2_256', CONCAT_WS('|', col1, col2, ...)) AS row_hash
FROM {table}Compare hashes first; only fetch full rows for mismatched hashes. This reduces data transfer significantly.
Report Format
Reconciling dbo.EMPLOYEES...
Reconciling dbo.DEPARTMENTS...
Reconciling dbo.JOBS...
--- dbo.EMPLOYEES ---
Source: 107 Target: 107
Missing: 0 Extra: 0 Mismatches: 0
Result: ✓ IDENTICAL
--- dbo.DEPARTMENTS ---
Source: 27 Target: 27
Missing: 0 Extra: 0 Mismatches: 3
Result: ✗ DIFFERENCES FOUND
--- dbo.JOBS ---
Source: 19 Target: 19
Missing: 0 Extra: 0 Mismat🎯 Best For
- Claude users
- GitHub Copilot users
- Data professionals
- Analytics teams
- Researchers
💡 Use Cases
- Data pipeline auditing
- Query optimization
📖 How to Use This Skill
- 1
Install the Skill
Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.
- 2
Load into Your AI Assistant
Open Claude or GitHub Copilot and reference the skill. Paste the SKILL.md content or use the system prompt tab.
- 3
Apply Sql-Server-Table-Reconciliation to Your Work
Provide context for your task — paste source material, describe your audience, or share existing work to guide the AI.
- 4
Review and Refine
Edit the AI output for accuracy, tone, and completeness. Add human insight where the AI lacks context.
❓ Frequently Asked Questions
How do I install Sql-Server-Table-Reconciliation?
Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/sql-server-table-reconciliation/SKILL.md, ready to use.
Can I customize this skill for my team?
Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.
⚠️ Common Mistakes to Avoid
Ignoring data quality
AI analysis inherits all data quality issues — profile your data first.