MR
Mayur Rathi
@mayurrathi
⭐ 5 GitHub stars

AI Data Engineer

AI Data Engineer是一款data方向的AI技能,核心价值是Expert guidance on the complete modern data stack from ingestion to analytics: batch vs streaming, storage (Snowflake, BigQuery, Delta Lake), orchestration (Airflow/Prefect), and data quality governance,可用于解决开发者在data领域的实际问题,帮助用户提升效率、自动化重复任务或优化工作流。

Expert guidance on the complete modern data stack from ingestion to analytics: batch vs streaming, storage (Snowflake, BigQuery, Delta Lake), orchestration (Airflow/Prefect), and data quality governance.

Last verified on: 2026-05-27
mkdir -p ./skills/ai-data-engineer && curl -sfL https://raw.githubusercontent.com/mayurrathi/awesome-agent-skills/main/skills/ai-data-engineer/SKILL.md -o ./skills/ai-data-engineer/SKILL.md

Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).

Skill Content

# AI Data Engineer


Purpose

Design scalable data pipelines and modern data architecture for production environments.


Architecture Design Process


Step 1: Define Requirements

- Scale: GB/TB/PB per day

- Latency: Batch vs near-real-time vs real-time

- Data Sources: DBs, APIs, files, streams

- Consumers: Analysts (SQL), Data Scientists, Apps


Step 2: Choose Architecture

**Batch:** Fivetran/Airbyte → dbt → Snowflake/BigQuery → BI

**Streaming:** Kafka/Confluent → Flink/Kafka Streams

**Storage:** Delta Lake, Iceberg, Hudi on S3/ADLS/GCS

**Layers:** Bronze (raw) → Silver (cleaned) → Gold (aggregated)


Step 3: Data Modeling

- Star schema for business reporting

- Data Vault 2.0 for enterprise warehousing

- Slowly Changing Dimensions (SCD 1, 2, 3)


Step 4: Data Quality

- Great Expectations framework

- Lineage tracking (DataHub, Atlan)

- Row-level security for PII

- Encryption at rest and in transit


Step 5: Operations

- Infrastructure as Code (Terraform)

- CI/CD for dbt and Spark jobs

- Cost optimization: partitioning, clustering, lifecycle

🎯 Best For

  • UI designers
  • Product designers
  • Claude users
  • ChatGPT users
  • Gemini users

💡 Use Cases

  • Generating component mockups
  • Creating design system tokens
  • Data pipeline auditing
  • Query optimization

📖 How to Use This Skill

  1. 1

    Install the Skill

    Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.

  2. 2

    Load into Your AI Assistant

    Open Claude or ChatGPT and reference the skill. Paste the SKILL.md content or use the system prompt tab.

  3. 3

    Apply AI Data Engineer to Your Work

    Provide context for your task — paste source material, describe your audience, or share existing work to guide the AI.

  4. 4

    Review and Refine

    Edit the AI output for accuracy, tone, and completeness. Add human insight where the AI lacks context.

❓ Frequently Asked Questions

Does this work with Figma?

Some design skills integrate with Figma plugins. Check the Works With section for supported tools.

How do I install AI Data Engineer?

Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/ai-data-engineer/SKILL.md, ready to use.

Can I customize this skill for my team?

Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.

⚠️ Common Mistakes to Avoid

Skipping usability testing

AI-generated designs should be validated with real users before development.

Ignoring data quality

AI analysis inherits all data quality issues — profile your data first.

🔗 Related Skills