MR
Mayur Rathi
@github
⭐ 34.1k GitHub stars

Arize-Experiment

Arize-Experiment是一款data方向的AI技能,核心价值是Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance,可用于解决开发者在data领域的实际问题,帮助用户提升效率、自动化重复任务或优化工作流。

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use

Last verified on: 2026-05-30
mkdir -p ./skills/arize-experiment && curl -sfL https://raw.githubusercontent.com/github/awesome-copilot/main/skills/arize-experiment/SKILL.md -o ./skills/arize-experiment/SKILL.md

Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).

Skill Content

# Arize Experiment Skill


> **`SPACE`** — All `--space` flags and the `ARIZE_SPACE` env var accept a space **name** (e.g., `my-workspace`) or a base64 space **ID** (e.g., `U3BhY2U6...`). Find yours with `ax spaces list`.


Concepts


- **Experiment** = a named evaluation run against a specific dataset version, containing one run per example

- **Experiment Run** = the result of processing one dataset example -- includes the model output, optional evaluations, and optional metadata

- **Dataset** = a versioned collection of examples; every experiment is tied to a dataset and a specific dataset version

- **Evaluation** = a named metric attached to a run (e.g., `correctness`, `relevance`), with optional label, score, and explanation


The typical flow: export a dataset → process each example → collect outputs and evaluations → create an experiment with the runs.


Prerequisites


Proceed directly with the task — run the `ax` command you need. Do NOT check versions, env vars, or profiles upfront.


If an `ax` command fails, troubleshoot based on the error:

- `command not found` or version error → see references/ax-setup.md

- `401 Unauthorized` / missing API key → run `ax profiles show` to inspect the current profile. If the profile is missing or the API key is wrong, follow references/ax-profiles.md to create/update it. If the user doesn't have their key, direct them to https://app.arize.com/admin > API Keys

- Space unknown → run `ax spaces list` to pick by name, or ask the user

- Project unclear → ask the user, or run `ax projects list -o json --limit 100` and present as selectable options

- **Security:** Never read `.env` files or search the filesystem for credentials. Use `ax profiles` for Arize credentials and `ax ai-integrations` for LLM provider keys. If credentials are not available through these channels, ask the user.

- **CRITICAL — Never fabricate outputs:** When running an experiment, you MUST call the real model API specified by the user for every dataset example. Never fabricate, simulate, or hardcode model outputs, latencies, or evaluation scores. If you cannot call the API (missing SDK, missing credentials, network error), stop and tell the user what is needed before proceeding.


List Experiments: `ax experiments list`


Browse experiments, optionally filtered by dataset. Output goes to stdout.


bash
ax experiments list
ax experiments list --dataset DATASET_NAME --space SPACE --limit 20   # DATASET_NAME: name or ID (name preferred)
ax experiments list --cursor CURSOR_TOKEN
ax experiments list -o json

Flags


| Flag | Type | Default | Description |

|------|------|---------|-------------|

| `--dataset` | string | none | Filter by dataset |

| `--limit, -l` | int | 15 | Max results (1-100) |

| `--cursor` | string | none | Pagination cursor from previous response |

| `-o, --output` | string | table | Output format: table, json, csv, parquet, or file path |

| `-p, --profile` | string | default | Configuration profile |


Get Experiment: `ax experiments get`


Quick metadata lookup -- returns experiment name, linked dataset/version, and timestamps.


bash
ax experiments get NAME_OR_ID
ax experiments get NAME_OR_ID -o json
ax experiments get NAME_OR_ID --dataset DATASET_NAME --space SPACE   # required when using experiment name instead of ID

Flags


| Flag | Type | Default | Description |

|------|------|---------|-------------|

| `NAME_OR_ID` | string | required | Experiment name or ID (positional) |

| `--dataset` | string | none | Dataset name or ID (required if using experiment name instead of ID) |

| `--space` | string | none | Space name or ID (required if using dataset name instead of ID) |

| `-o, --output` | string | table | Output format |

| `-p, --profile` | string | default | Configuration profile |


Response fields


| Field | Type | Description |

|-------|------|-------------|

| `id` | string | Experiment ID |

| `name` | string | Experiment name |

| `dataset_id` | string | Linked dataset

🎯 Best For

  • Data analysts
  • Business intelligence teams
  • Claude users
  • GitHub Copilot users
  • Data professionals

💡 Use Cases

  • Finding patterns in customer data
  • Creating automated dashboards
  • Data pipeline auditing
  • Query optimization

📖 How to Use This Skill

  1. 1

    Install the Skill

    Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.

  2. 2

    Load into Your AI Assistant

    Open Claude or GitHub Copilot and reference the skill. Paste the SKILL.md content or use the system prompt tab.

  3. 3

    Apply Arize-Experiment to Your Work

    Provide context for your task — paste source material, describe your audience, or share existing work to guide the AI.

  4. 4

    Review and Refine

    Edit the AI output for accuracy, tone, and completeness. Add human insight where the AI lacks context.

❓ Frequently Asked Questions

Can this connect to my database directly?

Most data skills accept CSV or JSON input. Database connectors are listed in the Works With section.

How do I install Arize-Experiment?

Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/arize-experiment/SKILL.md, ready to use.

Can I customize this skill for my team?

Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.

⚠️ Common Mistakes to Avoid

Not validating data quality

AI analysis is only as good as your input data. Profile and clean data before analysis.

Ignoring data quality

AI analysis inherits all data quality issues — profile your data first.

🔗 Related Skills