Mayur Rathi

⭐ 34.1k GitHub stars

Arize-Experiment

Arize-Experiment is an data AI skill with a core value of Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. It helps developers solve real-world problems in the data domain, boosting efficiency, automating repetitive tasks, and optimizing workflows.

Creates, runs, and analyzes Arize experiments for evaluating and comparing model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. Use

Last verified on: 2026-07-14

Quick Facts

Category data

Works With Claude, GitHub Copilot

Source github/awesome-copilot

Stars ⭐ 34.1k

Last Verified 2026-07-14

Risk Level Low

mkdir -p ./skills/arize-experiment && curl -sfL https://raw.githubusercontent.com/github/awesome-copilot/main/skills/arize-experiment/SKILL.md -o ./skills/arize-experiment/SKILL.md

Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).

Skill Content

# Arize Experiment Skill

> **`SPACE`** — All `--space` flags and the `ARIZE_SPACE` env var accept a space **name** (e.g., `my-workspace`) or a base64 space **ID** (e.g., `U3BhY2U6...`). Find yours with `ax spaces list`.

Concepts

- **Experiment** = a named evaluation run against a specific dataset version, containing one run per example

- **Experiment Run** = the result of processing one dataset example -- includes the model output, optional evaluations, and optional metadata

- **Dataset** = a versioned collection of examples; every experiment is tied to a dataset and a specific dataset version

- **Evaluation** = a named metric attached to a run (e.g., `correctness`, `relevance`), with optional label, score, and explanation

The typical flow: export a dataset → process each example → collect outputs and evaluations → create an experiment with the runs.

Prerequisites

Proceed directly with the task — run the `ax` command you need. Do NOT check versions, env vars, or profiles upfront.

If an `ax` command fails, troubleshoot based on the error:

- `command not found` or version error → see references/ax-setup.md

- `401 Unauthorized` / missing API key → run `ax profiles show` to inspect the current profile. If the profile is missing or the API key is wrong, follow references/ax-profiles.md to create/update it. If the user doesn't have their key, direct them to https://app.arize.com/admin > API Keys

- Space unknown → run `ax spaces list` to pick by name, or ask the user

- Project unclear → ask the user, or run `ax projects list -o json --limit 100` and present as selectable options

- **Security:** Never read `.env` files or search the filesystem for credentials. Use `ax profiles` for Arize credentials and `ax ai-integrations` for LLM provider keys. If credentials are not available through these channels, ask the user.

- **CRITICAL — Never fabricate outputs:** When running an experiment, you MUST call the real model API specified by the user for every dataset example. Never fabricate, simulate, or hardcode model outputs, latencies, or evaluation scores. If you cannot call the API (missing SDK, missing credentials, network error), stop and tell the user what is needed before proceeding.

List Experiments: `ax experiments list`

Browse experiments, optionally filtered by dataset. Output goes to stdout.

bash

ax experiments list
ax experiments list --dataset DATASET_NAME --space SPACE --limit 20   # DATASET_NAME: name or ID (name preferred)
ax experiments list --cursor CURSOR_TOKEN
ax experiments list -o json

Flags

|------|------|---------|-------------|

| `--limit, -l` | int | 15 | Max results (1-100) |

Get Experiment: `ax experiments get`

Quick metadata lookup -- returns experiment name, linked dataset/version, and timestamps.

bash

ax experiments get NAME_OR_ID
ax experiments get NAME_OR_ID -o json
ax experiments get NAME_OR_ID --dataset DATASET_NAME --space SPACE   # required when using experiment name instead of ID

Flags

|------|------|---------|-------------|

Response fields

| Field | Type | Description |

|-------|------|-------------|

| `id` | string | Experiment ID |

| `name` | string | Experiment name |

| `dataset_id` | string | Linked dataset

🎯 Best For

Data analysts
Business intelligence teams
Claude users
GitHub Copilot users
Data professionals

💡 Use Cases

Finding patterns in customer data
Creating automated dashboards
Data pipeline auditing
Query optimization

📖 How to Use This Skill

1
Install the Skill

Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.
2
Load into Your AI Assistant

Open Claude or GitHub Copilot and reference the skill. Paste the SKILL.md content or use the system prompt tab.
3
Apply Arize-Experiment to Your Work

Provide context for your task — paste source material, describe your audience, or share existing work to guide the AI.
4
Review and Refine

Edit the AI output for accuracy, tone, and completeness. Add human insight where the AI lacks context.

❓ Frequently Asked Questions

Can this connect to my database directly?

Most data skills accept CSV or JSON input. Database connectors are listed in the Works With section.

How do I install Arize-Experiment?

Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/arize-experiment/SKILL.md, ready to use.

Can I customize this skill for my team?

Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.

⚠️ Common Mistakes to Avoid

Not validating data quality

AI analysis is only as good as your input data. Profile and clean data before analysis.

Ignoring data quality

AI analysis inherits all data quality issues — profile your data first.

🔗 Related Skills

acreadiness-assess Acreadiness-Assess acreadiness-policy Acreadiness-Policy adr-generator ADR Generator ai-prompt-engineering-safety-best-practices Ai-Prompt-Engineering-Safety-Best-Practices ai-readiness-reporter Ai-Readiness-Reporter ai-ready Ai-Ready

Arize-Experiment

Quick Facts

Skill Content

Concepts

Prerequisites

List Experiments: `ax experiments list`

Flags

Get Experiment: `ax experiments get`

Flags

Response fields

🎯 Best For

💡 Use Cases

📖 How to Use This Skill

Install the Skill

Load into Your AI Assistant

Apply Arize-Experiment to Your Work

Review and Refine

❓ Frequently Asked Questions

Can this connect to my database directly?

How do I install Arize-Experiment?

Can I customize this skill for my team?

⚠️ Common Mistakes to Avoid

Not validating data quality

Ignoring data quality

🔗 Related Skills