I Built a Starter Repo That Turns AI Coding Tools Into Senior Engineers

March 4, 2026 · 12 min read · Humza Tareen

AI Coding Tools CI/CD Developer Experience GitHub Actions Open Source

Every new project starts the same way: create a repo, write some code, realize you have no CI, no linting, no test enforcement, no issue templates. Three weeks in, you're debugging a production issue with no runbook, your AI coding assistant is generating code that doesn't follow your conventions, and half the team is committing directly to main.

I've set up development practices across multiple production projects—from a multi-cluster AI evaluation platform on GCP to Python evaluation pipelines. The same patterns kept working: Ruff for linting with auto-fix in CI, pre-commit hooks that catch problems before they reach the remote, test coverage enforcement that posts PR comments telling you exactly which test files to create, and incident runbooks that turn 2-hour debugging sessions into 15-minute triage flows.

So I extracted all of it into a single, clone-and-go starter repo. But I added something the original projects didn't have: instruction files for every major AI coding tool—Cursor, Claude Code, Codex, and GitHub Copilot—configured to follow the same practices the CI enforces.

The repo: github.com/humzakt/dev-starter-kit

The Problem: AI Tools Without Context Are Junior Developers

AI coding tools are powerful, but out of the box they don't know your project's conventions. They'll use single quotes when your linter expects double quotes. They'll skip tests. They'll commit to main. They'll generate 200-character lines when your config says 120. They'll catch generic exceptions when your style guide says to be specific.

The fix isn't to stop using AI tools—it's to give them the same onboarding document you'd give a new team member. That's what AGENTS.md, CLAUDE.md, and .cursor/rules/ files are for. They turn your AI assistant from a context-free code generator into something that understands your project's architecture, follows your conventions, and runs the right commands after every change.

What's in the Repo

The starter kit covers three categories: CI/CD and quality gates, operational readiness, and AI tool configuration.

CI/CD Workflows

Three GitHub Actions workflows that work together:

PR Checks (pr-checks.yml) is the main quality gate. When you open a PR, it runs Ruff with --fix, commits any auto-fixes back to your branch, then verifies the code is clean. After linting passes, it runs pytest. The interesting part is the test coverage enforcement job: it detects which Python source files you changed, checks whether you also added or updated the corresponding test files, and if you didn't, it fails the check and posts a PR comment with the exact test file names, class names, and method signatures you need to write.

# The CI posts comments like this on your PR:
#
# Test Coverage Required
#
# Source: src/services/auth.py
# Test file to create/update: tests/test_auth.py
# Test class: TestAuth
# Required test methods:
#     def test_validate_token(self):  ...
#     def test_refresh_session(self):  ...

This isn't a coverage percentage check—it's a structural check that ensures you're at least thinking about tests for the code you changed. You can bypass it with a skip-test-check label for config-only PRs.

Lint & Format (lint.yml) runs on both pushes and PRs, but only checks files that actually changed. It validates YAML files too (excluding workflow files, which have their own validation).

Merge Readiness (merge-ready.yml) is the final gate. It runs strict lint (no auto-fix), Python syntax compilation, YAML validation, and tests. All checks must pass before the merge button lights up. Skipped checks (because no relevant files changed) count as passing.

Pre-commit Hooks

The .pre-commit-config.yaml catches problems before they even reach CI:

Ruff lint and format on every commit
Syntax checks—Python AST verification, YAML, JSON, TOML validation
Hygiene—trailing whitespace, line endings, end-of-file fixers
Security—private key detection, large file prevention
Branch protection—blocks direct commits to main

Issue Templates

Three structured templates that guide reporters through providing the right information:

Bug Report—component dropdown, priority, reproduction steps, error output, environment info
Incident Report—SEV-1 through SEV-4 severity, affected components checklist, triage checklist, timeline table, root cause, action items
Infrastructure Issue—CI/CD failures, dependency issues, environment config, investigation checklist

Incident Runbook

The docs/INCIDENT_RUNBOOK.md is a structured playbook for when things go wrong:

Severity classification table with response time targets
3-phase triage checklist (Identify Scope, Gather Evidence, Communicate)
Component-specific diagnostic commands you can copy-paste
CI/CD troubleshooting section
4 common failure scenarios with diagnosis and fixes
Post-incident review template

The AI Tool Configuration Layer

This is the part that makes the starter kit different from other project templates. Every major AI coding tool reads a different file format for project instructions. The starter kit includes all of them:

File	Tool	Format
`AGENTS.md`	Codex, Cursor, Claude Code, Windsurf	Markdown
`CLAUDE.md`	Claude Code	Markdown
`.cursor/rules/*.mdc`	Cursor IDE	MDC with YAML frontmatter
`.claude/rules/*.md`	Claude Code	Markdown with YAML frontmatter
`.github/copilot-instructions.md`	GitHub Copilot	Markdown

AGENTS.md: The Universal Standard

AGENTS.md is the broadest-reaching file. It's read by Codex, Cursor, Claude Code, and other AI agents. According to recent evaluations, AGENTS.md achieves a 100% pass rate for providing AI agents with project context. It includes:

Build and run commands
Code style conventions
Testing rules and patterns
Git and PR workflow
Project architecture
The development loop AI agents should follow (understand → edit → lint → test → verify → commit)

CLAUDE.md: Session-Persistent Instructions

Claude Code reads CLAUDE.md at the start of every session. It's kept intentionally concise—under 100 lines—because Claude Code already has a large system prompt, and every line competes for attention. It focuses on the commands to run and the rules to follow, without the explanatory context that AGENTS.md provides.

Cursor Rules: Modular and Scoped

Cursor uses .cursor/rules/*.mdc files with YAML frontmatter that controls when each rule applies:

general.mdc (alwaysApply: true)—project conventions that apply everywhere
python.mdc (glob: **/*.py)—Python-specific rules: import ordering, type hints, error handling patterns
testing.mdc (glob: tests/**/*.py)—test structure, what to test vs. what not to test, running commands

This modular approach means the AI only gets the relevant rules for the file it's editing, keeping the context window focused.

What the Instructions Actually Teach

All instruction files converge on the same core behaviors:

Read before editing—understand the existing code and its tests before making changes
Run lint after every edit—ruff check --fix . && ruff format .
Run tests after every edit—pytest tests/ -v
Fix failures before moving on—don't leave broken tests or lint errors
Write tests for new code—the CI will enforce this anyway
Use conventional commits—feat:, fix:, chore:, etc.
Never commit secrets—use environment variables

The result: your AI coding tool follows the same development loop a senior engineer would. It doesn't just generate code—it generates code that passes your CI.

How to Use It

Clone, delete the git history, and start fresh:

git clone https://github.com/humzakt/dev-starter-kit.git my-project
cd my-project
rm -rf .git
git init && git checkout -b main

# Set up dev environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
pre-commit install

# Start building
# Write code in src/, tests in tests/
# Push and open PRs -- CI handles the rest

Customize by updating:

pyproject.toml—project name, Python version, lint rules
Issue templates—component dropdowns for your specific project
AI tool files—architecture section, project-specific commands
requirements.txt—your actual dependencies

Design Decisions

Why Multiple AI Tool Files Instead of Just AGENTS.md?

AGENTS.md is the most broadly compatible, but each tool has its own file format with unique capabilities. Cursor's .mdc files support glob-based scoping, so the AI only loads Python rules when editing Python files. Claude Code's hierarchical system supports user-level, project-level, and file-level overrides. GitHub Copilot reads from .github/copilot-instructions.md specifically. By including all formats, the starter kit works regardless of which tool your team uses.

Why Test Coverage Enforcement Instead of Coverage Percentage?

Coverage percentage (e.g., "maintain 80% coverage") creates perverse incentives. Developers write trivial tests to hit the number, or they argue about what counts. The structural check is simpler: if you changed source files, did you also change test files? It doesn't check that the tests are good—that's what code review is for. It checks that you at least thought about testing.

Why Auto-fix in CI?

The alternative is rejecting PRs for formatting issues, which wastes everyone's time. The CI fixes formatting and commits it back. The developer pulls and continues working. The tradeoff is extra commits in the PR history, but squash merging eliminates that.

Why Pre-commit AND CI?

Pre-commit catches issues locally before they're pushed. CI catches issues for developers who haven't installed pre-commit, or when pre-commit is bypassed. Belt and suspenders.

What I Learned Building This Across Multiple Projects

After implementing these practices in a multi-service GCP platform (Cloud Run, GKE, AlloyDB, Pub/Sub) and a Python evaluation pipeline, a few patterns emerged:

Auto-fix is worth the complexity. The Ruff auto-fix + commit-back pattern in CI eliminated 90% of "fix formatting" follow-up commits. Developers focus on logic, CI handles style.

Test coverage comments are more effective than failed checks alone. A red X on a check tells you something failed. A PR comment that says "create tests/test_auth.py with class TestAuth and method test_validate_token" tells you exactly what to do. The Python script in the workflow parses the git diff to extract changed function names and generates these suggestions automatically.

Incident runbooks save more time than you think. The first time someone uses the triage checklist during an actual incident, it pays for the time spent writing it. The structured format prevents the "random debugging" pattern where people try things without system.

AI tool instructions compound. Every session where the AI follows your conventions instead of its defaults saves a few minutes of correction. Over hundreds of interactions, that's hours of saved time and fewer lint failures in CI.

The Full File Tree

.
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug-report.yml
│   │   ├── incident-report.yml
│   │   ├── infrastructure-issue.yml
│   │   └── config.yml
│   ├── workflows/
│   │   ├── pr-checks.yml
│   │   ├── lint.yml
│   │   ├── merge-ready.yml
│   │   └── README.md
│   └── copilot-instructions.md
├── .cursor/rules/
│   ├── general.mdc
│   ├── python.mdc
│   └── testing.mdc
├── .claude/rules/
│   └── development.md
├── docs/
│   └── INCIDENT_RUNBOOK.md
├── src/
├── tests/
├── AGENTS.md
├── CLAUDE.md
├── .pre-commit-config.yaml
├── .gitignore
├── pyproject.toml
├── requirements.txt
├── LICENSE
└── README.md

Repo: github.com/humzakt/dev-starter-kit. MIT licensed. Clone it, customize it, ship it.

CI/CD Quality Gates
CI/CD workflows included in the starter kit
Bootstrapping Engineering Standards
Engineering standards the kit enforces
Adding Responses API to an Agent Framework
AI agent framework setup