Engineering Standards on a Legacy Codebase
You join an active team on the workflow platform’s admin panel: features ship every week, stakeholders expect velocity, and the codebase has already absorbed years of pragmatic decisions. What you do not find is a crisp engineering envelope—no pull request template, no issue templates, no explicit testing bar, and no shared configuration telling AI coding assistants how this repository actually works. Meanwhile, quality signals are uneven: CSV exports show raw UTC ISO strings like 2026-03-31T16:29:00.000Z while tables in the UI show friendly localized dates like “Mar 31, 2026.” A multi-tab payments area exists in the admin UI—Overview, pay rates, user adjustments, payment records—but the backend wiring is only half complete, which is a liability if anyone mistakes it for production-ready. CI is green, yet several GitHub Actions are pinned to versions that trail the current Node.js support window, meaning deprecation warnings are not theoretical; they are calendar events.
The wrong move is to declare a “standards sprint,” freeze feature work, and attempt a big-bang refactor. The team is not broken; it is busy. The right move is to layer standards in thin, reviewable slices—each change small enough to merge without drama, each change reinforcing the next. This post walks through how we did that: AI tool rules, PR and issue templates with a testing mandate, shared date formatting utilities, a feature lockdown instead of a destructive rollback, and a proactive GitHub Actions refresh for Node.js 24 compatibility.
The starting point: shipping without guardrails
When standards are implicit, they drift by person and by week. One engineer runs Vitest locally out of habit; another relies on CI only. One patch formats dates with toLocaleDateString; another serializes Date straight into CSV cells. AI assistants, left unconstrained, invent folder patterns and testing commands that do not match the stack. None of that is malice; it is the predictable outcome of an active codebase without scaffolding.
The constraint is social as much as technical. You cannot halt the roadmap to “fix engineering.” You can, however, make the easy path the correct path: templates that ask for evidence, rules files that teach tools the repo’s conventions, utilities that remove one-off fixes, and guardrails that make half-built features obviously inert. The theme is incremental leverage—each PR should leave the system slightly harder to misuse than before.
AI tool rules: teach assistants the same loop as a senior engineer
We added project-specific rules in three places: Cursor (.cursor/rules/), Claude Code (.claude/rules/), and Copilot instructions updated to match. The content is not novel philosophy; it is operational context—Next.js and TypeScript for the web app, Prisma for persistence, Vitest for unit tests, Cloud Run and background jobs where relevant—so generated edits land in the right directories with the right import style.
The highest-leverage section is the testing mandate. Tools default to “edit and hope.” We inverted that default: read surrounding code before changing it, run the linter after edits, run targeted tests after substantive changes, and add or update tests whenever behavior changes. That mirrors what a careful human reviewer expects, but it is encoded where the assistant reads it every session.
## Testing mandate
- Before editing: open the nearest existing tests and note coverage for the module.
- After editing: run `pnpm lint` (or the repo’s equivalent) on touched packages.
- After behavioral changes: run `pnpm vitest run <path-to-nearest-spec>`.
- For new features or bug fixes: add focused unit tests; do not merge without a failing test that proves the fix.
- Prefer small, deterministic tests over broad snapshots unless the UI contract truly requires them.
Rules are not magic; they are guardrails on autopilot. The win is fewer drive-by edits that skip tests because “it was a small change,” and fewer invented scripts that do not exist in package.json.
PR template, issue templates, and operational clarity
Culture follows structure. We introduced a pull request template with an explicit testing section: what unit tests were added or updated, paste or summarize local Vitest output for the relevant package, and a short manual validation checklist for anything that touches the admin panel or async workers. The goal is not bureaucracy; it is to make “I verified this” legible to reviewers who cannot inhabit every feature area.
Alongside that, we added GitHub issue templates tailored to the stack—bug report, incident report, and infrastructure issue—so triage captures service boundaries (Cloud Run revisions, Prisma migrations, Cloud Tasks queues) without free-form essays. A lightweight incident response runbook links from the incident template so on-call steps are consistent even when sleep-deprived. These documents do not replace judgment, but they reduce the activation energy of doing the right thing when production misbehaves.
## Testing
- [ ] Unit tests added/updated for changed behavior (Vitest).
- [ ] `pnpm vitest run <relevant-path>` passes locally (paste summary).
- [ ] Manual checks (if UI/workers): …
## Risk & rollout
- Data migrations: none / forward-only / requires backfill
- Feature flags: …
Within days of merging the template, pull requests started carrying test notes without nagging. That is the quiet power of defaults: when the empty box is visible, people fill it.
Reviewers benefited too. Instead of guessing whether a change was exercised, they could scan a predictable section for commands and outcomes. When a regression slipped through, the postmortem could compare what the template asked for against what was merged—useful without turning code review into a courtroom. The templates did not replace conversation; they anchored it.
Date formatting consistency: one bug class, one utility layer
QA filed a classic inconsistency: CSV exports exposed raw ISO timestamps in UTC while on-screen tables showed localized short dates. Users reconciling spreadsheets against the admin panel assumed the product was wrong somewhere; in truth, two presentation paths diverged. The fix is not to patch the one export that triggered the ticket—it is to centralize formatting and apply it everywhere dates cross a boundary (tables, detail drawers, CSV, and any PDFs if you add them later).
We introduced three shared functions: formatDateShort for dense tables, formatDateFull for detail views where context matters, and formatDateCsv for exports where we want a stable, human-readable string without surprising timezone shifts. Each function documents its timezone policy in one place so future contributors do not reintroduce drift.
const LOCALE = "en-US";
const TZ = "America/New_York"; // example: pick the org’s reporting timezone
export function formatDateShort(d: Date): string {
return new Intl.DateTimeFormat(LOCALE, {
timeZone: TZ, month: "short", day: "numeric", year: "numeric",
}).format(d);
}
export function formatDateFull(d: Date): string {
return new Intl.DateTimeFormat(LOCALE, {
timeZone: TZ, dateStyle: "medium", timeStyle: "short",
}).format(d);
}
/** CSV-safe, fixed-width style; avoids raw ISO Z unless explicitly desired */
export function formatDateCsv(d: Date): string {
return new Intl.DateTimeFormat("en-CA", {
timeZone: TZ, year: "numeric", month: "2-digit", day: "2-digit",
hour: "2-digit", minute: "2-digit", second: "2-digit", hour12: false,
}).format(d);
}
After wiring these through tables and export builders, the product speaks one temporal language. The meta-lesson: inconsistencies are often symptoms of missing shared primitives, not of “one bad line.” Extract the primitive, then sweep call sites.
Feature lockdown: preserve work, remove ambiguity
The payments feature shipped as a polished four-tab interface—overview, pay rates, user adjustments, payment records—but server routes and configuration were not ready for production traffic. Deleting the UI would have wasted design and front-end effort and confused stakeholders who knew the capability was “almost there.” Instead, we implemented a lockdown pattern: the backend guard forces safe defaults on every read/write of configuration, the settings page shows an amber “Feature under development” banner with controls disabled, and the payments page is replaced by a static explanation while navigation keeps the entry visible so admins know the surface exists but is intentionally inactive.
This is strictly better than a silent half-feature: operators see honesty instead of fragile controls. We added seven focused unit tests on the guard so refactors cannot accidentally re-enable risky flags.
type PaymentsConfig = {
enabled: boolean;
autoCalculate: boolean;
// …other fields preserved for forward compatibility
};
/** Ensures incomplete payments cannot be toggled on via API/UI drift */
export function buildPaymentsValue(input: PaymentsConfig): PaymentsConfig {
return {
...input,
enabled: false,
autoCalculate: false,
};
}
When the backend and product are ready, you flip the guard behind a feature flag or remove it in a single, auditable change—rather than resurrecting deleted routes from memory.
GitHub Actions and Node.js 24: proactive maintenance
Deprecation notices for Actions runtimes are easy to ignore until they become hard failures. We audited five workflow files and bumped first-party actions to current majors—actions/checkout v4 to v5, actions/setup-node v4 to v5, and refreshed Google Cloud auth actions to their latest compatible major—then validated builds, tests, and deploy hooks on a branch. No semantic surprises surfaced because the upgrades were mechanical; the value is runway: CI stays compatible with the Node.js 24 toolchain the application targets, and we avoid emergency upgrades during a release week.
Treating CI like product code—periodic hygiene PRs, changelog scanning, pinned majors you understand—pays dividends that rarely make flashy blog titles but absolutely prevent weekend pages.
We also smoke-tested the workflows on a forked branch with representative secrets stubbed, so we were not learning about YAML edge cases during a deploy freeze. That extra hour of verification is cheaper than a broken pipeline when someone urgently needs a hotfix.
The incremental approach: many small wins beat one freeze
None of the changes above required a migration weekend. Each landed as its own pull request: rules files, templates, date utilities, payments lockdown with tests, Actions bumps. Reviewers could reason about them independently, and rollbacks were trivial if something surprised us. The testing section in the PR template turned out to be the highest-impact nudge; once the path of least resistance included “show your Vitest run,” tests became part of the normal conversation instead of a special-occasion chore.
If you are staring at a legacy-adjacent codebase with an active team, start where friction is loudest—dates, half-features, CI warnings—and encode standards as tooling and templates rather than lectures. Standards that ship quietly, week after week, compound into a culture that looks inevitable in hindsight.
The sequence matters less than the cadence: pick one credible improvement, merge it, measure whether review comments and incidents get easier, then stack the next change. Momentum is the standard.