All Articles

How I Find Bugs: Filing 33 Issues in One Week

·8 min read·Humza Tareen
Code AuditDebuggingSecurityPython

The Art of Systematic Bug Hunting

I filed 33 issues in one week across four production services. Not because the code was terrible — it wasn't. But production codebases accumulate subtle bugs that only surface under specific conditions: race conditions under load, timezone mismatches in cron jobs, unbounded queries on large datasets, security assumptions that don't hold.

Here's the systematic approach I use to find them.

The Methodology

I audit in layers, starting from the outside and working inward:

Layer 1: API Surface. Every public endpoint gets scrutinized. Does it validate input? Does it check authorization? Can I pass unexpected values (negative pagination limits, empty arrays, duplicate entries)? I found that one API accepted limit=0 and limit=-1, both returning the entire dataset — an OOM risk on large tables.

Layer 2: Authentication & Authorization. Are admin endpoints actually protected? Can an authenticated user access another user's resources? I found task endpoints with no ownership checks — any authenticated user could read, update, or cancel any task in the system.

Layer 3: Data Integrity. Are there unique constraints where there should be? Can the same input create duplicate records? I found that the same PR URL could create unlimited concurrent evaluation tasks with no deduplication.

Layer 4: Error Handling. What happens when dependencies fail? When the database is slow? When an external API returns unexpected responses? I found analytics endpoints that loaded unbounded result sets with no pagination — fine with 100 records, catastrophic with 100,000.

Layer 5: Security. SSRF vectors, path traversal, CORS misconfigurations, secrets in logs. I found a path traversal vulnerability in a file content endpoint — the file_path parameter wasn't validated against the repository root, allowing ../../etc/passwd style attacks.

The Bugs That Surprised Me

145 uses of deprecated datetime.utcnow(). Every single one was a potential timezone bug. Python's datetime.utcnow() returns a naive datetime (no timezone info). When compared with timezone-aware datetimes from the database, it either crashes or produces wrong results. The orphan cleanup job was failing silently because of exactly this mismatch.

OpenAPI docs exposing 128 API paths publicly. Including internal endpoints, admin endpoints, and cron job triggers. An attacker with the docs URL could see the entire API surface and target internal-only endpoints.

Webhook secrets not enforced in development. The secret validation was behind an environment check, meaning anyone who discovered the webhook URL in a non-production environment could trigger arbitrary trainer and reviewer workflows.

The Prioritization Framework

Not all bugs are equal. I classify by blast radius and exploitability:

Critical: Actively exploitable security vulnerabilities (SSRF, path traversal, missing auth checks). These get fixed first, often same-day.

High: Data integrity risks and reliability issues (race conditions, missing constraints, unbounded queries). These get fixed within the week.

Medium: Performance issues and code quality gaps (N+1 queries, deprecated APIs, missing indexes). These get scheduled into the next sprint.

Low: Hardening improvements (stricter CORS, security headers, input validation edge cases). These get batched into hardening PRs.

The Output

Of the 33 issues filed, 11 were closed within 3 days — either by me or triggered by the fixes. The rest are tracked with clear reproduction steps, expected behavior, and severity classification. Every issue links to the specific code path, making the fix straightforward for whoever picks it up.

The pattern works because it's repeatable. I've used the same layered approach on every service I've touched — and it consistently surfaces bugs that were hiding in plain sight.