fix: pipeline robustness hardening #1

Closed
ronit wants to merge 0 commits from fix/pipeline-robustness-hardening into master
Owner

Summary

  • Atomic webhook dedup — replace TOCTOU DashMap check-then-insert with entry() API, preventing duplicate pipeline runs from concurrent webhooks
  • Dedup eviction — background task clears stale entries every 30s, preventing unbounded memory growth
  • NATS flush — completion events now call flush() after publish() to guarantee delivery
  • RPC timeout — switch to request_rpc_with_timeout(300s) instead of unbounded request_rpc
  • Reviewer — fetch default branch from Forgejo API instead of hardcoding main/master
  • ORG_ACCOUNT_ID — deduplicate inline UUID functions into shared constant
  • Silent failure logging — 5 previously silent failure paths now log warnings (missing PR, malformed repo name, missing event header)
  • Pre-initial migration202603229999_create_pipeline_schema.sql ensures CREATE SCHEMA IF NOT EXISTS pipeline runs before the initial migration on fresh databases

Test plan

  • cargo check — compiles with pre-existing warnings only
  • Deployed to prod — pod healthy, webhook processing confirmed via logs
  • New migration applied successfully alongside existing migrations

🤖 Generated with Claude Code

## Summary - **Atomic webhook dedup** — replace TOCTOU DashMap check-then-insert with `entry()` API, preventing duplicate pipeline runs from concurrent webhooks - **Dedup eviction** — background task clears stale entries every 30s, preventing unbounded memory growth - **NATS flush** — completion events now call `flush()` after `publish()` to guarantee delivery - **RPC timeout** — switch to `request_rpc_with_timeout(300s)` instead of unbounded `request_rpc` - **Reviewer** — fetch default branch from Forgejo API instead of hardcoding main/master - **ORG_ACCOUNT_ID** — deduplicate inline UUID functions into shared constant - **Silent failure logging** — 5 previously silent failure paths now log warnings (missing PR, malformed repo name, missing event header) - **Pre-initial migration** — `202603229999_create_pipeline_schema.sql` ensures `CREATE SCHEMA IF NOT EXISTS pipeline` runs before the initial migration on fresh databases ## Test plan - [x] `cargo check` — compiles with pre-existing warnings only - [x] Deployed to prod — pod healthy, webhook processing confirmed via logs - [x] New migration applied successfully alongside existing migrations 🤖 Generated with [Claude Code](https://claude.com/claude-code)
fix: harden pipeline-service against silent failures and race conditions
All checks were successful
pipeline-service No pipeline action (save)
6f23a97c1f
- webhook dedup: replace TOCTOU DashMap check-then-insert with atomic entry() API
- webhook dedup: add background eviction task (30s interval, 60s TTL) to prevent unbounded growth
- notify: add nats.client.flush() after publish to ensure completion events reach NATS
- deploy RPC: switch to request_rpc_with_timeout (300s) instead of unbounded request_rpc
- deploy: use shared ORG_ACCOUNT_ID constant instead of per-file inline functions
- reviewer: fetch default branch from Forgejo API instead of hardcoding main/master
- executor: log warnings for missing PR (result was silently dropped) and malformed repo names
- webhook: log warnings for missing/non-ASCII x-forgejo-event header and malformed repo names
- migration: add compensating CREATE SCHEMA IF NOT EXISTS for bare sqlx migrate safety

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: replace compensating migration with pre-initial schema creation
All checks were successful
pipeline-service No pipeline action (save)
e81796a370
The previous compensating migration (202603240001) ran after the initial
migration and could not protect bare `sqlx migrate run` on a fresh database.
Replace with 202603229999 which runs before the initial migration, creating
the pipeline schema before SET search_path TO pipeline executes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: restore applied migration file — sqlx requires it in the resolved set
All checks were successful
pipeline-service No pipeline action (save)
1bd1800025
sqlx validates that all previously-applied migrations still exist as files.
Deleting 202603240001 caused CrashLoopBackOff. Restore the file while
keeping 202603229999 as the pre-initial schema creation for fresh DBs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: restore exact original migration content to match sqlx checksum
All checks were successful
pipeline-service No pipeline action (save)
da8bd60a7f
sqlx checksums the full file content including comments. The restored
file had different comments, causing "migration has been modified" error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ronit closed this pull request 2026-03-25 04:57:31 +00:00
All checks were successful
pipeline-service No pipeline action (save)

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
isoastra/pipeline-service!1
No description provided.