Sergey Orsik.dev
← notes

2026-05-16

Synchronous Upload Gate — Validation and Moderation Before Async Work

Why probe + policy checks run on completeUpload, retry semantics, object lifecycle, and separation from the transcode worker.

Diagram

Problem statement

If the API enqueued Kafka messages before knowing an object is playable and policy-compliant:

  • Workers waste CPU on corrupt files and malware.
  • CDN keys for rejected content may already be cached or shared.
  • Compliance teams cannot enforce "no persist" guarantees for prohibited imagery.

The complete multipart handler is therefore a deliberate synchronous gate — not an anti-pattern, but a bounded latency trade.


Step 1 — Technical validation (FFmpeg / ffprobe class)

Inputs: Presigned GET URL or internal S3 URL, object key, declared size.

Checks (illustrative):

CheckFailure class
Container readableMediaValidationException
Video: codec/duration boundsreject if > max duration
Image: decodable dimensionsreject if extreme aspect or 0×0
MIME vs extension mismatchoptional hard fail

On failure: DeleteObject immediately — comment in code often notes "prevent orphan billing." Client receives 4xx; no DB row.

Latency: Usually sub-second for images; 1–3s for short video probes.


Step 2 — Content moderation (third-party vision API class)

Inputs: Same reachable URL (HTTPS). Filename may hint context but should not be sole signal.

Exception taxonomy (critical for retries)

ExceptionObject in bucketHTTP to clientRetry
InappropriateContentExceptionDeleted4xx policyNo
ContentModerationUnavailableExceptionRetained5xx/503 after retries exhaustedYes (@Retryable)
Timeout / rate limitRetainedRetryYes

Why retain on vendor outage: Avoid false negatives where upload fails permanently during a blip; ops can re-drive moderation later. Tradeoff: brief window where unmoderated object exists in private bucket — mitigate with bucket policy (no public ACL) and no CDN publish until gate passes (indirection URLs still require auth at edge).

Duplicate moderation calls

Spring Retry on transient errors may invoke the vendor multiple times for the same bytes. Acceptable cost vs blocking uploads; consider idempotency token if vendor supports it.


Step 3 — Metadata insert

Only after steps 1–2 succeed:

  • Insert profile_media with media_url = API indirection URL (points at original key path).
  • Assign serial_number / gallery position with row locks if concurrent uploads.

Race: Two completes for same slot — DB unique constraint + shift logic prevents duplicate slot numbers.


Step 4 — Kafka publish

Fire-and-forget async send with whenComplete logging:

on failure: log.error(mediaId) — row already committed
on success: log partition/offset

Gap: No transactional outbox → orphan row possible if broker down. Mitigations listed in project-architecture failure table.


Redis auxiliary: upload description

Pattern:

SET media:init:desc:{uploadId} = "{user text}" EX 3600

Read on complete, delete key. Loss of Redis → upload succeeds without description — acceptable degradation.


Second line: antivirus (async HTTP)

ClamAV or cloud AV on worker POST /scan:

AspectHot path gateAV scan
WhenBefore DB insertAfter upload, optional/on-demand
Blocks userYesOnly if exposed API waits (usually no)
CleanupAPI deletes on policyWorker deletes + virus Kafka event

Do not assume AV replaces moderation — different threat models (malware vs CSAM/policy).


Latency and capacity planning

FactorImpact on complete p99
Moderation API regionRTT to vendor
Image vs videoVideo frames cost more
Retry storms during outageMultiplies vendor QPS
Sync FFmpeg on large fileKeep probe lightweight (headers only)

Scaling lever: Gate stays on API tier — scale API replicas horizontally; do not scale API for transcode. If gate p99 unacceptable, move moderation to async with "pending_visibility" state (product complexity increases).


Comparison table: sync gate vs async moderation

ApproachProsCons
Sync gate (this pattern)Simple mental model; no public bad contentUpload latency tied to vendor
Async moderation + quarantineFast upload ACKComplex UI states; leak risk if URLs guessable
Client-side onlyCheapNot defensible for UGC platforms