Home / Blog / Webhook vs polling: the real trade-offs and the hybrid that actually works

Webhook vs polling: the real trade-offs and the hybrid that actually works

Arguing webhook versus polling is asking the wrong question. When is each the right fit, and when do you need to run both together?

The first question whenever you wire up data sync between two systems: webhook or polling? The honest answer is usually both, depending on the case.

In the past few years I’ve answered this question twice on an e-commerce integration (Shopify plus ERP), a fintech notification system (bank webhook with a polling fallback), and an analytics platform (event ingestion). Same question every time, slightly different answer.

Here’s the framework I use.

Polling: plain, dumb, works

Polling: the client hits an endpoint on a schedule and checks for changes.

while true:
    response = GET /api/orders?since=last_sync_timestamp
    for order in response.orders:
        process(order)
    last_sync_timestamp = response.server_time
    sleep(60)

Upsides:
– The client pulls, so firewall and NAT aren’t your problem
– Auth is client-side, a standard Bearer token is enough
– Failure recovery is trivial: one call fails, the next catches up
– Debugging is easy: curl to test, every call shows up in logs

Downsides:
– Latency is capped by your poll interval. 60 second polling means a 30 second average delay
– Waste: most polls return “no changes”, burning bandwidth and CPU
– Server-side load: N clients polling every 60 seconds is a steady baseline load

Webhook: reactive, efficient, tricky

Webhook: the server POSTs to a URL the client provides whenever something happens. The client needs a publicly reachable endpoint.

Upsides:
– Latency measured in milliseconds. The notification arrives the moment the event does
– Efficient: no wasted polls, only real changes
– Scalable: millions of clients can each receive only their own events

Downsides:
– The client has to be publicly reachable (dev needs a tunnel, ngrok or Cloudflare Tunnel)
– Auth is harder: signature verification, secret rotation
– Delivery guarantees are fuzzy: the webhook can drop on the server side, the client can return 5xx, retry semantics vary
– Ordering isn’t guaranteed: events can arrive out of sequence
– Debugging is harder: did the POST arrive or not? Hard to inspect after the fact

Which one, when?

My decision matrix:

| Situation | Preference |
|——-|——–|
| Low-frequency data, sub-second latency required | Webhook |
| High-frequency data, seconds are fine | Polling |
| Client is firewalled or local dev | Polling |
| Event order is critical | Polling (with pagination) |
| Backend signal, unknown event frequency | Webhook |
| Both ends under your control | Webhook is ideal |
| Third-party consumer | Polling fallback is a must |

The mistake I see most: webhook-only

Webhooks are easy to love. Low latency, efficient, modern. I’ve met engineers who call polling “2010s technology”.

But a webhook-only system is fragile. If a webhook drops (server outage, network hiccup, endpoint returning 500), the event is just gone. There’s no mechanism for the consumer to notice an event never arrived.

A real incident: Shopify had a 15-minute webhook delivery delay, retried, but our endpoint hit a rate limit. Some orders landed with 99% success, 1% lost. 47 orders were missing from the ERP. Support crisis followed.

The hybrid: webhook plus reconciliation

The healthiest pattern:

  1. Webhook: primary for real-time events
  2. Polling-based reconciliation: every 1 to 6 hours, fetch “last 24 hours of events” and catch anything the webhook missed

Reconciliation job:

def reconcile_last_24h():
    since = now - timedelta(hours=24)
    remote_events = api.fetch_events(since=since)
    local_events = db.fetch_events(since=since)
    
    missing = [e for e in remote_events if e.id not in local_events]
    for event in missing:
        log.warn(f"Missed webhook: {event.id}")
        process_event(event)

Reconciliation runs infrequently (hourly is plenty), one job. It catches everything the webhook dropped.

Webhook delivery guarantees

If you’re designing the webhook side, guarantee these:

At-least-once delivery. Retry policy is mandatory. If the consumer returns 5xx or times out, retry with exponential backoff.

Signature verification. Every webhook should be HMAC-signed. The consumer verifies with the secret and blocks forgery.

Idempotency key. Every event gets a unique ID. If the consumer sees the same ID twice, it skips.

Delivery log. The consumer logs “this event ID arrived at this time”. Non-negotiable for support debugging.

Delivery dashboard. On the producer side, show users “in the last hour, N webhooks sent, M failed, these are retrying”. Cuts support tickets.

Consumer side: endpoint design

When you build the receiving endpoint:

Fast ack. Return 200 OK immediately and hand the processing work to an async queue. Webhook producers time out quickly (10 to 30 seconds); anything long-running needs to happen off the request thread.

Idempotency built-in. The consumer dedupes on event ID and timestamp.

Rate-limit tolerant. Burst traffic is a thing (catch-up after downtime). Queue to buffer, apply backpressure.

Logging. Log every webhook with its raw body. Otherwise debugging lost events is hopeless.

Polling: efficient if you design it right

Polling isn’t always wasteful. Good patterns:

Incremental pull with a since cursor. Don’t refetch the whole dataset every time. Fetch records newer than the last timestamp.

Conditional requests. ETag and If-Modified-Since. If nothing changed, you get a 304 Not Modified and save bandwidth.

Adaptive polling interval. Poll often when there’s activity, slowly when there isn’t. Active session: every 10 seconds. Idle: every 10 minutes.

Long polling. The client holds the request open and the server responds when something changes. You get the feel of real-time with polling’s simplicity.

Long polling is a near-alternative to webhooks and, because it’s client-initiated, it bypasses NAT and firewall issues.

GraphQL subscriptions, SSE, WebSocket

Other transports:

Server-Sent Events (SSE). One-way push from server to client over a long-lived HTTP connection. Native in browsers, no polyfill needed.

WebSocket. Bidirectional, low-latency. Chat, games, collaborative editors.

GraphQL subscriptions. Event streams over WebSocket via the subscription spec.

These can beat webhook and polling when:
– The client can hold an open connection indefinitely
– You need bidirectional messaging
– The browser talks directly to the server

For server-to-server backend integrations, webhooks usually fit better.

Final advice

Don’t ask “webhook or polling?” Ask “how do I combine them?”

  • Webhook for real-time
  • Polling reconciliation for reliability
  • Logging on both for debugging
  • A delivery dashboard for monitoring

Running two systems in parallel seems complicated at first. During an outage it’s exactly what keeps you afloat.

Have a project on this topic?

Leave a brief summary — I’ll get back to you within 24 hours.

Get in touch