Documentation

Webhooks Best Practices

Design resilient webhook receivers that handle retries, timeouts, and duplicate deliveries without breaking your application.

Open Dashboard Read Full Docs

Best Practices

Core Principles for Reliable Webhook Receivers

Every webhook integration you build will face network hiccups, server restarts, and payload duplicates. Following these principles ensures your system stays resilient under real-world conditions.

⚡

Always Acknowledge Within 3 Seconds

Respond with a 2xx status code as fast as possible. Defer heavy processing to a background queue. Providers like Stripe and GitHub will treat slow responses as failures and keep retrying indefinitely.

🔐

Verify Signatures on Every Request

Never trust incoming webhook payloads blindly. Validate the signature header (e.g., X-WebhookWatch-Signature) using your shared secret before processing any data. This prevents spoofed events from corrupting your system.

🔄

Implement Idempotency Keys

Store the event ID or idempotency key from each payload in a database. Before processing, check if you've already handled that event. This prevents duplicate charges, double emails, or corrupted state when retries arrive.

📊

Log Everything, Alert on Failures

Record every incoming webhook payload, response code, and processing outcome. Set up alerts for repeated 5xx errors. WebhookWatch captures all failed deliveries automatically so you can replay them from the dashboard.

🧪

Test with Simulated Failures

Use WebhookWatch to simulate network partitions, delayed responses, and duplicate deliveries. Verify your receiver handles each scenario gracefully before going to production. Real failures always expose gaps you missed in staging.

📋

Validate Payload Structure Early

Check required fields and data types at the top of your handler. Return a 400 Bad Request for malformed payloads so the provider stops retrying invalid data. Log the raw payload for debugging before throwing.

Common Pitfalls

Mistakes That Break Webhook Integrations

We've analyzed over 12,000 webhook endpoints through WebhookWatch. These are the most frequent failure patterns we see in production systems.

💥

Processing Before Responding

Running database queries, sending emails, or calling external APIs before sending a 200 OK response. This causes cascading retries when your handler exceeds the 3-second timeout. Result: triple the load, triple the cost, corrupted state.

🕳️

Silently Swallowing Errors

Catching exceptions and returning 200 OK even when processing failed. The provider thinks delivery succeeded, but your system missed the event. You'll discover missing orders or unprocessed payments days later with no audit trail.

🔓

Skipping Signature Verification

Running webhook endpoints without verifying the HMAC signature. Any malicious actor can send fake events to your public URL. This has led to account takeovers and fraudulent transactions in at least 3 public postmortems we've reviewed.

📦

Ignoring Event Ordering

Assuming events arrive in chronological order. When retries mix with new events, you might process a "payment.cancelled" event before "payment.completed". Always use the event timestamp and idempotency keys to enforce correct ordering.

67% of endpoints fail first retry

2.4s average response time

1 in 8 skip signature checks

89% recover with proper retries

Retry Logic

Designing Your Retry Strategy

Webhook providers use exponential backoff with jitter to retry failed deliveries. Your receiver should anticipate these retries and handle them correctly. Here's a reference implementation for a robust webhook handler.

const crypto = require('crypto');
const { processEvent } = require('./queue');

const WEBHOOK_SECRET = process.env.WEBHOOK_SECRET;
const processedEvents = new Set();

app.post('/webhooks/receive', async (req, res) => {
  // 1. Verify signature immediately
  const signature = req.headers['x-webhookwatch-signature'];
  const payload = JSON.stringify(req.body);
  const expected = crypto
    .createHmac('sha256', WEBHOOK_SECRET)
    .update(payload)
    .digest('hex');

  if (!signature || signature !== expected) {
    return res.status(401).send('Invalid signature');
  }

  // 2. Check idempotency before any processing
  const eventId = req.body.id;
  const eventType = req.body.type;

  if (processedEvents.has(eventId)) {
    console.log(`Event ${eventId} already processed, returning 200`);
    return res.status(200).send('Already processed');
  }

  // 3. Acknowledge immediately, defer work
  processedEvents.add(eventId);
  res.status(200).send('Received');

  // 4. Process asynchronously — failures won't trigger retries
  try {
    await processEvent(eventId, eventType, req.body);
  } catch (err) {
    console.error(`Failed to process ${eventId}:`, err.message);
    // Do NOT remove from processedEvents — handle via alerting
    await sendAlert({ eventId, error: err.message });
  }
});

⏱️

Exponential Backoff Schedule

Most providers retry on a schedule like 1m, 5m, 15m, 1h, 4h, 12h, 24h. After 7 attempts over roughly 1.5 days, they stop. Design your system to handle events arriving hours or days after the original trigger.

🎲

Jitter Prevents Thundering Herds

Providers add random jitter to retry intervals so thousands of failed webhooks don't hit your server simultaneously. Your idempotency layer must handle out-of-order arrivals from this jitter.

🚨

Fail Fast on Permanent Errors

If a payload is malformed or your schema has changed, return a 400 or 422 status code. This tells the provider to stop retrying. Only 5xx responses and timeouts trigger retries. Don't waste provider resources on unfixable errors.