Skip to content

nginx Bot & Crawler Detection

nginx doesn't call out to an API per request without extra modules, so the reliable pattern is: log the signals Kitbase needs as structured JSON, then ship them to the ingestion endpoint in batches.

Privacy — we only keep the bots

Forwarding every request doesn't mean every request is stored. Human visitors' signals are used only to classify the request in memory and are then discarded — only bot and crawler requests are persisted. For those, the raw IP is stored only when IP logging is enabled for the environment; otherwise it's used to derive geolocation (country, region, city) and then dropped.

Prerequisites

  • KITBASE_API_KEY — your project's secret API key (sk_kitbase_…).
  • KITBASE_ENVIRONMENT — the target environment name, e.g. Production.

1. Log the signals as JSON

Add a JSON log_format and write it to a dedicated access log. ($uri is the normalized path; the client timestamp is omitted so Kitbase stamps each event at receipt time.)

nginx
log_format kitbase escape=json '{'
    '"user_agent":"$http_user_agent",'
    '"ip_address":"$remote_addr",'
    '"method":"$request_method",'
    '"host":"$host",'
    '"path":"$uri",'
    '"referrer":"$http_referer"'
'}';

access_log /var/log/nginx/kitbase.log kitbase;

If nginx sits behind another proxy/load balancer, use $http_x_forwarded_for (first IP) for ip_address instead of $remote_addr.

2. Ship the log to Kitbase

Use a log shipper to read new lines and POST them to https://ingest.kitbase.dev/ingest/v1/server in batches (up to 500 per call). A production-grade shipper like Vector or Fluent Bit with an HTTP sink handles tailing, rotation, and batching for you — point it at the endpoint with the Authorization: Bearer header and wrap the lines as { "environment": "...", "events": [ ... ] }.

For a minimal setup, a tiny tailer works too:

js
// kitbase-shipper.js — tail the nginx JSON log and POST batches
import { spawn } from "node:child_process";

const tail = spawn("tail", ["-n0", "-F", "/var/log/nginx/kitbase.log"]);
let buffer = [];

async function flush() {
  if (buffer.length === 0) return;
  const events = buffer.splice(0, 500);
  await fetch("https://ingest.kitbase.dev/ingest/v1/server", {
    method: "POST",
    headers: { authorization: `Bearer ${process.env.KITBASE_API_KEY}`, "content-type": "application/json" },
    body: JSON.stringify({ environment: process.env.KITBASE_ENVIRONMENT, events }),
  }).catch(() => {});
}

tail.stdout.on("data", (chunk) => {
  for (const line of chunk.toString().split("\n")) {
    if (line.trim()) try { buffer.push(JSON.parse(line)); } catch {}
  }
});

setInterval(flush, 5000); // flush every 5s

Human requests in the batch are ignored by Kitbase — only bot/crawler requests are stored.

Next steps

  • API reference — full request schema, response, and attribution fields.
  • All platforms — setup guides for other frameworks and hosts.

Released under the MIT License.