Skip to content
Marketing Factory Digital GmbH
Contact
Logo Marketing Factory Digital GmbH
  • Agency
    • About us
    • History
  • Services
    • Consulting, Analysis and Strategy
    • Programming and Development
      • Interface Development
      • PIM/ERP Links
      • Custom Development
      • Seamless CMS Integration
    • Hosting and Support
      • Cloud Strategies
      • Hosting Partners of Marketing Factory
    • Services with Third Parties
  • Technology
    • TYPO3 Agency Marketing Factory
      • TYPO3 Upgrade vs. TYPO3 Relaunch
      • Current TYPO3 Versions: v12, v13
      • The TYPO3 Lifecycle
      • Our TYPO3 Extensions
    • Shopware
    • IT Security
      • DDoS Protection
      • Continuous Upgrading
      • Privacy First
    • Tech Stack
      • Commitment to Open Source
      • Technology Selection
      • PHP Ecosystem
      • Containerisation & Clustering
      • Content Delivery Networks
      • Search Technologies
  • References
    • Projects
    • Clients
      • Client List
    • Screenshot of the homepage of the new Maxion Wheels websiteNEW: Relaunch of the corporate website of Maxion Wheels
  • Community
    • Community Initiatives
  • Blog
  • Contact
  • Deutsch
  • English

You are here:

  1. Blog
  2. When Bot Traffic Suddenly Multiplies by a Thousand
  • AI
  • Hosting
  • Security
  • Tech Stack
26.03.2026

When Bot Traffic Suddenly Multiplies by a Thousand


Show larger version for: Ein Chat-Verlauf aus einer APP namens Gatus zeigt zunächst eine Warnmeldung:

**Achtung:** Alarm wegen dreimaligem Gescheitern der Suche.
Zustandsbedingungen: HTTP-Status nicht 200, Zertifikat abgelaufen, im Body“‘Zeige Ergebnisse‘“ und „Typo3“ nicht gefunden.

Anschließend eine Bestätigung:

**Erfolg:** Gescheiterter Alarm wurde zweimal erfolgreich behoben.
Zustandsbedingungen: HTTP-Status 200, Zertifikat gültig, „Zeige Ergebnisse“ und „Typo3“ im Body vorhanden.
[Translate to English:] Monitoring mit wiederkehrenden Fehlern in der Suche

This morning, one of our customers experienced intermittent outages. Not a full breakdown, more like this: everything works… until it suddenly doesn't.

We didn't hear about it from users first. Instead, our monitoring flagged repeated issues in Slack.

We use Gatus to continuously check key website functions, including search. The monitoring runs externally, deliberately independent from the customer's infrastructure. That way, we catch issues that might not be immediately visible from inside the system.

Show larger version for: Die Grafik zeigt die Zahl der Log-Einträge über die Zeit. Ein Massiver anstieg nach 10 Uhr ist zu erkennen.

The Analysis: A Surge in Requests, but No Obvious Culprit

After a quick system check, we turned to the logs. All logs are aggregated in a central Graylog instance, where we can analyze them efficiently.

In this case, we focused on request logs from Amazon CloudFront, which sits in front of the application as a CDN.

The initial finding was clear: a significant spike in incoming requests.

Usually, the root cause becomes obvious quickly. Often it is a small number of IP addresses generating a large share of traffic. This time, it was different.

  • No obvious clustering of individual IPs
  • But a strong concentration on a specific URL
  • And a noticeable pattern in the user agent: a slightly outdated Chrome version

That gave us a much clearer picture.

Instead of the usual 1 to 2 requests per minute to the configurator, we suddenly saw close to 2,000 requests per minute.

Too much for the system — especially since these responses are generated dynamically and are not cached.

Show larger version for: Eine Tabelle zeigt die Aggregation von IP-Adressen mit deren Auftretenshäufigkeit und prozentualem Anteil.
Show larger version for: Die Tabelle zeigt die häufigsten User-Agent-Typen von Web-Besuchern nach Browser und Betriebssystem auf.

Die meisten Nutzer verwenden einen Mozilla/5.0-Browser auf Apple macOS (92,7 %). Weitere Betriebssysteme und Browser sind ebenfalls aufgeführt.

The Real Pattern: Distributed, but Coordinated

Digging deeper revealed a familiar pattern:

  • 3 to 6 requests per IP address
  • Then a switch to a new IP
  • Same user agent across requests

This is where traditional protection mechanisms start to break down. Rate limiting, for example, becomes far less effective.

Side note: a setup like this does not care about robots.txt. That file is only a suggestion, not a protection mechanism.

Whether this was a targeted attack or simply a poorly configured bot is hard to say. In practice, it doesn't matter. The impact on the system is the same.

The Key Difference: Combinations Instead of Targeted Access

What we observed here reflects a broader shift in how automated systems interact with websites. This wasn't a typical search scenario. The affected area was a faceted interface, where results can be filtered and refined using multiple parameters.

The bot wasn't querying specific content. It was systematically exploring combinations. Parameters on, parameters off, values changing — all of it happening at high speed.

For the system, this means:

  • Every request is potentially unique
  • Very little reuse of results
  • Caching becomes ineffective due to the sheer number of variations

Traditional crawling tends to be linear. This pattern is combinatorial. And combinatorics scale fast.

What takes a human a few clicks turns into thousands of variations per minute when automated.

Why This Becomes Critical

At first glance, these requests look completely harmless:

  • They are valid requests
  • They use realistic parameters
  • They don't trigger obvious errors
  • Each individual client generates only a small number of requests

But in aggregate, they create exactly the kind of load that pushes systems to their limits. Not because individual requests are expensive, but because there are too many different requests hitting at the same time.

Or put differently:

The problem is not the single request. It is the combination of variety and speed.

Show larger version for: Das Diagramm zeigt einen Nachrichtenverlauf über den Tageszeitraum von 05:00 bis 13:00 Uhr.
Ab etwa 08:00 Uhr steigt die Anzahl der Nachrichten rapide an, mit einem deutlichen Spitzenwert um 08:30 Uhr.
Ein zweiter, höherer Anstieg beginnt bei 09:30 Uhr und erreicht weitere Gipfel zwischen 10:00 und 11:00 Uhr, danach sinkt die Anzahl schrittweise.

The Solution: Challenges Instead of Blocking

Simply blocking traffic was not a viable option. With constantly rotating IP addresses, we would either react too late or risk blocking legitimate users.

Instead, we applied controls where they are most effective: using AWS WAF, we introduced a challenge mechanism specifically for the configurator.

The idea is simple: real users pass without friction, automated systems typically fail the challenge. This allowed us to reduce the load significantly without affecting legitimate usage.

The effect was immediate. Request volume dropped, and the system stabilized.

Takeaway

Traffic patterns from bots and AI systems are no longer theoretical. They are part of everyday operations.

It is no longer a single IP generating high load. It is many IPs generating small amounts of traffic, but in a coordinated way.

To the system, this looks like normal usage. In reality, it is the opposite.

Ingo Schmitt

Fluent in TypoScript, php and sql; knows perl and bash and has very basic knowledge in java. Joined in 1996 and is meanwhile as managing director responsible for development, operation and hosting of our products. Articles in this blog cover technical and sustainable topics. Involved with TYPO3 as chairman of the Business Control Committee (BCC) and organizes the annual TYPO3Camp RheinRuhr.

More posts by this author

Get blog posts as RSS feed

Related blog posts

  • How we automate container updates on mittwald
  • AWS CloudFront log analysis in Matomo
  • Efficient review environments: Why we replaced Kubernetes with Virtual Machines
  • The Interface of Software Will Be the AI

Please feel free to share this article.


Comments

No comments yet.

Write a comment.

I have been informed that the processing of my data is on a voluntary basis and that I can refuse my consent without detrimental consequences for me or withdraw my consent at any time to Marketing Factory Digital GmbH by mail (Erkrather Straße 401, D-40231 Düsseldorf) or e-mail (info@marketing-factory.de).

I understand that the above data will be stored for as long as I wish to be contacted by Marketing Factory. After my revocation my data will be deleted. Further storage may take place in individual cases if this is required by law.

  • Data privacy policy
  • Legal notice

© Marketing Factory Digital GmbH

Picture Credits
  1. "Slack Notification": © Ingo Schmitt / Marketing Factory Digital GmbH
  2. "Massiver Anstieg nach 10 Uhr": © Ingo Schmitt / Marketing Factory Digital GmbH
  3. "Screenshot: IP Verteilung": © Ingo Schmitt / Marketing Factory Digital GmbH
  4. "Verteilung Browser ": © Ingo Schmitt / Marketing Factory Digital GmbH
  5. "Log-Requests nach Challenge": © Ingo Schmitt / Marketing Factory Digital GmbH