Cloudflare’s bot control is designed to support solve problems such as bots mining information to train generative AI. It also recently announced a system using generative AI to build “AI Maze, a new mitigation approach that uses AI-generated content to slow, confuse, and waste resources of AI robots and other bots that do not follow “no crawl” directives.
But he argues that today’s problems were due to changes in the database permission system, not generative AI technology, not DNS, and not what Cloudflare initially suspected, which was a cyber attack or malicious activity such as a “large-scale DDoS attack.”
According to Prince, there is a machine learning model behind this Bot management which generates bot ratings for requests sent over its network, has a frequently updated configuration file that helps automatically identify requests; however, “a change in our underlying ClickHouse query behavior that generates this file caused it to contain a large number of duplicate “function” lines.”
There are more details in the post about what happened next, but the query change caused the ClickHouse database to generate duplicate information. As the configuration file quickly grew beyond the established memory limits, it disabled “the underlying proxy system that handles traffic processing for our clients for any traffic that depends on the bot module.”
As a result, companies that used Cloudflare rules to block certain bots returned false positives and cut off real traffic, while Cloudflare customers who did not exploit the generated bot result in their rules remained online.
