Nov 16, 2025 Detection

A.I. Cybersecurity Tool Marketing: Insanity vs Reality

A.I. insanity has reached no heights. As vendors scream about AI super threats while the reality is boring.

Cybersecurity and tech marketing have always loved acronyms and buzzwords. (I still don't know what a "Network Fabric" is or if I've ever actually used one.) The moment a new technology gains any traction, VCs scramble to figure out how to package it into a "Detection and Response" service. When major breaches, exploits, or threat actors hit the news, every company rushes to put out marketing about it and how they could detect it. While the community often criticizes these marketing tactics (like my Network Fabric joke), it's usually grounded in good faith. It's typically just a vendor trying to stand out and demonstrate their product's capabilities.

But this time, it feels different.

The recent flood of bogus, hypothetical "research" claims is bordering on snake oil. What's worse is that the people writing it, in some cases vendors trying to sell you a tool, don't seem to understand detection, threat hunting, or A.I.

Because of this, I thought it was important to write something that highlights the real risks of A.I. so you can avoid the dubious marketing fluff.

A Threat Hunting and Detection Primer

Detection in cybersecurity has improved dramatically over the years, but it’s still widely misunderstood, especially by people outside the security world, even if they’re brilliant technologists. From the outside it can look like “magic,” but it isn’t. You need a basic grasp of detection and threat hunting to see why the claims we’ll walk through in the next section are so absurd.

When it comes to spotting malicious activity, everything an adversary does is a potential detection opportunity, starting as early as the weaponization phase of the kill chain. Just because malware “executes” does not mean it “wins.”

• When an attacker registers a look-alike domain, that’s a detection opportunity.
• When they send a maldoc to a victim organization, that’s a detection opportunity.
• If a user opens the maldoc and code executes, that’s another detection opportunity.
• When the attacker dumps credentials, moves laterally, or establishes persistence, each of those actions is yet another detection opportunity.

Modern detection strategy is built around this idea: attackers have to do stuff to achieve their goals. This is ultimately what led to people to stop making the claim of their being a “defender’s dilemma.” It’s not that defenders get only one shot and attackers get infinite tries; it’s that every step the attacker takes is a chance for us to catch them. There are all kinds of reasons in the real world attackers may still evade all of our detections, however even in the instances of a threat actor evading your security controls, the activity assuming the visibility and data is available is still there.

Threat hunting, much like detection, has only recently started to become well understood. Just a couple of years ago, people were still arguing about what even was a threat hunt. Ultimately, frameworks such as Splunk's excellent P.E.A.K. framework have started to become more widely adopted.^[1]

The ultimate idea of threat hunting being the human-driven process of finding malicious behaviors that your automated systems missed. The primary purpose is to use that hunt to drive improvements to your security posture , fix visibility gaps, and build new detections ensuring that each hunt has a measurable outcome.

The Insanity

Now that we all understand how detection works and threat hunting basics, lets look at some of the recent content put out and identify what makes it bogus so you can better avoid it in the future.

The A.I. "Theory Crafters"

One of the biggest issues with the current wave of "A.I. driven threats" is that the claims are not limited to just media hype. They are also coming from major academic institutions and security vendors, muddying the waters for defenders. This content often makes bold, unsupported claims that don't align with the realities of detection engineering or threat hunting.

A prime example is a working paper published by MIT Sloan in collaboration with the vendor Safe Security. The paper, "Rethinking the Cybersecurity Arms Race When 80% of Ransomware Attacks are AI-Driven," made the specific claim that "In 2024, 80.83 percent of recorded ransomware events were attributed to threat actors utilizing AI".^[2]

This paper was quickly and widely criticized by security professionals. As reported by The Register and other outlets, security researcher Kevin Beaumont called the paper "absolutely ridiculous," noting that it describes almost every major ransomware group as using AI without providing evidence and stating that this does not match what he observes tracking those groups. Fellow researcher Marcus Hutchins reportedly said the paper was "so absurd I burst out laughing at the title." Following the criticism, MIT Sloan removed references to the working paper from its public-facing article, stating it was "being updated based on some recent reviews." Beaumont later coined the term "cyberslop" to describe this trend, defining it as a situation where trusted institutions make baseless generative-AI threat claims to profit from their perceived expertise.^[3]

The "Skynet Theory Crafters"

The next group of bogus claims comes from what I call the "Skynet Theory Crafters." These are the people who seem to misunderstand A.I., detection, and threat hunting, treating A.I. as "magic." Here's a particularly egregious example:

So how do you hunt something that thinks at machine speed and leaves no reliable fingerprints? The same way you’ve always hunted: by understanding intent rather than chasing artifacts. But faster, smarter, and with different assumptions about what normal looks like.

Traditional hunting assumes human operators with human limitations. They take breaks, make mistakes, follow familiar patterns, leave gaps between operational phases. AI operators don’t have those constraints. They execute with mechanical consistency, perfect timing, and inhuman precision.

But they’re not invisible. They’re just operating at a frequency we haven’t been tuned to detect.

Let's break down why this is so ridiculous.

"So how do you hunt something that thinks at machine speed"

When it comes to attacks, speed can actually be a problem for attackers. For example, the malware sandbox VMRay has documentation describing "Timebombs," which are common evasion techniques.^[4]

Delaying execution is a common technique since sandboxes tend to analyze samples for only a few minutes. Malware employs time bombs, introducing sleep intervals of varying complexity, making detection an ongoing challenge. Time bombs are like ticking time capsules, strategically activated after the initial analysis phase.

"leaves no reliable fingerprints"

The "assume breach" mindset has been standard in the industry for years.^[5] It assumes preventative controls will fail, making deep visibility essential for detection. The "reliable fingerprints" this quote mentions sound like old-school signatures or atomic indicators (IOCs), which the industry largely moved away from. The whole point of modern detection is to find "unknown" and anomalous activity.

Focusing on behavior means that, for example, there are only so many ways to execute code from a malicious document. It doesn't matter if it's a new exploit or malware family. It's still going to do things like spawn child processes or make an abnormal network connection from an Office application. That is the behavior we detect, and that's what makes this approach so powerful.

As a detection engineer, I might analyze a specific malware family, but the detections I write are for the behaviors it uses, not just the family itself. That detection then works for any future tool or malware family that tries the same thing.

"AI operators don’t have those constraints. They execute with mechanical consistency, perfect timing, and inhuman precision."

This claim is perhaps the most ridiculous. LLM-based A.I. famously does not act with "mechanical consistency" or "inhuman precision."

This is a core, known limitation, identified in numerous academic papers focusing on their confidence calibration and overconfidence, poor logical reasoning, and tendency for hallucinations.^[6]

Back to Reality

So what are the real threats when it to comes to A.I.? Taking a look at the recent report released by Anthropic we can get some real clues.^[7]

In mid-September 2025, Anthropic's Threat Intelligence team detected a sophisticated cyber espionage campaign from a group it calls GTG-1002, a Chinese state-sponsored actor. This group targeted approximately 30 entities, including major tech corporations and government agencies, and achieved a handful of successful intrusions. What made this campaign unique was its use of Anthropic's own "Claude Code." The report states this is the first documented case of a cyberattack executed largely without human intervention at scale. The A.I. was used as an agent to autonomously discover vulnerabilities, exploit them, and then perform a wide range of post-exploitation activities like lateral movement and data exfiltration.^[7:1]

However, the report also highlights a critical limitation that prevented the attack from being fully autonomous. The A.I. frequently "hallucinated" during the operation, overstating its findings, claiming to have credentials that didn't work, and fabricating data. This "Al hallucination" was a major obstacle for the attackers, requiring human operators to validate all the A.I.'s results. While the A.I. handled 80-90% of the tactical work, humans were still required to make all the critical strategic decisions, such as authorizing the move from reconnaissance to active exploitation.^[7:2]

The Real Risks of A.I.

So, if the "Skynet" stuff is nonsense, what are the real risks? They are far more practical and, honestly, much more boring. As the Anthropic report shows, the real threat is scale and increase of capability of less skilled actors.^[7:3]

Scale: This is the big one. One skilled operator can now do the work of ten, running multiple, simultaneous intrusions with fewer resources. This doesn't change the kind of attack, but it dramatically increases the number of potential victims.
Raising the Floor: Less-skilled attackers ("script kiddies") can now use A.I. to pull off more complicated attacks. It's essentially "competence-in-a-box," helping them write tools or exploits that were previously beyond their skill level.
Polished Social Engineering: This is a huge risk. We're talking about perfectly-written, convincing phishing lures, realistic voice cloning, and even real-time video editing to assist in complex scams (like the hiring scams infamously used by North Korean actors).^[8]
Internal Data Exposure: This is an "own goal" we're inflicting on ourselves. The risk isn't just from attackers; it's from our own teams plugging sensitive data and internal tools into LLMs with zero security review, massively expanding the attack surface.

Here's the critical takeaway: None of these risks require "A.I." to detect. They are all just scaled or polished versions of problems we already know how to solve. An A.I.-generated phishing email is still a phishing email. An A.I.-written implant is still caught by behavioral detections. Organizations that have skirted by on the minimum will have the most issues, while the adversary "playbook" for mature organization is likely to remain the same.

Takeaway

Many researchers are critical of LLMs in security, and for good reason.^[6:1] I'm skeptical of their current value over mature tools like a SOAR platform, but I do think people will find smart ways to use them.

They will likely help automate specific processes within a human-led threat hunt, not replace it. I can also see them finding a role in the detection pipeline, doing things like enforcing a consistent alert voice, ensuring a description is technically accurate, or finding regex optimizations. I'm genuinely interested to see what kind of smart things people come up with and manage to get working.

The takeaway from this shouldn't be that AI tools are useless or have no place in a security program. The real goal is to help you see past the marketing insanity. When you understand how detection and threat hunting actually work, you can easily navigate the hype and identify what's real versus what's just "cyberslop."

Splunk, PEAK Threat Hunting Framework (Splunk, 2023), whitepaper and practitioner guide describing the PEAK (Prepare, Execute, Act, and Know) methodology for structuring threat hunts and measuring their outcomes. ↩︎
Michael Siegel et al., Ransomware Goes AI: Rethinking the Cybersecurity Arms Race When 80% of Ransomware Attacks Are AI-Driven (Cybersecurity at MIT Sloan and Safe Security working paper, 2025), 7, claiming that analysis of roughly 2,800 ransomware incidents showed “In 2024, 80.83% of recorded ransomware events were attributed to threat actors utilizing AI,” PDF, https://idcontrol.com/wp-content/uploads/2025/05/Ransomware-goes-AI.pdf. ↩︎
Kevin Beaumont, commentary on MIT–Safe Security ransomware AI working paper, quoted in Matt Warman, “Research Says 80% of Ransomware Attacks Use AI — Experts Disagree,” TechRadar Pro, April 18, 2025; and in coverage at ThreatsHub and similar outlets, where Beaumont calls the work “absolutely ridiculous” and notes it labels almost every major ransomware group as AI-using without evidence. See also MIT Sloan, “AI Cyberattacks and Three Pillars for Defense,” MIT Sloan Ideas Made to Matter, editor’s note stating that references to a working paper “being updated following recent reviews” were removed, April 2025, https://mitsloan.mit.edu/ideas-made-to-matter/ai-cyberattacks-three-pillars-defense. For Beaumont’s term cyberslop and his definition of “trusted institutions using baseless claims about cyber threats from generative AI to profit,” see Kevin Beaumont, “Cyberslop: Why You Should Be Skeptical About AI Ransomware Statistics,” 2025. ↩︎
VMRay, Time-Based Evasion Techniques: Time Bombs (VMRay sandbox documentation, 2024), describing how malware introduces delays so that “delaying execution is a common technique since sandboxes tend to analyze samples for only a few minutes” and how time bombs trigger after initial analysis to evade dynamic detection. ↩︎
See, for example, Microsoft, “Zero Trust Deployment Plan with Microsoft 365,” Microsoft Learn, which defines Zero Trust as a model that “assumes breach and verifies each request as though it originated from an uncontrolled network,” https://learn.microsoft.com/en-us/security/zero-trust/microsoft-365-zero-trust; and Lumifi Cybersecurity, “The Assume Breach Paradigm,” 2023, which describes assume breach as a mindset that treats applications, services, identities, and networks as “not secure and probably already compromised” and emphasizes detection and response over the illusion of perfect prevention. ↩︎
Prateek Chhikara, “Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models,” arXiv preprint 2502.11028 (2025), which empirically shows that large language models exhibit significant miscalibration—often expressing high confidence in incorrect answers—and analyzes when overconfidence is worst, https://arxiv.org/abs/2502.11028. See also Mihir Parmar et al., “LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), which finds that LLMs struggle on many formal logical reasoning patterns despite strong performance on other tasks, https://arxiv.org/abs/2404.15522; and Matthew Dahl et al., “Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models,” Journal of Legal Analysis 16, no. 1 (2024): 64–93, which documents systematic hallucinations in legal tasks and warns that LLMs frequently generate text “not consistent with legal facts,” https://academic.oup.com/jla/article/16/1/64/7699227. ↩︎ ↩︎
Anthropic, GTG-1002: A Case Study in AI-Assisted Cyber Espionage (Anthropic Threat Intelligence report, 2025). The report describes the mid-September 2025 detection of a Chinese state-linked group dubbed GTG-1002, targeting around 30 organizations, using Anthropic’s Claude-based “Claude Code” agent to automate large portions of reconnaissance, exploitation, and post-exploitation, while noting that agent hallucinations (e.g., claiming credentials that did not work) remained a key obstacle to full autonomy and that human operators still made strategic decisions and validated results. ↩︎ ↩︎ ↩︎ ↩︎
MIT Sloan School of Management, “AI Cyberattacks and Three Pillars for Defense,” MIT Sloan Ideas Made to Matter, which highlights AI-enabled phishing, social engineering, and deepfake-driven scams such as fake customer service calls as emerging threats and discusses how AI has “transformed cyberattacks” through deepfakes, phishing, and social engineering, https://mitsloan.mit.edu/ideas-made-to-matter/ai-cyberattacks-three-pillars-defense. For concrete examples of state-linked actors using AI-driven deepfakes in hiring and job-interview scams, see John Leyden, “North Korean Fake IT Workers Up the Ante in Targeting Tech Firms,” CSO Online, November 21, 2024, which reports that North Korean groups are leveraging deepfake technologies to scam companies into hiring fake IT workers, https://www.csoonline.com/article/3609972/north-korean-fake-it-workers-up-the-ante-in-targeting-tech-firms.html. ↩︎

Subscribe to our newsletter and receive access to Magonia Observatory (Coming soon!)

No spam, no sharing to third party. Only you and me.

A Threat Hunting and Detection Primer

The Insanity

The A.I. "Theory Crafters"

The "Skynet Theory Crafters"

"So how do you hunt something that thinks at machine speed"

"leaves no reliable fingerprints"

"AI operators don’t have those constraints. They execute with mechanical consistency, perfect timing, and inhuman precision."

Back to Reality

The Real Risks of A.I.

Takeaway

Similar topics

"That Can be Evaded" and the Imperfect Detector

Fuzzy Hashing Research: A Paper Highlight with Practitioner's Notes

Maximizing the Value of Indicators of Compromise and Reimagining Their Role in Modern Detection

Why the EDR Telemetry Project is Misleading

Announcing CelesTLSH CLI: A Lightweight Tool for TLSH Hash Analysis