What Framing Security Alerts as a Binary True or False Positive is Costing You
Ask anyone who’s worked in a SOC long enough and they’ll tell you: debates over “true positive” versus “false positive” happen a lot. Usually, the conversation goes in circles, one person insists an alert was a false positive, another argues it was technically a true positive, and by the end, everyone walks away with slightly different definitions.
The real issue? We’re trying to use a binary system to describe something that’s more nuanced. And in doing so, our metrics stop being useful for measuring rule quality because they no longer isolate the performance of the rule itself from the context in which it fires.
A Quick Detour Back to the 1950s
The concepts of true positive and false positive aren’t new. Signal Detection Theory (SDT), is a framework first formalized in the 1950s to model how humans and systems make decisions in noisy environments, think radar operators trying to spot enemy aircraft.
In SDT terms:
- True Positive (Hit) – The signal was present, and you correctly detected it.
- False Positive (False Alarm) – The signal wasn’t present, but you thought you detected it.
- True Negative (Correct Rejection) – The signal wasn’t present, and you correctly ignored it.
- False Negative (Miss) – The signal was present, but you failed to detect it.
In its original form, SDT wasn’t concerned with whether the signal mattered, just whether it was detected correctly according to the defined criteria. That’s an important distinction for us in security: a correct detection doesn’t necessarily mean it was operationally valuable. That is a separate, but overlapping metric.
This is the case for Security Operations as well.
When “True Positive” Doesn’t Equal “Impactful”
Consider this scenario:
You’re an MSSP monitoring multiple customers. You’ve built a rule to detect the use of remote monitoring and management (RMM) tools. These tools, when used by attackers, are a legitimate threat vector.
- Customer A uses RMM Tool AnyDesk for normal IT operations.
- Customer B uses RMM Tool ConnectWise for normal IT operations.
- Your rule triggers whenever AnyDesk is detected.
For Customer A, the alert is technically correct, your detection fired on exactly what it was built to detect. Under SDT, that’s a true positive. But operationally, it has no security impact because it’s part of daily business activity.
For Customer B, the same detection could be suspicious or outright malicious. Same rule. Same accuracy. Very different impact.
If you only measure true/false positives in the SDT sense, these two outcomes look identical. That’s when your metrics stop telling you whether a rule is valuable, they’re now capturing a mix of rule performance and environmental context, which muddies your ability to measure or improve the detection itself. As an MSSP I might have an incorrect picture of how valuable or reliable a rule is if I relied on categorizing a detection outcome as a True or False Positive alone.
Why You Should Use Intent or Disposition
This is where adding a layer of intent or disposition becomes useful. Not as an afterthought, but as part of the detection outcome model from day one.
By capturing both whether the detection worked and whether the activity mattered, you can separate two very different questions:
- Did the rule work as designed? (Accuracy)
- Did the alert have security value in this environment? (Impact)
A refined classification could look like this:
- True Positive – Malicious
Correct detection of confirmed malicious activity requiring a response. - True Positive – Suspicious
Correct detection of unusual or potentially risky activity that warrants investigation. - True Positive – No Business Impact
Correct detection of expected activity with no security concern in this context. - False Positive
Detection fired on something it wasn’t designed to detect.
With this structure, you're now getting an accurate depiction of a detector and its fidelity, while also now being able to determine how frequently it leads to a security outcome.
The Metric SOCs Don’t Know They Need
SOC leaders often say they want to “reduce false positives.” That sounds good, but it’s incomplete. The metrics they actually need are:
- How often does this rule lead to a meaningful investigation?
- How much analyst time does it consume when it does?
By adding an intent or disposition, you can now tell whether a rule actually triggered a meaningful investigation. And when the case is closed, you get a clearer picture of how much time analysts typically spend when that specific rule fires. This kind of insight is far more valuable, and usually exactly what SOC Managers are after.
Why This Matters for Detection Engineering
Using intent/disposition alongside true/false positives lets you track:
- Precision – How often the rule fires correctly.
- Impact Rate – How often correct alerts have security value.
- Investigation Cost – Average analyst time per impactful alert.
That’s a much clearer picture of rule quality. It allows you to better prioritize what rules to be tuned and allows you identify rules that you may want to prioritize SOAR/automation responses for.
Final Takeaway
True positive and false positive as defined in signal detection theory have served decision-making disciplines since the 1950s. But in SOC operations, accuracy alone isn’t enough, you need to track the intent behind each detection to measure its real-world value.
A rule can be perfectly accurate and still add little to your security posture if its alerts rarely require action. Adding intent/disposition tagging keeps your metrics tied to both rule quality and business impact, allowing detection engineers and SOC analysts to make better, faster decisions about what to tune, keep, or retire.