MEASURE YOUR SECURITY

May 29, 2019

Why measurement is at the core of Analogue Network Security, and why Cybersecurity without metrics is a fool’s errand

by Winn Schwartau, Founder & CVO, The Security Awareness Company

The following is an excerpt from Winn Schwartau’s latest book, Analogue Network Security.

If everyone in the whole world can easily measure their computers’ performance and their internet traffic performance, then fifty years and trillions of dollars-plus later, the professional information security industry should damned well be able to measure the performance of our security tools.

Hint: The only mathematical certainly in infosec is the forward flow of time. If there is forward time, there is risk, by definition. Anything else is a WAG, which is why we are losing the CyberWars. A fundamental shift is required.

Measure Your Security

At the very core of Analogue Network Security is measurement. As history has shown us, cybersecurity without metrics is a fool’s errand.

I sincerely hope this fundamental tenet resonates with readers, because today, right now, you all have the ability to measure much of your cybersecurity posture. Quantitatively. With ANS, it will hopefully only get better.

Keep in mind, however, that every security metric is both dynamic and probabilistic. Fortunately, with Analogue Network Security, you should learned how to visualize your various security stances in the time-domain.

Breaches are endemic. Gazillions of dollars spent after conversations like you just read in the cartoon on the prior page. Now, I am not picking on any one vendor; I believe the fault is in the binary approach we take to security. So, let’s see if there is a way we can hold vendors accountable.

I use “BB” to reference any vendor’s Black Box that performs any abstract security service. The internal process mechanism is immaterial to measurement; signature-based A/V, rule-based binary decision making, heuristics, deep learning or any possible hybrid – it’s still a Black Box. In many cases, the BB is designed to detect some condition, anomaly or other specific or profiled traffic behaviors. Great. I get it. But vendors tend to make promises and I’d like to know how well their BB performs. To that end, let’s ask a few Analogue Network Security-style questions of the Black Box Vendor.

▶ How long does it take to detect a malicious act/thing/file?

▶ Can you show me how that time value changes with specific hardware/network/software/load configurations?

▶ Is the time value the same for every possible test? If not, what is the Min-Max range for performance over time? (Set boundaries.)

▶ Can you show me the raw test data?

▶ For D(t-min), what is the guaranteed accuracy (Trust Factor or Risk) of the detection for: – False Positives – False Negatives

▶ Same question, but for D(t-avg) and D(t-max). How good is your BB(?) and I would expect testing labs to build pretty management porn versions with colorful graphs and charts. Then, we can compare products and services with much more substantial metrics than the “trust-me” approach. 0 < D(t-min) < D(t-avg) < D(

The general tool description (independent of the BB’s specific security function) is:

Inject “hostile” test code that should be detectable by the BB. Vary the difficulty of detection, especially with ‘smart’ systems. Identifying which kinds of code will trigger the BB’s detection mechanism will help inject appropriately easy and hard conditions, presumably based upon near-normal distribution. Ask your vendor. This approach can be used consistently across many product ranges to gauge comparative performance vs. efficacy.
When the specific code is injected, start a clock. To make it easy, start at 0:00:00.00, or T(0).
At the detection point in the BB, ideally there is a logging/ notification trigger that says, “Hey, we got something!” Now, I know this may not exist everywhere, but the goal is to grab hold of that same trigger and use it to Stop the Clock. T(1). If APIs are available… awesome. If not, ask your vendor to put in such a trigger point for your use.
The time difference between those two numbers is your current, accurately measured Detection Time, or T(1) – T(0) = D(t) for that input criteria and that specific test platform (SW/HW, etc.)

In “normal” operations, a BB can be “in line” from a traffic standpoint or as a side-chain. Again, it doesn’t matter; this is a measurement testbed. The point is to inject at T(0) and measure the detection trigger at T(1) to begin understanding your network with an analogue view.

This view of product efficacy becomes exceedingly important as we begin to mix all of the Analogue Network Security concepts to build our security architectures. Here, Trust Factor is the amount of statistical confidence that the vendor has in his own product(s). For now we will ignore the impact of hardware and other environmental performance considerations in our evaluation, but from an implementation standpoint, this range (spectrum) can be very illuminating when evaluating a vendor’s security solution. First, we can examine error rate and how that translates to Trust Factor.

Just as in our earlier Trust Factor discussions where the TF is the inverse of Risk…

(Risk = 1/TF and TF = 1/Risk)

…with large datasets, we find that for a detection system to work with enough efficacy so as not to interfere with operations, several orders of magnitude of increased Trust Factor are required. We are looking at organizations of tens of thousands of people, each touching thousands of events every year (10^4 * 10^3). We are looking for efficacy that exceeds the Six Sigma Measure, Analyze, Design, Verify, OODA approach used so successfully by ICSSCADA vendors such as GE, sometimes by a factor of 10⁶, or more.

Our testbed, then, is pretty simple and akin to the measurement approach above. In this case, though, the values of x and y are clearly defined. We know what the answer should be, based upon the inputs. We also know – exactly – how long the Black Box takes to achieve a specific Trust Factor due to T(1) – T(0) = D(t).

If T(0) = 0 and T(1) = 3,000ms, then D(t) is clearly 3,000ms.

Nothing difficult here.

The next question for the Black Box vendor should be, given a Trust Factor of 1-(y/x) for D(t), will my Trust Factor increase as a function of D(t) increasing, and can you show me the plot on a curve? Does the Black Box become more accurate if it is given more time to complete its determination? If the Black Box accuracy increase is linear over time, TF should rise linearly as D(t) is increased. Or is there some other performance variable? Perhaps doubling the D(t) only provides a 10% increase in accuracy? Or, does the Black Box function optimize operation with an upper limit and the vendor has chosen this D(t) intentionally?

My argument is we should know these things before deploying new technologies into any network. We can make these measurements. We can compare Black Box performance in varying conditions by controlling the known specifics of the inputs and measuring the D(t). We then have the opportunity and customer-luxury of comparing competing products with time-based metrics. Technical reviewers should have a ball – and vendors will necessarily be required to “Up Their Game.”

Let’s take this one step further and say after our initial measurements, “Wow… that’s really good! But I’m a security person and want to verify my results.” How can we do that? When we apply Verification, doesn’t this look somewhat familiar? A little feedback! Security experts oft en say that, perhaps in the case of malware detection, “Use two different products. One for Fee… one for Free.” Black Boxes #1 and #2 can be competing products, using the same or differing approaches to detection. What this gives us is a pretty simple mechanism by which to validate the performance claims of vendors in the detection universe.

If we do see a Detection Trigger in the Verification process, what does that mean? Is #2 better than #1? What happens if we reverse #1 & #2 in our testbed? Is their relationship transitive or not? This is the kind of analysis that tells us, on our terms, a lot more than just a handshake belief in the “Invisible Patented Secret Sauce We Can’t Show You.”

Think Dark Matter. We don’t know what it is. But we know it’s there because we can measure its effects on less exotic particles. Just because we, in the security world, are oft en not privy to the inner workings of the products and solutions we employ, there is no reason not to measure them.

In Fig. 8A above, two similar detection products are running in parallel (assume in sync for now). One or ideally, both, of the Detection products should detect the same errant event or object. In the top right, we add the union of the two events together, so that a reaction is initiated if both detections occur. In the bottom, we permit a reaction to take place if either of the detection mechanisms is triggered.

First, with the OR gate, note that the overall TF, as before, goes down. With the AND gate, the TF rises significantly.

Since the two products are not in 100% time-sync, we could of course, employ a TB-FF for a secondary verification according to the two-man-rule approach. Th is will only work, however, if the reaction is revocable.

Now, let’s add a second component to Detection in Depth, that of Reaction, and see what we can measure.

In this case, we employ the same technique. Insert known good traffic and known bad samples into the Black Box detection. The difference is, we add the Reaction Matrix Process and Remediation, R(t), to the equation. Every tiny process takes some amount of time, no matter how small. (See earlier section, Chapter 3, page 12, “How Fast is a Computer?”)

Tying the API of the remediative technology to the R(t) clock gives us the measurable, and hopefully repeatably consistent capability, to measure this one event chain as E(t) = (T(1) – T(0)) + (T(3) – T(2)). How fast does the remediation take to complete? Th e answers are there … if you know what the right questions are. Successful Remediation is up to the enterprise to define. Is it an automatic remediation, such as a port block, user_event_denial, a change in security controls? Or, is there a human in the process?

The measurements are actually quite simple.

▶ The Detection Trigger stops the first clock and begins the

reaction process, up to and including remediation.

▶ It also starts another clock process as the reaction process begin.

Only you can decide when the reaction process ends. Is it with notification of the Admin? Is it a corrected setting? Does it include forensics? Or only the immediate risk? Define this carefully, because, for there to be meaning as you successfully engage in analogue network security, repeatable relative (versus absolute) reference points are necessary.

At this point, though, all you need to do is come up with some rough reaction times R(t). You really want to know at what scale your current reaction matrix and channel are working. Does it take 1 hour? 20 minutes? Or is your process down to seconds? Which ones work better? Which means getting E(t) → 0, of course.

Measure the entire process. The whole thing, not just the tech. Add the human component wherever you can. Test different scenarios. Finally, I would like to offer the following suggestions to security vendors:

Be more transparent about how your products work. If your engineers know, so should your customers. If your engineers don’t, they should.
Enable your customers to test and measure your products in situ, within an operationally live production environment, so they understand the performance differences from idealized testbeds.
Give them the tools to do this, or someone else will.

These points are really critical, especially for this old analogue Winn. Just ’cause it works on the bench does not mean it will work at a live concert, or in your live network. From an analogue view, what problems, time delays, or other hindrances in your current network traffic and architectures can interfere with the proper time-based operation of your Detection-Reaction Matrix? I don’t know that answer – that’s why we have to measure it.

Now that you have a hard value, you have taken the first step in Analogue Network Security.

Let’s apply what we have discussed for the last several chapters and put them into real-life use.

The Horror of It All: Time-Wise

A couple of years back, my team tried to deploy a simple multi-media broadcast (video streaming) on a customer’s network. Interactive learning with tens of thousands of users around the world. Talk about complete failure! Yes, Epic Fail!

WTF? No way! But this smelled familiar to me. The neurons smoldered.

It took a lot of convincing, but I did get one of the friendly engineers, who ‘got’ that something else was afoot, to agree to do some network analysis. In seconds aft er our test measurement, the answers appeared, and were, to me, obvious.

Being a security conscious fi nancial fi rm, they had purchased and installed all sorts of security products; server based, client-centric; from the perimeter to the soft gooey insides, they had it all.

But they had never tested their systems to look at the network performance hit from continuous synchronous communications on all of these security tools. Th ey had no idea how much workstation capability was wasting away in the ether of security-ville. Worse yet, due to network collisions (oh, it was a mess…) some security products weren’t even working properly. (Bruce Schneier’s Security Th eater comes to mind…).

Unplug a couple. Retest. Interactive multimedia performs correctly. Turn ’em back on… and, Fail!

Please heed this as an obvious lesson.

Look at your networks (code…). Know how to Test. Fairly. Recognize the difference between Live-Production, Dev, and Idealized Testbeds.

Then, test again. And again. Yes, do it again, too. Regularly. Please! Apply OODA. Oh, and test again.

End Rant #42.

About the Author

Winn Schwartau’s latest book, Analogue Security Network, is available on Amazon. Winn is one of the world’s top experts on security, privacy, infowar, cyber-terrorism, and related topics. He is also the founder and CVO (chief visionary officer) of The Security Awareness Company. He can be reached on Twitter @WinnSchwartau and at the official SAC website https://www.thesecurityawarenesscompany.com.