By Amol Bhagwat, VP, Solutions and Field Engineering at Gurucul
As the threat landscape continues to get more complex, security analytics are becoming essential for identifying, preventing and responding to threats. As a result, recent research suggests that the security analytics market will grow by more than 16% (to more than $25B) by 2026. Today, security products offer a variety of different analytics modules, either as separate parts of a platform like a SIEM or as individual products. This often include analytics for network traffic, behavior or UEBA, identity, IoT devices, cloud, logs and endpoints and more.
All these analytics are important for detecting various threat actor tactics, techniques, and procedures (TTPs), such as account compromise, privilege access misuse, data theft, malware, lateral movement, device discovery, covert channel exfiltration and more. Analytics modules typically are powered by some form of machine learning and sit on top of a data lake. How much value an organization gets out of these analytics depends on two factors: 1) if those analytics modules are unified or separate, and 2) if they use a rules-based engine or true adaptive machine learning (ML).
In this article, we’re going to explore the value of unifying multiple analytics streams and explain how it helps organizations determine their overall security posture and risk. First, what’s the value of unified analytics?
While each analytics module provides useful information on its own, when unified the value increases exponentially. If models are separate, security analysts need to put together the results manually to produce context (much like pieces of a puzzle). For example, a slightly higher than normal number of login attempts to a particular system via a mobile device may not be a serious risk on its own. But if that system connected to a known malware site on the last successful attempt, the sequence of events presents a huge risk. Knowing these two facts requires two completely different set of analytics and data that must be connected to show the full picture.
Furthermore, having separate analytics is a resource burden. Too many modules produce too much data, which can overwhelm small teams. And individual pieces of data don’t tell the whole story. For example, one module might detect someone logging in from a new IP address. But are they working remotely or has their account been compromised? Limited data like this can send analysts on a wild goose chase, which takes up time and resources. The organization winds up spending more for subpar protection.
Unified analytics connects outputs from each system to establish context and identify relationships between them. For example, detecting a new IP address login along with port scanning or unusual lateral movement would strongly indicate that an account has been compromised. Another example: accessing a clinical patient record kept in a US data center remotely from an approved laptop is likely acceptable but accessing it from a Linux server in Guatamala should raise red flags. By unifying this different telemetry and applying the corresponding analytics teams can assess risk more accurately, better target a response, be more transparent on the process (and have more confidence in the results), understand the entire attack more quickly (through a unified console), reduce threat hunting costs, and improve overall security. But not all solutions make this easy; in a survey conducted at RSA 2023, 42% of respondents said it took them weeks or longer to add new data sources to their SIEM and nearly half only chain together endpoint and network analytics.
But unifying analytics modules is only part of the equation. The type of machine learning applied to these data sources is also crucial to streamlining detection and response. Most of today’s solutions (such as XDR and SIEM) still use rules-based ML with a predefined set of rules and instructions to look for specific inputs and produce specific outputs. For example, looking for malware signatures with a file hash either matching a signature or not. Or analyzing logs while throwing out additional endpoint telemetry gathered from an EDR solution. This could absolutely slow down a security analyst from identifying a threat. For example, if a user has uncommon access to a specific application, but this is an accepted outlier condition it’s important to not throw a false positive. That requires trained ML versus the automatic triggering of a simple rule.
It’s rarer to find solutions using adaptive ML. These models train on actual data, which allows the system to learn new rules on its own, discard ones that aren’t working anymore, and ingest unfamiliar or unstructured data. Adaptive ML also makes it easier to scale as a network grows and can ingest more types of data, such as badge systems or data from HR software to show who is on vacation, has put in their two weeks’ notice, or is on a performance improvement plan. It may also save the organization money depending on how the vendor charges for that data (on the other hand, products that charge by data volume can quickly run up huge bills if not monitored closely). It also adapts to new or changing attacks without requiring vendor updates and can verify or customize new models (if the vendor allows it). This is an important capability; the same RSA survey cited above found that just 20% of respondents are very confident that their SIEM can detect unknown attacks, and 17% are not confident it can do so.
Finally, adaptive ML does a better job overall of finding relationships between data because it’s not restricted to preset inputs. For example, the system can learn things like not to flag logins from unfamiliar IP address when that user is working remotely. Because it has this context, the analytics throws far fewer false positives. This reduces the workload for security teams, lets them focus on the true positives, and makes the organization safer overall.
Unified analytics based on true, adaptive ML offers many advantages over separate, rule-based analytics including reducing time-to-discover and time-to-remediation. But with more solutions entering this space, it’s becoming even more difficult to evaluate analytics. To help, consider asking these three questions:
- Can I correlate data from any source, no matter what it is, and if so, what is this costing me?
- Can this system detect new and emerging threats and if so, how?
- Does this system calculate risk or priority level for alerts and do these calculations just use public sources or are they customized to your specific network environment.
About the Author
Amol Bhagwat is the VP of Solutions and Field Engineering at Gurucul. Amol is a distinguished security professional with over 15 years of experience in delivering security and risk management solutions for Fortune 500 customers across the globe. He drives product strategy, marketing campaigns, solutions development, APAC technical sales and global customer success program. Prior to Gurucul, he played an important role in building security practice for a major global system integrator. He achieved exponential business growth as a practice lead with focus on innovative solutions and delivery excellence. Amol graduated from University of Mumbai with B.E. in Electronics.