Is Your Security Log ‘Bathtub’ About to Overflow?

October 2, 2022

By Ozan Unlu, CEO and Founder, Edge Delta

Security Log Data – More Data Doesn’t Always Mean Better Protection

A major issue that security operations teams face is the aggressive speed at which vulnerabilities are being exploited, coupled with massively increasing data volumes (relating to security events) being generated across current infrastructures.

Security logs can be extremely useful for helping identify or investigate suspicious activity, and are a cornerstone of every traditional SIEM platform. But the fact is that current infrastructures are generating security logs at a rate faster than humans or even machines can analyze.

Consider this: it would take a person about one 8-hour work day to read 1 megabyte of raw logs and events, a thousand people for a Gigabyte, a million people for a Terabyte, and a billion people for a Petabyte. Some of the organizations we work with create close to 100 petabytes of data per day. Security operations teams are drowning in data and the tide is only going to get higher. These teams desperately need a better way to manage, analyze and make sense of it all. But how?

The Limitations of SIEM Systems

Today’s SIEM systems – where security logs are traditionally routed, indexed and prepared for analysis – are quite advanced, but they do have their limitations. Certain systems, particularly older, on-premise ones, can be painfully slow when it comes to querying data and delivering the required information, especially when maxed out on events per second (EPS). This is certainly not ideal given that attackers only need seconds to exploit a vulnerability. Visibility into threats – both emerging and existing – in as close to real-time as possible is essential.

Additionally SIEM pricing models can be problematic, as price often inflates massively as data volumes increase, while security budgets are only increasing incrementally. Here, it’s important to remember that all security log data is not created equal. Certain logs are typically the most likely to contain meaningful information, while other logs may contain information helpful for event correlation.

An example of this would be an intrusion detection system that records malicious commands issued to a server from an external host; this would be a primary source of attack information. A firewall log could then be reviewed to identify other connection attempts from the same source IP address, reinforcing that the IP address in question is in fact likely to be a malicious actor.

Intelligent event correlation is one of the most powerful features of SIEM systems, and the richer and more comprehensive the data, the better the results. Security operations teams therefore find themselves facing a dilemma. They can include the majority (or all) log data – including a high volume of logs with little to no value – which often leads to an overstuffed SIEM that eats through their budget. Or, they can make predictions on what logs they really need while neglecting others, which may keep the team in-budget but creates significant blindspots and vulnerabilities. Such an approach may be deemed too risky since threats can be lurking anywhere.

Finding A Balance

The “centralize and analyze” approach to SIEM evolved at a time when organizations prized one true copy of logs in one highly secure location, often totally separated from production environments and completely inaccessible to hackers, malicious insiders and other employees. Given the significant rise in the number and variety of cybersecurity threats, combined with the volume of security logs being generated, such an approach is no longer optimal from a speed or cost perspective.

A new approach is needed that entails analyzing all data at its source – separating where data is analyzed from where it is stored. Some call this approach “Small Data” – processing smaller amounts of data in parallel. Once logs are analyzed at their source, they can then be relegated as higher-value (and routed to a higher-cost, lower volume SIEM repository) or lower value and routed to a lower-cost storage option). Additionally, when analytics are pushed upstream, security operations teams can sidestep indexing for the moment and identify anomalies and areas of interest even faster than with an SIEM alone, which is critical in the constant race with adversaries.

Today, this can be achieved in a way that maintains maximum security, availability and confidentiality of logs. Enterprises can therefore afford to have eyes on all their data while not compromising the security benefits of an SIEM.

Conclusion

In the coming years, cybersecurity threats will only escalate, and security teams must be able to expertly harness and leverage their log data in order to stay one step ahead. Older methods of data ingestion and analysis, which may have worked well as recently as a few years ago, are in need of modification to meet this new reality. There are ways that security operations teams can have it all – visibility into complete datasets combined with a budget preservation – but it may require some new approaches.

About the Author

Ozan Unlu, CEO and Founder, Edge Delta. He can be reached online at [email protected] and at https://www.edgedelta.com/