Cyber threats are an ongoing challenge for businesses. They manifest themselves in several forms, making it difficult for organizations to keep pace. Businesses are increasingly using machine learning (ML) to unearth patterns from data and devise models to fight digital attacks.
ML plays a key role in anomaly detection, which is the process of identifying deviations from the majority of data in a set of patterns or behaviors considered standard events. These deviations may be suspicious, unusual, or rare. ML enables businesses to automatically update rules, based on newly identified attack patterns, to protect across use cases and different lines of business.
However, every time a business experiences an attack that is novel in some way, it raises a few questions. Should we expand the existing machine learning model? Or consider this particular incident different enough to be its own problem that merits a different solution?
Expand a model or split it into multiple models?
Expanding a model is always an interesting challenge as the answer to every problem is the need for better training data. Each time there is new threat or new information, you simply add this to the training data and adjust the model to optimize for the latest threat. With time, the model grows to become a one-size fits all – just like those pants that promise one size fits all! However, the more situations that a model can fit in, the less it fits specific situations, leading to a tradeoff between accuracy and coverage.
Another drawback with a single, overarching model is that each time you train it for a new issue, you are introducing recency bias to the solution. The advanced attack of today is the script kid tool of tomorrow. While the leading edge of cyber attack is always advancing and getting more sophisticated, old methods will still pop up as someone finds a tool that was created for a previously successful attack.
Splitting a model into multiple models can help alleviate the dilemma of accuracy vs coverage. That said, maintaining and retraining multiple models involves a lot of effort, time, and resource costs. In addition, there is the need to determine which model has precedence for edge cases that have conflicting results between the models.
Speed to response can create a backlog of quick fix solutions
Speed to response is becoming increasingly crucial with attackers launching instantaneous, high-volume attacks whenever they can find a chink in the defence. Often, to respond instantly to a threat, security teams are tempted to put a quick fix into place. While this affords them some time to adjust and adapt the defences, it also places them in a catch-22 situation.
Considering the high pace of attackers’ moves and counter moves, attack-prevention teams are often left with no time to revisit and review these stop gap arrangements. Lack of review of these quick fix solutions soon starts to build up a backlog of work that needs to be addressed. This backlog means the quick fix solutions remain in place for far too long than is healthy for the organization.
Further, with attackers manipulating identities at scale and intelligent bots able to mimic human behavior with fair accuracy, not all incoming traffic signals can be segregated as clear ‘trust’ or ‘mistrust’. As signals are increasingly falling into a ‘gray’ zone, analyzing these gray signals is one of the constant challenges of fraud prevention. With improvements in detection methods and strategies, attackers are also becoming more adept at adjusting to these new detection methods. Therefore, there is no such thing as an absolute determination on any specific transaction – only a level of certainty. We have discussed this challenge of handling gray signals in multiple blogs earlier; and this is why features like risk scores are an industry standard.
The Solution: Mitigate attacks and develop models, in parallel
The best possible way to avoid a tradeoff between accuracy and coverage is to have speedy attack mitigation and long-term development activities running in parallel. The Data Science team can leverage new detection methods and product features to focus on improving the current models and building new ones. Simultaneously, the Customer Success team can focus on smaller, situational applications for faster attack responses and to constantly monitor and mitigate the old attack vectors.
Arkose Labs understands the predicament digital businesses are facing and has worked out a unique supervised and unsupervised machine learning approach, where two different teams approach the same issue from either side. While the Data Science group takes care of the development activities, the Customer Success group has created a system comprising multiple smaller apps that specialize in their respective tasks. These include: detecting large volumes of new traffic from a suspicious IP, a fingerprint that solves at a non-human rate, checking that previously detected signatures are still active, and so forth.
Allowing these applications to perform a specific task makes it easier to understand what attackers are doing. Further, the outputs of these applications then feed into either our risk engine or the validation mechanism, which alerts the team for review and adjustments before our customers can feel the impact.
This dual team approach enables us to benefit from the oversight from two teams and powers a multi-faceted, smarter detection. Individual teams detecting and mitigating attacks, ensures that we continue to grow our defences, without the constant back and forth of active attack mitigation interrupting the development work.
Stop attacks faster, improve consumer experience
Our smart detection engine performs real-time risk assessment of every incoming user and informs the challenge-response mechanism to present an appropriate challenge to the user according to the risk level. This symbiotic relationship affords us the unique ability to test the decisions, rather than making a binary decision or passing that decision on to our customers. We are able to segment the traffic based on the signals from the alerting system and test how this segment of traffic reacts based on the challenge that the auditing system presents.
The continual feedback between the alerting and auditing systems allows for automatic escalation of pressure, which prevents attacks from succeeding. This feedback loop between the two stops attacks much faster than manual intervention techniques, is more fine-tuned, and helps improve consumer experience. Further, depending on the earlier results observed from the enforcement challenges Protect, we can adjust the pressure of the challenge. This adjustment is done by feeding the information – whether this segment of the traffic is more or less suspicious – back to the detection engine.
While this ability to automatically adjust pressure on attacks is a major benefit to our customers, its reverse, often, provides us with even more value. It allows us to automatically detect false positives, reduce the pressure the enforcement challenges exert, and minimize the impact on real users. These activities are particularly important when the attacker is skilled at imitating legitimate consumers. They allow the system to self-heal once the attack has stopped and the increased pressure from the challenges only impact genuine consumers.