Up until now, the shift towards transparency in the cybersecurity industry has been slow-moving – and rightfully so. When it comes to artificial intelligence (AI) developments however, the permeation of organisations’ unwillingness to share new methodologies is noticeably stagnating the progress of AI and in turn its potential to mitigate cyber threats.
The lack of leadership in this aspect over the years has also created a noisy understanding of how AI truly provides protection against cyber-threats, further impeding AI adoption by IT leaders.
Now is not the time to underestimate the threat of human intelligence. According to the Sophos 2021 Threat Report, ransomware threat actors continue to innovate both their technology and techniques at an accelerating pace. There’s no telling when the next new ransomware strategy or highly contextualised business email compromise forgery campaigns will launch, and organisations need to demonstrate an urgency to address this before it’s too late.
At Sophos, we’ve recognised the criticality of this moment, even if it means challenging industry norms. Last month, our team of SophosAI data scientists announced four new open AI developments to strengthen the industry’s defences against cyberattacks. Organisations are free to dive into our latest datasets, tools and methodologies to advance their innovation and cybersecurity agendas. Making AI in security more transparent is the first step to protecting organisations with fewer resources from cybercrime and bolstering industry collaboration.
Even so, there is a danger that exists beyond cybersecurity to having datasets and methodologies widely available. Developing products and strategies based on poorly gathered data risks making the entire effort redundant and potentially harmful to the business. As David DeLallo, executive editor at McKinsey, and Jeni Tennison, vice president and chief strategy adviser at the Open Data Institute (ODI) point out, productive sharing of AI assets require high quality and trustworthy data. This is exactly what we’ve replicated in our open AI developments.
The four areas that Sophos is providing datasets, tools and methodologies are:
- SOREL-20M dataset for accelerating malware detection research: SOREL-20M is a joint project between SophosAI and ReversingLabs containing metadata, labels and features for 20 million Windows Portable Executable files. With 10 million disarmed malware samples available, it is the first production-scale malware research dataset available to the general public.
- AI-powered impersonation protection method: Impersonation protection is designed to flag potential email spearphishing attacks by comparing display names of inbound emails against senior executive titles – who are most likely spoofed in such an attack – that are unique to their organisation.
- Digital epidemiology to determine undetected malware: A method pioneered by Sophos to identify malicious “dark matter”, malware that might be overlooked, and “future malware” that is developed by attackers, by estimating the prevalence of malware infections in total.
- YaraML automatic signature generation tools: A new, AI-driven automatic signature generation method that directly “compiles” full-fledged, industrial-strength machine learning models into signature languages. It essentially allows AI to “write” the signature, proving far more effective than traditional approaches.
With the launch of these four new methodologies, Sophos is well-positioned to help transform the industry’s understanding of AI-driven cybersecurity solutions. In a market riddled with buyer scepticism, changes to industry standards or regulations are not enough to build the confidence in AI necessary for mitigating modern cyberthreats at scale. Openness in AI is what will open the doors to new age cybersecurity.