Closing the net on bad actors, to make phishing as insignificant as spam email.
Cybersecurity companies are in an arms race with bad actors. The more technologies you put in place to detect and block phishing content, the more technologies bad actors implement to evade detection. VISUA’s Visual-AI doesn’t look at code or use fingerprints – it looks at emails and web pages with ‘human’ eyes, but at machine speed, to flag and score high-risk elements it finds, allowing phishing detection systems to prioritise further and deeper analysis.
Our phishing detection Visual-AI was developed to be integrated and work in harmony with a platform’s existing AI-based detection methods. Providing an early warning system that detects high-risk brands and other visual factors in emails and websites. Built on a dedicated and proprietary technology stack that can provide instant analysis and detection. No buzzwords or impossible promises, only results that are trusted by some of the leading anti phishing/ cybersecurity platforms in the world.
Think of visual signals as a detection of any visual element that is significant to a given use case. For example in phishing detection, the presence of a specific logo, such as a bank, in an email or web page could be very significant and highlight that communication for deeper analysis. Equally, the presence of a specific word in an email that was not detected during programmatic analysis, would indicate the use of some form of obfuscation technique, which in itself would indicate some form of scam activity.
Therefore, visual signals are an additional layer of data, extracted from visual media, that can subsequently be layered on top of other signals, derived from other sources, to drive more accurate business intelligence and decision-making.
We have all heard the term ‘arms race’ in relation to cyber security many times. Bad actors constantly evolving and cyber security companies having to keep up while trying to predict future fraud activities and attack vectors. This has never been more true thanks to AI/machine learning. This technology has revolutionised white hat techniques for detecting spam, phishing and malware, but it has also been weaponised by criminals to make their attacks ever more innovative and harder to detect.
A cyber security company’s goal is to make decisions as to what is genuine and what is a threat. That requires the analysis of signals. But some of these threats are now being driven by weaponsing graphics in brand spoofing and impersonation attacks along with using graphics to evade detection. Analysis of these messages and pages using computer vision therefore provides an additional layer of visual signals that can be added to signals derived programmatically, allowing better decision making and prioritisation of certain messages for deeper analysis.
The key difference is that we do not implement computer vision to simply analyze visual elements within the code of an email or web page. That approach is simply too easy to evade, as well as missing out on the power of Visual-AI to detect other evasion techniques that are not visual in nature.
Instead, VISUA’s approach exploits a key flaw in every attack. At some point all the smoke and mirrors must stop and the scammer must show the intended victim what they want them to see. So at this point, a flattened image of the email or web page is captured and sent for visual processing. Visual signals are identified based on the rules and requirements of each platform. Brands can be identified, trigger words flagged, high-risk objects highlighted (forms and buttons, etc.) and entire emails/pages compared to known good or known bad versions that they match.
When combined with traditional programmatic analysis, this is a powerful addition to any phishing detection workflow.
A bad actors goal in a brand spoofing attack is sufficiently create trust and/or confuse the recipient of an email, or visitor to a web page, such that they take the desired action of giving up sensitive information (login credentials, etc.) or click on a poisoned link that then downloads malware. In traditional spoofing attacks a scammer masquerades as another entity. That may be one or more entities such as an individual that the intended victim knows, a url, such a googIe.com (the L in this example is actually a capital i, which looks like a lowercase L), or they may use IP, ARP, and DNS spoofing among others. However brand spoofing goes further by using trusted visual cues, such as a company logo as well as other graphics associated with the company being spoofed. In extreme cases, they will go to great lengths to create ‘pixel-perfect’ copies of emails and web pages.
Brand spoofing can occur across many channels, including email, web, social media, print and even telephone, but it is typically associated with a visual form of impersonation. It can also be used early in a campaign, during the social engineering phase, or late in the campaign, once a specific victim is targeted with a precise attack.
Brand impersonation attacks are a specific type of attack where a bad actor targets a victim using another company’s brand. For instance, they may target many people, in a ‘spray and pray’ attack that uses a courier’s logo (like FedEx) or a streaming service’s branding (like Netflix), or it may be a very targeted attack where they know the victim uses Microsoft or Google services and send them a very believable fake login request email linked to an equally believable but fake login web page.
Brand spoofing can occur across many channels, including email, web, social media, print and even telephone. It can be something as simple as spoofing an email address or URL or as complex as developing a pixel-perfect copy of a web page or email. However, it is typically associated with a visual form of impersonation.
Attack vectors, in the general sense, encompasses any technique used to penetrate a network or system, or to deceive an individual. Visual or graphical attack vectors are a specific form of attack where graphics are weaponised in order to confuse victims and/or deceive detection systems. There are many and diverse ways graphics can be weaponized as highlighted in the answers to the next set of questions.
The most obvious method is to include a company logo in your email, webpage or social post. In some cases scammers will use a modified version of the logo or an older version, which may be enough to fool less capable computer vision solutions.
Another way is to include authoritative marks (whether genuine or fake ones). For instance a padlock icon, ‘SSL Secure’, or ‘Certified Partner’ logo or icon is often enough to convince a victim that the communication is safe and genuine. In fact a high percentage of respondents in research surveys have indicated that icons used within the body of content is more genuine than a padlock icon in the address bar!
Finally, the simple use of a favicon in the tab of a web page is often enough to convince a victim of a site’s legitimacy.
Combining multiples of these elements, and even creating a ‘pixel-perfect’ copy of a legitimate email or web page has been shown to be highly effective in building trust in recipients of emails and visitors to web pages.
Scammers are very knowledgeable about the techniques cyber security companies use to detect fraudulent communications. They know how the various systems work and the types of programmatic scanning techniques used to detect their communications. They therefore identify various aspects that would be prone to detection and use graphics as the alternative vehicle for the attack. Examples are:
Converting keywords to graphics:
They will convert key ‘trigger’ words in the content to a graphic. For instance, words like ‘Username’, ‘Password’, ‘Login’, and ‘Credit Card Number’ will be converted from readable text to a JPG or PNG, but in such a way to be indistinguishable from the normal text to the user.
Sections Converted To Images:
Rather than converting just a word at a time, they may convert an entire form to a graphic, overlaying the input fields above the graphic. In some cases, they will convert the entire email or site into a single graphic.
URLs Converted To Graphics:
Similar to key ‘trigger’ words, a genuine URL is converted to a graphic, however, a link is then attached to the graphic that points to the fake site. In this way a user may see www.paypal.com, but the link behind the image will point to www.paypa1.com.
Add Visual Noise:
The list is far too numerous to list out in its entirety here and unfortunately the list is becoming longer and techniques more elaborate and sophisticated thanks to the use of AI by scammers.
However, a few specific techniques are:
If you want to hide the word ‘Login’ from detection, add random characters between each letter that gets removed by a script at runtime. So the code shows ‘L8dgfhoSt5s3gsktfhilpq3dn’ (for this example we have coloured the random letters in red).
They use botnets to create thousands of variants of text and header variations that are extremely difficult to distinguish as fake.
Delivering a high volume of sophisticated and legitimate-looking emails can overwhelm a detection system, or more accurately, the humans who make the ultimate decisions. That gives your attack a better chance of being allowed through.
Short-life / Single-Use URLs
Blacklists were the standard approach for deciding the legitimacy of a web page/site. Bad actors therefore adapted technologies to allow the creation of short-life and even single-use URLs that exist such a short time as to never make it onto any blacklists.
Frequency & IP-Based Substitution
Programmatic checks take time and resources, so typically an email or webpage will be checked once, or a limited number of times. Bad actors therefore use methods such as serving the correct page in an email the first time the URL is visited, but substituting the spoofed page thereafter. Alternatively they will track the IP address ranges of specific target companies and only serve spoofed content when that IP address requests it, otherwise it will be served genuine content (on the basis that it is likely a detection system checking the emails and web pages).
Add a number of these techniques together and combine it with an overloaded and alert-saturated system, and gaps will appear that allow these attacks to get into victim inboxes.
Phishing and fraud detection systems rely on signals and triggers derived from the data they process using a combination of technologies. Once examined, a Decision engine makes a determination based on the volume, types, and combination of signals.
But the ability to make the right decisions and eliminate as many false positives and false negatives as possible relies on maximising the number of signals available for analysis. Visual-AI as a component in any phishing detection workflow delivers key visual signals that can be layered on top of other signals, derived from other sources, to drive more accurate decision-making. This is illustrated in the simple workflow image below, where Visual-AI sits alongside rules-based systems and supervised/unsupervised ML systems.
This approach has multiple benefits in terms of how to exploit these signals:
1) Prioritise Analysis Based On Visual Content:
For example, if you see an email for a high importance user (e.g. CFO) and you detect the logo of a bank, you can prioritise that email for deeper analysis.
2) Parallel Process
Simply use Visual-AI alongside the other technologies and Mux the signals to determine a threat score.
3) Post Analysis Visual Processing
Use your standard approaches and only use visual processing for emails and sites where the risk cannot be accurately determined programmatically.
We combine 4 key technologies with our visual phishing detection workflow as highlighted in the image below:
Logo / Mark Detection
It is the combination of all these technologies and the way they are applied that makes VISUA such a powerful ally in phishing and fraud detection.
Brand spoofing has grown exponentially in recent times, especially accelerated during the COVID-19 era, with scammers using the branding of many organisations for the first time, such as CDC and WHO; along with staples, like tax offices, banks, social media and entertainment services’ logos.
A fit for purpose logo detection API can highlight every time a brand is detected, allowing key decisions to be made. For example, if you see an email for a high importance user (e.g. CFO) and you detect the logo of a bank, you can prioritise that email for deeper analysis. Or every email and web page with a CDC logo can be quarantined immediately.
But Marks and Icons are also used by scammers and these can also be tracked.
This highlights that not only do you need an effective logo detection technology that can be accurate at scale, but it must also be able to:
VISUA provides instant logo learning and has an unlimited library size, with over 100K logos and marks already available.
Bad actors will use many code-based and graphical techniques to obfuscate objects like converting buttons and forms to graphics or using scripts to only render forms when a user’s browser is detected.
Using object detection after rendering the email or web page can detect these elements and highlight them so that any anomalies can be tracked and investigated.
Many forms of text obfuscation can be used by bad actors. From simple misspellings (Logln – which uses lowercase L instead of an I), to padding words with random characters that are removed at render, and even converting words to graphics.
By capturing the fully rendered email/web page as a graphic and running text analysis, any strange anomalies can be detected and highlighted. This analysis can capture all the text or simply look for trigger words that could indicate risk, such as ‘password’, ‘login’, ‘payment’, etc.
Visual Search can detect visually similar images to reference images in the library. If you want to protect a login or payment page, or indeed a key email that’s sent to customers/users, simply render and save the page or email as a graphic at all common resolutions, and save it to the library. The system can then look for and flag any images that match those in the library. This gives an early warning system the moment any spoofed copies of emails or pages are created and begin disseminating. This is what we call ‘KGI’ (Known Good Images).
Importantly, this can also be used in reverse to track ‘KBI’ (Known Bad Images) to track the spread of common spoofed pages and phishing emails. Simply add a proven phishing email/page to the library and Visual Search will flag every time it appears in processed images.
Additionally, ‘Image Matching’ controls can be applied, allowing the system to look for only exact matches or also detect close matches. By reducing the percentage match, the system can highlight variants and new adaptations of known good and bad images.
Blacklists used to be the main weapon against spoofed, phishing and malicious web pages. But with the advent of short-life and single-use URLs, blacklists have become far less effective in the detection and blocking of these sites.
The key challenge is the identification of fraudulent and spoofing sites takes time, which can sometimes be longer than the lifespan of the URL! The use of computer vision (Visual-AI) allows critical visual signals to be highlighted, allowing pages to be immediately quarantined if specific logos, text or objects (login pages) are discovered, or the page matches a known good or bad page.
Absolutely not. In fact the term ‘visual phishing detection’, although descriptive of what it is helping to do, is nonetheless inaccurate. Computer Vision provides additional signals, which in specific circumstances can be vital in the detection and blocking of phishing attacks. However, the Visual-AI engine does not make the determination as to what is, or is not, a threat. Instead these signals are fed into the decision engine, along with other signals derived from Signature Analysis and other forms of supervised and unsupervised ML.
As such, computer vision can be thought of as an additional engine that combines a combination of specific features that sit alongside other modules/engines to enhance threat scoring.
Not at all. The entire process, from rendering to visual processing occurs in fractions of seconds and occurs in parallel to traditional programmatic analysis. Users will not be impacted in any way by this additional analysis.
There are many reasons why some computer vision providers are better than others. It would be outside the scope of this answer to list them all, but the key reasons are as follows:
Yes, you may well be able to build this yourself. But the key question is what are the pros and cons of doing so? You could build a CRM system for your business, but in virtually every case you’ll instead go use Hubspot, Salesforce or similar. Building your own CRM, although technically possible, provides no major advantage to your business.
Similarly, building your own computer vision solution, specifically to detect visual signals for phishing detection, also doesn’t make sense for the following reasons:
Great question. This is another quite unique offering from VISUA. Deployment can be implemented in the cloud, on-premise or even a combination of cloud and on-premise if required.
We like to think that our Visual-AI (computer vision) API is very easy to implement as part of any workflow, in fact, in most cases implementation takes as little as two hours. We have very clear API documentation also. But we are not simply an API provider, so do not hesitate to get in touch with any questions you may have. We also implement a very thorough onboarding process and as a client you will have direct access to our team for any ongoing support questions.
Yes, you can find very clear API documentation for our Logo Detection endpoint, or indeed any of our other technologies. You can find all relevant documentation here.
Absolutely! Unlike other solutions on the market that charge significant fees for support, or force you to reach out to third-party consultants, VISUA is proud to be much more than simply an API provider. You can get in touch with any questions you may have during your research and feasibility stage. We also implement a very thorough onboarding process and as a client you will have direct access to our team for any ongoing support questions.
VISUA is not an API company, like other providers. As such we don’t provide ‘support packages’. Support and guidance, both pre and post implementation is part of our DNA. In other words, if you need help with our tech or have questions, we’re here to provide the answers.
Our comprehensive API makes integrating our technology easy and fast, and our solution can be provided as-a-service or on-premise to maximise speed and security. Unique use cases are also no problem thanks to the flexibility of our technology and team. This is a unique and effective solution to a growing problem and our AI is like no other you’ll have used or seen – so, reach out to discover what Visual-AI can do for you.
Render the web page or email, save it as a flattened image and send it to our engine for processing.
The page is analyzed and high-risk elements or attributes are identified and flagged.
A risk score is calculated and passed, with the identified anomalies, back to the master phishing detection system for final actions.
All this happens in a second or less!
“VISUA is a delight to work with. They approached our use case with technology that worked, a proactive approach to deep-diving all issues, and a willingness to be nimble and adjust to our needs on the fly.”
Seamlessly integrating our API is quick and easy, and if you have questions, there are real people here to help. So start today; complete the contact form and our team will get straight back to you.