In the dynamic landscape of cybersecurity, the emergence of Large Language Models (LLMs) has elicited both anticipation and concern.
Names of many things, as this article post is going to explore, are fraught. What Arachne Digital refers to as a CTA (before we even get into the multiple aliases a CTA can have) can be referred to as different things. Groups, threat actors, activity groups, campaigns, intrusion sets, and more are all terms that some use, and all these terms have subtly different meanings. But the thing they all have in common is that they are the actions of people or things created by people, using computers for malicious purposes. Ultimately, Arachne Digital cares about these people, reflected in the use of the term CTAs, rather than a more data-focused term, such as intrusion set. The reason for this focus will be outlined below.
To start with, why do we need names for CTAs at all? Names get us one step closer to attribution, and that is messy. Can’t we just share indicators of compromise (IoCs) to defend ourselves?
You can and should consume and share IoCs, but names have utility that can be an asset to defenders.
There is great deal of literature on the power of names, from personal names to the names of groups, and the names of other things such as objects and events. For the purposes of cyber threat intelligence (CTI), names give us agency over something. When naming a CTA, we can define what that name entails, be it specifically the activity you or your team have seen or information in the public domain. We can track the CTA because we have now defined what it is and what it is not, and we can now communicate about the CTA.
Dissemination is one of the most important aspects of the intelligence cycle, so the value of communication cannot be overstated. Collectively, tracking and communicating about CTAs enables defenders to take actions against CTAs, so names are arguably one of the cornerstones of defence.
Singular names might be useful, but then why do we need multiple, often confusing names like APT28 and Fancy Bear?
One organisation may have visibility of some malicious activity. They might be able to see the IP addresses that malicious traffic came from, the commands the CTA ran within the target environment to download malware from the command and control (C2) server, and the malware binary that was downloaded. They assign a permanent or liminal name to the CTA that might be revised once more information comes to light. Regardless, this name enables the tracking of this CTA.
However, another group has a completely different tool set and different visibility. They see the same C2 server, but they cannot see what the C2 server delivered or what the CTA did in the victim environment. However, they can see other connections to the C2 server and connections that the C2 server has made to other victim environments. They give a different name to the CTA and start tracking it. These two groups are tracking the same people, sitting at computers, performing malicious activity, but they do not know it. Different variations of this example can happen tens of times, leading to one CTA having several aliases.
But if both groups are talking about the same CTA, why don’t they just share information to figure that out and settle on a standard name? There are two parts to this question, the aspect of sharing and the aspect of coming to consensus.
Maybe information cannot be shared as the group does not want to reveal their detection capability, which will let CTAs tailor their attacks to avoid detection. Perhaps information cannot be shared as the group does not want to disclose publicly that they have visibility of a CTA. If this was public knowledge, it would allow the CTA to change their behaviour to avoid being tracked. Maybe the information the group has learned is sensitive, as it relates to a customer and commercial agreements mean the group cannot reveal the information. The information might simply be classified and therefore legally cannot be shared.
Different groups will have different methodologies for how they collect data, process information, and produce intelligence. Consequently, two groups might not agree that their intelligence reflects the same thing. Two different groups might also have different requirements. One group may care about tactical intelligence, granular IoCs, as their remit is the day-to-day defence of an organisation. Another group may care about strategic intelligence, as they advise on government policy around setting international norms. These two groups may not have the skills or capability to validate the other’s work to confirm they are indeed talking about the same group.
This is commonly seen with ransomware groups. One CTA may gain initial access to a victim organisation, then another may take that access and start moving laterally through the environment before deploying ransomware. A third might handle the negotiations with the victim organisation and manage the infrastructure that the communication and storage of potentially stolen data relies on.
This can also be seen among government backed advanced persistent threats (APTs). These may have multiple CTAs with different tasking from their parent organisation, and they are performing different malicious actions, but they may share malware and infrastructure. The same group that gained access to an environment might not be the same one that performed actions on a target. The same malware or C2 connection seen in two different places might not mean that one CTA was responsible for both attacks. These may be things that groups attempting to share intelligence are not able to validate for various reasons, or simply cannot agree on.
Where this leaves us is that multiple different groups with different visibility and different methodologies need to track CTAs. They name what they can see, understand, and validate, which is different from what other groups can see, understand, and validate. While we may not be able to come to consensus on a single, all-encompassing naming convention, groups can individually build a strong understanding of the activities they can see. One person’s Nobelium is not another’s Blue Kitsune.
Note: Since this article was written, Nobelium has been renamed by Microsoft to Midnight Blizzard.
Nobelium (Midnight Blizzard) and Blue Kitsune have very distinct lists of IoCs, and Tactics, Techniques and Procedures (TTPs), that constitute all we know about each one. The two bodies of information might overlap, but they are not one for one for all the reasons above, and therefore go by different names.
That is not to say that this is a good situation. The merits of various points, such as why intelligence might not be shared or consensus might not be reachable around attribution to specific CTAs, can be argued. Not all the points above in all circumstances are good reasons, but the reasons still exist no matter what your opinion of the reason is.
Now that the necessities have been outlined, the problems this cause will be explored.
It is very hard as a defender when reading a new blog post to even be sure the CTA you are reading about is the CTA you think you are reading about. This is due to the fact that every CTA can have multiple aliases. Arachne Digital collects information on upwards of 900 CTAs at time of publication. Many groups have only been seen once or twice and might only have one name, but some have ten or more aliases. That is potentially a thousand or more aliases whom defenders need to keep straight. This impedes tracking, communicating about and acting against CTAs.
In talking about the intelligence cycle, The Grugq notes, “If you have poor collection but excellent analysis and dissemination you will beat the snot out of someone who has great collection but has no analysis or dissemination”. This is because dissemination, getting intelligence into the hands of people who can use it, and communicating it in a way that they can make use of it, is what drives action. Collection does not drive actions by itself.
Even with poor collection, a clear and timely message about what little was collected will still generate an action. Conversely, a confusing and/or late message about a large volume of information might not generate an action at all. Even worse, it might generate a detrimental action because the intelligence could not be understood. A group’s visibility, which largely dictates what they collect, drives different CTA aliases. The situation of multiple aliases is considered necessary, but also confusing. Like it or not, this confusion is at the expense of analysis and dissemination, that is, being able to understand and communicate clearly about CTAs, which enables defenders to act.
That is not to say understanding the nuances of different aliases and communicating about them is impossible. But the complication of multiple aliases often does not help. The exception might be the perfect scenario where the analyst, the person communicating the intelligence, and the consumer who wants to use this information, all have very specialised knowledge and have the space to have a nuanced discussion. Multiple aliases defining overlapping but slightly different datasets shine here. But in the middle of incident response, someone might not have time for that. Even in a normal day job someone might not have time for that.
This can be particularly problematic for journalists specifically. Journalists play a role in the intelligence process. They often try to find a common thread between multiple events to provide insight (which is analysis). Then, they provide the information to their readership (which is dissemination). Multiple competing names really impedes their work. Often you will see a news piece talk about a specific CTA by a specific name. The article may even list some aliases cited in the report the news story is based on. Then, the same outlet may put out another piece about the exact same CTA by another name. Again, it will list some completely different aliases, and have no idea the two stories are about the same group. That is relevant information that is lost.
An outsider could completely reasonably assume APT28 (aliases Sofacy, Strontium (now Forest Blizzard) and Sednit), and Fancy Bear (aliases PawnStorm, Iron Twilight and Blue Athena) do not both refer to the same CTA. They have different names; how could they be the same? A journalist often is not going to compare IoCs and other pieces of information to start stitching the underlying patterns together. That is far, far too time consuming for most news cycles. Why should an outsider bother going to all that effort when the CTI community itself often has not bothered to do so?
CTI is not a static field. Especially around cybercrime, new groups are appearing almost weekly. Another complicating factor is liminal names. Many groups track CTAs by their own naming convention, but they also have different levels of naming conventions that they use to classify specific intrusions and campaigns. Later, these will be merged when there is enough evidence to suggest different names all point to one CTA, and the overall name can change, potentially multiple times. Liminal names can be great for specificity around datasets, particularly internally to a group where they have the entire context of what the liminal name refers to, However, to external groups the complexity compounds the existing issues explored in this article.
Liminal names are necessary. If analysts are not careful at early stages of tracking, misattribution can occur, which leads to intelligence being flat wrong rather than just hard to decipher. Liminal names should be shared where possible. If groups do not share otherwise shareable CTI, it would be a step backwards for our industry and those we serve. But make no mistake, an ever-shifting list of multiple aliases with varying tiers of naming differing from group to group is incomprehensible to most.
If one group puts forth a naming convention, another group cannot use that same naming convention as the second group does not have oversight of what the first group is tracking. This could lead to the same names being given to different things. This problem then becomes even more complex with liminal names, making sure there are no naming collisions and then agreeing on their graduation between different stages of naming conventions. You cannot open source naming conventions as they stand.
Without a global body that serves the entire world-wide community, we will never have interoperable names. Such a body would also serve adversary nations, something akin to the Internet Corporation for Assigned Names and Numbers (ICANN). The body would oversee the open sourcing of datasets and the naming of datasets.
The concept of open source is powerful. By giving over things such as the HTTP protocol to the public domain humanity received the internet, the world’s largest shared resource. Building on what has come before and continuing to break new ground yields incredible progress.
But as outlined, there are intrinsic aspects about how CTI is currently performed that run against the practice of open source.
This is not surprising given the close connection with CTI to military and national security circles. Animals are a common name for CTAs. The vast majority of these are either powerful or predatory animals, real or mythical, and often an animal that has national significance to the country a CTA is attributed to. CTI uses frameworks like kill chains and MITRE ATT&CK. Names can be designed to invoke fear, and a cynical mind might suggest this is a deliberate marketing ploy to drum up concern and therefore demand for security products.
Some naming conventions use colours which might seem innocuous, but again often have national significance, and the vast majority are vivid primary colours. And there is a flipside of this situation that is often not considered. What do the CTAs themselves think of these nationalist monikers and the associated scary imagery?
There is no evidence of this, but it is possible, potentially even probable, that CTAs adore these names. Names are powerful, and we should not embolden our adversaries or reinforce nationalistic imagery.
In short, multiple names and aliases for CTAs must exist, but this makes the CTI field wildly complex and nuanced. Perfection is striven for in understanding datasets and CTI would often rather understand a problem perfectly than communicate a problem simply. CTI understands data, but can forget the human adversary and human audience.
Understanding the realities of CTI, along with understanding the complication those realities impose, Arachne Digital uses the below principles to guide naming CTAs. This is not to say Arachne Digital is breaking new ground, but it is important to state intentions around how this contentious issue is handled.
Arachne Digital uses its own naming convention. While more names make things more complex, for the above-mentioned reasons a naming convention is necessary.
Arachne Digital has been deliberate in its choice of names, avoiding common conventions but also avoiding predatory and nationalistic imagery. Names have been chosen to convey a softer, and potentially humorous tone.
The Arachne Digital naming conventions consist of a random word followed by a listed suffix:
A new name will not be given to every single group and Arachne Digital names are used as last resort. A great deal of the data Arachne Digital works with is data in the public domain; therefore, existing names can be used. Where aliases exist, and it is generally agreed that names refer to a specific group of real-world adversaries with no overlap (for example, NOT Winnti or Lazarus Group), Arachne Digital will try and align on the name with the greatest ‘brand recognition’. This will support some modicum of industry standardisation.
Arachne Digital will give names to intrusion sets, or CTAs if there is a general understanding that several intrusion sets are converging on a CTA, where names do not already exist. Arachne Digital will also give names where CTA names become synonymous with malware or tools, as this is a trend Arachne Digital wants to push back against.
For example, for ransomware attacks attributed to the ‘Conti group’, or just ‘Conti’, are we talking about the group itself or the malware that performed the encryption? Did Conti use Conti, or did Conti use other ransomware in the attack? Could the attack be any one of possible multiple initial access brokers or affiliates? In this situation, a name will be designated for specificity.
Arachne Digital names should also be treated as liminal. As understanding about CTAs evolve, groups may be merged or the Arachne Digital name may be dropped in favour of a name that has broad industry recognition.
Arachne Digital believes in the ideals of open source and will make every attempt to open source code, naming conventions, and anything else where it is appropriate to do so.
To this end, and to assist in navigating Arachne Digital’s naming conventions, we are open sourcing our internal CTA tracker, called Spindle. This document, capturing different names, aliases, and connections, is not a perfect solution to all the above-mentioned problems, but Arachne Digital believes some information is better than none. Contribution and usage guidelines for Spindle have been published and will be updated as Arachne Digital continues to build out.
Yes, we said usage guidelines. Our naming convention is open source.
Arachne Digital understands why many would not want to use the naming convention of a for-profit company. However, if you do want to use our naming convention, we make every attempt to ensure the community can use it. If you have any questions about Arachne Digital’s approach to naming CTAs, please reach out to us on contact at arachne dot digital. If Arachne Digital has questions about naming CTAs, we will make best efforts to reach out to analysts to discuss.
Arachne Digital believes in standardisation where possible, and as such supports widely accepted frameworks like MITRE ATT&CK. If an ICANN-like body were to emerge to drive wider standardisation, Arachne Digital would support this.
Where multiple names do not actually serve to explain nuance in different datasets, please drop extraneous names. This is a call that some in the CTI community have made, and one that Arachne Digital supports. There are many instances of CTA names, tools, and malware, where organisations will openly state, “We call this CTA/tool/malware X, but it is known by the community as Y.” If X and Y are the same, please just drop X.
You may lose some prestige and market awareness by not having your name in the mix every time Y CTA is reported on. However, you are making the lives of every single person that touches CTI globally that much easier. There is a direct financial value for you in being able to communicate clear and actionable intelligence.
On the flip side, do make the differences explicit if X and Y are different. A group might not be able to share the ins and outs of how they came to that conclusion, but they can say X and Y are different. If a group can share CTI, please make it clear how the dataset with its own name differs from other datasets it may overlap with.
Many thanks to Katie Nickels, Florian Roth and The Grugq for their contributions to the field of CTI, much of which was referred to when writing this article.
“As a premier cyber security provider, Fortian is dedicated to delivering industry-leading security solutions to our clients. Arachne Digital’s cyber threat intelligence (CTI) plays a critical role in our 24×7 Managed Security Services, empowering us to stay ahead of evolving threats and safeguard our clients’ digital assets.
Arachne Digital’s timely and actionable CTI provides us with relevant indicators that are seamlessly integrated into our security tools and processes. This integration enhances our ability to monitor, detect, and respond to threats in real-time and improves the efficiency of our threat hunting and incident response processes.
Fortian is proud to partner with Arachne Digital, and we look forward to continuing our collaboration to protect our clients against the ever-evolving cyber threat landscape.”
Arachne Digital is proud to partner with the DISARM Foundation as the inaugural member of their Partner Programme, launched at the beginning of 2024.
This partnership is crucial in supporting the DISARM Foundation’s mission to maintain and enhance the DISARM Framework, ensuring it remains a free and continuously updated resource in the fight against disinformation.
Through our collaboration, Arachne Digital provides valuable feedback, promotes the integration of the framework into our operations, and encourages wider adoption within the defender community. This partnership highlights our commitment to combating evolving threats and fostering a secure digital environment.