Digital Forensics and the Importance of Thinking Like a Malicious Adversary

Introduction

Cybercrime continues to grow and evolve in both quantity and the sophistication of the attacks. Forensic investigations face a multitude of challenges and limitations. Publicly available encryption and Artificial Intelligence (AI) libraries and the vast quantity of malicious code makes it increasingly difficult to protect networks and systems against attack and subsequently impede attempts at Reverse Engineering (RE). The TCP/IP protocol suite which is the underlying fabric of most networks is inherently insecure with security flaws in numerous protocols regardless of the implementations (Albandari, Bedour, Samina & Zahida 2017). The protocols were developed to allow internetworking with little concern for security and can be manipulated to obfuscate malicious activity and hinder attempts at attribution. Typically, not all network traffic is normally logged, this further complicates network analysis. Cyber criminals leverage anti-forensic techniques to distort investigations by attacking forensic tools or by deleting or encrypting the digital evidence (Yaacoub, Noura, Salman & Chehab, 2021). Retrieving digital evidence from diverse devices and systems is increasingly challenging not only because of the array of hardware, firmware, Operating Systems (OS) and software but also the huge volumes of data that can be generated (Caviglione, Wendzel & Mazurczyk 2017). Cloud infrastructure adds more layers of complexity with different technologies used for different services and issues with jurisdiction where the hardware could be physically located in another country (Manral, Somani, Choo, Conti & Gaur, 2020).

For all these reasons, sophisticated and to a lesser extent even inexperienced malicious adversaries have an advantage over forensic investigators. The aim of this report is to illustrate that in that contest the forensic investigator can leverage the ability of thinking like the attacker to reduce that edge.

This report begins with a review of the challenges faced by digital forensics followed by detailed case studies that serve to introduce an approach that allows the investigator to think like the attacker and focus on certain aspects of the attack. This results in increases in efficiency and extinguishes lines of inquiry that are of low probability during an investigation. The Adversarial Diamond Threat Model describes how an adversary can use a capability and infrastructure against a victim that results in an intrusion event. Further the Cyber Threat Kill Chain (CTKC) provides a mechanism to define the steps an adversary might take at each step of an exploitation or intrusion.

Argument

With the increasing complexity and sophistication of Information and Communications Technology the ability to think like a malicious adversary when performing digital forensics is advantageous. Powerful open-source AI Frameworks and encryption libraries along with readily available malicious code and easy access to anti-forensics techniques has created a situation where even less sophisticated adversaries can use a complex set of Tactics, Techniques and Procedures (TTP). The proliferation of connectedness enabled by mobile devices, networking technologies, IoT, cyber physical systems and cloud services that all use different types of hardware, firmware, OS and software and can generate huge volumes of disparate data that is both difficult to handle but can also allow malicious activity to remain hidden (Caviglione, Wendzel & Mazurczyk 2017).

Modern forensic techniques include Stored Data and File System analysis, Reverse Engineering (RE) and Network Analysis but each has limitations.

The analysis of stored data and file systems requires the creation of forensic images to allow the analysis of files, installed applications and the inspection of remnants of deleted content. This process can be time consuming not only because of the size of volumes and data but also because of the different types of devices and implementations used with various OS, file systems and mixed technologies (Caviglione, Wendzel & Mazurczyk 2017).

Typically, RE is used in an investigation. RE can be attempted on malware binaries, network traffic, seemingly benign applications, and log files from the OS (Caviglione, Wendzel & Mazurczyk 2017). But its effectiveness is limited by the anti-forensic techniques the adversary has used. These may include code obfuscation, multistage exploit loading, encrypted payloads and SSH tunnels as well as covert C2 and other sophisticated infrastructure (Blackberry Research & Intelligence Team, 2020).

The TCP/IP protocol suite was developed to enable internetworking with little regard to aspects of security such as authentication, confidentiality, and integrity (Alotaibi et al. 2017). Numerous protocols can be manipulated to both facilitate and obfuscate malicious activity and hinder attempts at attribution. Network traffic analysis is also impacted by size and diversity, logs can be captured by various network appliances such as firewalls, intrusion detection/prevention systems, servers and security information and event management (SIEM) systems. These devices and systems are normally set to capture only particular events, but not all network traffic which would be overwhelming for commodity hardware. Only capturing certain predefined events can still lead to huge volumes of data that contain very different types of network traffic. Attempts at RE network traffic can be further complicated using technologies such as Proxies, Tor and or VPNs (Comer, 2014).

While each of these forensic techniques have their limitations they are routinely used in typical computing and mobile device forensics, however cloud forensics presents a new set of challenges (Liu, Singhal & Wijesekera, 2020; Manral et al, 2020). More layers of complexity are introduced in cloud forensics because of the diverse technologies on offer and the evidence can be at the client side, network side and or server side. Virtualization and distributed systems used by commercial cloud service providers make it challenging to define the actual physical location of the hardware and then because the hardware may be in another country there can be issues with jurisdiction. Because of these constraints research has mainly focussed on the client side and network side (Manral et al, 2020).

Adversaries use anti forensic techniques and tools to delete or compromise the integrity of digital evidence, create a layer of misdirection and evade detection. A hex editor can be used to view and change a file signature which can mislead investigators looking for a particular type of file. Timestamps are critical for establishing a timeline of events, the Touch utility or PowerShell can be used to modify timestamps which can mislead an investigation. Various secure deletion tools such as Eraser are freely available, if the digital evidence is completely overwritten rather than just flagged for deletion data recovery tools will not work. Malware authors routinely use encryption, obfuscation, polymorphism, anti debugging and anti VM techniques which are all designed to resist attempts at RE (Kim, Yeom, Oh, Shin & Shin, 2021).

Cyber-attacks are constantly evolving and given the sophisticated tool set available to malicious adversaries in addition to the complexity of modern technology it is very challenging to protect our systems and networks against malicious attacks and even more challenging to determine exactly how and what happened, after an attack. To think like an attacker, a technical understanding of the target systems but also the TTP used by the attacker is required. The investigator must also understand their motivation, purpose, and intent as this will inform the decisions and steps taken during an exploitation. The ability to think like a malicious adversary increases efficiency by creating a focus that helps manage the huge quantities of disparate forensic data presented by investigations.

The Adversarial Diamond Threat model can be used to help organise the various aspects of malicious activity. The model helps describe how an adversary can use a capability and infrastructure against a victim that results in an intrusion event (Caltagirone, Pendergast & Betz 2013). The core features populate the edges of the diamond and are linked to illustrate the relationships between them. The model helps define and therefore understand how attacks occur. The motivation and the technology used can help inform defence but also which forensic paths to follow. With this knowledge the motivation and technologies used becomes clear which helps manage the complexities outlined above. Figure 1 defines a range Adversaries, Infrastructure, Capabilities and Victims. An as a Service adversary typically have a sophisticated infrastructure at their disposal which could include Domain Generating Algorithms (DGA), complex SSH tunnelling capabilities and use of the Tor network for C2 management. Their capabilities could include custom malware using known payloads and specific post infection multistage tools as well as sophisticated original malware and post exploitation tools. Their victims are likely to be Governments, Companies, and Infrastructure with no geographical relationship (Blackberry Research & Intelligence Team, 2020).

Figure 1. Adversarial Diamond Threat Model

Screen Shot 2021-12-01 at 8.15.35 am

The adversarial threat model can start to define an adversary and provide some insight into their TTP. Further, the Cyber Threat Kill Chain (CTKC) can help define the steps they might take at each step of an exploitation or intrusion. The CTKC uses the Cyber Kill Chain (Hutchins, Cloppert & Amin, 2011) at its core and outlines the phases a malicious adversary must progress through before achieving an outcome that could include the breach of a trust boundary to violate the confidentiality, integrity, or the availability of a system (Hutchins, Cloppert & Amin, 2011). The CTKC defines three phases, Acquisition, Attack and Advantage that encompasses the life cycle of an attack (Hartley, 2014). The Acquisition phase includes the set of activities the adversary must move through before launching an attack. The Attack phase closely follows the cyber kill chain from reconnaissance to attack and lastly the Advantage phase details what the adversary might do after the attack. The model is used to examine malicious attacks and can be used to understand how the adversary was thinking and what steps and decisions they may have taken.

Two attacks are used to illustrate how the adversarial threat model and the CTKC can be used to help an investigator think like the attacker and how that can be leveraged to create an advantage. A novel AI powered malware and the CostaRicto campaign are examined.

The novel malware is a Windows Image Classification Application (ICA) that uses a TensorFlow Convolutional Neural Network and a known malicious payload that evades Windows Defender and Firewall (FW), Norton LifeLock Anti-Virus (AV), and its Intrusion Prevention System (IPS). When a predefined target is identified by the neural network it decrypts, reassembles, and executes a reverse HTTPS payload. It is built with three major components, a benign carrier application, a neural network, and a known malicious payload for which the AV engine has a signature. It uses a TensorFlow Convolutional Neural Network (CNN) to detect a target in an image, when that target is detected, it decrypts, reassembles, and executes the Metasploit reverse HTTPS payload. Because that known malicious payload is encrypted and only decrypted if a certain target is identified it evades detection by the FW, IPS and AV static and dynamic detection. It will execute and establish a shell on an up to date Windows 10 machine with all of those defences. The payload is interchangeable where other types such as ransomware could be used (Gaber, 2020).

The Blackberry Research and Intelligence Team (2020) reported on the Costaricto campaign which they classified as an Advanced Persistent Threat because of the sophistication of the malware tools and the supporting infrastructure including complex VPN and SSH tunneling capabilities. Importantly this attack appears to be outsourced or rather is an example of espionage as a Service. Once access is gained a complex network of SSH tunnels are established from the victim’s machine. CostaBricks which is a custom VM based payload loader is then downloaded. A custom backdoor, interestingly named SombRAT is then delivered and it can load other malicious payloads in the form of plugins or standalone binaries and also carry out a number of actions such as collecting system information, listing and killing processes, and uploading files to the C2. The C2 servers are managed through Tor and a layer of proxies and communication can use either DNS tunneling or TCP sockets (Blackberry Research & Intelligence Team 2020).

Tables 1 and 2 use the adversarial threat model and the CTKC to further define these types of attacks. Very specific technical details for both the ICA and CostaRicto are available and used in the models below. This helps highlight the various capabilities, motivation, and the technical details of the campaigns. The details the models define would not be used when investigating something completely new or otherwise unrelated but rather could be used by an investigator analysing a similar attack or exploitation. They provide a framework to examine the various aspects of an attack.

Table 1. Applied Adversarial Diamond Threat Model

Screen Shot 2021-12-01 at 8.16.38 am

Table 2. Applied Cyber Threat Kill Chain

kill-chain_01
kill-chain_02
kill-chain_03

Dismantling an attack allows the investigator to examine each of the components with a greater focus. Consider the ICA planning, weaponization and exploitation phases. The motivation of the adversary is monetary gain, and an important aspect of the exploit is in the ability of the malware to evade typical network and host defences. While the CNN contained in the executable is a Blackbox and may be impossible to RE (Nugent & Cunningham, 2005; Stoecklin, 2018) there are other components in the executable that would allow a signature to be defined for use in defences such AV. It is also conceivable that less experienced adversaries would make mistakes when assembling the executable using a known malicious payload and these may leak information such as an IP address if a multistage payload is used. This IP address may allow attribution where a VPN or cloud server provider can be subpoenaed for the relevant records. The adversary would likely at least be cognizant of these risks but may not have the skill set to completely mitigate them. Armed with that knowledge the investigator could begin by focussing on disk forensics and configuration settings forensics.

The TTPs employed by CostaRicto resemble highly sophisticated state sponsored campaigns. The advanced tools and techniques, long time frames, large quantities of noisy data and the expert domain knowledge required to understand the entire attack make it very challenging to achieve an outcome with credible results (Liu, Singhal & Wijesekera, 2020). The models help deal with layers of complexity by allowing focus on constituent components for example consider the C2 phase. The C2 traffic is encrypted with RSA-2048 and communication used a DNS tunnel with a hardcoded domain name that was obfuscated using XOR along with DGA subdomains or TCP sockets. Further a custom AES encryption was used for harvested data. The C2 servers were managed through Tor and a layer of proxies (Blackberry Research & Intelligence Team, 2020). Just that one component of the attack, the C2 infrastructure is very complex and RE would require expert knowledge across multiple domains. This is not insurmountable, and the models allow the investigative team to pivot into a particular phase of the attack and step into it in increasing detail without getting lost in the complexity.

Conclusion

The scope of forensic investigations continues to grow as the attack surface presented by a wide range of devices expands both in area and complexity with the diversity of hardware, firmware, OS, and software. The high level of connectedness enabled by mobile devices, networking technologies, IoT, cyber physical systems and cloud services can generate huge volumes of disparate data that is difficult to manage. Achieving an outcome and acquiring credible forensic results is challenging.

Powerful open-source AI Frameworks and encryption libraries along with readily available malicious code and easy access to anti-forensics techniques has created complex set of TTPs available to less experienced adversaries and very sophisticated attacks that are offered as a service.

The ability to think like a malicious adversary allows the investigator to focus on certain aspects of an attack and this results in increases in efficiency and extinguishes lines of inquiry that are of low probability during an investigation. However, this requires not only a technical understanding of the target systems, the TTP’s used by the attacker but also an understanding of their motivation, purpose, and intent. This understanding helps crystallise the decisions the adversary takes during an exploitation.

This report has illustrated that in the contest between the adversary and forensics the investigator can leverage the ability of thinking like the attacker to reduce the advantage adversaries can have. The Adversarial Diamond Threat model and the Cyber Threat Kill Chain are used to decompose and define an AI powered evasive malware and the mercenary CostaRicto campaign. The models provide a framework that highlights the various capabilities, motivation, and the technical details of the attacks that could be applied to similar attacks or exploitations and demonstrates that the ability to think like an adversary leads to more effective and efficient investigations. Understanding the components and stages of an attack allows a greater focus by highlighting what threads to follow in a sea of disparate data. As the sophistication of attacks continues to increase the ability to think like a malicious adversary will result in more successful investigations with credible forensic evidence.

References

Albandari, M. A., Bedour, F. A., Samina, N. and Zahida P. (2017). Security issues in Protocols of TCP/IP Model at Layers Level. International Journal of Computer Networks and Communications Security.

Blackberry Research and Intelligence Team. (2020*). The CostaRicto Campaign: Cyber-Espionage Outsourced*. Retrieved from https://blogs.blackberry.com/en/2020/11/the-costaricto-campaign-cyber-espionage-outsourced

Caltagirone, S., Pendergast, A. and Betz, C. (2013). The Diamond Model of Intrusion Analysis. Retrieved from https://apps.dtic.mil/sti/pdfs/ADA586960.pdf

Caviglione, L., Wendzel, S. and Mazurczyk, W. (2017). The Future of Digital Forensics: Challenges and the Road Ahead. IEEE Security & Privacy, 15(6), 12–17. Retrieved from https://doi.org/10.1109/MSP.2017.4251117

Comer, D. E. (2014). Internetworking with TCP/IP Principles, Protocols and Architecture, 6th Edition. Pearson.

Gaber, M. (2020). How Artificial Intelligence and Encryption Power Evasive and Targeted Malware on a Windows 10 Machine Running the Latest Security. Retrieved from How Artificial Intelligence Can Power Evasive and Targeted Malware Part 3

Hartley, M. (2014). Think Like Your Adversary: Leveraging the Cyber Threat Kill Chain. ISSA Journal. Retrieved from https://cdn.ymaws.com/www.members.issa.org/resource/resmgr/journalpdfs/feature1114.pdf

Hutchins, E.M., Cloppert, M. J. and Amin, R.M. (2011). Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. Leading Issues in Information Warfare & Security Research, 1(1), p.80, Retrieved from https://www.lockheedmartin.com/content/dam/lockheed-martin/rms/documents/cyber/LM-White-Paper-Intel-Driven-Defense.pdf

Kim, S., Yeom, S., Oh, H., Shin, D., and Shin, D. (2021). Automatic Malicious Code Classification System through Static Analysis Using Machine Learning. Symmetry (Basel), 13(1), 35.

Liu, C., Singhal, A. and Wijesekera, D. (2020). Forensic Analysis of Advanced Persistent Threat Attacks in Cloud Environments. Advances in Digital Forensics XVI (pp. 161–180). Cham: Springer International Publishing.

Manral, B., Somani, G., Choo, K.-K., Conti, M. and Gaur, M. (2020). A Systematic Survey on Cloud Forensics Challenges, Solutions, and Future Directions. ACM Computing Surveys, 52(6), 1–38.

Nugent, C. and Cunningham, P. (2005). A Case-Based Explanation System for Black-Box Systems. Artificial Intelligence Review, 24(2), 163–178.

Stoecklin, M. (2018). DeepLocker: How AI Can Power a Stealthy New Breed of Malware. Retrieved from https://securityintelligence.com/deeplocker-how-ai-can-power-a-stealthy-new-breed-of-malware/

Yaacoub, J.-P. A., Noura, H. N., Salman, O., and Chehab, A. (2021). Digital Forensics vs. Anti-Digital Forensics: Techniques, Limitations and Recommendations. Retrieved from https://arxiv.org/abs/2103.17028