diff --git a/_labs/software_and_malware_analysis/10_anti_sre.md b/_labs/software_and_malware_analysis/10_anti_sre.md new file mode 100644 index 0000000..51d39c1 --- /dev/null +++ b/_labs/software_and_malware_analysis/10_anti_sre.md @@ -0,0 +1,270 @@ +--- +title: "Anti-Reverse-Engineering" +author: ["Thalita Vergilio", "Tom Shaw", "Z. Cliffe Schreuders"] +license: "CC BY-SA 4.0" +description: "Advanced anti-reverse-engineering techniques including anti-debugging, anti-disassembly, and code obfuscation methods used by malware to thwart analysis." +overview: | + In this lab on anti-reverse-engineering techniques, you will explore the various tactics employed by both malicious actors and legitimate software developers to thwart the efforts of reverse engineers. This lab provides an in-depth understanding of how malware samples can identify their operating environment, detect the presence of debuggers, and employ anti-disassembly techniques to obfuscate their code. + + Throughout this lab, you will learn how malware samples identify virtual machine and sandbox environments, detect debuggers, and modify their runtime behavior. You'll also explore anti-disassembly techniques and code obfuscation methods, gaining hands-on experience with a set of practical challenges. Tasks include analyzing code in disassemblers like Ghidra, bypassing anti-debugging techniques, and deciphering hidden passwords within obfuscated code. By the end of this lab, you will have honed your skills in dynamic analysis and developed a deep understanding of the cat-and-mouse game between malware creators and reverse engineers. Get ready to unlock the secrets of anti-reverse-engineering and enhance your cybersecurity expertise through a series of engaging challenges. + + In your home directory you will find some binaries that you need to reverse engineer in order to determine the password that the program expects. Once you have found the password, run the program and enter the password to receive the flag. +tags: ["anti-reverse-engineering", "anti-debugging", "anti-disassembly", "code-obfuscation", "malware-analysis", "ctf"] +categories: ["software_and_malware_analysis"] +lab_sheet_url: "https://docs.google.com/document/d/1Qtljd6dpunp5P_IZmnlIs_Wd1bG1Qb_ONju5sp2Bhx8/edit?usp=sharing" +type: ["ctf-lab", "lab-sheet"] +difficulty: "intermediate" +cybok: + - ka: "MAT" + topic: "Malware Analysis" + keywords: ["analysis techniques", "analysis environments", "anti-analysis and evasion techniques"] +--- + +## Introduction {#introduction} + +This lab covers the tricks and techniques used by malware authors to make it difficult for analysts to understand what a given malware sample is actually doing. + +At this point in the module, you have learned about and applied both static and live analysis techniques to gain understanding of compiled binary programs, including live malware samples. The anti-reversing techniques covered in this lab can make both live and static analysis significantly more difficult. + +This lab will explore three main categories of anti-reverse-engineering techniques: + +- **Anti-debugging** - Techniques to detect and evade debuggers +- **Anti-disassembly** - Methods to mislead disassemblers and decompilers +- **Code obfuscation** - Approaches to make code logic harder to understand + +It is common for malware samples to check whether a debugger is attached or if the program is running in a virtual machine. When detected, the program may either immediately exit or change its behavior to something more innocuous. In this lab, you will learn how malware that changes behavior based on its environment functions, and how it can be analyzed. + +Disassemblers are another category of tool in the analyst's toolkit that can be directly targeted. This lab covers some of the ways in which programs can be written to intentionally mislead the disassembler, leading to incorrect high-level C code interpretations of the assembly instructions contained within the executable. + +Finally, you will explore code obfuscation techniques, including polymorphic and metamorphic malware design as well as the use of executable packers. + +The techniques described in this lab are not only used for malicious purposes. Legitimate software products may also use anti-reversing techniques to protect their intellectual property or mitigate against piracy. + +This lab contains **twelve challenges** where reverse-engineering has been obfuscated through the use of anti-disassembly, anti-debugging, and packing techniques. These challenges are more complex than those you completed in previous weeks and will advance your dynamic analysis skills to the next level. Hints and tips can be found at the end of the lab sheet. + +Have fun! + +## Environment Identification {#environment-identification} + +To impede live analysis, malware samples often follow a two-step process: + +1. **Environment Detection** - Identify whether the program is running within an analysis environment: + - Is the program running in a VM or sandbox? + - Is a debugger attached? + +2. **Behavior Modification** - Modify program runtime behavior: + - Perform unsuspicious activity, OR + - Exit immediately + +There are numerous ways in which a program can inspect its environment to determine whether the process is under analysis. + +### Am I within a Virtual Machine or Sandbox? {#am-i-within-a-virtual-machine-or-sandbox} + +Virtual Machines (VMs) are commonly used by malware analysts to run malware samples in a sandboxed environment. Virtual machines often leave footprints in the form of artifacts on the system that can be inspected by a running program, including: + +- **Network Interface Cards (NICs)**: + - Default MAC addresses associated with virtualization providers + - Virtualized network adapters +- **Hardware characteristics**: + - Single CPU core (sometimes) + - Limited memory configurations +- **System artifacts**: + - Presence of registry keys specific to virtualization + - Use of virtual devices + - Manual inspection of memory contents + +For specific artifact examples for commonly used virtualization providers, see [this comprehensive guide](https://subscription.packtpub.com/book/security/9781788838849/11/ch11lvl1sec79/anti-vm-tricks#:~:text=The%20most%20typical%20way%20to,registry%20or%20a%20running%20service.). + +The Malware Behaviour Catalog (MBC) project provides a framework for understanding malware behavior. Review the entry on various methods of [Virtual Machine Detection](https://github.com/MBCProject/mbc-markdown/blob/main/anti-behavioral-analysis/virtual-machine-detection.md). + +One interesting approach found in the [WebCobra](https://www.mcafee.com/blogs/other-blogs/mcafee-labs/webcobra-malware-uses-victims-computers-to-mine-cryptocurrency/) malware sample involves checking the titles of open windows and comparing them to a list of title strings used within popular analysis tools. + +### Is there a debugger attached? {#is-there-a-debugger-attached} + +There are several techniques that a running program can use to identify whether it is currently being debugged: + +### Windows API Functions +- **[IsDebuggerPresent()](https://learn.microsoft.com/en-us/windows/win32/api/debugapi/nf-debugapi-isdebuggerpresent)** - Checks if a debugger is attached to the current process +- **CheckRemoteDebuggerPresent()** - Checks if a debugger is attached to a remote process + +### Linux Process Inspection +- **Parent process inspection** using Linux's [procfs](https://en.wikipedia.org/wiki/Procfs): + - Check `/proc//status` for process information + - Compare the PPid value against known debugger names (e.g., 'gdb') + +### Memory Structure Analysis +- **Manual checking of data structures in memory**: + - Windows: Use the [TEB](https://learn.microsoft.com/en-us/windows/win32/api/winternl/ns-winternl-teb) to find the [PEB](https://learn.microsoft.com/en-us/windows/win32/api/winternl/ns-winternl-peb), then look up the BeingDebugged value + +### Breakpoint Detection +- **Scan executable code for `0xCC`** (the INT3 breakpoint instruction) +- **Run a checksum on the executable code**: + - This mitigates against patched code as well as `0xCC` debugger instructions + +### Timing-Based Detection +- **[Timing-based checks](https://anti-debug.checkpoint.com/techniques/timing.html)**: + - Is the program running slower than it should? + - Perhaps a debugger is halting execution... + +For a comprehensive overview of debugger detection methods, review the [Debugger Detection](https://github.com/MBCProject/mbc-markdown/blob/main/anti-behavioral-analysis/debugger-detection.md) entry in the Malware Behaviour Catalog. + +## More Anti-Debugging Techniques {#more-anti-debugging-techniques} + +Debuggers in Linux use the [ptrace() system call](https://man7.org/linux/man-pages/man2/ptrace.2.html) to "trace" (observe and control) another process. A running program can only have one tracer at a time. If you attempt to attach a debugger to a process that is already being traced, you will get an 'Operation not permitted' error. + +Malware authors can leverage this limitation by calling ptrace() on their own process, preventing other debuggers using ptrace() from attaching to the process. + +This approach can be bypassed using several methods: + +- **LD_PRELOAD technique**: Use a modified version of the ptrace() function loaded via `LD_PRELOAD` environment variables +- **Code patching**: Break and jump past the ptrace() call during execution + +The following CTF challenges are related to the environment identification and anti-debugging techniques described above, with hints provided at the end of the lab sheet: + +- ### AntiDbg\_BypassPtrace + +- ### AntiDbg\_Int3Scan + +- ### AntiDbg\_SigtrapCheck + +- ### AntiDbg\_SigtrapEntangle + +- ### AntiDbg\_SigtrapHijack + +- ### AntiDbg\_TimeCheck + +## Anti-Disassembly Techniques {#anti-disassembly-techniques} + +Another approach to thwarting analysis is to write programs in a way that attempts to mislead the disassembler, leading to an incorrect reconstruction of assembly code from the program's binary machine instructions. This is known as **disassembly desynchronization**. + +The following technique, described by [Kargen et al (2022)](https://ieeexplore.ieee.org/abstract/document/9825860), impacts disassemblers that parse conditional statements from the fall-through branch (i.e., the else condition) *before* parsing the taken branch (i.e., the if condition). This manipulation of the disassembler's parsing mechanism results in valid but incorrect instructions being produced. + +These techniques can be used to hide real instructions, such as function calls. This can be achieved by injecting specifically crafted data bytes into the program and using "fake" branches that always resolve in one direction. + +An example of this mechanism can be seen in Figure 1 below. + +![][image-1] +*Example of anti-disassembly technique showing fake branch and hidden function call* + +### How the Technique Works + +1. **Create a "fake" branch**: A condition is created that always resolves in one direction - a jump to the 'hidden' function call. + +2. **Use XOR for zero result**: As you may recall, performing an XOR instruction with the same operands always results in 0. In this example, the contents of the eax register are XOR'd with itself, which always results in 0. This is immediately followed by a JZ instruction (jump if zero) which jumps to the secret() function call. + +3. **Insert crafted junk data**: Specifically crafted data bytes are inserted into the program. Since the program always jumps at line 2, this data byte will never be reached during program execution. The significance of using the value `0xEB` will become clear shortly. + +4. **Include hidden function call**: The function call to secret() on line 5 is included, which will be hidden at decompilation. + +Recursive decompilers that resolve the fall-through condition (which is never actually taken) first will misinterpret the above example. Can you see what has happened in Figure 1? + +==hint: Think about the problem, then scroll to the next section for an explanation.== + +### What happened? {#what-happened} + +The parser has interpreted the `0xEB` (data byte) and the `0xE8` (first byte of the call command) as `0xEBE8` - a jmp instruction! + +All of the following 'commands' that have been parsed incorrectly are parts of the secret call (`0xE874563412`) misinterpreted as a series of assembly instructions (`0xEBE8`, `0x7456`, `0x3412`). + +Similar techniques can be applied at the decompiler level to trick it into producing incorrect high-level code representations of assembly instructions. + +The following CTF challenges are related to the anti-disassembly techniques described in this section, with hints at the end of the labsheet: + +- ### AntiDis\_FakeCond + +- ### AntiDis\_FakeMetaConds + +- ### AntiDis\_InJmp + +## Packing and Code Obfuscation Techniques {#packing-and-code-obfuscation-techniques} + +Another approach to making analysis more difficult is to use code obfuscation techniques to make the program logic harder to understand. + +### Common Obfuscation Approaches + +- **Renaming variables and functions** - Replace meaningful names with cryptic ones +- **Obfuscating data** - Encrypt or encode data to hide its purpose +- **Obfuscating data access** - Make data access patterns more complex + +### Hands-On Example + +[Here is an example of a web tool](https://obfuscator.io/) that obfuscates JavaScript using these techniques. + +==action: Open the web page.== + +==action: Experiment with different code inputs.== + +==action: View the obfuscated code in the 'output' tab.== + +### Legitimate vs. Malicious Use + +Legitimate tools such as packers and code minifiers have the intended purpose of reducing the size of programs for transmission across the network. However, these tools were historically adopted by malware creators to obfuscate both manual analysis and automated malware detection. + +The [UPX packer](https://upx.github.io/) has historically been one of the more popular packing tools used by malware for obfuscation. As a result, the inclusion of UPX headers in a program is often detected by antivirus software. [Tigress](https://tigress.wtf/) is another popular tool used to obfuscate C code. + +### Advanced Techniques + +These ideas are taken further with polymorphic and metamorphic malware design. These concepts are discussed in detail in the [Mechanisms of Polymorphic and Metamorphic Viruses paper by Li et al. (2011)](https://ieeexplore.ieee.org/abstract/document/6061171), accessible with your university account. + +## CTF Challenges Hints and Tips {#ctf-hints-and-tips} + +The focus of this week is to bypass techniques that malware creators have used to prevent reverse-engineering of their code. Feel free to use Ghidra alongside GDB to combine both static and dynamic analyses. + +> Tip: Remember there is often more than one way to solve each challenge, so don't be too focused on doing it "the right way" this week. Anything is valid, as long as you get the flag. + +Here are some tips to help you find the flags: + +### AntiDis\_FakeCond {#antidis_fakecond} + +> Hint: Disassemble and analyse the code in Ghidra. + +### AntiDis\_FakeMetaConds {#antidis_fakemetaconds} + +> Hint: This challenge was designed for IDA Pro, but can be done with Ghidra. Work through the main() function to find a hardcoded hex value. There are a couple of arithmetic operations performed on it. Take them into consideration when working out the password. + +### AntiDis\_InJmp {#antidis_injmp} + +> Hint: The password can be found when you analyse the code in Ghidra. + +### AntiDbg\_BypassPtrace {#antidbg_bypassptrace} + +> Hint: As with the challenge above, use 'jump' to bypass the bad code that stops you from debugging. + +### AntiDbg\_Int3Scan {#antidbg_int3scan} + +> Hint: Disassemble and analyse in Ghidra. Work backwards through the code to find the password. + +### AntiDbg\_SigtrapCheck {#antidbg_sigtrapcheck} + +> Hint: This one is pretty straightforward. You can use 'jump' to skip function calls. The rest is standard. + +### AntiDbg\_SigtrapHijack {#antidbg_sigtraphijack} + +> Hint: The handler() function does not get executed, so you need to force it to run. + +> Hint: In GDB, set a breakpoint in main() and one in handler(). + +> Hint: Run the program. When it stops in main(), jump to handler(). + +### ParamsRegs {#paramsregs} + +> Hint: Run the program in GDB, put a breakpoint just before the function of interest is called. Check each of the parameters. + +### ParamsStack {#paramsstack} + +> Hint: Same as above, but check what is pushed to the stack instead. + +## Conclusion {#conclusion} + +At this point you have: + +* Used self-directed learning to understand different techniques used by malware creators to thwart reverse-engineering +* Gained practical knowledge on how to bypass anti-disassembly, anti-debugging, and packing techniques +* Solved practical CTF challenges and found 12 more flags! + +> Tip: Well done! + +Some of these challenges were quite tricky and required using a combination of tools and techniques learned in previous weeks. Fantastic work! + + +[image-1]: {{ site.baseurl }}/assets/images/software_and_malware_analysis/10_anti_sre/image-1.png diff --git a/_labs/software_and_malware_analysis/9_malware_behaviour.md b/_labs/software_and_malware_analysis/9_malware_behaviour.md new file mode 100644 index 0000000..fa3703f --- /dev/null +++ b/_labs/software_and_malware_analysis/9_malware_behaviour.md @@ -0,0 +1,127 @@ +--- +title: "Malware Behaviour: Flag Hints" +author: ["Thalita Vergilio", "Tom Shaw", "Z. Cliffe Schreuders"] +license: "CC BY-SA 4.0" +description: "Advanced malware behavior analysis using dynamic reverse engineering techniques including process forking, network communication, library preloading, and binary unpacking." +overview: | + A CTF lab focusing on advanced malware behavior analysis. In your home directory you will find some binaries that you need to reverse engineer in order to determine the password that the program expects. Once you have found the password, run the program and enter the password to receive the file. + + This lab covers advanced dynamic analysis techniques including process forking, network communication, library preloading, and binary unpacking. You will work with various malware behaviors and learn how to analyze them using GDB and other reverse engineering tools. +tags: ["malware-analysis", "dynamic-analysis", "process-forking", "network-analysis", "library-preloading", "binary-unpacking", "ctf"] +categories: ["software_and_malware_analysis"] +lab_sheet_url: "https://docs.google.com/document/d/1NmcQ3fZ7EXZYzYV-p1F_Snhu0-XbpeSCOwjDT59-yZY/edit?usp=sharing" +type: ["ctf-lab", "lab-sheet"] +difficulty: "intermediate" +cybok: + - ka: "MAT" + topic: "Malware Taxonomy" + keywords: ["dimensions", "kinds"] + - ka: "MAT" + topic: "Malware Analysis" + keywords: ["analysis techniques", "analysis environments"] +--- + +## Advanced Analysis Techniques {#advanced-analysis-techniques} + +Before attempting the CTF challenges, you'll need to understand several advanced techniques used in malware analysis. + +## GDB Fork Mode {#gdb-fork-mode} + +When analyzing programs that create child processes, you need to configure GDB to follow the child process: + +```bash +set follow-fork-mode child +``` + +This tells GDB to debug the child process instead of the parent when a fork occurs. + +## Library Preloading (LD_PRELOAD) {#library-preloading-ld-preload} + +LD_PRELOAD allows you to override system functions by loading your own shared library first. To create a shared library: + +```bash +gcc -shared -fPIC -o libname.so source.c +``` + +To use it: + +```bash +LD_PRELOAD=./libname.so ./program +``` + +## Network Analysis {#network-analysis} + +For network-based challenges, you can use netcat to listen for connections: + +```bash +nc -l 8080 +``` + +This listens on port 8080 for incoming connections. + + +## CTF Challenges {#ctf-challenges} + +> Tip: Here are some tips to help you find the flags: + +## Ch12Covert_ForkFollow {#ch12covert-forkfollow} + +> Hint: Remember to set the follow-fork mode to 'child' in GDB. + +> Hint: Put a break on the cmp that decides whether to print the password or not. + +> Hint: When it stops, check what is being compared. + +> Hint: Watch the size of the data you are examining (this is randomly assigned, but it could be a word, a double word, etc). + +## Ch12Covert_ForkPipe {#ch12covert-forkpipe} + +> Hint: You need to set the follow-fork mode to 'child' again. + +> Hint: You also need to enter a really long password (you will see why when you start debugging the program). + +> Hint: Examine the try_command() function. + +> Hint: Break at the line that compares dl and al. + +> Hint: Now you can either work with these and the 'set' command, or look further up in the code for values of interest. + +## Ch11MalBeh_NetcatShovel {#ch11malbeh-netcatshovel} + +> Hint: This one is easy. Open a new tab and run a netcat command to listen on port 8080. + +> Hint: Run the challenge. + +> Hint: Check the other tab for the password. + +## Ch18PackUnp_UnpackEasy {#ch18packunp-unpackeasy} + +> Hint: Copy the file to the user's home directory to remove the setuid. + +> Hint: Use UPX to unpack it. + +> Hint: Run GDB at that location. + +> Hint: Find the function that compares the string entered to the password. Note that there is no function name, only a memory address, but you can guess by the arguments to the function and the instructions afterwards that it is probably strcmp(). + +> Hint: You know what to do next 🙂 + +> Hint: Remember to run the program again from the challenges directory to get the password. + +## Ch11MalBeh_LdPreloadGetUID {#ch11malbeh-ldpreloadgetuid} + +> Hint: Watch the LD_PRELOAD Demo lecture first! + +> Hint: Copy the challenge executable to your home directory. + +> Hint: In your home directory, create a file that implements getuid(). + +> Hint: Compile as a 32-bit dynamic library. + +> Hint: If you try to run ldd, it will probably fail saying your dynamic library has the wrong ELF class. Ignore that. + +> Hint: Run the challenge program from the home directory using your preloaded library. The password will be printed on the screen. Run it again from the challenges directory and enter the correct password. + +## Ch11MalBeh_LdPreloadRand {#ch11malbeh-ldpreloadrand} + +> Hint: Follow the same procedure as the previous one. diff --git a/assets/images/software_and_malware_analysis/10_anti_sre/image-1.png b/assets/images/software_and_malware_analysis/10_anti_sre/image-1.png new file mode 100644 index 0000000..cee752a Binary files /dev/null and b/assets/images/software_and_malware_analysis/10_anti_sre/image-1.png differ