Scan of the Month 32, September 2004 ************************************ Entry by Lisa Thalheim Table of contents: 0. Introduction 1. Tools used 2. My setup 3. Answers to the SotM questions 3.1 Identify and provide an overview of the binary, including the fundamental pieces of information that would help in identifying the same specimen. 3.2 Identify and explain the purpose of the binary. 3.3 Identify and explain the different features of the binary. What are its capabilities? 3.4 Identify and explain the binary communication methods. Develop a Snort signature to detect this type of malware being as generic as possible, so other similar specimens could be detected, but avoiding at the same time a high false positives rate signature. 3.5 Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered. 3.6 Categorize this type of malware (virus, worm...) and justify your reasoning. 3.7 Identify another tool that has demonstrated similar functionality in the past. 3.8 Suggest detection and protection methods to fight against the threats introduced by this binary. 3.*.1 Is it possible to interrogate the binary about the person(s) who developed this tool? In what circumstances and under which conditions? 3.*.2 What advancements in tools with similar purposes can we expect in the near future? 4. Overall methodology 5. Thanks & Greetings 0. Introduction =============== I will first give an overview of my setup and the tools that I used. Next, I will answer all the questions, where appropriate already giving some hints on how I obtained these answers. Last, I will explain my overall methodology, hopefully giving the requested information on how I approached the problem of analysing the given binary. 1. Tools used ============= * PE Explorer Description: Tool to explore PE files, view the headers, etc. It also features a disassembler - plus automatic decompression of binaries which were compressed with UPX Location: http://www.heaventools.com/download.htm Platform: Windows Various * hexcurse Description: Text-mode hex viewer and editor Location: http://jewfish.net/download.php?file=hexcurse.tar.gz Platform: Linux for me * IDA Pro Description: You know it. Location: http://www.datarescue.com/ Platform: Windows Various and since recently Linux * OllyDbg Description: Debugger. Location: http://home.t-online.de/home/Ollydbg/ Platform: Windows Various * TcpView Description: Something like netstat for Windows. Location: http://www.sysinternals.com/ntw2k/source/tcpview.shtml Platform: Windows Various * Lord PE Description: Another viewer/editor for PE headers/files. Location: http://scifi.pages.at/yoda9k/LordPE/info.htm Platform: Windows Various * Ethereal Description: The sniffing and protocol analyzing thingy. Location: http://www.ethereal.com/ Platform: Various * UPX Description: The Ultimate Packer for EXecutables Location: http://upx.sourceforge.net/ Platform: Linux * Common stuff like strings, perl etc. 2. My setup =========== I used a playground Windows 2000 box which can be flattened at any time without loss of critical data. The Windows 2000 box was in a local network which contained only one other host, my Linux production system, which was firewalled and (hopefully) secured. Apart from that, I borrowed a colleague's notebook which runs a Windows 2000 and has an IDA license installed. The playground Win2k was mainly used to inspect the SotM-binary and do runtime-analysis on it, but apart from that do everything else that requires Windows and was not IDA. The borrowed Win2k system served only as an IDA loader. My home system (the Linux box) sniffed in the local network and was used for convenience tasks such as inspecting the binary and any resulting data with scripts, writing the documentation, etc. 3. Answers to SotM-questions ============================ 3.1 Identify and provide an overview of the binary, including the fundamental pieces of information that would help in identifying the same specimen. One thing to notice is that the PE header sections have non-standard names: JDR0 and JDR1 (plus rsrc, but that is nothing special). Next, there is a block of data starting around offset 0x400 that looks too random for code. This is suspicious. Apart from that, there are some weird strings in the binary, like "RaDa" and "Malware". --------------------------------------------------------------------------------------------------- 3.2 Identify and explain the purpose of the binary. The binary is supposed to be a tool to launch DDoS smurf attacks. At least it states so in its GUI. However, it does not launch any DDoS-attacks against anyone, but silently goes to the background and polls commands from a server. It then executes these commands. The wannabe-attacker does not get to DDoS his victim but is instead 0wned by the 1337 hacker tool he just tried. --------------------------------------------------------------------------------------------------- 3.3 Identify and explain the different features of the binary. What are its capabilities? Let' start off with a list of options to the program, what arguments they take and what they do. --period takes exactly one argument, which must be numeric (else: runtime error 13) seems to be the time in seconds that the program should wait before polling the command file again --gui takes no arguments starts a gui --tmpdir takes exactly one argument, the path to a non-existing directory that directory is then created --verbose takes no arguments did not do anything useful up to now --visible takes no arguments shows the command file it retrieved --server will use as server instead of 10.10.10.10 format of is http://ip-address:port/path/to/your/commands/, or, to say in short, is a URI --commands takes exactly one argument The program sends a request for the given file to port 80 on 10.10.10.10 GET /RaDa/ HTTP/1.1 by default. For host, port and path other than default, see --server This argument states the name of the commands-file. --cgipath takes exactly one argument: the path to the cgi-scripts for put and get --cgiput takes exactly one argument: the name of the CGI-script to use for uploading files from the local host to the remote server. --cgiget takes exactly one argument: the name of the CGI-script to use for downloading files from the remote server to the local host. --cycles takes exactly one argument (numeric only) Runtime error 13 if alpha How many times should the program loop polling its arguments? --help either doesn't check for number of arguments or has a very large number of arguments... --installdir accepts exactly one argument, which is a path that does not yet exist. The binary will be copied there. --noinstall no arguments do not create a RaDa-directory --uninstall no arguments uninstalls RaDa - seems to at least delete a registry key --authors no arguments gives the names of the authors Source: The list of available options was extracted from the binary by dumping all strings and then grepping through the output for strings starting with a double-dash. The function and number/type of arguments for each option were retrieved by manually trying them once I had the feeling there wouldn't be anything really bad happening when I mess around with this binary. The program would pop up a message box saying "Unknown argument: " whenever I tried an option that didn't exist or a an option that did exist but had too many arguments. It would pop up a message box saying "Runtime error 13: Type mismatch" whenever I used a valid option with an invalid type of argument (such as a string where a number was expected). Some of the functionality became only apparent once I had figured out how to use the RaDa_commands.html-file properly, because then different HTTP-requests were generated which showed the effects of different arguments. The obvious capability of the program is to launch a GUI. This GUI has some buttons, few of which are meaningful. On pressing the "Go!"-button, the program retrieves its commands-file and executes the commands within. The program can up- and download files from the local host to the remote server and vice versa. It can apparently take screenshots (?), sleep and execute arbitrary programs on the local host. Source: This information was retrieved by statically analysing the binary. Again, it was the strings that attracted my attention, since there are the strings "exe", "put", "get", "screenshot", and "sleep" in the binary, and all are used in close succession. The next thing was to figure out how one can trigger these commands. It was obvious that they would be contained in the RaDa_commands.html-file obtained from the server. What was unclear was the format of the RaDa_commands.html-file. I did not manage to figure out that format by statically or dynamically analyzing the binary. The problem was, that I could not even reach the part of the code where the strings ("get", "put", etc) where matched against something. It seems like some pre-parsing took place in another part of the code further away, or even in some library function. What got me on the right path after all were the strings some instructions before all the command-strings, "Forms", "Document", "elements", "Name", "Value". Doesn't that sound like an HTML form? In fact, it does.
Setting this as the contents of RaDa_commands.html will launch a calculator on the host running RaDa. The other commands are used similarly. --------------------------------------------------------------------------------------------------- 3.4 Identify and explain the binary communication methods. Develop a Snort signature to detect this type of malware being as generic as possible, so other similar specimens could be detected, but avoiding at the same time a high false positives rate signature. RaDa by default tries to connect to the host 10.10.10.10 on port 80 and if successful issues the following HTTP request: GET /RaDa/RaDa_commands.html HTTP/1.1. Since the string '/RaDa/RaDa_commands.html" should be fairly unique, it can be used quite safely for a snort signature, without triggering a lot of false positive alarms. alert any any -> any any (msg:"Detected RaDa trojan horse"; content:"GET /RaDa/RaDa_commands.html") Since the request itself is probably unique enough, we do not need to rely on the host 10.10.10.10 or port 80 or HTTP version. This comes in handy, because one can change the host and port used using the --server option, and it is very easy to modify the default host in the program. This rule will also detect any slightly modified version of RaDa which just uses another default host or port. On the other hand, one can modify pretty much anything about the request; see the documentation of the command line options for further information. So this rule might not be of much use. Another thing that seems to be unique is what RaDa sends when it received a "get"-command. The GET-request then contains the header 'Content-Type: multipart/form-data; boundary=---------------------------0123456789012' At first glance, this also seems to be fairly unique, but it isn't. It looks like it is used by some VB stuff generically when up/downloading files by automating IE. So this possibly won't make such a good rule. Another thing that one might want to use for a rule is the content of the commands-file. alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:";) alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:";) alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:";) alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:";) alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:";) The uniqeness of these last rules may be limited, since form fields which are called 'put' or 'get' or 'exe' may be quite common. RaDa does not seem to cause any network traffic besides HTTP connections to the command server. --------------------------------------------------------------------------------------------------- 3.5 Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered. * UPX binary compression and modification The greater part of the binary was compressed using UPX. After compression, the file has been modified slightly such that UPX won't decompress it anymore. You have to modify it again to uncompress it. To be more exact, what you have to do is: 1) Change the name of the first section (as in PE header) to 'UPX' (0x55505800). In the original file it is 'JDR0' and is located at offset 0x000001B8 in the file. 2) Change that name again at offset 0x000003E0 to 'UPX', where it is 'JDR'. After these modifications, upx (v 1.92) should be able to decompress the binary. Source: Find the error message upx gives you on first try ('CantUnpackException') in the upx source code. Understand the section of code that leads to that exception being thrown and deduce the necessary modifications. * VMWARE detection I have not been able to try this, as I had no vmware ready to use. But the static analysis of the binary suggests that the program checks whether it is running inside a vmware. At least, there is the string 'HKLM\Software\VMware, Inc.\VMware Tools\InstallPath' in the binary. Apart from that there are the strings '00:0C:29', '00:50:56' and '00:05:69'. Asking google for these strings yields a lot of pages stating that these are MAC addresses used by VMware. It is not clear to me what happens when the program is run inside a VMware, but I suspect the program might not work at all or not properly when run in a VMware. This would make sense, as it is likely that people analyzing a possibly harmful binary will throw it into a sandbox to do the dynamic analysis. * VB I don't know whether this was intentional but that whole VB-business made reverse engineering the binary a lot less enjoyable. This is particularly true for the dynamic analysis. This is because a lot of interesting stuff seems to happen inside weird VB library functions. * Messed up PE header The actual data that later becomes the code is 'hidden' by the PE header. --------------------------------------------------------------------------------------------------- 3.6 Categorize this type of malware (virus, worm...) and justify your reasoning. This binary falls into the category of Trojan Horses. It claims to be a tool for launching DDoS-attacks. But in fact, it does not do any such thing but pretends to terminate and then silently continues to run in the background, polling commands for it to execute locally from a given host. --------------------------------------------------------------------------------------------------- 3.7 Identify another tool that has demonstrated similar functionality in the past. A backdoor in OpenSSH: http://www.openssh.org/txt/trojan.adv It connects to a specific IP address to an IRC port (6667?) and accepts commands that are sent over that connection. If unsuccessful to establish the connection, it tries again after one hour. It understands three commands: kill the backdoor, execute a command, sleep. --------------------------------------------------------------------------------------------------- 3.8 Suggest detection and protection methods to fight against the threats introduced by this binary. * Detect the registry key HKLM\Software\Microsoft\Windows\CurrentVersion\Run\RaDa Source: When you repeatedly select "Uninstall" in the GUI, a message box pops up saying it can't find that registry key to delete it. The string ;HKLM\Software\Microsoft\Windows\CurrentVersion\Run' can also be found in clear text in the binary. Methodology: Try * Detect the trojan via the network traffic it causes. See the section 3.4 for the snort-rules. --------------------------------------------------------------------------------------------------- 3.*.1 Is it possible to interrogate the binary about the person(s) who developed this tool? In what circumstances and under which conditions? There are several possibilites to do this. Actually, the names of the authors are all over the place. * Run the program RaDa.exe with the option --authors. A messagebox then pops up which states the names of the authors, Raul Siles and David Perez. * Run RaDa.exe with the option --gui. The GUI that pops up also states the names of the authors. * Run RaDa.exe with the option --help. An instance of IE pops up showing an HTML page which also states the authors' names. * Run RaDa.exe with any invalid option, or with a valid option but wrong number/type of arguments to that option. The effect is the same as when using the --help option. * Look into the binary. Obviously, the strings for the possibilities given above need to be stored somewhere, so it is no surprise (though not self-evident, because the names might be mangled or encoded and only converted to clear text at runtime) that you can find the names of the authors in clear text when dumping strings from the binary. This method of retrieving the authors' names has the advantage that you can find them using only static analysis, and without accepting the risk that is posed by launching an unknown and probably harmful program. This is also my preferred variant and the first one I tried. Methodology: a) Static analysis - dumping strings b) systematically trying the options. A list of the available options was extracted from the binary by dumping all the strings and then grepping for strings that start with a double-dash. --------------------------------------------------------------------------------------------------- 3.*.2 What advancements in tools with similar purposes can we expect in the near future? There are a couple of improvements of this trojan which come to mind immediately. One could carry the VMware check a bit further by using a decoding/compression method where the MAC address is used in the process of decoding, such that the data gets decoded to nonsense if the program is run in a VMware, hence has one of the well-known MAC addresses VMware uses. Next, apart from the fact that reverse engineering VB-code is no fun, I have seen no anti-debugging measures in the binary. One could do with a bit of those, too. --------------------------------------------------------------------------------------------------- 4. Overall Methodology ====================== I started off by throwing the binary into IDA. There were some unusual things going on right away, since the PE header stated that the actual data (which turned out to be compressed code) should not be loaded, so I had to go over it again and load it manually. The next thing I noticed was that there was not much code in the binary, but a lot of data which did not seem to make sense. The suspicion arose that this data would probably be compressed or encrypted or otherwise mangled code, and that the code which was recognized by IDA would be the decoder. I decided to throw it into OllyDbg and see whether I can validate this suspicion. Indeed, the code did some kind of transformation, using the kludge of data as input and writing the output to some location in memory (0x401000 and upwards to be exact). I started by singe-stepping the decoder, but that got boring after a short while, so I looked for places which looked like the coder might have finished and the program was ready to jump into the decoded code. I noticed that there were a lot of jump instructions in the code, all of which used fixed address offsets. I also noticed there were three function calls which used register-adressed arguments, and they were located at what seemed to be the end of the code that was initially in the file. I breakpointed the first one of these calls and found the full decoded binary in the memory at the already mentioned location. I did a memory dump and had now a decoded binary to be going on with. After loading the original binary in PE Explorer I learned that this software would automagically decode my kludge of data. I looked over its feature list again and found a default plugin which decompresses UPX-compressed binary. Thus I suspected that the code was compressed by UPX. This turned out to be the case, though the binary had obviously been fiddled with after compressing it and UPX could not decompress it right away. See section 3.5 on counter-analyzing measures for details. I finally let the program run and started playing around with it, using the list of options I extracted from the binary. The rest of the work was basically skimming over the disassembly of the binary to guide me in playing with the program, where to look and what to try, trying out options and commands, trying different contents for the RaDa_commands.html-file and watching ethereal to see whether anything I did had any effect on RaDa's communication behaviour. 5. Thanks & Greetings ====================== Thanks to Felix von Leitner for motivation. Greetings to the attendees and lecturers of the Summerschool Applied IT Security 2004 in Aachen, Germany.