Scan of the Month 32, September 2004
************************************

Entry by Lisa Thalheim <thalheim@informatik.hu-berlin.de>

Table of contents:

0. Introduction
1. Tools used
2. My setup
3. Answers to the SotM questions
   3.1 Identify and provide an overview of the binary, including the fundamental pieces of information that would help in
       identifying the same specimen.
   3.2 Identify and explain the purpose of the binary.
   3.3 Identify and explain the different features of the binary. What are its capabilities?
   3.4 Identify and explain the binary communication methods. Develop a Snort signature to detect this type of malware being as generic
       as possible, so other similar specimens could be detected, but avoiding at the same time a high false positives rate signature.
   3.5 Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered.
   3.6 Categorize this type of malware (virus, worm...) and justify your reasoning.
   3.7 Identify another tool that has demonstrated similar functionality in the past.
   3.8 Suggest detection and protection methods to fight against the threats introduced by this binary.
   3.*.1 Is it possible to interrogate the binary about the person(s) who developed this tool? In what circumstances and under which conditions?
   3.*.2 What advancements in tools with similar purposes can we expect in the near future?
4. Overall methodology
5. Thanks & Greetings

0. Introduction
===============

I will first give an overview of my setup and the tools that I used.
Next, I will answer all the questions, where appropriate already giving some hints on
how I obtained these answers.
Last, I will explain my overall methodology, hopefully giving the requested information
on how I approached the problem of analysing the given binary.

1. Tools used
=============

* PE Explorer
	Description: Tool to explore PE files, view the headers, etc.
		     It also features a disassembler - plus automatic decompression of binaries
		     which were compressed with UPX
	Location:    http://www.heaventools.com/download.htm
	Platform:    Windows Various

* hexcurse
	Description: Text-mode hex viewer and editor
	Location:    http://jewfish.net/download.php?file=hexcurse.tar.gz
	Platform:    Linux for me

* IDA Pro
	Description: You know it.
	Location:    http://www.datarescue.com/
	Platform:    Windows Various and since recently Linux

* OllyDbg
	Description: Debugger.
	Location:    http://home.t-online.de/home/Ollydbg/
	Platform:    Windows Various

* TcpView
	Description: Something like netstat for Windows.
	Location:    http://www.sysinternals.com/ntw2k/source/tcpview.shtml
	Platform:    Windows Various

* Lord PE
	Description: Another viewer/editor for PE headers/files.
	Location:    http://scifi.pages.at/yoda9k/LordPE/info.htm
	Platform:    Windows Various

* Ethereal
	Description: The sniffing and protocol analyzing thingy.
	Location:    http://www.ethereal.com/
	Platform:    Various

* UPX
	Description: The Ultimate Packer for EXecutables
	Location:    http://upx.sourceforge.net/
	Platform:    Linux

* Common stuff like strings, perl etc.

2. My setup
===========

I used a playground Windows 2000 box which can be
flattened at any time without loss of critical data.
The Windows 2000 box was in a local network which contained only
one other host, my Linux production system, which was firewalled and
(hopefully) secured. Apart from that, I borrowed a colleague's notebook
which runs a Windows 2000 and has an IDA license installed.
The playground Win2k was mainly used to inspect the SotM-binary and do runtime-analysis
on it, but apart from that do everything else that requires Windows and was not IDA.
The borrowed Win2k system served only as an IDA loader.
My home system (the Linux box) sniffed in the local network and was used for convenience
tasks such as inspecting the binary and any resulting data with scripts, writing the
documentation, etc.

3. Answers to SotM-questions
============================

3.1 Identify and provide an overview of the binary, including the fundamental pieces of
    information that would help in identifying the same specimen.

One thing to notice is that the PE header sections have non-standard names: JDR0 and JDR1
(plus rsrc, but that is nothing special). Next, there is a block of data starting around offset
0x400 that looks too random for code. This is suspicious.
Apart from that, there are some weird strings in the binary, like "RaDa" and "Malware".

---------------------------------------------------------------------------------------------------

3.2 Identify and explain the purpose of the binary.

The binary is supposed to be a tool to launch DDoS smurf attacks. At least it states so in its
GUI. However, it does not launch any DDoS-attacks against anyone, but silently goes to the
background and polls commands from a server. It then executes these commands. The wannabe-attacker
does not get to DDoS his victim but is instead 0wned by the 1337 hacker tool he just tried.

---------------------------------------------------------------------------------------------------

3.3 Identify and explain the different features of the binary.
    What are its capabilities?

Let' start off with a list of options to the program, what arguments they take and what they do.

--period
	takes exactly one argument, which must be numeric
        (else: runtime error 13)
	seems to be the time in seconds that the program should wait before
	polling the command file again

--gui
	takes no arguments
        starts a gui

--tmpdir
	takes exactly one argument, the path to a non-existing directory
	that directory is then created

--verbose
	takes no arguments
	did not do anything useful up to now

--visible
	takes no arguments
	shows the command file it retrieved

--server <srv>
	will use <srv> as server instead of 10.10.10.10
	format of <srv> is http://ip-address:port/path/to/your/commands/,
        or, to say in short, <srv> is a URI

--commands <file>
	takes exactly one argument
	The program sends a request for the given file to port 80 on 10.10.10.10
	GET /RaDa/<file> HTTP/1.1 by default. For host, port and path other
	than default, see --server
	This argument states the name of the commands-file.

--cgipath
	takes exactly one argument: the path to the cgi-scripts for put and
        get

--cgiput
        takes exactly one argument: the name of the CGI-script to use for
        uploading files from the local host to the remote server.

--cgiget
	takes exactly one argument: the name of the CGI-script to use
        for downloading files from the remote server to the local host.

--cycles
	takes exactly one argument (numeric only)
	Runtime error 13 if alpha
	How many times should the program loop polling its arguments?

--help
	either doesn't check for number of arguments or has a very large
        number of arguments...

--installdir
	accepts exactly one argument, which is a path that does not yet exist.
	The binary will be copied there.
	
--noinstall
	no arguments
        do not create a RaDa-directory

--uninstall
	no arguments
        uninstalls RaDa - seems to at least delete a registry key

--authors
	no arguments
        gives the names of the authors

Source: The list of available options was extracted from the binary by dumping all strings
        and then grepping through the output for strings starting with a double-dash.
        The function and number/type of arguments for each option were retrieved by manually
        trying them once I had the feeling there wouldn't be anything really bad happening
        when I mess around with this binary. The program would pop up a message box saying
        "Unknown argument: <given>" whenever I tried an option that didn't exist or a
        an option that did exist but had too many arguments. It would pop up a message box
        saying "Runtime error 13: Type mismatch" whenever I used a valid option with an invalid
        type of argument (such as a string where a number was expected).
        Some of the functionality became only apparent once I had figured out how to use the
        RaDa_commands.html-file properly, because then different HTTP-requests were generated
        which showed the effects of different arguments.

The obvious capability of the program is to launch a GUI. This GUI has some buttons, few of which
are meaningful. On pressing the "Go!"-button, the program retrieves its commands-file and
executes the commands within.
The program can up- and download files from the local host to the remote server and vice versa.
It can apparently take screenshots (?), sleep and execute arbitrary programs on the local host.

Source: This information was retrieved by statically analysing the binary. Again, it was the
strings that attracted my attention, since there are the strings "exe", "put", "get", "screenshot",
and "sleep" in the binary, and all are used in close succession. The next thing was to figure out
how one can trigger these commands. It was obvious that they would be contained in the
RaDa_commands.html-file obtained from the server. What was unclear was the format of the
RaDa_commands.html-file. I did not manage to figure out that format by statically or dynamically
analyzing the binary. The problem was, that I could not even reach the part of the code where
the strings ("get", "put", etc) where matched against something. It seems like some
pre-parsing took place in another part of the code further away, or even in some library function.
What got me on the right path after all were the strings some instructions before all the command-strings,
"Forms", "Document", "elements", "Name", "Value". Doesn't that sound like an HTML form? In fact, it does.

<html><head><title></title></head><body>
<form>
   <input type="text" value="C:\WINNT\system32\calc.exe" name="exe">
</form>
</body></html>

Setting this as the contents of RaDa_commands.html will launch a calculator on the host running RaDa.
The other commands are used similarly.

---------------------------------------------------------------------------------------------------

3.4 Identify and explain the binary communication methods. Develop a Snort signature to detect this type of malware
    being as generic as possible, so other similar specimens could be detected, but avoiding at the same time a high
    false positives rate signature.

RaDa by default tries to connect to the host 10.10.10.10 on port 80 and if successful issues
the following HTTP request: GET /RaDa/RaDa_commands.html HTTP/1.1.
Since the string '/RaDa/RaDa_commands.html" should be fairly unique, it can be used quite safely
for a snort signature, without triggering a lot of false positive alarms.

alert any any -> any any (msg:"Detected RaDa trojan horse"; content:"GET /RaDa/RaDa_commands.html")

Since the request itself is probably unique enough, we do not need to rely on the host 10.10.10.10
or port 80 or HTTP version. This comes in handy, because one can change the host and port used using
the --server option, and it is very easy to modify the default host in the program. This rule will
also detect any slightly modified version of RaDa which just uses another default host or port.

On the other hand, one can modify pretty much anything about the request; see the documentation
of the command line options for further information. So this rule might not be of much use.
Another thing that seems to be unique is what RaDa sends when it received a "get"-command.
The GET-request then contains the header
'Content-Type: multipart/form-data; boundary=---------------------------0123456789012'
At first glance, this also seems to be fairly unique, but it isn't. It looks like it is used by 
some VB stuff generically when up/downloading files by automating IE. So this possibly won't make
such a good rule.

Another thing that one might want to use for a rule is the content of the commands-file.

alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:"<input.*name=\"screenshot\">;)
alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:"<input.*name=\"sleep\">;)
alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:"<input.*name=\"put\">;)
alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:"<input.*name=\"get\">;)
alert any any -> any any (msg:"Possibly detected RaDa trojan horse."; content:"<input.*name=\"exe\">;)

The uniqeness of these last rules may be limited, since form fields which are called 'put' or 'get' or
'exe' may be quite common.

RaDa does not seem to cause any network traffic besides HTTP connections to the command server.

---------------------------------------------------------------------------------------------------

3.5 Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered.

* UPX binary compression and modification

The greater part of the binary was compressed using UPX. After compression, the file has been modified
slightly such that UPX won't decompress it anymore. You have to modify it again to uncompress it.
To be more exact, what you have to do is:

1) Change the name of the first section (as in PE header) to 'UPX' (0x55505800). In the original file it is 'JDR0'
   and is located at offset 0x000001B8 in the file.
2) Change that name again at offset 0x000003E0 to 'UPX', where it is 'JDR'.

After these modifications, upx (v 1.92) should be able to decompress the binary.

Source: Find the error message upx gives you on first try ('CantUnpackException') in the upx source code.
        Understand the section of code that leads to that exception being thrown and deduce the
        necessary modifications. 

* VMWARE detection

I have not been able to try this, as I had no vmware ready to use.
But the static analysis of the binary suggests that the program checks whether it is running inside
a vmware. At least, there is the string 'HKLM\Software\VMware, Inc.\VMware Tools\InstallPath' in the
binary. Apart from that there are the strings '00:0C:29', '00:50:56' and '00:05:69'. Asking google
for these strings yields a lot of pages stating that these are MAC addresses used by VMware.
It is not clear to me what happens when the program is run inside a VMware, but I suspect the program
might not work at all or not properly when run in a VMware. This would make sense, as it is likely
that people analyzing a possibly harmful binary will throw it into a sandbox to do the dynamic analysis.

* VB

I don't know whether this was intentional but that whole VB-business made reverse engineering
the binary a lot less enjoyable. This is particularly true for the dynamic analysis. This is because
a lot of interesting stuff seems to happen inside weird VB library functions.

* Messed up PE header

The actual data that later becomes the code is 'hidden' by the PE header.

---------------------------------------------------------------------------------------------------

3.6 Categorize this type of malware (virus, worm...) and justify your reasoning.

This binary falls into the category of Trojan Horses. It claims to be a tool
for launching DDoS-attacks. But in fact, it does not do any such thing but pretends to terminate and
then silently continues to run in the background, polling commands for it to execute locally
from a given host.

---------------------------------------------------------------------------------------------------

3.7 Identify another tool that has demonstrated similar functionality in the past.

A backdoor in OpenSSH: http://www.openssh.org/txt/trojan.adv

It connects to a specific IP address to an IRC port (6667?) and accepts commands that are
sent over that connection. If unsuccessful to establish the connection, it tries again
after one hour. It understands three commands: kill the backdoor, execute a command, sleep.

---------------------------------------------------------------------------------------------------

3.8 Suggest detection and protection methods to fight against the threats introduced by this binary.

* Detect the registry key HKLM\Software\Microsoft\Windows\CurrentVersion\Run\RaDa

Source: When you repeatedly select "Uninstall" in the GUI, a message box pops up saying it can't find that
        registry key to delete it.
        The string ;HKLM\Software\Microsoft\Windows\CurrentVersion\Run' can also be found in clear text in
        the binary.
Methodology: Try

* Detect the trojan via the network traffic it causes. See the section 3.4 for the snort-rules.


---------------------------------------------------------------------------------------------------

3.*.1 Is it possible to interrogate the binary about the person(s) who developed this tool?
      In what circumstances and under which conditions?

There are several possibilites to do this. Actually, the names of the authors are all over the
place.

* Run the program RaDa.exe with the option --authors. A messagebox then
pops up which states the names of the authors, Raul Siles and David Perez.
* Run RaDa.exe with the option --gui. The GUI that pops up also states the names of the authors.
* Run RaDa.exe with the option --help. An instance of IE pops up showing an HTML page which also
states the authors' names.
* Run RaDa.exe with any invalid option, or with a valid option but wrong number/type of arguments
to that option. The effect is the same as when using the --help option.
* Look into the binary. Obviously, the strings for the possibilities given above need to be stored
somewhere, so it is no surprise (though not self-evident, because the names might be mangled or
encoded and only converted to clear text at runtime) that you can find the names of the authors in
clear text when dumping strings from the binary. This method of retrieving the authors' names has
the advantage that you can find them using only static analysis, and without accepting the risk that
is posed by launching an unknown and probably harmful program.
This is also my preferred variant and the first one I tried.

Methodology: a) Static analysis - dumping strings b) systematically trying the options. A list of the
	     available options was extracted from the binary by dumping all the strings and then
             grepping for strings that start with a double-dash.

---------------------------------------------------------------------------------------------------

3.*.2 What advancements in tools with similar purposes can we expect in the near future?

There are a couple of improvements of this trojan which come to mind immediately.
One could carry the VMware check a bit further by using a decoding/compression method
where the MAC address is used in the process of decoding, such that the data gets
decoded to nonsense if the program is run in a VMware, hence has one of the well-known 
MAC addresses VMware uses.
Next, apart from the fact that reverse engineering VB-code is no fun, I have seen no
anti-debugging measures in the binary. One could do with a bit of those, too.

---------------------------------------------------------------------------------------------------

4. Overall Methodology
======================

I started off by throwing the binary into IDA. There were some unusual things going on right away,
since the PE header stated that the actual data (which turned out to be compressed code) should
not be loaded, so I had to go over it again and load it manually.
The next thing I noticed was that there was not much code in the binary, but a lot of data which did
not seem to make sense. The suspicion arose that this data would probably be compressed or encrypted
or otherwise mangled code, and that the code which was recognized by IDA would be the decoder.
I decided to throw it into OllyDbg and see whether I can validate this suspicion. Indeed, the code
did some kind of transformation, using the kludge of data as input and writing the output to
some location in memory (0x401000 and upwards to be exact).
I started by singe-stepping the decoder, but that got boring after a short while, so I looked for
places which looked like the coder might have finished and the program was ready to jump into the
decoded code. I noticed that there were a lot of jump instructions in the code, all of which used
fixed address offsets. I also noticed there were three function calls which used register-adressed
arguments, and they were located at what seemed to be the end of the code that was initially in the file.
I breakpointed the first one of these calls and found the full decoded binary in the memory at
the already mentioned location. I did a memory dump and had now a decoded binary to be going on with.
After loading the original binary in PE Explorer I learned that this software would automagically
decode my kludge of data. I looked over its feature list again and found a default plugin which
decompresses UPX-compressed binary. Thus I suspected that the code was compressed by UPX.
This turned out to be the case, though the binary had obviously been fiddled with after compressing it
and UPX could not decompress it right away. See section 3.5 on counter-analyzing measures for details.
I finally let the program run and started playing around with it, using the list of options I extracted
from the binary. The rest of the work was basically skimming over the disassembly of the binary to guide
me in playing with the program, where to look and what to try, trying out options and commands,
trying different contents for the RaDa_commands.html-file and watching ethereal to see whether anything
I did had any effect on RaDa's communication behaviour.

5. Thanks & Greetings
======================

Thanks to Felix von Leitner for motivation.
Greetings to the attendees and lecturers of the Summerschool Applied IT Security 2004
in Aachen, Germany.