Host Security Experts only-IT

profileHelpStudent13
buttercup.pdf

1

Buttercup: On Network-based Detection of Polymorphic Buffer Overflow Vulnerabilities

Archana Pasupulati, Jason Coit, Karl Levitt, S. Felix Wu

Department of Computer Science University of California, Davis

{pasupula, coit,levitt, wu}@cs.ucdavis.edu

S.H. Li, R.C. Kuo, Kuo-Pao Fan Computer Communication Labortory

Industry Technology Research Institute {shli, rckuo}@itri.org.tw

Abstract — Attack polymorphism is a powerful tool for the attackers in the Internet to

evade signature-based intrusion detection/prevention systems. On the other hand, new and faster

Internet worms can be coded and launched easily by even high school students at any moment of

time to against our critical infrastructures such as DNS or update servers. And, we believe that

polymorphic Internet worms will be developed in the future such that many of our current

solutions might have very small chance to survive. In this paper, we propose a simple solution

called “Buttercup” to counter against attacks based on buffer-overflow exploits (such as

CodeRed, Nimda, Slammer, and Blaster). We have implemented our idea in SNORT, and included

19 return address ranges of buffer-overflow exploits. With a suite of tests against 13 TCPdump

traces, the false positive for our best algorithm is as low as 0.01%. This indicates that,

potentially, Buttercup can drop 100% worm attack packets on the wire while only 0.01% of the

good packets will be sacrificed.

I. Introduction

Since a signature-based Network Intrusion Detection System (NIDS) identifies an attack instance

by exactly matching attack signatures against the incoming and outgoing data packets, when the

well-known attacks are modified/transformed differently, the NIDS might fail due to its inability

to match them in its signature database. Sometimes, we call these transformed attacks (but all

from one single original attack signature, for the purpose of IDS evasion) “polymorphic attacks”.

2

In this paper, we propose a new solution to accurately identify one particular type of polymorphic

attacks, known as polymorphic shellcode. Due to the space limitation, solutions for dealing with

other types of polymorphic attacks are discussed in [1].

Under the polymorphic shellcode attacks, the attacker can choose an unknown encryption

algorithm to encrypt the attack code and include the decryption code as part of the attack packet.

The trick to make the whole thing work is to utilize an existing buffer-overflow exploit and to set

the “return” memory address on the over-flowed stack to be the entrance point of the decryption

code module. The attacker can transform every other bit in the packet payload to avoid being

detected by a signature-based IDS, but a critical constraint exists on the range of the “return”

memory address that can be twisted. Our solution, Buttercup, is simply to identify the ranges of

the possible return memory addresses for existing buffer-overflow exploits, and if a packet

contains such addresses, a red/yellow flag might be raised. For the evaluation of false positive, we

have modified SNORT and selected 19 exploits to run against 13 different TCPdump traffic files.

For one of our range matching algorithms, the false positive is as low as 0.01%, while other

simpler algorithms are all below 1.13%.

One significant motivation/objective for the Buttercup project is to identify and drop internet

worm attacks at the edge of the Internet. Today, most existing solutions for worms require a

human analysis of the worm binary code, first, and then develop signatures for catching the

worms. Unfortunately, we believe that the process of worm code analysis will take some

significant amount of time, and by the time the puzzle is solved, the damage has been widely

spread and maybe uncontrollable. If the attacker developed a polymorphic version of worms, the

analysis will be much harder because, first, we need to understand an unknown encryption

algorithm. However, since all the worms (CodeRed, Nimda, Slammer, Blaster) utilize some

existing “buffer-overflow” exploits, if we can recognize the potential return memory address

ranges, we can catch it from the birth of the worms. Furthermore, when the worm is identified via

the Buttercup module, it should be dropped immediately. For worms, based on our experience,

3

we have to drop every single worm packet, otherwise, they will be spreading themselves in very

high speed. With Buttercup, we can drop all worms based on the known buffer-overflow

vulnerabilities, while, according to our evaluation, only 0.01% of the good packets in the Internet

will be mistakenly dropped.

II. SNORT, a Signature-Based IDS

SNORT [2,3] is an open source lightweight signature-based IDS and it is a representative of any

signature-based IDS. Snort rules are simple to write, yet powerful enough to detect a wide variety

of hostile or merely suspicious network traffic. An example rule below contains protocol,

direction, port, and other attack related information:

alert tcp any any -> 10.1.1.0/24 80 (content: "/cgi-bin/phf"; msg: "PHF probe!";).

There are, however, some weaknesses in a signature-based NIDS like SNORT, and these

weaknesses can be exploited by an attacker to evade the NIDS and to successfully attack his/her

target. SNORT has a preprocessor, spp_fnord, for detecting polymorphic shellcode, by searching

for a certain length pattern of no-op like characters, but it is port and length dependent. Please

note that a really skillful attacker can avoid or transform the no-op operations as well.

III. Some Background about Buffer Overflow

On many C implementations, writing past the end of an array declared auto in a routine causes the

execution stack to get corrupted. This code is said to smash the stack [4] and can cause return

from the routine to jump to a random address. By placing our own code at a particular memory

location, and causing the return address variable on stack to point to that location, it is possible to

take over control of a system and obtain root privileges on it. Over the last few years, there has

been a great increase in the number of buffer overflow vulnerabilities being discovered and

exploited. Some of the examples of attacks exploiting buffer overflow vulnerabilities are Code

Red I, Nimda, SQL/Sapphire/Slammer, and Blaster worms.

Processes are divided into three regions: Text, Data and Stack. The text region is fixed by the

program and includes code (instructions) and read-only data. This region corresponds to the text

4

section of the executable file. This region is normally marked as read-only and any attempt to

write to it will result in a segmentation violation. The data region contains initialized and

uninitialized data. Static variables are allocated at load time on the data segment and dynamic

variables are allocated at run time on the stack.

A stack is an abstract data type, which has the LIFO (last in, first out) property i.e., the object that

has been placed last on the stack will be the first object removed. The stack is used to

dynamically allocate the local variables used in functions, to pass parameters to functions, and to

return values from functions as shown in Fig. 1. It also stores the return addresses for function

calls i.e. the address of the instruction to be executed after the return from the function call. This

is what makes it vulnerable.

A buffer is a contiguous block of computer memory that holds multiple instances of the same data

type. A buffer overflow [5,6,7] is the result of stuffing more data into a buffer than it can handle.

A typical example is when a function copies a supplied string into an allocated buffer space

without bounds checking by using a strcpy() instead of strncpy(). The contents of the supplied

string that do not fit into the allocated buffer space overwrite the bytes after the allocated buffer

space in the stack, including the return address. When the stored return address on the stack gets

replaced by some arbitrary value due to a buffer overflow, the function returns and tries to read

the next instruction from that address. This results in a segmentation violation.

bottom of/top of top of/bottom of memory /stack memory/stack buffer sfp ret *str <------ [ ][ ][ ][ ]

Fig. 1: Structure of a stack

By sending a string that overflows a buffer such that it fills the return address on the stack, with

an address where arbitrary code is placed by the attacker, he/she could use the buffer overflow

vulnerability to execute his/her own code. This kind of an attack is mostly used by a malicious

user to gain root access on a machine and to execute code on it. In most cases, a buffer overflow

attack is simply used to spawn a shell. From the shell, other commands can be issued. The

5

hexadecimal representation, of the commands in machine language, which are used to spawn a

shell, is sent as a part of the string that is used to overflow the buffer. This string is thus called the

shell code.

Importance of return address: When a buffer overflow vulnerability is discovered, the most

important requirement for an exploit to work is to get the return address right. A buffer overflow

exploit involves loading shellcode onto the buffer we are overflowing and overwriting the return

address variable of the stack frame (which contains parameters to a function, its local variables,

and the data necessary to recover the previous stack frame, including the value of the instruction

pointer at the time of the function call) so it points back into the buffer. Hence, the address placed

in the return address variable would be a value within the address space allocated for the process

i.e., the shellcode is executed off the stack. If the shellcode occupies a portion of memory other

than the memory space of the program we are trying to exploit, a segmentation violation occurs.

The problem faced when trying to overflow the buffer of another program is to figure out at what

address the buffer (and thus the exploit code) will be. The answer is that, for every program, the

stack starts at the same address. Most programs do not push more than a few hundred or a few

thousand bytes into the stack at any one time. Therefore by knowing where the stack starts one

can try to guess where the buffer one is trying to overflow, will be. The program can take as a

parameter the buffer size, and an offset from its own stack pointer (where we believe the buffer

we want to overflow may live). This method of guessing the offset is only applicable to local

buffer overflow exploits and to exploits that are run on the same operating system as the target

machine.

However, trying to guess the offset, even while knowing where the beginning of the stack lives, is

nearly impossible. The problem is, we need to guess exactly where the address of our code will

start. If we are off by one byte more or less, we will just get a segmentation violation or an invalid

instruction. One possible solution is to pad the front of the buffer overflow with NOP instructions

that perform NULL operations. Hence, half of the overflow buffer is filled with them. The shell

6

code is placed at the center, and then followed with the return addresses. If the return address

points anywhere in the string of NOPs, they will just get executed until they reach the shell code.

Assuming the stack starts at 0xFF, that S stands for shell code, and that N stands for a NOP

instruction, the new stack would look like this:

bottom of EEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 123456789AB CDEF 0123 4567 89AB CDEF memory buffer sfp ret a b c <----- [NNNNSSSSSSS][0xE2][0xE2][0xE2][0xE2][0xE2] ^ | |_____________________| Fig. 2: Structure (graphical representation) of a buffer overflow exploit

IV. Polymorphic shellcode

Polymorphic shellcode [8] is basically a functionally equivalent form of a buffer overflow exploit

with a different signature on the network. The attack code is subtly transformed such that it looks

different from the known signature. As it hits the target machine, it reassembles, having eluded

the IDS [9].

A well-known tool that generates polymorphic shellcode is a polymorphic buffer-overflow engine

called ADMutate [10]. An attacker feeds the ADMutate a buffer overflow exploit to generate

hundreds or thousands of functionally equivalent exploits [11]. This is accomplished by using

simple encryption techniques, along with the substitution of functionally equivalent machine-

language instructions. This confuses many IDS tools (including Snort) that search for the familiar

NOP sled or the known machine-language exploit included in buffer overflows, as ADMutate

dynamically modifies these elements.

A buffer overflow attack script consists of three parts, a set of NOPs, the shellcode, and the return

address in the form [NNNN][SSSS][RRRR]. In polymorphic shellcode, the NOPs are replaced by

a random mix of no-effect instructions and the shellcode is encrypted differently each time, thus

making signature-based detection by an NIDS, that looks for NOPs or certain strings within the

shellcode, impossible. Having generated encoded shellcode and substituted NOPs, ADMutate

7

then places the decoder in the code. The shellcode is then of the form

[NNNN][DDDD][SSSS][RRRR], where “D” represents the decoder. It is not possible to detect

the decoder either since techniques such as multiple code paths, non-operational pad instructions,

out-of-order decoder generation and randomly generated instructions make it look different each

time. The use of sliding keys effectively eliminates the ability to recover the plaintext shellcode

by means of the reversible nature of xor. The only part of the script that remains constant through

each instance of a buffer overflow attack is the return address. In fact, even the return address is

modified by modulating its least significant bit, but when this is done, sometimes, the address

may no longer be valid when it hits the target. Hence, we intend to use this part of a buffer

overflow attack script in enabling an IDS to detect polymorphic shellcode.

V. ButterCup: an IDS architecture against Attack Polymorphism

As we saw above, one solution to the problem of determining the return address to exploit a

buffer overflow vulnerability is, to pad the front of the shellcode with NOP instructions. If the

return address points anywhere within the NOPs, they will just get executed till the exploit code

is reached. Using this method, the exploit might work for a certain range of the offset values since

the return address could point anywhere within the string of NOPs.

Hence, for every buffer overflow vulnerability, the return address is overwritten with a value,

which can only lie within a certain range of values (the process’ address space). By determining

the address range for a particular buffer overflow exploit and looking for values that lie within

this range, in incoming packets, we hope to detect the exploit.

Determination of address range values: Determining a lower limit and an upper limit within

which the return address can fall can reduce the range of values, which need to be checked,

further. The lower limit would be the address at which the buffer starts since the string we send to

overflow starts at the start of the buffer and cannot be placed in a memory area with an address

less than the address of the buffer.

8

Let’s take a look at the example we saw above (fig. 2). In this example, since the buffer starts at

address 0xE1, the lower limit of our address range would be 0xE1 and not any value lower than

that. Since the string in the example can be changed by increasing or decreasing the number of

NOPs, we try to determine a suitable range that would help us detect the attacks even if the

number of NOPs is changed.

In addition to having the form [NNNN][SSSS][RRRR], the attack script can also be of the form

[RRRR] [NNNN][SSSS], especially in cases where the buffer is small. In this case, the buffer and

the return address field are filled up with the address where the shellcode is to be found. The

attack in this case looks like this:

bottom of DDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory BCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory buffer sfp ret a b c <------ [0xF80xF80xF80xF8][0xF8][0xF8][NNNN][SSSS][SSSS] | ^ |_________|

Fig. 3: Structure of a buffer overflow exploit demonstrating the range of address values

In the case where the attack is of the form [NNNN][SSSS][RRRR], the upper limit would be the

(address of the return address field - length of the shellcode). In the case where the attack is of

the form [RRRR] [NNNN][SSSS], the upper limit would be (bottom of stack – length of the

shellcode). The higher of these two values is obviously the one in the second case.

We have thus determined that there is definitely an upper limit and a lower limit within which the

return address of the shellcode of a buffer overflow exploit should fall. The only task left now is

determining the address range. This range of values can aid in the detection of a particular buffer

overflow exploit.

An example: We modified the values, for the offset from the stack pointer and the number of

NOPs, in an exploit code that exploited a local buffer overflow vulnerability and found that there

definitely was a range of values within which the return address value had to fall. If the values of

the offset and number of NOPs were changed such that the return address value fell outside this

9

range, there was a segmentation fault. The lower value of the range was found to be 0xbffff62c,

which was the point where the buffer started, and the higher value was 0xbffff9c4.

By analyzing the exploit codes, one can determine the range of the return address values. The

solution we provide is to enable Snort to analyze packets and check for 32-bit values, which lie

within the range of addresses for a particular buffer overflow vulnerability.

Implementation of proposed solution: We implemented the solution by including a new keyword

in Snort-2.0.0 called “range”. We call this implementation of our solution in Snort, Buttercup. In

Buttercup, a new detection plugin file named sp_range_check was included, which takes 32 bits

at a time from the payload of the incoming packet, starting from the first byte, and compares it

against the two values provided as the values for the “range” keyword. If it lies within the range,

then the buffer overflow alert corresponding to those return address values is generated. Else, the

32 bits starting from the next byte are compared with the two values. The range values are

obtained by getting the return address used for a particular buffer overflow exploit and initially,

the lower limit is taken to be a value –200 from the return address value and the upper limit is a

value +200 from the return address value. In this way, the entire packet is analyzed. An example

of a rule to detect a buffer overflow exploit using the range keyword is as follows:

alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"Intel PXE buffer overflow"; range:"|bfffef94-bffff124|";) From the exploit code for the Intel PXE Buffer overflow, the return address of the shellcode was

determined to be 0xbffff05c. The lower limit was obtained by subtracting 200 from this value

(0xbffff05c – 200 = 0xbfffef94) and the upper limit was obtained by adding 200 to this value

(0xbffff05c + 200 = 0xbffff124). A rules file named my.rules was included in the rules directory

of snort and 22 rules were included for 22 different buffer overflow attacks, and the code was

tested for false positives. Among the 22 rules, 3 were later commented out since they generated a

lot of false positives.

10

Among these, the Sendmail’s prescan buffer overflow, the XFree86 XLOCALEDIR buffer

overflow, the Linux ATM buffer overflow and the KON buffer overflow vulnerabilities are local

vulnerabilities i.e., they can be exploited by a local user to gain root privileges.

We also obtained a range for the Microsoft Windows RPC Buffer Overflow vulnerability, which

was exploited by the very recent Blaster worm that caused a lot of damage worldwide. We

obtained this range by studying some exploit codes for this vulnerability. The lower range and

higher range values were found to be 0x77d73713 and 0x77f92b63 respectively. The return

address values are different for different versions and service packs of the Windows operating

system, which the code exploits, and hence, these values were derived by subtracting 200 from

the lowest of the return address values and adding 200 to the highest of the return address values.

However, a rule for detecting this attack wasn’t added to our rules file before we performed all

the tests, since this vulnerability was exploited only recently.

Steps proposed to reduce false positives: In order to reduce the number of false positives further, 2

other keywords, ‘rangeoffset’ and ‘rangedepth’ were introduced. The value provided with the

‘rangeoffset’ indicates the starting point in the packet payload from where the 32-bit values are

checked. The ‘rangedepth’ sets the maximum search depth for the range check function to search

from the beginning of its search region. The ‘rangeoffset’ and ‘rangedepth’ options are used as

modifiers to rules using the ‘range’ option keyword. By carefully studying the buffer overflow

exploit code, we can determine the part of the shellcode in which the return address is placed and

thus provide values for the above two option keywords. We also used the ‘dsize’ option keyword,

already implemented in Snort, in order to flag alerts only for those packets that have payloads

whose length falls within a given range in addition to containing the particular return address

values. Using these three additional keywords, the number of false positives was brought down

considerably. An example of a Snort rule containing the ‘dsize’, ‘rangeoffset’ and ‘rangedepth’

keywords, in addition to the ‘range’ keyword, is as follows:

11

alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS any (msg:"MSSQL2000 remote UDP exploit"; range:"|42ae1000-42b0caa4|"; dsize:475<>550; rangeoffset:97; rangedepth:20; )

The above rule is used for detecting attacks that exploit the MS SQL 2000 buffer overflow

vulnerability. We detect these attacks by looking for values lying between 42ae1000 and

42b0caa4, only in packets whose size falls in the range 475-550. Also, we look for these values

starting from the 97th character in the packet payload and only within 20 characters from the

starting point. As we can see from this example, this greatly reduces the amount of processing

that the IDS needs to do since it looks for the address values only in certain packets and only in

certain portions of those packets instead of searching all the packet payloads from start to finish.

The values for the ‘dsize’ keyword were obtained by studying the exploit codes for each of the

buffer overflow vulnerabilities mentioned above, and determining the size of the shellcode.

Similarly, the values for ‘rangeoffset ‘ and ‘rangedepth’ were obtained by studying the shellcode

and determining exactly which parts of the shellcode contain the return address values. However,

due to the complex nature of the shellcodes, we could not get values for the ‘dsize’, ‘rangeoffset’

and ‘rangedepth’ keywords for all of the exploits.

Buffer Overflow OS/application Low address High address Dsize<> Range Range vulnerability value value Offset Depth - ATFTPd buffer overflow Linux-Debian 3.0 0x08055544 0x080556d4 502<>570 248 16 - Snort TCP Stream Reassembly Integer Overflow Snort 1.9.1 0x0819fdfa 0x0819ff8a 3830<>3900 646 4 - IIS WebDAV buffer overflow Windows 0x4142427c 0x4142440c 1014<>1050 922 102 - MSSQL200 Remote UDP exploit Windows 0x42ae1000 0x42b0caa4 475<>550 97 20 - SQL/Sapphire/Slammer worm Windows 0x42b0c914 0x42b0caa4 - - - - IIS5.0 .idq overrun Windows 0x77e51616 0x77e517a6 1048<>1100 - - - Code Red Worms Windows (IIS) 0x7801cb0b 0x7801cc9b 285<>350 245 - - Kerio Personal Firewall buffer overflow Windows 0x780705c8 0x78070758 5267<>6050 5268 4 - Sendmail’s prescan buffer overflow FreeBSD 0xbffe1e6e 0xbffe1ffe 3062<>3100 - - - File buffer overflow Linux 0xbfffbc78 0xbfffbe08 6180<>6250 0 20 - PGP4Pine buffer overflow Linux 0xbfffdb08 0xbfffdc98 291<>350 256 45 - ISC DHCPD buffer overflow Linux 0xbfffdd70 0xbfffdf00 240<>300 - - - XFree86 XLOCALEDIR buffer overflow Xfree86 v 4.2.x 0xbfffe7a5 0xbfffe935 5090<>6050 58 - - Intel PXE buffer overflow Linux -Red Hat 0xbfffef94 0xbffff124 1018<>1050 1024 4 - PoPToP PPTP Server buffer overflow Linux 0xbffff478 0xbffff648 490<>550 320 4 - GKrellM buffer overflow Linux-Debian 3.0 0xbffff703 0xbffff893 490<>550 156 4 - Linux ATM buffer overflow Linux 0xbffff798 0xbffff928 242<>300 227 25 - KON buffer overflow Linux 0xbffffef9 0xc0000089 790<>850 - - - WebAdmin.exe buffer overflow Windows 2000 0xd6bf523 0xd6bf53cf - - -

12

Table 1: Buffer overflow vulnerabilities, included in the rules file, their address ranges and the values of ‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords.

We hope that through deeper evaluation, these values can be obtained for all the exploits. Table 2

below lists the buffer overflow vulnerabilities for which rules have been included in our version

of Snort, alongwith the address ranges (in hex) we look for (when the range is +-200), and the

values for the ‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords for all of our rules.

The symbol ‘<>’ denotes that the address range is checked only in packets whose size falls within

a certain range. The lower value of this range was determined by subtracting a value of 10 from

the size of the shellcode obtained from the buffer overflow exploit, and the higher value was

obtained by rounding off, the value determined from the shellcode, to the nearest 50. We also

performed tests checking for address ranges only in packets whose size exceeds a certain value

and in this case, the symbol ‘>’ is used with the ‘dsize’ keyword.

VI. Simulation and Analysis

In this section, we describe the various tests that were performed on Buttercup in order to

compare its performance with the original version of Snort. In order to determine the performance

of our IDS architecture against polymorphic shellcode, various parameters, such as ‘range’ and

‘dsize’ values, were changed in our implementation and the performance of Snort observed in

terms of processing time and percentage of alerts generated.

Simulation: For our simulation, approximately 50 real tcpdump files of network traffic were

obtained from the MIT Lincoln Laboratory IDS evaluation Data Sets. These tcpdump files were

provided as input to Buttercup, which included the ‘range’, ‘dsize’, ‘rangeoffset’ and

‘rangedepth’ keywords and 19 new rules. Buttercup was then tested for false positives on each of

these files.

13

In Table 2, we look at the total number of packets that each of the tcpdump files has, and we then

compare the number of alerts generated by the unmodified Snort against the number of alerts

generated by Buttercup. The version of Buttercup used in this case has rules that have the range

values of +-200 and do not include the ‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords.

Table 3 depicts the results obtained in the form of the percentage of alerts generated i.e. (no. of

alerts / no. of packets) when several tcpdump files were taken as input by Buttercup. In order to

observe how the number of alerts would change when the range values were changed, we present

the percentage of alerts for range values of +-50, +-100, +-200, +-250, +-300, +-400 and +-500 in

table 1 below.

Table 4 again depicts the change in the percentage of alerts, but his time, comparison is made

between the cases where the rules have just the ‘range’ keyword alone, the rules have the ‘dsize’

keyword, with symbol ‘<>’, in addition to the ‘range’ keyword, the rules have the ‘range’,

‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords and the symbol ‘>’ is used with the ‘dsize’

keyword.

Since, in the above two cases, we only want to concentrate on how many alerts Buttercup

generates due to the buffer overflow rules we have added, we only include our rules file my.rules

in the configuration file, snort.conf.

Finally, Table 5 depicts the change in the processing times of original Snort and Buttercup. In this

case, since we are concerned about how our modified Snort compares with the unmodified Snort,

we include all the rules files in the configuration file, snort.conf.

Fig. 5 and fig. 6 are graphical representations of the results presented in Table 3. Fig.7 is a bar

graph representing the results presented in Table 4.

Tcpdump files Total no. of No. of Snort No. of Buttercup packets alerts alerts

inside.tcpdump-00 159658 87 1064 outside.tcpdump-00 583050 132 4242 sampledata01-dump 14523 38 32 tcpdins-00 649787 34056 2023 tcpdwk1mon-98 634595 174 7131 tcpdwk1tue-98 598569 165 6417 tcpdwk2wed-98 811678 169 6402 tcpdwk2thu-98 966468 273 9536 tcpdwk2fri-98 475060 37725 1423 tcpdinswk1mon-99 1492331 20394 7533 tcpdinswk1tue-99 1237119 3435 7161 tcpdinswk1wed-99 1726319 37316 8994

14

Table 2: Total no. of packets and no. of alerts generated by Snort and Buttercup for various tcpdump files

Table 3: Percentage of alerts generated by Buttercup for various address ranges and tcpdump files

Table 4: Percentage of alerts generated for various versions of Snort for a range value of +-200.

where

RANGE Tcpdump files +-50 +-100 +-200 +-250 +-300 +-400 +-500 inside.tcpdump-00 0.3488 0.6213 0.6664 0.6746 0.6927 0.7165 0.7973 outside.tcpdump-00 0.3967 0.6727 0.7276 0.7356 0.7541 0.7797 0.8174 sampledata01-dump 0.1928 0.2066 0.2203 0.2203 0.2203 0.2203 0.2617 tcpdins-00 0.1617 0.2846 0.3113 0.3176 0.3296 0.3421 0.3684 tcpdwk1mon-98 0.7181 0.9904 1.1237 1.1336 1.2203 1.2704 1.3104 tcpdwk1tue-98 0.6796 0.9422 1.0721 1.0804 1.1566 1.2009 1.2415 tcpdwk2wed-98 0.4927 0.7035 0.7887 0.8080 0.9022 0.9390 1.0755 tcpdwk2thu-98 0.5730 0.8782 0.9867 1.0081 1.1236 1.1945 1.3179 tcpdwk2fri-98 0.2450 0.2823 0.2995 0.3092 0.3360 0.3517 0.5054 tcpdinswk1mon-99 0.2630 0.4639 0.5048 0.5125 0.5337 0.5551 0.6076 tcpdinswk1tue-99 0.3039 0.5186 0.5788 0.5875 0.6125 0.6366 0.7188 tcpdinswk1wed-99 0.2678 0.4734 0.5210 0.5284 0.5431 0.5634 0.6031 tcpdinswk2mon-99 0.2670 0.4439 0.4927 0.4996 0.5172 0.5393 0.5749

Snort versions Tcpdump files

BC-range BC-range- BC-range- BC-range- BC-range- dsize<> dsize<>-RO-RD dsize> dsize>-RO-RD Inside.tcpdump-00 0.6664 0.0144 0.0138 0.5293 0.2468 Outside.tcpdump-00 0.7276 0.0245 0.0249 0.5987 0.2833 sampledata01-dump 0.2203 0 0 0.2203 0.0275 tcpdins-00 0.3113 0.0057 0.0051 0.2408 0.1039 tcpdwk1mon-98 1.1237 0.0077 0.0093 0.9642 0.3240 tcpdwk1tue-98 1.0721 0.0075 0.0092 0.9275 0.3229 tcpdwk2wed-98 0.7887 0.0067 0.0059 0.6779 0.2592 tcpdwk2thu-98 0.9867 0.0153 0.0110 0.8788 0.3857 tcpdwk2fri-98 0.2995 0 0 0.2804 0.0539 tcpdinswk1mon-99 0.5048 0.0138 0.0132 0.4020 0.1916 tcpdinswk1tue-99 0.5788 0.0122 0.0117 0.4210 0.2202 tcpdinswk1wed-99 0.5210 0.0106 0.0099 0.4083 0.1902 tcpdinswk2mon-99 0.4927 0.0107 0.0098 0.3845 0.1710

15

BC-range – Buttercup with only ‘range’ keyword and range of +-200.

BC-range-dsize<> – Buttercup with range of +-200 and ‘dsize’ <> values (values derived from

size of shellcode).

BC-range-dsize<>-RO-RD – Buttercup with ‘range’ of +-200 and ‘dsize’ <> values (values

derived from size of shellcode) and ‘rangeoffset’ and ‘rangedepth’ keywords included.

BC-range-dsize> – Buttercup with ‘range’ of +-200 and ‘dsize’ > value (size of shellcode

obtained from buffer overflow exploits).

BC-range-dsize>-RO-RD - Buttercup with range of +-200 and ‘dsize’ > value (size of shellcode

obtained from buffer overflow exploits) and ‘rangeoffset’ and ‘rangedepth’ keywords included.

Table 5: Processing times (in seconds) of different versions of Snort

where

Snort-2.0.0 - original snort-2.0.0 with all rules files included in snort.conf.

BC-range – Buttercup with all rules files included in snort.conf and only ‘range’ keyword with a

range of +-200.

BC-range-dsize<> – Buttercup with all rules files included in snort.conf and ‘range’ keyword

with a range of +-200 and ‘dsize’ (<> values) keyword.

Snort versions Tcpdump files No. of packets Snort-2.0.0 BC-range BC-range- BC-range- dsize<> dsize<>-RO-RD phase-1-dump-00 40 0.311 0.301 0.308 0.314 phase-1-dump-2-00 4 0.156 0.181 0.162 0.160 phase-2-dump-00 158 0.12466 0.21532 0.12144 0.13130 phase-2-dump2-00 6 0.1394 0.1335 0.1330 0.1353 phase-3-dump-00 225 0.10749 0.19784 0.12670 0.42505 phase-3-dump-2-00 72 0.12826 0.19548 0.46062 0.47378 phase-4-dump-00 520 0.54868 0.36267 0.33663 1.63335 phase-4-dump-2-00 203 0.17332 0.53444 0.54236 0.76637 phase-5-dump-2-00 954 0.30983 0.36433 0.35798 0.48870 sampledata01-dump 14523 1.33127 5.76940 3.51988 3.43390 tcpdwk3mon-98 793256 73.11217 215.2422 219.2653 230.4325 tcpdwk3tue-98 393566 37.42337 135.6459 125.3525 149.3899

16

BC-range-dsize<>-RO-RD – Buttercup with all rules files included in snort.conf and ‘range’

keyword with a range of +-200 and ‘dsize’ (<> values), ‘rangeoffset’ and ‘rangedepth’ keywords.

Fig. 5: Graph showing change in percentage of alerts with change in address range values for 2 tcpdump

files.

Bar graph for pe rce ntage of ale rts for various addre s s range values

0 0.2 0.4 0.6 0.8

1 1.2

inside.tcpdump_00 tcpdw k2w ed-98 tcpdinsw k1w ed-99

Tcpdum p file s

P er

ce nt

ag e

of a

le rt

s

+-50

+-100

+-200

+-250

+-300

+-400

+-500

Fig. 6: Bar graph showing percentage of alerts for various address range values for 3 tcpdump files.

Percentage of alerts vs Range values

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 100 200 300 400 500 600

Range values

P er

ce nt

ag e

of a

le rt

s

inside.tcpdump-00

outside.tcpdump-00

17

Bar graph s how ing pe rce ntage of ale rts for various ve rs ions of Snort

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

inside.tcpdump_00 tcpdw k2w ed-98 tcpdinsw k1w ed-99

Tcpdum p file s

P er

ce nt

ag e

of a

le rt

s BC-range BC-range-dsize<> BC-range-dsize<>-RO-RD BC-range-dsize> BC-range-dszie>-RO-RD

Fig. 7: Bar graph showing percentage of alerts for various versions of Snort for 3 tcpdump files

Performance: In Table 2, we see that the number of alerts generated by Buttercup is far greater

than those generated by Snort. This is most probably due to a large number of false positives

generated by Buttercup since the version of Buttercup used here does not contain the ‘dsize’,

‘rangeoffset’ and ‘rangedepth’ keywords.

From Table 3, we observe that as the address range values are increased from +-50 to +-500,

there is a corresponding rise in the percentage of alerts generated. The rise is however, not a

linear one, as can be observed from the graph in fig. 5. The percentage of alerts increase sharply

between a range of 50 and 100, increases less sharply for range values between 100 and 200,

doesn’t change too much for the range values between 200 and 300 and again between range

values of 300-400 and 400-500, the increase is pretty sharp. From Fig. 6, which shows the bar

graph comparing the percentage of alerts for the various address range values, we can see that the

percentage of alerts for the range values of 200 and 250 are the closest in value. It can thus be

safely concluded that the optimum range values are between 200 and 250.

Table 4 compares the percentage of alerts generated for different versions of Buttercup for

various tcpdump files. It can be observed from Fig. 7, which shows the bar graph more clearly

depicting the percentage of alerts for the various versions of Snort, that the percentage of alerts is

18

the greatest when only the ‘range’ keyword is used, is lesser when the ‘dsize’ (with symbol ‘<>’)

keyword is included and is the least when the ‘rangeoffset’ and ‘rangedepth’ keywords are also

included. Hence, by studying a buffer overflow exploit carefully and determining the size of the

shellcode and the part of the shellcode that contains the return addresses, the number of false

positives can be brought down considerable, thereby, enabling a more accurate detection of buffer

overflow attacks.

However, the drawback of narrowing the payload in which to look for address ranges is that some

of the buffer overflow attacks may not be detected if there is a miscalculation in the ‘rangeoffset’

and ‘rangedepth’ values or if the shellcode is modified considerably. The same behavior repeats

for the cases where symbol ‘>’ is used with the ‘dsize’ keyword, but the percentage of alerts is far

greater than those where symbol ‘<>’ is used. Hence, calculating the range in which the size of

the shellcode falls helps us determine buffer overflow attacks more accurately than just looking

for packets that are larger than a given size. It must be pointed out here that there is definitely a

range for the size of the shellcode, since there aren’t too many ways of modifying the size of a

particular shellcode other than varying the number of NOPs included.

Table 5 compares the processing times of four different versions of Snort for different tcpdump

files and also lists the number of packets in each file. It can be observed that the tcpdump files

used aren’t the same as the ones used in the above three cases. This is because these are the

smaller tcpdump files, which didn’t generate too many alerts and hence, are unsuitable for

determining the performance of Snort in the first three cases. These files are, however, useful in

this case, since the larger files cannot be used for determining the processing times because all the

rules files are included in the snort.conf (since the performance is compared to the unmodified

Snort with all its rules files included). Since all the rules files are included, when the large

tcpdump files are used, too many alerts are generated and Snort halts.

It can be observed that the processing time increases sharply when the ‘range’ keyword is

included as compared to the unmodified version. However, when the ‘dsize’ keyword is included,

19

the processing time decreases since only packets whose payload size falls within a specific

payload are searched for the address ranges. This considerably brings down the processing time.

We would expect the processing time to decrease further when the ‘rangeoffset’ and ‘rangedepth’

keywords are added since the payload in which to look for address ranges is further narrowed

down, but this doesn’t happen. In fact, the processing time increases slightly. It can be concluded

that this happens due to the extra processing involved with the inclusion of two new keywords.

Also, it should be noted that this behavior is true for most of the tcpdump files, but, as can be

observed from Table 5, for some of them, the results are different. This is due to the fact that due

to the complexity of some of the exploit codes, the ‘rangeoffset’ and ‘rangedepth’ values for all

the rules could not be determined. Hence, some of the rules have just the ‘range’ and ‘dsize’

keywords, thereby leading to the inconsistency in the results of the tcpdump files. A final

observation is that as the size of the tcpdump files increases, the processing time increases

significantly.

VII. Conclusion

In this paper, we focus on the weakness of signature-based Network Intrusion Detection Systems

in detecting polymorphic attacks. When a regular attack, for which an IDS already has a signature

available in its signature database, is modified or transformed, the IDS might fail to identify

correctly. The same principle can be applied to future Internet worm attacks that we have not seen

before.

We present a new solution here called “Buttercup” to counter against any attacks based on buffer-

overflow vulnerabilities (such as CodeRed, Nimda, Slammer, and Blaster). We have implemented

our idea in SNORT, and included 19 return address ranges of buffer-overflow vulnerabilities. We

introduce three new keywords in SNORT namely ‘range’, ‘rangeoffset’ and ‘rangedepth’ and a

keyword already existing in Snort namely ‘dsize’ to detect packets with potentially return address

values lying within specific ranges. For evaluation, with a suite of tests against 13 TCPdump

traces, the false positive for our best algorithm is as low as 0.01%. This indicates that, potentially,

20

Buttercup can drop 100% worm and other attack packets on the wire while only 0.01% of the

good packets will be sacrificed. We believe that our solution is simple and practical as normally

an exploit is known long before the worms based on that particular exploit are developed and

launched.

Currently, Buttercup will need an accurate input of the return address ranges to be effective. For

high-speed Internet worms, we are currently developing solutions such that Buttercup can

intelligently discover previous unknown address ranges. With this particular capability, we can

even handle attacks with totally “unknown” exploits.

Acknowledgement

This research is sponsored by NSF and ITRI.

References

[1] “On Network-Based Attack Polymorphism” MS thesis, Computer Science Department, UC Davis.

[2] Martin Roesch, “Snort-Lightweight Intrusion Detection for Networks”.

[3] Martin Roesch, “Snort Users Manual”, Snort Release: 1.9.x.

[4] Aleph One, “Smashing the Stack for fun and profit”, http://www.phrack.org/show.php?p=49&a=14

[5] “Buffer Overflows Demystified”, http://www.enderunix.org/docs/eng/bof-eng.txt

[6] Lefty, “Buffer Overruns, what’s the real story?”, http://destroy.net/machines/security/stack.nfo.txt

[7] Fides, “Simple buffer-overflow exploits”, http://www.collusion.org/Article.cfm?ID=176

[8] K. Timm, “IDS Evasion Techniques and Tactics”, http://online.securityfocus.com/infocus/1577

[9] E. Messmer, “Put to the test”, http://www.nwfusion.com/news/2002/0415idsevad.html

[10] “ADMuate Readme”, http://www.ktwo.ca/readme.html

[11] E. Skoudis, “Sneaking Past IDS”, http://www.infosecuritymag.com/2002/jul/sneaking.shtml

[12] “Polymorphic Shellcodes vs. Application IDSs”, NGSEC White Paper, http://www.ngsec.com