Miloslav Trmač: Notes from DIMVA 2008

While attending [DIMVA 2008](http://www.dimva2008.org/), I was making notes, like when attending [DeepSec](http://carolina.mff.cuni.cz/~trmac/blog/2007/notes-from-deepsec-2007/) last year. Slides from the talks are currently not available, the web site only lets you buy the proceedings. I hope the notes are useful.

### Dynamic Binary Instrumentation-based Framework for Malware Defense

Presents a mechanism of running untrusted programs using dynamic binary instrumentation (using [Pin](http://rogue.colorado.edu/pin/)). Two separate Xen domains / VMWare hosts are used: a testing environment, in which the program is instrumented to collect information about execution traces and basic blocks, and a production environment, which uses a generated policy to make sure the program does not deviate from previously observed behavior.

The program is run in the testing environment (using automatically-generated and manually-prepared inputs), and execution traces (information about basic blocks and input data, e.g. system call arguments) are generated. These traces are searched for specified malicious behavior, and a "policy" that describes allowed traces is generated. Execution in the testing environment is roughly 26× slower, but it was implied this step can be performed in the background or perhaps by a trusted server.

In the production environment, the instrumentation is much lighter, and verifies the program is using only the allowed traces (traces that were not observed in the testing environment cause stopping the program). The overhead is roughly 20%.

The main drawback is that unexpected traces in the production environment were observed in 6.8% of executions, which is way too high. The paper also does not mention anything about the size of the policy, I guess this might be a problem for multi-megabyte executables. Overall, I don't know what this achieves compared to running the application in some kind of sandbox—a sandbox that has less than 20% overhead sounds very plausible. Some concerns were also presented about the possibility of malware detecting that it is being instrumented.

### Learning and classification of malware behavior

Attempts to classify malware by their observed behavior (e.g. names of open files or created registry keys). Automatic clustering of observed behavior does not give very good results, so attempt to "machine learn" labels assigned to the malware by COTS anti-virus products. When learning, generalize some operations (e.g. recognize `open("\\windows\\system\*")` each time `open("\\windows\\system\\abcdef")` is recognized) to capture common patterns.

The model created by machine learning can be used to extract the common features of the malware family (e.g. mutex name used by the worm). The model can identify new variants of existing family of malware with 69% accuracy; when fed benign Windows binaries, all were correctly categorized as "unknown".

### Data space randomization

A source-to-source transformation that encrypts almost all data by XORing it with a random mask, using different masks for different objects if they do not alias through pointers, so an overflow of one buffer does not give the attacker control of other data. Execution overhead is 15% on average, up to 30%.

Only "overflow candidates" are XORed, the rest (scalars of which address is not taken) is physically separated using PROT_NONE pages (I'm worried what this would do to stack requirements, and the cost of mprotect() might be higher than the cost of encryption in some cases).

Currently does not handle shared libraries (a common aliasing model would have to be computed when loading the library, which might result in replacing several distinct masks by a single mask—especially difficult when loading the libraries at run-time and the distict masks were already used). It is also likely that with `void *` or `char *` pointers and very large-scale software, especially if no source code is available for some libraries, eventually all or almost all data would alias and the protection caused by using different masks for different variables would vanish.

### XSS-GUARD: Precise Dynamic Prevention of Cross-Site Scripting Attacks

A transformation of Java `.class` files of web applications to create a "shadow" web page while creating the "real" output page. The shadow page contains the same literal strings as the real page, and safe data (`aaaa`) instead of any variables. Thus the shadow page has the same general structure as the real page, but it does not contain any successful XSS attacks—and because the safe data has the same length as the variable string written to the real page, matching document structure between the shadow and real pages is easy.

A Firefox parser is then used to search for scripts (in all the possible places) on the real page. If the corresponding script is not found on the shadow page, an XSS attack was detected. If a script was found, but it is not literally the same (e.g. when generating JavaScript data), the script is parsed and its syntactic structure is compared to detect XSS attacks.

The performance overhead is up to 14% when comparing strings, up to 42% when parsing JavaScript is necessary; considering this eliminates all XSS attacks (unless the target browser is parsing HTML differently and detect scripts in more places), this might very well be worth it.

### VeriKey: A Dynamic Certificate Verification System for Public Key Exchanges

A policy that doesn't show the user an "unknown SSL certificate" dialog.

If the server provides a certificate signed by a trusted CA, accept it if its hostname has the same TLD (solving the `https://paypal.com/` vs. `https://www.paypal.com/` problem), reject it if the hostname has a different TLD. There's a risk of crossing trust boundaries if the TLD has subdomains with separate owners—the classic "`.co.uk` is not a TLD" problem.

If the certificate is not signed by a trusted CA, use a trusted set of verification servers to connect to the SSL server, and compare the certificate returned by each. If the verification servers are located worldwide, and selected to share as little common routes with the client and each other as possible, this prevents a MITM near the client (e.g. by the WiFi network provider), and makes a large-scale MITM difficult to implement because it would have to attack many distinct paths - unless the MITM happens very near the server (e.g. on the router of the server's LAN), in which case no attack is detected.

Overall, this is less secure than ideal SSL, probably more secure than how SSL is used in practice even by IT-savvy people, and much more secure than how SSL is used by barely-knowledgeable computer users. Sending names of SSL servers to the verification servers has privacy implications, and it's not clear who would build such an extremely-trusted verification server network - perhaps some global companies for internal use. The technique can be implemented in a browser, or at a LAN proxy by installing the proxy's CA certificate on clients and letting the proxy "perform a MITM" and give clients connections signed by the proxy's CA instead of the original connections with self-signed certificates; I'm not sure whether this would work with client certificates.

### Keynote talk: "The Future of Network Security Monitoring"

Various observation from the speaker's job at GE.

* In the real world, security is not a 100% requirement, only a trade-off: "Is it acceptable that x% of your laptops is under remote control?" — "What's the cost?" It is therefore important to have hard data on security.
* Companies focus on easy-to-measure but irrelevant "input" metrics (such as unpatched systems or systems with latest anti-virus update installed) instead of the "ouput" metrics (intrusions detected) and metrics that truly affect outputs.
* Anti-virus software is "very" vulnerable, and widely installed, so it might be worth it not to install any—it is certainly not obvious that it should be installed. ("If one AV is good, why not two or three?")
* The saying that "80% of intrusions is caused by insiders" is not backed by any hard data, the last reference is a 1986 study.
* BGP hijacking is very common, IP addresses are not always reliable.
* Windows OS security is getting better recently, the biggest problem is currently third-party Windows applications.
* "Where else do victims investigate their own crime scene?"
* ("Eating carrots improves night vision" is marginally true, but not significant; It is a popular knowledge because it was published by UK war propaganda to divert attention from the fact they have invented a radar.)

### On Race Vulnerabilities in Web Applications

Common race conditions ("x = read(A); write(A, X + 1)" without any locking) are quite common in web applications because programmers don't think about them as parallel programs. PHP does not support any locking mechanism, depends on locks in the database.

The authors have written a tool that logs SQL queries performed in PHP and analyzes them off-line to discover interdependencies (write after a read of the same attribute). Heuristics are used to avoid false positives caused by disjoint WHERE clauses (a constraint solver can be used to check whether WHERE clauses are disjoint, but not in the general case—it cannot handle nested queries). The tool does not support database synchronization primitives and does not examine anything but SQL.

The tool has detected around 40 race conditions in common software (WordPress, phpBB), a few of them are security vulnerabilities.

### On the Limits of Information Flow Techniques for Malware Analysis and Containment

"Information flow technique" means observing how values are copied between variables, and determining whether a security-sensitive variable can be modified by untrusted data. The article argues that the tools to detect information flow might be useful for "regular" programs, but are very easy to defeat by malware—e.g. using control flow to transfer data instead of copying variable values explicitly.

### Embedded Malware Detection using Markov n-grams

"Embedded malware" = malware in "data" files; anti-virus software searches only some portions of files because searching them thoroughly would be too slow. Signature-based techniques are not even sufficient in some formats because the data might be split in several chunks in the file. (I guess the typical embedded malware would be based on an exploit of the data format decoder.) A detector that can quickly scan the whole file to suggest likely malware locations is proposed.

The detector is based on comparing the probability of 2-byte sequences with their conditional probability based on 1-byte probability; it produces very significant distinguishing values for malware (with a cutoff at 5σ).

(The best results were on data formats that are mainly used for compression, such as MP3 or JPEG. It seems to me that the detector is basically searching the compressed data, which should be fairly similar to white noise, and looking for structured data.)

### Keynote talk: "From Virtual Machines to Virtual Infrastructure: How Virtualization is Reshaping the Enterprise and What this Means for Security"

A large part was an introduction to/advertisement for virtualization. Security-related observations:

* The increased flexibility removes some obstacles that previously prevented increases of system/network complexity.
* It is easy to keep running old software versions in VMs
* Many "transient", currently unnecessary VMs may be kept suspended and only running from time to time, so one-time infections may reappear, patches/anti-virus updates are applied late, and remote vulnerability scans do not find the suspended VMs. VM suspension also makes it difficult to expire passwords and keys.
* It is now possible to place VMs in a separate virtual network while installing patches—or install them "off-line" on the VM image.
* Machine owner might not be as clear as in the physical world, people may clone VMs and give them to each other; in the extreme it might be necessary to disconnect a physical machine if the VM administrator can not be found.
* Stateful firewalls might have problems with VM mobility
* VM data may be left behind on hosts after migration, even if it is security-sensitive—and the VM can not control this.
* VM rollback may interfere with external state (e.g. Windows machine passwords)
* VM rollback allows "impossible" replay attacks (e.g. S/Key authentication can be replayed after rollback)
* A VM may generate the same random numbers after rollback, which might be used to generate all sorts of keys.
* Inter-VM traffic is not observable on the internal LAN, so IDS must be on the network edge
* Plans to move IDS/anti-virus outside the VM to make it less vulnerable to attacks from within the VM
* Encrypted inter-VM communication can be observed by reading the internal state from the VM (can be used to observe encrypted malware communication)
* Recording/replaying VMs is quite cheap (56 Kb/s + 5–15%CPU), can be used for foresics

### Expanding Malware Defense by Securing Software Installations

Installation is attractive because it is run as administrator.

The system runs installation in a sandbox, checks the results using a policy (e.g. "does not overwrite files owned by other packages"), then copies the installed files over from the sandbox and modifies the executables to execute in another sandbox.

(IMHO malware running as an user can do enough harm, and is much simpler to install, making the total risk at least comparable to installing software as an administrator. Especially in the chosen Linux distribution platform, vast majority of system-wide installed software probably is either present in the distribution (thus implicitly trusted) or business-critical (e.g. Oracle), and there's comparatively little demand for permanently running installed software in a sandbox.)

### FluXOR: detecting and monitoring fast-flux service networks

(A fast-flux network is a set of zombies that proxy HTTP requests to the real server, hiding the location of the server. A domain under attacker's control is used to direct clients to the zombies.) Describes the result of their network monitoring, searching first for malicious hostnames, then for domains that contain these hostnames, and enumerating zombies available from the domain.

Around 160 new domains are found daily. A commenter said that a total of less than 25, probably around 10 bot nets are using DNS redirection.

### Traffic Aggregation for Malware Detection

Observes network traffic on the border, wants to find malware on the LAN by clustering data about hosts (destination, start of payload, host platform) and finding unusual clusters (using AI techniques).

Host platform is identified using TTL data and seeing whether the host contacts the MS time server. The system identifies usually 2-3 clusters per hour of traffic, and the authors suggest the clusters can be examined manually.

### Rump session

Lots of small presentations, some random notes:

* There is an EU-funded project to develop a safe OS for the internet (with a 2-year evaluation period). One of the projects is SPACLik, based on SELinux and a "high-level security policy language". (Won't the system be too obsolete after it is developed and evaluated?)
* MS patch Tuesday is quite noticeable on statistics of zombie availability
* Hyperjacking (using HW virtualization for hiding malware) is detectable because running in a VM is detectable. Researcher (from IBM) suggests always running a "preventive virtualizator" that prevents installing another, but allows running trusted "nested" virtualizators.
* It was argued that the obsession with small, trusted code that will "solve all security problems" is misguided: Microkernel's don't help because exploiting one of the basic trusted servers is quite sufficent—and Linux demonstrates a large monolithic kernel has a quite small attack surface. "Solving security" by slowing systems down by tens of percent simply doesn't pay, security breaches are too rare to make it worth it. Hardware appliances are not secure because they are simple (millions of lines of Verilog actually), but because it has a small attack surface.

### The Contact Surface: A Technique for Exploring Internet Scale Emergent Behaviors

Describes a graph grouping TCP flow data, and graphing log(number of source IPs) vs. log(number of destination IPs). The graph has showed large anomalies that abruptly stopped, caused by SYN-only flows. A hypothesis (consistent with the graph trace) is that the anomaly is caused by randomized scanning of the whole internet, and is observable only in such a large scale (several /8 networks); on smaller scales there would be nothing noticeable on the graph.

### The Quest for Multi-headed Worms

Describes using correlation and other techniques to detect that distinct attacks are caused by a single worm that randomly switches between attack modes.

### A Tool for Offline and Live Testing of Evasion Resilience in Network Intrusion Detection Systems

Creates traces of ambiguous packet data, and feeds them to IDS to compare them. The results show snort reports many useless messages, and misses real attempts at evasion.

Miloslav Trmač

Wednesday, July 16, 2008

Notes from DIMVA 2008

No comments:

Post a Comment