Tuesday, January 25, 2011

Notes from 27th Chaos Communication Congress - day 4

Here are some notes from the final day of the 27th Chaos Communication Congress. See also
day 1, day 2, day 3.

PDF


Can have a PDF that displays different output depending on OS/Locale - even
without JavaScript. Can do a lot from JavaScript, Postscript if "signed by
trusted cert". PDF is a container - can contain Flash that is auto-started.

PDF streams: ambiguous syntax for determining input size - can overlap other
data in the file. Document metadata is readable and writable in
JavaScript. Lots of metadata that can be used for storing arbitrary data.
Redefinitions of a PDF object - last one wins, even ignores the xref table.

%PDF header can be anywhere in first 1024 bytes => can make a file that is
both valid PDF and {ZIP, EXE, GIF, HTML}.

Antivirus evasion: Not tested - most AVs didn't find even a very common exploit; if
they did, then only using a generic signature. Various format confusion
methods confuse some AVs.

Lightning talks



  • TV-B-Gone for N900 - N900 has an IR xmitter
  • Monitoring spyware - CERT-Polska monitoring ZeUS (not a virus,
    needs to be dropped by something else).
  • SystemTap: "a code injection framework" - need only 3 lines to sniff IM from libpurple
    http://stapbofh.krunch.be/
  • "Hacking governments" - Talk proof of concept: graphing relationship structure of relevant people?
  • UAVP-NG: UAV ("quadcopter") PCBs + software, GPLv3: http://ng.uavp.ch/
  • Privacy: "transparency is more difficult than having things in the open"
  • http://incubator.apache.org/clerezza/: "Semantic web" linking social
    networks
  • bblfish.net = W3C federated social web XG: "WebID"


Data monitoring in terabit Ethernet


It is easy to monitor bus topology; point-to-point harder - especially duplex
(2x link capacity needed). Observing optical traffic can be done using a
splitter, but must be done for each direction separately. This is easier
with switches - can copy traffic, but combined traffic may be too much for
the analysis port.

"Data Mediation Layer": a device that collects observed traffic from >=1
sites, distributes it to >=1 analysis machines (based on rules): aggregation
(>1 input → 1 output), regeneration (1 input copied to >1 output),
distribution (depending on content), filtering (L2-L4), manipulation ("packet
slicing" = discarding packet content, masking (to hide sensitive data),
timestamping, annotation with input port number). The DML allows can
filtering out most traffic and consolidate it (therefore fewer and weaker
analyzing machines are necessary).

Examining existing filters: stored on device - could perhaps use serial TTY
for access?

Web GUI: firmware updates not automated. Gigamon allows getting the filter
list without authenticating.

DML machines have default accounts.

How the Internet sees you


Autonomous system = a network operated under a single policy: only sees what
passes through (assuming no cooperation between entities, or data collection
by law enforcement). A tap = mirror port, optical splitter, or a function in
the switch.

Surveillance (e.g. law enforcement): can see everything, but have to store /
analyze it all.

A "flow" = set of IP packets passing during a certain time interval = (src
IP, src port, dst, IP, dst port); ~50B/flow, flow ID and data volume is
stored. "Netflow" = export of flow records: lower data rate, but no packet
contents, higher router overhead (could fill up the flow table?).

NetFlow v5: common, need v9 for IPv6. IPFIX ~ NetFlow v9: uses "information
elements" = (field ID, value) pairs.

Storage requirements: large ISP (2M flows/s: 2 PB/s all data, 4 TB/day
netflow).

sFlow: sampling - only e.g. 1/4000 packets => little data, easy overhead
(only need to copy headers, not parse the packet), can miss data.

Handling meaning of IP addresses: By log DNS queries and answers, can
understand virtual hosts without parsing HTTP. Also can reveal otherwise
unannounced domains (completely new domain with millions of users probably
means malware).

Using this, ISPs can do accounting/billing, but can also build an user's
profile - based on (restricted set of) used services, but also by connections
to automatic update servers (a signature visible on IP/DNS level).

Experiment on 27c3: 1/4k packets captured, anonymized IPs => can't do DNS, nothing
stored.

"If you want to be anonymous, be a sheep" so as not to stand out - no
clearing cookies, no Tor... "Do not connect to a known Tor exit - use a
special bridge". IPv6 privacy: enabled by default on Windows, disabled on
Linux - probably does not help anyway.

Open source tools: NFSen, ntop

RC4/WEP key recovery attacks


Goal: automated cryptoanalysis tools.

Overview of WEP: Uses RC4 stream cipher = key stream generator to XOR with
plaintext. WEP can lose packets, so each packet is encrypted independently,
RC4 key = (secret key, 24b counter = IV), IV contained in the packet.

RC4: consists of key schedule and a pseudo-random generator. Key scheduling
has a strong bias towards secret key. PRGA: Only 2 bytes in S swapped per
PRG byte => can find biases of PRGA, allowing to guess at scheduled key from
keystream (then we can guess at keystream bytes).

Looking for more PRGA biases:

  • Starting with 1st keystream byte and all relevant inputs, statistically
    find biases. Too many possibilities still - restrict some values to {-1,
    0, 1} => Found new biases, with strength varying by round and values.
  • To try other than values than {-1,0,1}, used Fourier transform chosen
    such that the key schedule correlation applies, and found more biases.


"Black box approach": Just use first 256 bytes of keystream and key bytes as
a linear equation. This is too large => limit to first L bytes of both key
and output => found new biases again (new because we observe correlations
"caused" by PRGA rounds).

Attacks on WEP - improvements: Can recover sum of key bytes (again through a
bias). => To recover WEP key with P=1/2, need only 9,800 encrypted packets.

No comments:

Post a Comment