day 2, day 3, day 4.
Keynote2>
5 years ago speaker said at CCC "we have lost the war": a perfect storm:
post-9/11 paranoia, EU data retention, climate change. As of today, war not
actually lost: German constitutional court started protecting privacy
- OTOH Netherlands has a constitution, but not a court, so the majority can
ignore constitution => fearmongering and other aspects of usual politics;
Netherlands has now become an cautionary tale WRT privacy.
Political recommendations: watch party funding, "literally by all means
defend the constitution".
Speaker "mildly bipolar", was recommended anti-depressants - "being unhappy
has become socially unacceptable". If depression is a force that pushes us
to make painful but necessary changes, antidepressants prevent necessary
change - perhaps there might be a political-pharmaceutical complex one day?
Success on voting machines: Illegal in both Germany and Netherlands (Germany
"safe" - constitutional court, in Netherlands will need to fight this war
over and over again with each single mayor) E-voting in Brazil: black box,
gets an ID card from each voter - future versions will event collect a
fingerprint.
Wikileaks: Speaker did not participate in the latest release, "possible
ramifications scared the bejezus out of me", "I can't live from a backpack".
Important, but outcome is uncertain: "Not sure what has been unleashed" -
attacks on Internet freedom will certainly increase. US proposes to be able
to get plaintext from any service - "Crypto war 2.0 starting". "Anonymous is
getting on my nerves" - "real hackers would not release real names in PDF
metadata", lacking a "level of maturity" - "we" [at CCC] might attack, but
nothing good comes out of it.
Politicians don't know what's going on, can't control it, can only pretend
they are in control for the voters. "Hackers don't have the answers", but
understand the dangers of complexity - "lack of slack"; "CCC does not cause
chaos - we have prevented some aspects, and we understand chaos a little".
Living in a world of separate viewpoints/narratives, from "Apple, Google,
Facebook and the geographically-challenged traditional governments"
Future: Basic story remains - we lost the war. "It's going to be a mess":
"difficult times, not end times" - build trust relationships, diverse skill
sets, be flexible.
CCC logistics: too many of us, need to move out to be able to attract new
people. DEFCON given as a counterexample for expansion, probably does not
apply here - CCC never had their problems..
Code deobfuscation by optimization
http://code.google.com/p/optimice/
An IDA plugin to decode/simplify semantics on an obfuscated code, optionally
"assemble" into a new code segment for further processing in IDA.
Handling obfuscation: Small basic blocks, "push+return" to break IDA graphing
are simplified/converted. Fake paths (conditional jumps to nonsensical
bytes) are simplified. Overlapping instructions are duplicated.
Implementation: Build a CFG. For instruction semantics, use "MazeGen's XML
at ref.x86asm.net" to track all inputs/outputs. Optimizations performed: JMP
threading, conditional jump simplification, dead code removal, various
heuristics, e.g. push+ret->jmp. Constant propagation/folding unimplemented.
Contemporary Profiling of Web Users
Research on defeating web proxies / anonymizers / Tor. etc. "In private
communication research, dummy traffic was researched in the last 20 years and
has never been a solution"
Proxies that remove JavaScript:
We want to limit JS: it can e.g. get screen size, local date (=>clock offset
and drift). Existing proxy projects are dead.
The proxies 1) remove <script>
2) move content out of <noscript>
, both using
regular expressions. PHProxy attack: <noscript><scr</noscript>ipt>
.
Glype: same attack on <object>
. Also, Java can load JS as well:
...showDocument(..."javascript:..."...)
.
When JavaScript is enabled, but filtered: When DOM modification are used to
hide objects, it is usually possible access the originals. NoScript can
forbid 3rd party JS for tracking, but code can still load 3rd party CSS.
Another specific problem is filtering <object archive=...>
, but not
<object><param name=archive .../>
.
Identifying users by web profile
Assuming an anonymizer, or watching users on a DNS server. Identifying the
user is easy with static IP; dynamic IP "should" protect (changing IP
address on DSL, or Tor changing the routing each 10 minutes).
Using standard machine learning using mechanisms for word frequencies to
learn on host access frequencies, using a "multinomial naive Bayes
classifier". In experiment, successfully identified 77% of "links" (user on
day D => user on day D+1). Accuracy is good even with 10-minute sessions
(i.e Tor). Longer time between learning and classification doesn't hurt
much.
Recommendations: Change IP address frequently and do not continue previous
activities after the change. Use _separate_ proxies for each activity.
Randomly distributing activity across multiple proxies does not help - each
proxy has similar data. Visiting only popular pages does not help much.
Detection of bots and other strange users
Motivation: heavy load by bots, proprietary databases crawled by
competitors.
- If load balancing: make it deterministic (e.g. md5 of client's IP), look
for "incorrectly" connecting users. This is trivial, but actually works -
many bots are lazy and just connect to host 0.
- Observe behavior: client that does not access images/JS, connects too often.
User-Agent:
Fake user agents are often too old (IE 5.5). HTTP header
characteristics (order/capitalization) allow quite specific user-agent
detection, allow detection of faked User-Agents. These techniques are
both easier to do "after" load balancing proxies because the proxies will
defragment the input, making evasion more difficult.
Local attacks on Tor
"Local" = connection between client and Tor entry node. Attempting web site
fingerprinting using traffic analysis only (timing, packet sizes), want to
see if a specified site was visited. Timing information is mostly useless
due to Tor's load and circuit changes; Tor "should" protect against size
fingerprinting dues to fixed-size cells.
Machine learning again: Need to train this with each browser separately, and
extract separate requests (not mixed with unrelated sites). Using
"Multinomial naive Bayes", "Support Vector Matrices", training on packet
sizes and direction, "ignoring ACKs" = counting the total size of a transfer
in one direction until the direction changes.
Detection accuracy when training against "all possible sites" is >95% on
OpenSSH,OpenVPN, IPec. Tor started with 3%, can now get 55% => feasible.
"Jap": interestingly the problem is more
difficult on the free version than on the premium one.
Detection when distinguishing between a few "interesting" sites and "rest of
Internet": training with a representative sample for "rest". With 5
"interesting" sites need ~2000 samples for <1% false positives, will get
67% false negatives.
Recommendations: Do multiple things at the same time (tabs, Internet
radio...) - decreases success to ~10%.
Automatic identification of crypto primitives in software
http://code.google.com/p/kerckhoffs/
A master thesis, limitations: Assumes no obfuscation, no JIT/interpretation,
only limited to crypto (nothing else, and not block cipher mode).
Existing tools: signature based, or dynamic instrumentation measuring
percentage of "bitwise" operations (globally / in a basis block / function),
looking for loops that change entropy
Implementation: Intel's PIN for dynamic instrumentation to get insn-level
execution trace. Then reconstructs CFG ("dynamic" = including indirect
jumps), jump taken/not taken statistics, detects loops, memory ranges/areas.
Algorithm identification methods:
- Excessive use of bitwise instructions.
- Sequences of (instruction mnemonics, constant operand) pairs, find fingerprints - e.g. combinations unique for an implementation.
- Loops (X often unrolled): observe (number of executions of the loop as a whole, number of iterations, number of instructions in a loop).
- Look for a specific relation (e.g. AES (input, key, output)) between blocks of data with a suitable size
libUSB
A generic introduction to USB - releases, limitations, transfer types, endpoints, descriptors, usage of libusb on various OSs.
Desktop on Linux
From a PoV of an UNIX sysadmin... It seems difficult to keep up with the
technology changes. Presentation overshadowed with explanations by Lennart
Poettering.
Distributions focusing on "Dumbest Assumable User", few sysadmin controls
available/visible.
MM frameworks: too many layers (=> loses relevant information on the way with
Phonon+GStreamer+PulseAudio). GStreamer backend to Phonon unmaintained,
still used by default in "some distributions"
GDM: complicated - why do we need a full GNOME session? (Answer: a11y pulls
audio, which pulls bluetooth, ...; g-p-m necessary for default power
policy). GDM doesn't handle systems with many users well; Can't show all 3.5k
names in any case. When that is disabled, still shows recent users and
"Other" - users mistake login screen for a screen lock.
ConsoleKit: Sorry state of documentation: "Defining the problem: To be
written" after all these years. Intended to manage separate seats, but ACL
changes illusory without revoke(). Not robust - changes persistently-stored
ACLs but keeps only in-process state => ConsoleKit crash leaves around
obsolete ACLs.
D-Bus complaits: Nonsensical name spacing ("you need a domain", "narcissistic
naming: based on project name, doesn't tell what it actually does). Would
like implementation-independent interfaces [X where to get the
implementations?]. TCP transport: "no authentication, no authorization, no
encryption". (Lennart: ['we' agreed that] "D-Bus won't be used across the
network full stop")
IPv6
Many "old" vulnerabilities were carried forward from IPv4, previously
presented:
- Neighbor discovery spoofing <-> ARP spoofing
- Duplicate address detection DoS [answer "this is a duplicate" to everything]
- Rogue autoconfiguration server <-> rogue DHCP server
Routerless networks: Sending a router advertisement with 0 lifetime "kills"
the router for clients. Per RFCs clients treat any address as _link-local_
if no router exists.
Unexpected RA on an IPv4-only network: switches on dual stacks. Thus we can
bypass IPv4-only firewalls, can MITM on IPv6 because IPv6 transport is
preferred to IPv4.
RA flooding: 1m bogus RAs DoSes Cisco, Windows, old Linux (100% cpu).
Remote ping scans: Originally thought infeasible due to large address space,
broadcast doesn't exist. But we can still use search engines, DNS, common
addresses. Randomly chose 17k DNS names. The following "host address" (= host part
within network) sources exist:
- Autoconfiguration: either link-local = based on MAC (can guess if you know
the (company standard) manufacturer), or "Privacy option" = random and
changing from time to time
- DHCP: allocated sequentially! => "if you got one, you got all"; common
ranges based on example documentation
- Manually configured: ::1, ::2, ..., or ::service_port, IPv4 address, and
simple variants of these.
Overall, can easily guess ~70% of host addresses. A scan only needs to try
~2100 host addresses (1-20 seconds) to get 70-80% of hosts, similarly try
~1500 common host names. A scan may return a router's "not available"
message for a different network, giving us more targets. We can iterate
between guessing hosts on a network, and using reverse DNS to get more
starting points. Altogether we can identify~90-95% of servers (not counting
other kinds of hosts).
Multicast DoS: Multicast background: A "query" router periodically prompts
for confirmation of existence of multicast receivers. We can spoof
"unsubscribing" message, but this will cause another prompt and resumption of
traffic. If we become the "query" router, we can avoid sending the prompt.
"Query" is voted by local link address => 0000000 wins [nobody configures a
router on 0]. Then we can unsubscribe the network, except that other routers
would assume the router is dead if no prompts were seen. To avoid this, send
the prompts - but only to a "router-only" multicast MAC.
To see if a Windows/Linux computer is sniffing the network: Send a packet
(ping) to an _unused_ multicast address, see if the host responds.
Side channels in IPv6: "IPv6 is a side channel" - too much functionality,
cannot be reasonably filtered
Code available at http://www.mh-sec.de/downloads/thc-ipv6-1.4.tar.gz .
Will start www.ipv6{security,hacking}.info
for secure configuration advice.
Mitigations:
- ACLs on L3 switch (e.g. don't allow RAs from client ports), if supported
- IPSec, but a pain
- Secure Encrypted neighbor Discovery - basically happening in the switches,
not supported yet anywhere, still has problems.
- More secure client configuration - not always possible
- Detection of attacks is easy, prevention unknown
Analyzing Stuxnet
Presented by Microsoft "to set the facts straight". Analyzed within a few
weeks after this came in, but not allowed to talk about it at the time.
2 interesting things: _4_ 0-day vulnerabilities, attack on SCADA.
Discovered: by VirusBlokAda (Belarus, not known by Microsoft) sent a sample
in ~July, eventually got the original LNK files. Others are looking at this
as well - need to "know ahead" about threats, ~1 MB of binary; full knowledge
sharing with Kaspersky.
Methodology: "initial triage" - identify surprising code, clues for
vulnerabilities, then discuss details with developers of relevant code.
Total time ~30-40 man-hours in 3-4 days to find the vulnerabilities. Later
completely decompiled 2 components to buildable C.
Attackers: "Don't ask me who the author was". Components were written by
different people. Aiming for 100% reliability, high impact. Developed on
removable media (path embedded in file is B:...). Shell code does not use
simple "call" insns - always "ret".
Attack 1: LNK files
Dumped LNK as text, identified the buggy DLL; all done in ~1 hour. Bug:
.CPL
has icons inside => must do LoadLibrary()
, which calls DllMain()
. Fix:
limit icon loading to only registered .CPL
s.
Impact: Arbitrary code execution without privilege escalation - only a
foothold for further attacks. Looked at attack vectors - in addition to
USB, could use WebDAV (remote attack) => fixed it "out of band",
"telemetry" told them users were being affected. 100% reliable attack
vector. Apparently some people knew about this for years.
Attack 2: Task scheduler
Debugging was not really helpful => using process monitor, event logs -
noticed that task files were accessed. Bug: XML file storing task data
(including the user to run it as => can escalate to LocalSystem) _writable
by user_, authenticated using a CRC32 hash (which was protected against user
access); CRC32 collisions are easy. Fix: use SHA256 (kept files writable
"for compatibility" [with what??? the authentication would break writing
anyway])
100% reliable - but only works on >= Vista.
Attack 3: Keyboard Layout
Eventually found "not immediately obvious" code - searching in win32k.sys
,
NtVirtualAllocateMem
, keyboard layout loading, some IDA-unidentified code.
Tried various things, finally noticed the code looks like a shell code and
inserted a break point in it to get a back trace. Bug: <=XP allowed loading
keyboard layouts from any directory, indexed a function array using an
unvalidated user-controlled integer. Attack looked for a suitable user-land
address following the original table, copied attack code there.
100% reliable - only <= XP, so we can assume the attacked environment is
not monolithic.
Attack 4: Printer spooler
Kaspersky reported suspicious spooler RPC. Network trace: guest printing to
files in %system%
. Spooler should have switched to the client account, but
it doesn't for Guest because it is too limited, so it uses System instead.
Windows by design automatically runs a .MOF file dropped in there :)
This all only works if anonymous connections are allowed, which is very
uncommon in corporations
for "incorrectly" connecting users. This is trivial, but actually works -
many bots are lazy and just connect to host 0.
User-Agent:
Fake user agents are often too old (IE 5.5). HTTP headercharacteristics (order/capitalization) allow quite specific user-agent
detection, allow detection of faked User-Agents. These techniques are
both easier to do "after" load balancing proxies because the proxies will
defragment the input, making evasion more difficult.
Local attacks on Tor
"Local" = connection between client and Tor entry node. Attempting web site
fingerprinting using traffic analysis only (timing, packet sizes), want to
see if a specified site was visited. Timing information is mostly useless
due to Tor's load and circuit changes; Tor "should" protect against size
fingerprinting dues to fixed-size cells.
Machine learning again: Need to train this with each browser separately, and
extract separate requests (not mixed with unrelated sites). Using
"Multinomial naive Bayes", "Support Vector Matrices", training on packet
sizes and direction, "ignoring ACKs" = counting the total size of a transfer
in one direction until the direction changes.
Detection accuracy when training against "all possible sites" is >95% on
OpenSSH,OpenVPN, IPec. Tor started with 3%, can now get 55% => feasible.
"Jap": interestingly the problem is more
difficult on the free version than on the premium one.
Detection when distinguishing between a few "interesting" sites and "rest of
Internet": training with a representative sample for "rest". With 5
"interesting" sites need ~2000 samples for <1% false positives, will get
67% false negatives.
Recommendations: Do multiple things at the same time (tabs, Internet
radio...) - decreases success to ~10%.
Automatic identification of crypto primitives in software
http://code.google.com/p/kerckhoffs/
A master thesis, limitations: Assumes no obfuscation, no JIT/interpretation,
only limited to crypto (nothing else, and not block cipher mode).
Existing tools: signature based, or dynamic instrumentation measuring
percentage of "bitwise" operations (globally / in a basis block / function),
looking for loops that change entropy
Implementation: Intel's PIN for dynamic instrumentation to get insn-level
execution trace. Then reconstructs CFG ("dynamic" = including indirect
jumps), jump taken/not taken statistics, detects loops, memory ranges/areas.
Algorithm identification methods:
- Excessive use of bitwise instructions.
- Sequences of (instruction mnemonics, constant operand) pairs, find fingerprints - e.g. combinations unique for an implementation.
- Loops (X often unrolled): observe (number of executions of the loop as a whole, number of iterations, number of instructions in a loop).
- Look for a specific relation (e.g. AES (input, key, output)) between blocks of data with a suitable size
libUSB
A generic introduction to USB - releases, limitations, transfer types, endpoints, descriptors, usage of libusb on various OSs.
Desktop on Linux
From a PoV of an UNIX sysadmin... It seems difficult to keep up with the
technology changes. Presentation overshadowed with explanations by Lennart
Poettering.
Distributions focusing on "Dumbest Assumable User", few sysadmin controls
available/visible.
MM frameworks: too many layers (=> loses relevant information on the way with
Phonon+GStreamer+PulseAudio). GStreamer backend to Phonon unmaintained,
still used by default in "some distributions"
GDM: complicated - why do we need a full GNOME session? (Answer: a11y pulls
audio, which pulls bluetooth, ...; g-p-m necessary for default power
policy). GDM doesn't handle systems with many users well; Can't show all 3.5k
names in any case. When that is disabled, still shows recent users and
"Other" - users mistake login screen for a screen lock.
ConsoleKit: Sorry state of documentation: "Defining the problem: To be
written" after all these years. Intended to manage separate seats, but ACL
changes illusory without revoke(). Not robust - changes persistently-stored
ACLs but keeps only in-process state => ConsoleKit crash leaves around
obsolete ACLs.
D-Bus complaits: Nonsensical name spacing ("you need a domain", "narcissistic
naming: based on project name, doesn't tell what it actually does). Would
like implementation-independent interfaces [X where to get the
implementations?]. TCP transport: "no authentication, no authorization, no
encryption". (Lennart: ['we' agreed that] "D-Bus won't be used across the
network full stop")
IPv6
Many "old" vulnerabilities were carried forward from IPv4, previously
presented:
- Neighbor discovery spoofing <-> ARP spoofing
- Duplicate address detection DoS [answer "this is a duplicate" to everything]
- Rogue autoconfiguration server <-> rogue DHCP server
Routerless networks: Sending a router advertisement with 0 lifetime "kills"
the router for clients. Per RFCs clients treat any address as _link-local_
if no router exists.
Unexpected RA on an IPv4-only network: switches on dual stacks. Thus we can
bypass IPv4-only firewalls, can MITM on IPv6 because IPv6 transport is
preferred to IPv4.
RA flooding: 1m bogus RAs DoSes Cisco, Windows, old Linux (100% cpu).
Remote ping scans: Originally thought infeasible due to large address space,
broadcast doesn't exist. But we can still use search engines, DNS, common
addresses. Randomly chose 17k DNS names. The following "host address" (= host part
within network) sources exist:
- Autoconfiguration: either link-local = based on MAC (can guess if you know
the (company standard) manufacturer), or "Privacy option" = random and
changing from time to time - DHCP: allocated sequentially! => "if you got one, you got all"; common
ranges based on example documentation - Manually configured: ::1, ::2, ..., or ::service_port, IPv4 address, and
simple variants of these.
Overall, can easily guess ~70% of host addresses. A scan only needs to try
~2100 host addresses (1-20 seconds) to get 70-80% of hosts, similarly try
~1500 common host names. A scan may return a router's "not available"
message for a different network, giving us more targets. We can iterate
between guessing hosts on a network, and using reverse DNS to get more
starting points. Altogether we can identify~90-95% of servers (not counting
other kinds of hosts).
Multicast DoS: Multicast background: A "query" router periodically prompts
for confirmation of existence of multicast receivers. We can spoof
"unsubscribing" message, but this will cause another prompt and resumption of
traffic. If we become the "query" router, we can avoid sending the prompt.
"Query" is voted by local link address => 0000000 wins [nobody configures a
router on 0]. Then we can unsubscribe the network, except that other routers
would assume the router is dead if no prompts were seen. To avoid this, send
the prompts - but only to a "router-only" multicast MAC.
To see if a Windows/Linux computer is sniffing the network: Send a packet
(ping) to an _unused_ multicast address, see if the host responds.
Side channels in IPv6: "IPv6 is a side channel" - too much functionality,
cannot be reasonably filtered
Code available at http://www.mh-sec.de/downloads/thc-ipv6-1.4.tar.gz .
Will start
www.ipv6{security,hacking}.info
for secure configuration advice.Mitigations:
- ACLs on L3 switch (e.g. don't allow RAs from client ports), if supported
- IPSec, but a pain
- Secure Encrypted neighbor Discovery - basically happening in the switches,
not supported yet anywhere, still has problems. - More secure client configuration - not always possible
- Detection of attacks is easy, prevention unknown
Analyzing Stuxnet
Presented by Microsoft "to set the facts straight". Analyzed within a few
weeks after this came in, but not allowed to talk about it at the time.
2 interesting things: _4_ 0-day vulnerabilities, attack on SCADA.
Discovered: by VirusBlokAda (Belarus, not known by Microsoft) sent a sample
in ~July, eventually got the original LNK files. Others are looking at this
as well - need to "know ahead" about threats, ~1 MB of binary; full knowledge
sharing with Kaspersky.
Methodology: "initial triage" - identify surprising code, clues for
vulnerabilities, then discuss details with developers of relevant code.
Total time ~30-40 man-hours in 3-4 days to find the vulnerabilities. Later
completely decompiled 2 components to buildable C.
Attackers: "Don't ask me who the author was". Components were written by
different people. Aiming for 100% reliability, high impact. Developed on
removable media (path embedded in file is B:...). Shell code does not use
simple "call" insns - always "ret".Attack 1: LNK files
Dumped LNK as text, identified the buggy DLL; all done in ~1 hour. Bug:.CPL
has icons inside => must doLoadLibrary()
, which callsDllMain()
. Fix:
limit icon loading to only registered.CPL
s.
Impact: Arbitrary code execution without privilege escalation - only a
foothold for further attacks. Looked at attack vectors - in addition to
USB, could use WebDAV (remote attack) => fixed it "out of band",
"telemetry" told them users were being affected. 100% reliable attack
vector. Apparently some people knew about this for years.Attack 2: Task scheduler
Debugging was not really helpful => using process monitor, event logs -
noticed that task files were accessed. Bug: XML file storing task data
(including the user to run it as => can escalate to LocalSystem) _writable
by user_, authenticated using a CRC32 hash (which was protected against user
access); CRC32 collisions are easy. Fix: use SHA256 (kept files writable
"for compatibility" [with what??? the authentication would break writing
anyway])
100% reliable - but only works on >= Vista.Attack 3: Keyboard Layout
Eventually found "not immediately obvious" code - searching inwin32k.sys
,NtVirtualAllocateMem
, keyboard layout loading, some IDA-unidentified code.
Tried various things, finally noticed the code looks like a shell code and
inserted a break point in it to get a back trace. Bug: <=XP allowed loading
keyboard layouts from any directory, indexed a function array using an
unvalidated user-controlled integer. Attack looked for a suitable user-land
address following the original table, copied attack code there.
100% reliable - only <= XP, so we can assume the attacked environment is
not monolithic.Attack 4: Printer spooler
Kaspersky reported suspicious spooler RPC. Network trace: guest printing to
files in%system%
. Spooler should have switched to the client account, but
it doesn't for Guest because it is too limited, so it uses System instead.
Windows by design automatically runs a .MOF file dropped in there :)
This all only works if anonymous connections are allowed, which is very
uncommon in corporations