31C3 Keynote
- Electronic music
Trustworthy Secure Modular Operating System Engineering
- TCB = application + dependencies (crypto, XML) + GUI (font/image renderers parsing arbitrary data) + language runtime + kernel + hardware (+ compiler)
- Ways to minimize the TCB:
- Compartments to limit impact of an attack: chroot, Solaris Zones, FreeBSD jail, Linux containers; Xen, KVM; not actually minimizing the TCB, only the impact
- More layers to detect known attacks (ASLR, firewall, IDS) — make the TCB even larger
- Alternatively, start with a clean state: focus on the necessary APIs of the internet (TCP/IP, DHCP, DNS, HTTP, TLS, SASL, XMPP, git, ssh, IMAP, …) and having persistent data storage, ignore traditional component layering
- Choosing a programming language: minimize accidental complexity, static typing (used for lightweight invariant enforcement), explicitly marking side effects (and using them very rarely)
- Unikernel = a specialized VM image compiled from a modular stack of application code + system libraries + configuration
- Mirage OS:
- BSD/MIT license, uses OCaml, compiles to a Xen VM image or other outputs; ~2 MB for a HTTPS server
- Single address space, no C library, running directly on the hypervisor, not including unneeded infrastructure (file systems, user database)
- Uses the OCaml module system, the same application code can use various module stacks: (UnixELF+UnixSocket+Cohhtp+MyHomePage or UnixElf+UnixTuntap+MirNet+MirTCP=Cohttp+MyHomePage or XenBoot+(XenStore,XenEvtchn)+XenRing+XenNetif+MirNet+MirTCP+Cohhtp+MyHomePage)
- Targeting Cubieboard 2 (ARM)
- Xen Security: Each PCI ID can be mapped to an individual VM (done by Qubes), HV separates VMs; the network interface is mapped as shared memory. Inter-VM communication possible using shared memory
- Implemented: TCP/IP, DHCP, HTTPS, DNS, IMAP; git-like persistent branchable store; TLS; deployment via git/github (the whole VM as a single binary blob in git)
- Performance: Serving static HTTP data, similar to Linux on ARM. Startup time ~20 ms, so can start services on-demand (the DNS server starts the service=VM when answering the request for the service host name)
- The TCB: Xen, MiniOS (printf and a few other stubs), OpenLibm (math library), libgmp, OCaml runtime, OCaml code, OCaml and C compilers
- http://openmirage.org
- Entropy source: provided by a Xen domain 0, which runs Linux
- Xen dependency critical to bypass the need to write hardware drivers
- TLS implementation:
- To avoid timing side-channels (and their amplification through GC), the core algorithms (block ciphers, hashes, HMACs) are in loop-free, non-allocating C; asymmetric crypto in OCaml with libgmp for bignums
- ASN.1: can describe the structure with very similar DSL to create a parser and a generator
- TLS API works with blobs as input/output for application data / protocol packets, then separate layers connect this with the network or sockets
- 20k LOC (vs. 100k for PolarSSL or 350k for OpenSSL)
Reproducible Builds
- “Why would anyone want to compromise me, I am not interesting” vs. attackers target project’s users through the developers
- Bitcoin: malicious modifications to binaries could result in irrevocable theft, the developer could be blamed, and the developers can’t prove they were hacked; so reproducible builds protect developers
- Tor concerned about coercion of their build engineers
- Critical bugs can be very small (CVE-2002-0083, a remote root exploit, fix is a single-bit difference)
- Obstacles: different compilers or optimizations, header files, library versions, build-environment metadata (e.g full paths, timestamps), metadata in archive formats, timestamps, signatures/key management need to stay private, profile-guided optimizations
- Doing reproducible builds today: Bitcoin, Tor, ?; others interested (Debian, Fedora, ?, Mozilla)
- Tor Browser:
- Full package set (incl. incremental updates) signed by multiple builders; supports anonymous independent verification (i.e. there is no public list of targets to attack).
- Does not require dedicated hardware or non-free software (OS X and Windows binaries are cross-compiled from Linux)
- Toolchain components: For Windows, MinGW-W64, wine+py2exe, nsis; For OS X, Toolchain4 and Crosstools-ng forks, mkisofs + libdmg-hfsplus
- Gitian: wraps qemu-kvm and LXC. Compilation stages described using YAML “descriptor”.
- Normalizes build environment (host name, user name, build paths, tool versions, kernel, fake time)
- Removes hardware dependency
- Limitations: Ubuntu guests only; needs authentication helpers for non-git input; partial compilation state is problematic (destroys in-progress state normally, fake time problematic for dependency checks, stages can be saved but it is difficult); supports only one qemu-kvm or LXC slave at a time
- Remaining issues
- readdir() output order depends on FS history (i.e. speed of computer and relative order of operations), required LC_ALL=C sorting
- Uninitialized memory in tools (binutils, libdmg-hfsplus)
- Time zone and umask
- Deliberately generated entropy (signatures)
- Authenticode and Gatekeeper signatures
- LXC still leaks (?)
- CPU feature detection during build
- Authentication of dependencies: by default, fetched through Tor; all authenticated; authentication is fairly problematic (many projects have weak or no signatures, so use a hard-coded SHA256 signature verified by downloading from multiple places)
- Most software is not as complicated as Tor Browser
- e.g. it’s much easier on Android (Android APKs do not need exact hash matches: uses Java JAR signatures, signing only contents, not file timestamps, does not depend on file order)
- FDroid: can submit source code, will do a separate build and checks that the output of that build still verifies against the developer’s signature (the FDroid signing key is a single point of failure, but having to compromise bot the developer and FDroid is better than not having FDroid)
- Debian reproducible builds:
- Massive integration effort. ReproducibleBuilds wiki is a gathering place for many problems and solutions. ~60% of builds now build reproducibly.
- dh_stripdeterminism to remove the known frequently occurring differences (e.g. timestamps), a major improvement, but because this modifies the binary, it is a single point of failure
- Future work: Remove Ubuntu dependency in Gitian (to allow Debian), counter trusting trust (Diverse Double Compilation from David Wheeler protects against backdoors in compilers, but not rest of the build environment, but a compiler allows us to cross-compile the Debian base system and run it on many architectures), multi-signature/consensus updates (is everyone getting the same update, or are some users getting a malicious update from the Tor project? perhaps using the Tor Consensus, Bitcoin blockchain, or Certificate Transparency to publish the update list)
- GNU ld on 32-bit had some kind of issue with SHA-1 in build ID of very large files (possibly ≥ 4 GB), switched to gold
- Re: Trusting Trust: it is difficult to write an automatically-propagating backdoor (and with Diverse Double Compilation we can theoretically use a 20-year old compiler to predate the threat actor’s existence), more realistic is some kind of ongoing access to the build systems and periodic updates
Revisiting SSL/TLS Implementations
- Bleichenbacher attack: Adaptive chosen-ciphertext attack, sending modified ciphertexts that send the pre-master secret (RSA is malleable, so can predict the effects of the ciphertext changes on cleartext) and observing the PKCS#1 padding unwrapping side channels (whether the cleartext starts with 0x00 0x02, or possibly a more detailed format checks that mostly confuse the use of the server as an oracle)
- Originally the oracle is received TLS alerts; can also use timing
- TLS protocol countermeasure: implementation requirement to treat padding errors invisibly to the recipient. As for timing, TLS 1.2 has a pseudocode to generate a random value and the use it if the padding is wrong; TLS 1.[01] seem to suggest to generate a random value only on wrong padding (i.e. explicitly suggest a timing channel)
- T.I.M.E. TLS testing framework is great for fuzzing TLS (but written in Java)
- Setup for timing attacks:
- Use a deterministic, non-GC language
- Network setup: No wireless, as near as possible to target, high-quality routers
- Disable power management (SpeedStep, CPU C states)
- Don’t use fancy network cards that do interrupt coalescing
- Stop all tasks/daemons/GUI
- Skip the first few hundred measurements to account for cache warmup
- Measure the time between sending the last byte of the request (having sent the previous one before starting the timer) and receiving thre response
- Java Secure Socket Extensions: has a distinguishable error message if the plaintext starts with 0x00 0x02 and plaintext contains a zero byte preceded with non-zero bytes
- Strength of the oracle increases with key length, from 0,2% for 1024-bit to 74% for 4096-bit
- OpenSSL 1.0.1i generates random pre-master secret on padding failure, i.e. there is a timing side-channel, ~1.5 microseconds, which is practical to attack in a local network at least; but OpenSSL has a very strict PKCS#1 check, which makes the oracle very weak (2,7*10-8, would require about 5*1012 requests)
- Java Secure Socket Extension: simple TLS implementation that uses exceptions and the like, and lax checking makes the oracle strength 60%
- Cavium hardware TLS accelerators (used in TLS appliances): Doesn’t verify first byte of the padding, only the second one, which requires modification of the attack.
- Lessons:
- Bad/fragile design in protocols may be haunting us for decades
- The same padding error from 1995 TLS was done again in XML encryption in 2012, and again in something JSON-like, “but we need to support old algorithms or our standard won’t be used”.
- The PSS signature scheme of PKCS#1 v2 should not have the same problem.
SS7: Locate, Track, Manipulate
- There are companies selling tracking of people, e.g. SkyLock
- SS7 is a protocol suite used by most telecommunication network operators between switches; designed at the time of few, state-controlled, companies; no authentication built in
- Extended: Mobile Application Part: support for cell phones
- CAMEL Application Part: allows custom services
- Still no authentication in these recent extensions
- Getting access to SS7 is getting easier:
- Can be bought from telcos or roaming hubs for a few hundred € a month.
- Usually needs roaming agreements, but some telcos are reselling their roaming agreements
- Some network operators leave their equipment unsecured on the internet
- Femtocells also have full access to SS7, and are hackable
- SS7 components: Home Location Register, Visitor Location Register (a copy of the HLR record close to geographical position), Mobile Switching Center (routes call/SMS, collocated with the VLR)
- Addressing by “global title”, looks like a phone number
- Location tracking: the network needs to know the cell closest to you; there are databases of cell locations; in cities cell ID may be specific down to street level
- Commercial tracking: advertise 70% availability, excludes Israel and USA (an Israeli-USA company :) )
- SSL7/MAP: anyTimeInterrogation service can query the HLR for cell ID and IMEI (which reveals phone type), pings the VLR/… for the ultimate data
- Many networks (e.g. all networks in Germany) actually block anyTimeInterrogation now
- But we can query the MSC/VLR directly if we know the IMSI instead of the phone number (and we can get the IMSI and global title of the current VLR from the HLR); this works for a lot of networks, VLR/MSC mostly accept any request from anyone
- Observations of a German network operator
- Started filtering all network-internal messages at borders
- Attack traffic dropped more than 80%, looked for sources: some of that was misconfiguration
- Other uses:
- Shipping company was tracking vehicles
- SMS service provider for banks (second factor) wanted to observe SIM switching
- Some attacks are still happening, so there must be other information sources or they are brute-forcing the VLR/MSC
- In the US, E911 requires location within 300m, can even ask the GPS position from the phone.
- The location server used by emergency services thankfully requires authentication
- But the switching centers used to route the request don’t do authentication, only address verification… but there are two sender addresses, one used for routing and one for verification
- It is also possible to manipulate the subscribe data in the VLR (i.e. disable incoming and/or outgoing calls/SMS/data, or delete the subscriber from the VLR altogether), the VLR does not do authentication
- CAMEL: defines events for which the VLR should ask the home network “GSM service control function”, which can modify or cancel the operation (e.g. during roaming, rewrite the phone number to refer to the home country)
- The attackers can modify the VLR, so they can also change the service control function address—and intercept call requests and reroute calls (set up a MITM)
- Apparently this is actually happening in Ukraine, requests came from a Russian SS7 network
- Location updates (roaming or within the country): VLR/MSC sends an updateLocation to the HLR, HLR saves the address of that VLR for routing, and sends a copy of the subscriber data
- updateLocation is not authenticated, so an attacker can “steal” the incoming calls and SMS
- USSD codes (*#…) can also be executed for other subscribers (on some carriers, transfer prepaid credits; set/delete call forwarding (e.g. to a premium-rate number))
- TMSI de-anonymization: an attacker can find out the phone numbers of subscribers around him. Paging (e.g. to notify of an incoming call) has to happen unencrypted, so an anonymous TMSI (temporary MSI) is normally used to hide the IMSI. But the MSC can be asked for the IMSI if the TMSI is known (and updateLocation gives the MSISDN for the IMSI)
- LTE:
- Uses the Diameter protocol in the core; SS7 is becoming a legacy protocol, but a lot of the design ported, e.g. still no end-to-end authentication, and GSM/UMTS will be around for a long time
- There are interfaces from LTE to SS7 for compatibility/interoperability
- Countermeasures for operators:
- Remove all necessities to hand out IMSI and current VLR/MSC to other networks
- Primarily needed for SMS routing; an alternative is SMS Home Routing (sending through the home network)
- All MAP and CAP messages only needed within a network should be filtered at the borders
- also filter sendRoutingINfor for voice calls if “Optimal Routing” is not used
- No countermeasures for subscribers
Mobile Self-defense
- SS7: Used for exchange of SS7 and cryptographic keys between networks, and also within a network
- Many networks (e.g. all German networks) have stopped responding to anyTimeInformation after the publishing of the services misusing this; but there are many other protocol messages
- SS7 actually enables 5 kinds of abuse: tracking, interception, DoS, fraud (charging someone else or calling for free), spam
- Local interception: intercept radio transmission (3G is pretty easy), then ask for the current decryption key (sendIdentification, supposedly used for MSC roaming within the network; all 4 German operators, and most other networks, have responded to this from another country)
- sendIdentification can be simply blocked
- 3G should prevent IMSI catchers (network is authenticated), but SS7 can be used to ask for the encryption key (still requires local presence)
- Can add plausibility checking / border firewall
- Can also intercept both incoming (activate call forwarding) and outgoing (add a number rewriting rule) calls, without being near to the phone
- Should at least only allow roaming partners to do this, and then some plausibility checking
- The SS7 requests should be rate-limited to handle clear abuse.
- Encryption:
- GSM, A5/1 now trivial: capture with 20-year old phone, then crack the cipher in a few seconds
- 3G: software-defined radio to capture data, then query SS7 SendIdentification to get the decryption key (also works for A5/3)
- (anyway, 5 operators in 4 countries do no encryption at all, some even have encryption on GSM but have it disabled on 3G because it is harder to intercept)
- gsmmap.org rates operators on ease of interception and tracking, provides country reports
- Want to especially highlight one issue: length of encryption key. NSA have apparently broken 64-bit A5/3 (which has been used to “upgrade GSM security to UTMS standard level” but is not actually as strong as UMTS; there is A5/4 which actually is strong enough). If SIM, not USIM cards, are used, even 3G is only 64-bit and not good enough
- Self-defense: many of the abuses can be detected
- Not generally available at the OS level, but there are debugging interfaces: releasing SnoopSnitch for Android ≥ 4.1 (rooted, but no CyagonenMod which is removing some files?; Qualcomm chipset)
- Improved IMSI catcher detection compared to the old CatcherCatcher
- Can optionally submit reports to gsmmap.org
- In theory this should be pursuable legally, but there are no statistics because nobody is looking for abuse (in Germany, they fixed some loopholes in 2 weeks since being notified after 20 years of vulnerability, so they are probably not motivated to look for abuse)
- There are about 800 companies legally connected to SS7, but there are various resellers and perhaps hacked endpoints and the like
- SS7 implementation/transport: Initially as a network between western-European networks, as dedicated cables. Eventually routing providers (7 or 9 big providers) between telcos appeared, who typically connect telcos using VPNs over the Internet.
- There is a separate interconnect for data roaming (GRX), which is inherently data so could be run over the open Internet
- People seem to be running Diameter (the LTE interconnect), which is inherently IP, over the open Internet…
- There seems to be no practical reason to have the anytimeInterrogation message in the protocol
- Countermeasures without abandoning SS7: AnytimeInterrogation, SendIdentification can be fixed easily enough, but various other messages are needed at least to allow roaming (so should at least block those that are not roaming partners)
- How to get things fixed? Bad publicity, or at least possibility of bad publicity, seems to work: this is not malicious, just negligence
- Didn’t talk about various other abuses of SS7 (primarily fraud, there is also spam and other things)
- Don’t think there is much more to be discovered in the SS7 standard, but vendors use SS7 as a pipe for vendor-specific commands (e.g. found a “ResetHLR” command)
Code Pointer Integrity
- Memory safety violation = an invalid dereference (e.g. a dangling pointer (temporal), out-of-bounds (spatial)
- Threat model: Assuming attacker can read/write data, read code, but not modify program code or influence program loading
- Other defenses:
- Execution prevention: ROP
- Data execution prevention, stack canaries: read the memory contents (per threat model)
- Memory safety: 3k LOC application vs. 500k Python runtime, 2.5M libc, 15.803M kernel—no way to get to safe languages in a practical timeframe
- Retrofitting memory safety to C/C++:
- SoftBound+CETS: 116% overhead, not quite compatible; CCured: 56% overhead; AddressSanitizer: 73% overhead, only probabilistic
- (SoftBound: adds additional metadata to pointers, containing lower and upper bounds, every pointer dereference adds a bound check)
- So, instead of protecting everything, strong protection for code pointers only => 1.9% “or”(?)” 8.4% overhead
- Code Pointer Separation:
- Splitting between safe memory (which may contain code pointers) and regular memory:
- On the heap, code-pointer separation: split memory view: Safe memory = “control plane” either a code pointer or NULL; regular memory contains everything but code pointers, with layout unchanged
- Stack: split into safe stack and regular stack. Safe stack contains local variables that we can prove are safely accessed (including return addresses), regular stack contains the rest
- Executable counts as unsafe memory
- Split happens using “hardware-based” separation (?), e.g. segments
- Attack method: The attacker can attack non-code pointers to redirect so that some other code pointer is called.
- Security guarantees:
- Attacker cannot forge new code pointers
- A code pointer is either immediate or assigned from another code pointer
- An attacker can only replace existing functions through an indirection (foo->bar->func() => foo->baz->func2())
- Code Pointer Integrity = Code Pointer Integrity + protecting “sensitive pointers” = code pointers and pointers used to access sensitive pointers
- sensitive pointer definition based on a conservative type-based approximation; on SPEC2006 ≤ 2.6% accesses are sensitive
- Also does bound checking (?! the same one that supposed to have 116% overhead?)
- Additional guarantees:
- Accessing sensitive pointers is safe (separation + bound checks at runtime)
- Accessing regular data is fast
- Summary:
- CPS: 0.5% to 1.9% overhead (~2.5% of memory accesses), strong protection in practice
- CPI: 8.4% to 10.5% overhead (~6.5% memory accesses), formal guarantee of safety
- Safe stack only: prevents ROP, ? overhead
- Implementation:
- LLVM on x86_64 and partially x86, on OS X, FreeBSD, Linux
- “Great support” for CPI on OS X and FreeBSD on x64, “fairly good” on other architectures
- Upstreaming in progress, starting with safe stack (“coming to LLVM soon”), will then continue with CPI and CPS
- https://github.com/cpi-llvm
- Practicality: recompiled FreeBSD userspace and ≥ 100 packages
- Memory overhead: low single digits (using a compact hash map for the protected memory)
- Applicability to kernel code: assembly would be a difficulty (not discussing the segmentation/page table needs?)
- How is the safe memory separated: (missed a few, segments?); just put it into a random place in the 64-bit space; can also “blind out” that space by ANDing all non-code pointers to make it unaccessible (costs ~2-3%)
- Out of scope:
- JIT is out of scope; could make the JIT produce safe code but that would need it to be trusted; or just rely on the JITed language being safe by design (again essentially trusting the JIT)
- Casting function pointers to void * / char * (which explodes the type analysis to cover everything as protected)
- Weird HW / critical data, e.g. having page tables as data subject to attack
- For C++, this causes all classes with virtual method tables (and anything referring to them) to be classified as sensitive (didn’t seem that they view this as a significant issue?!)
- Hardware assist? Intel MPX: looked at it, but not available yet; advertised as a debugging feature, rumored to have 40% overhead
ECCHacks
- http://ecchacks.cr.yp.to
- Uses:
- PK signatures: signed OS updates SSL certificates, e-passports
- PK encryption: SSL key exchange, locked iPhone mail download
- secret key encryption: disk encryption, bulk SSL encryption
- Why ECC? To break original DH and RSA, “index calculus” getting faster and faster; decreased cost of breaking RSA-{1024,2048} from 2{120,170} to 2{80,112}; “extremely unlikely” that index calculus on ECC will ever be able to work
- Explaining ECC: clock cryptography:
- x2 + y2 = 1 on [-1,1]x[-1,1]; instead parametrize as x = sin alpha, y = cos alpha. Because (sin (alpha1+alpha2), cos (alpha1 + alpha2)) = (sin alpha1 cos alpha2 + cos alpha1 sin alpha2, cos alpha1 cos alpha2 - sin alpha1 sin alpha2), we can define (x1, y1) + (x2, y2) = (x1y2 + y1x2, y1y2 - x1x2)
- Defining addition gives us scalar multiplication
- Special cases: (x1, y1) + (0, 1) = (x1, y1); (x1, y1) + (-x1, y1) = (0, 1)
- Then use the same definition over finite fields, e.g. F7 = { 0, 1, 2, 3, 4, 5, 6 } = { 0, 1, 2, 3, - 3, -2, -1 }
- Scalar multiplication by n can be done in O(log n) additions instead of O(n); we use this scalar multiplication as the trapdoor function:
- “Clock D-H protocol”: standardize p and a base point (x, y) in Clock(Fp); Alice chooses secret a, and gets public key a(x, y), Bob chooses secret b, gets public key b(x, y); then a(b(x,y)) = b(a(x,y)) is a shared secret
- Actually, warnings:
- Many choices of p are unsafe
- Clocks aren’t elliptic, can use index calculus to attack clock and actually need much larger primes than RSA for same strength
- There is a timing side-channel (for the full operation and possibly even sub-operations due to electronic emissions)
- Elliptic curves: x2 + y2 = 1 - 30x2y2 (Edwards curve as an example)
- Addition: (x1, y1) + (x2, y2) = ((x1y2 + y1x2)/(1+dx1x2y1y2), (y1y2 - x1x2)/(1-dx1x2y1y2)
- Other elliptic curves: odd prime p, non-square d in Fp; then a “complete Edwards curve” x2 + y2 = 1 + dx2y2; this is well-defined (denominators never 0) and can be easily defined on non-square d (and half are non-square in a non-finite field)
- To avoid slow divisions: instead of dividing a by b, store as a fraction (a/b) and use fraction arithmetic; or “projective coordinates”: (x, y, z) represents (x/z, y/z) or “extended coordinates” storing (x, y, z) projective and t = xy/z
- So, standardize prime p, safe non-quark d, base (x, y) on the curve. Then use the same D-H algorithm
- Packet overhead at high security level: 32 bytes for Alice’s public key, 24 bytes for nonce, 16 bytes for authentication; “all of this is fast enough so that we can afford to encrypt all packets”
- Example: p = s255 - 19, d = 121665/121666, x2 + y2 = 1 + dx2y2 is a safe curve, or alternatively -x2 + y2 = 1 - dx2y2 (there is a bijection between them)
- Other curves than Edwards exist…; for historical reasons, standardized on Weierstrass curves (with addition having 6 cases!)
- Any of the public standards protect against computing secret key from the public key
- Can still attack DH when the addition formulas are used on points not on the curve, which leaks information about the private key (footnote in the standard and patented!)
- So, instead protect against implementation errors by changing the protocol (e.g. making it impossible to send a point not on the curve)
- Curve25519 standardization: first brought up in 2010, still discussing
Crypto tales from the trenches
- Typical case: whistleblower not technically knowledgeable, "install PGP" a showstopper. Also, sources may not know they are sources and this makes it explicit
- Setting up Silent Circle took an hour, gave up due to usability
- Key management too complicated; in Windows creating a revocation key requires the CLI; Poitras doesn't have a revocation key after 2–3 years of PGP use
- OTR setup with a shared secret: neither remembered the right answer
- There is a "burner" phone app for iPhone just masquerading an area code?!
- Burner phones are difficult to buy anonymously in the USA, stereotypically only drug dealers need it.
- All the journalists use Tor, e.g. to hide their interest in a target
- Keeping encrypted notes vs. collaboration
Why is GPG “damn near unusable”?
- Automating key exchange makes things pretty workable except users still fall for fishing attacks
- Really transparent security doesn’t seem trustworthy, and people don’t notice that data was sent in plaintext instead
- Users have generally poor understanding of the email mechanism and threat models
- Examples of usability failures:
- Key reuse
- Low password entropy
- Enigmail encrypted and unencrypted messages look very similar
- GUI applications still use a lot of jargon
- Principles:
- “Design is made for people”; this is abstract, so need to observe people, search for usability failures
- Give feedback, make the systems visible
- “how do you know what happens when you push a button, and how do you know how to push it”?
- “how do you know what state the system is in?”
- Mental models, engineering to make failure impossible
- Knowledge transfers: using metaphors, standardization (design languages)
- “Everything is designed”, so try to reverse-engineer the intent
- Look at examples of evil design (www.ryanair.com)
- Case studies:
- PGP/GnuPG is not installed by default, users don’t change defaults
- CryptoCat: Uses JavaScript crypto, had a crypto weakness (but then so did PGP 1.0). Asking “How can we make crypto fun?”
- Other issues:
- Standardizing on ciphers
- API usability
- Key management: possible options “key continuity management” automating both key exchange and use (e.g. TOFU)
- Multi-device support
- Metadata leakage
- Introductions/user identifiers: key fingerprints are not memorable
- End-user understanding
Beyond PNR: Exploring Airline Systems
- http://saper.info/talk/31c3/
- Airport systems: Air Traffic Control, Flight Information Display Systems, baggage control, access control, announcements, lost&found, TIMATIC (a database of passport and visa requirements), World Tracer (knows about every piece of luggage worldwide)
- Weather: messages with prognosis (TAF), actual weather (METAR: http://www.aviationweather.gov/adds/metars/ )
- ATC systems: Single European Sky, EUROCONTROL (listing all aircraft departures/landings/states, with a public website)
- Airline systems:
- Inventory: airplanes, schedules, seating configuration, controls the sales process
- Reservations/ticketing
- Departure control (check-in, boarding pass, …) and flight management
- Load control (how many passengers, cargo)
- Avionics
- In-flight entertainment
- Flight planning software (e.g. planning needed fuel)
- Sales:
- Mostly via Global Distribution Systems, linking travel agents/web portals and airlines (URLs for reservations:): Sabre (virtuallythere.com), Galileo (travelport,com), Amadeus (checkmytrip.com); heavily regulated, ticket availability restricted by country
- Airlines also sell directly
- PNR (= “booking” = “ reservation”): describes itinerary of a passanger or a group
- air, also hotels, car rentals
- No formal industry standard but can be exchanged
- Identified by “record locator” (originally a direct disk block number), unique within an inventory system (i.e. may have several identifiers when going across airlines); short-term, can be reused
- Putting in a reservation, once you have access, is free
- Electronic Ticket: fare has been paid
- Valid between two cities on a given date
- Identified with a long number, never reused
- Includes the paid fare and terms & conditions (“fare note”)
- Accessing the reservation systems:
- Mainframes using the 3270 terminal: interactive interface with real-time processing
- IBM OS “TPF” (Transaction Processing Facility): <100 installations worldwide (airlines, credit card transactions); born as Airline Control Program in 1960s
- Typical workstation:
- Output devices: bag tag printers, boarding printers, hardcopy (dot matrix) printer
- Input devices: Magnetic stripe reader (for loyalty & credit cards), OCR scanner, barcode scanner, keyboard, RFID? (not seen for passport use yet)
- Used to be hardwired terminals; now Common Use Terminal Equipment (on Windows XP, allows use by different airlines) or self-service kiosks
- Wiring and communications:
- SITA (Sociėtė Internationale de Tėlėcommunications Aėronatiques)
- ARINC (Aeronautical Radio, Incorporated)
- Pre-internet WAN; every device (e.g. a printer) has an individual address
- Messaging via TELEX (P1024, specification requires payment)
- Can usually guess recipient address: (airport code), (airline code), two characters??
- Wireless communication:
- Air Craft Communication Addressing and Reporting System (radio link between aircraft and ground)
- Automatic Dependent Surveillance: broadcasts fligth information:
- ADS-A: addressed peer-to-peer
- ADS-B: broadcast (used by flightradar24.com)
- other modes, ADS-C … in question/answer mode
- Can also transmit telex
- Military IFF is just ADS with crypto
- acarsd.org available for sniffing ADS
- PNR details:
- Bots add messages (e.g. ticket must be bought by $date); there are facilities for comments visible only by the one who booked the reservation
- Many abbreviations used because TELEX is very costly (sometimes paying for every byte)
- Advance passenger information: travel documents added to the PNR, ~30 countries require this, a condition for some electronic visas (with automated checks that the passport is valid and the like); USA also requires an USA address
- Access to PNR:
- PAXLST contains basic data, mainly for visa purposes. Uses a text structured format (pre-XML), trend to use it for all PNR transmissions
- PNRGOV (government): almost complete information with history
- Pushed to the government before and after departure; government can also pull specific enquiries
- Entering USA:
- Transport Security Administration
- Secure Flight: requiring information about everyone flying into, out of, or over the USA
- Do Not Fly watchlist
- SSSS (secondary screening prior to departure): the check-in agent is informed, should be printed or written on the boarding pass.
- If selected too frequently, can complain to the DHS and obtain a special number to provide during reservation
- Customs and Border Protection:
- Requests API information for automated ESTA processing
- “Advisory program”: advisors in select overseas locations: “on-site training and technical assistance”, can ask passengers questions. A recent job description required Secret security clearance, describes “advising” whether high-risk passengers should be allowed to enter the USA
- PNRGOV: PNR+history+subject information… being standardized, already being exchanged EU-USA. Format documentation is public.
- Future:
- Migration away from telex to structured records
- Move away from peer-to-peer to a hub-spoke model
- Standardization of PAXLST and PNRGOV
- New technology: tablets for flight crews, automatic biometric passport control
- Why agents sometimes exchange boarding passes? Sometimes because they have added more information (either voluntary like loyalty number or required like passport number), other than that it shouldn’t really happen. Boarding passes are standardized but there are various interoperability issues.
Security Analysis of the Estonian Internet Voting System
- Security requirements:
- Integrity: outcome matches voter indent (i.e. recorded as intended and correctly counted)
- Ballot secrecy: Weak form: Nobody can figure out how you voted; Strong from: even if you try to prove it
- Internet voting: also has server-side threats (DoS, insider attacks, remote intrustion, possibly by states); can’t just take a machine and test it, and attacking a live server would be problematic
- US absentee internet voting system: ran a public test
- had a script injection bug
- Had open access to webcams in the data center :)
- Estonian voting system:
- Released source code to the server, client is still closed-source
- Reviewed publicly released video footage of operational procedures (configuring, setting up the server)
- From the voter’s perspective:
- Download an application (Windows, Mac or Linux), use the ID smart card
- Encrypted ballot = encrypt(PKelection, random pad, ballot); signed ballot = sign(SKuser, encrypted ballot)
- Verification: Based on the random pad, can receive the encrypted ballot (without eh signature), which allows brute-forcing ballot values (the encryption is deterministic?)
- As a safeguard against coercion, verification is only allowed 3 times, within 30 minutes; and can change the vote at any time during the period
- Counting: Election servers used to verify and strip the signatures (checking every voter is voting only once), then encrypted ballots are transferred over a DVD to a counting server which is the only one that holds SKelection and can read the ballot values
- Implicitly trusted components:
- Voter’s client (client-side malware can steal the PIN and cast a replacement vote at any later smart card use).
- To defeat the verification app, can just bet that the 30 minute window after a secret vote won’t be hit, or possibly install malware on the mobile
- Counting server: nobody double-checks the results. Airgapped, but installed from a DVD that is created by downloading a fresh copy of Debian (i.e. by a computer on the internet, so it could be compromised)
- Operational security, based on official videos:
- wifi SSID and password posted on the wall, visible in the side
- Using windows shareware installed over http to edit config files
- Digitally signing the voting client on a computer that does P2P file sharing
- Video includes root password keystrokes, someone’s ID card PIN, the key that opens the data center door
- Final official voting results transferred over an USB stick with huge amount of various files
- Eventually went public;
- The reform party, incumbent, loves the E-voting system as their technology; the centre party hates the system; all media closely affiliated with either parties
- Interpreted as either proving the system is flawed (by the centre party) or as an attack by the centre party (by the reform party)
- Official response was that they have accounted by all of this and there was no problem
- Got the head of security drunk and got their root password…
- Estonian CERT: “E-voting is (too) secure” dismissing this
- Lessons:
- The I-voting approach is not secure
- This is a national security issue, not a government IT problem, due to risk of state attacks
- Politics can obscure major technically problems
- Recommending to discontinue until there are “fundamental security advances”
- In general: want major fraud at least as hard as with paper, and so far there is no technology like that.
- Could also attack the signature stripping server, but that would leave more evidence.
Finding the Weak Crypto Needle in a Byte Haystack
- Stream cipher: generates a keystream from a key.
- Key reuse with a stream cipher: reveals XOR of plain texts
- History:
- VENONA decrypting Soviet one-time pad reuse
- Dircrypt malware: uses RC4, encrypted all files with the same key; (but also included the key in the encrypted files anyway)
- Ramnit malware traffic: uses encrypted block, reuses the key
- Microsoft Office 2003 document encryption: reuses the keystream for different files
- Approach: XORspace: given two inputs, create a 2D table XORing every byte of one input with every byte of the other input.
- Expecting a non-random distribution of byte frequencies along the diagonals (= sections of “concurrent” advancement of the two inputs), looking for streaks for sufficient evidence.
- Quantification using naive Bayesian log-odds, want enough evidence to find at most one false positive along the whole input; this is fairly strict but generally lax enough.
- (In general: don’t use stream ciphers, use AEAD modes. Recent TLS versions don’t even have stream ciphers. But malware uses stream ciphers frequently because they are easy to implement.)
Attacks on UEFI Security, Inspired by Darth Venamis’s Misery and Speed Racer
- Attacking successive levels of protections:
- BIOS_CNTL register:
- BIOS write enable, BIOS lock enable (attempt to set write enable causes SMI who can reset the write enable before the attacker can write)
- Move from ICH (northbridge/southbridge) to PCH (platform controller bit) added SMM_BWP: can allow writes if all processors are in SMM—because without this, the BIOS write enable bit is temporarily set which allows another core to do a write before the SMI handler can revert it
- most systems don’t currently set SMM_BWP (? <10% of ~2-year-old systems use it)
- ICH does not have SMM_BWP; can use “protected range” registers to protect, but still may be vulnerable
- With write protect effective, we need to break into SMM. The SMM lock down bits are cleared during reset, and ACPI S3 is a sufficient kind of reset. S3 wakeup involves executing a “boot script” stored in ACPI NVS which includes the contents of lockdown registers like BIOS_CNTL
- ACPI NVS is just an ordinary RAM with no additional protections! We could alter the boot script, alter data referred to by “dispatch” opcode (~eval), or alter the location of the boot script.
- Easiest is to change the pointer to the boot script to point to a modified copy.
- The ACPI development kit supports “SMM lockbox”, storing the boot script in SMM protected memory; only seen in one UEFI development motherboard, but it contains a “dispatch” opcode pointing to unprotected memory.
- What to do in the boot script?
- SMM is protected from CPU by SMM Range Registers, enabled before boot scripts, but protected from DMA by TSEG, which is unlocked in boot script context. So, set TSEG to make it ineffective. To make DMA operations easy, just bypass TSEG in the boot script, then use the OS hard disk driver to write to our wanted address.
- All surveyed UEFI systems were vulnerable. Requires reverse-engineering the boot script format, which is vendor-dependent.
- Flash protections: protected region registers restrict writes to flash.
- UEFI non-volatile variables must be writable at OS runtime (the OS can change the), so at least these are writable in SMM mode.
- Found many instances of code that is vulnerable to malicious SMM code writing untrusted values to non-volatile variables, causing buffer overflows.
- Much easier: in the UEFI reference code, “capsule update” mechanism uses “authenticated variables” to authenticate updates, so the attacker can just use the normal firmware update path. UEFI is generally considering SMM to be in the trusted base; eventually there will be a new HW feature to protect against SMM attacks.
- It turns out ~70% many systems do not even use the PR masks; only HP does, but has incomplete coverage
- Kernel code/driver signing protect us in theory (prevents doing DMA, accessing hardware) but one can exploit a vulnerable signed driver
- The vulnerabilities should be fixable with BIOS updates.
Too Many Cooks—Exploiting the Internet of TR-069 Things
- TR-069 = CPE WAN Management Protocol: used to provision, monitor and configure home routers; v1.0 in 2004, up to v1.4 amendment 5 in 2013
- The home router talks to an Auto Configuration Server over SOAP; the client always initiates the session, the server sends {Get,Set}ParameterValues
- Earlier: many implementations and configuration flaws in ACS servers
- The ACS can request the CPE to initiate a connection; this is mandatory, i.e. every CPE is a server listening, by default on the port 7547 (which is the second most popular open port in the world, 1.2% IP addresses; though some ISPs use a different port)
- Aside: port 80: ~70M devices; 50% web servers, 50% IoT, diverse uses
- port 7547: ~45M devices, all IoT
- TR-059 census: port 7547 over the entire IP address space
- 1.18% responded, i.e. 46M devices all over the world
- Server implementations:
- 52% RomPager, 19% gSOAP, 6% mini_httpd, 8% KTT-SOAP, rest Apache
- RomPager: embedded HTTP server by Allegro Software, optimized for space
- 98.04% is RomPager 4.07, then see 4.03, 4.34 and 4.51; 4.07 still being embedded into newest firmware
- RomPager 4.07:
- Released in 2002; 2.2M devices serving it on port 80, 11.3M on port 7547 (“to our knowledge the most popular version of any service ever available on the public internet”); ~200 different devices ship it
- Of all the firmwares including RomPager 4.07, all the same ZynOS header targeting mips32
- ZynOS is a RTOS, no file system or permissions, just one binary file. Notorious for “rom-0” attack giving the attacker the full router configuration (including passwords) without any authentication just by accessing port 80.
- Fuzzing: crash on overflow of Authorization:Basic user name: unprotected strcpy() gives arbitrary code execution.
- Generally applicable, but memory layout depends on model and firmware version, and a bad guess means a crash and a new IP allocation = no more attempts
- Wrote ZORDON: ZynOS Remote Debugger (Over the Net) using the existing memory read/write primitives (“over the net?”)
- Another vulnerability: Each dynamic HTTP request causes [??? lost; memory overwrite by concurrent requests because the request state is is shared??]
- “Misfortune Cookie” vulnerability: RomPager supports cookies, but no dynamic memory allocation: so preallocates 10 cookies of 40 bytes each, named C0-C9—but actually C$arbitrary_integer, giving us an arbitrary memory write. A few magic cookies allow bypassing any authentication, and browse the admin interface using “any port” (any port open by the router it seems, i.e. also CWMP listening on the public internet)
- In some countries, 50% routers are vulnerable
- There is no way to turn CWMP off=avoid this vulnerability with the standard firmware!
- Supply chain / manufacturing process: AllegroSoft => single chipset vendor providing a SDK => device manufacturers (Asus, D-Link, Huawei, TP-Link, ZTE) => ISPs customizing firmware
- The fix propagation process is ~non-existent.
- Vendor communication: AllegroSoft statement: we provided a patched version in 2005, but we can’t force any vendor to upgrade to latest version”
- ISPs could probably protect in some ways, e.g. using an internal IP range for the 7547 port and making it unavailable from the public internet; or at least change the port to not be that obvious (but still vulnerable to a full scan)
The Matter of Heartbleed
- Estimates 24-55% of HTTPS websites were initially vulnerable
- The heartbeat extension is not necessary for TLS/TCP, but useful on other transports
- Tracking vulnerability: scanned Alexa Top 1M, 1% of IPv4 regularly. Checked for a non-compliant OpenSSL behavior instead of directly exploiting the vulnerability.
- Evidence that 44/top 100 were initially vulnerable; some remained vulnerable in 24 hours. All top 500 were patched within 48 hours.
- First scan, 2 days after:
- 45% sites support HTTPS, 60% of those support heartbeat. Impact on Alexa sites: [24% lower bound per support of TLS 1.[1,2] which indicate a new enough version of OpenSSL, 55% upper bound per Heartbeat support and used software]. Overall, 11% of IPv4 hosts support Heartbeat, 6% vulnerable.
- Patching behavior: ~4% of Alexa 1M are still vulnerable today, ~2% of IPv4 hosts are vulnerable.
- Tracking attacks, on an otherwise unused IPv4 address
- First evidence of attacks is before the tracking scans started
- No evidence of indiscriminate internet-wide scans prior to disclosure; first scan was 22 hours after disclosure, then overall 6k attempts.
- Looking at the attack tracker and a honeypot, only 11 hosts scanned both, and only 6 scanned more than 100 hosts at ???; i.e. indiscriminate attacks were actually rare.
- Global vulnerability notification:
- 2 weeks after disclosure decided to make a global notification; split them into 2 groups, which allowed to observe a 47% increase in patching caused by the vulnerability notification—i.e. vulnerability notification actually works (even in cases when the notification bounced)
- Handling the cryptographic key compromise:
- Only 10.1% of known vulnerable sites replaced their certificates, 14% of them just reused the same private key! Only 4% websites revoked their vulnerable certificates.
- Lessons:
- Lack of attention to critical Open Source projects lead to this
- Looking at the past year, the HTTPS ecosystem remains incredibly fragile
- The PKI is not equipped to handle massive revocation; need to account for massive correlated revocation need
- Recent advances in scanning allow us to quickly respond and measure
- Notification of vulnerability, based on measurements, actually works.
Heartache and Heartbleed
- CloudFlare was told about the vulnerability, patched it locally, planned to disclose Apr 9
- Why tell CloudFlare? “every CloudFlare server can serve every site, which would make it very bad” (???)
- Actually disclosed Apr 7 by OpenSSL
- Codenomicon launched heartbleed.com, hit mass media
- Then, “we had our site fixed and some free time”
- Keep our scanner up: wrote a Go service, hosted on Amazon+CloudFlare. Up to 22k scans per minute, total 203M tests in the first 14 days.
- Use our network as a honeypot: logged every heartbeat with a mismatched length.
- Logs from April 9th, identifying tools per message size: >60% ssltest.py, ~10% filippo.io (?); same a week later, i.e. didn’t see massive private scanning
- Figure out what to do about the certificates
- The risk: typically not logged, a single request gets up to 64k of memory
- Cookies, session state, passwords were known to be revealed
- What about private keys?
- Reading OpenSSL code, it seemed that the private key is allocated during startup, all temporary copies have memory zeroed after use, and the server is single-threaded so there should be no risk.
- Set up a challenge; which resulted primarily in trolling (placing likely-looking data into the server memory by sending requests)
- 10 hours to get the key, 12 people got the keys within the first 24 hours
- The mechanism: some temporary variables were not wiped in some cases of the Montgomery multiplication
- Most solved it by checking every 128-byte block and checking for factors; one did it in 50 requests due to the small public exponent.
- Revoking certificates:
- Three methods to do this:
- CRL: flat file of revoked certificates; GlobalSign CRL grew from 22 kB to 4.7 MB, i.e. 30 Gbps + 100 Gbps waves every three hours = self-DDoS
- OCSP: broken, Chrome ignores it (hard fail breaks captive portals, soft fail allows dropping the OCSP response)
- CRLSets: Google’s proprietary method, they collect ~all CRLs and update the browser with a new list—but this only applies for EV certs, and updated only when the browser is updated; i.e. helped for none of the 100k CloudFlare certificates
- Ended up patching Chrome with a local hack that recognizes CloudFlare and revokes all old certificates
- Conclusions:
- Disclosure in Open Source is hard
- Disable features by default
- Many “attacks” were just scans
- Crowdsourcing (the challenge) was effective
- Revocation needs a solution
- Support OpenSSL
- CloudFlare is blocking tor; the main problem is that there is a lot of spam coming from it
Security Analysis of a Full-Body X-Ray Scanner
- Bought one on E-Bay from an European seller who bought a ~new one at a US government surplus auction
- Works by scanning a X-ray beam, measuring backscatter
- Can see bones very close to the skin, zipper and rivets from jeans
- Radiation safety:
- Ordinarily, a scan is 70–80 nSv, i.e. ~24 minutes of background exposure or eating one banana
- Some safety controls: verifying the ray is moving, etc.; can be overridden by the software in ROM; but the out-of-ROM embedded system does not have sufficient control to overr-radiate.
- Privacy:
- TSA was claiming the machine could not save images, but the manufactured machine actually can do that (on a floppy disk :) )
- X-rays backscatter in all directions, so any nearby adversary can reconstruct the images as well
- Efficacy:
- Wrote an image viewing malware which detected an X-ray-visible (but not human-visible) image and used this to trigger showing a benigh software instead.
- Attacks without tampering with the machine:
- The skin is reflecting backscatter, but both a metal gun and the background don’t, so it is possible to simply carry metal weapons outside of the body area. Possible mitigation: scan also from the side(s).
- Plastic explosives: can detect C4 blocks, but primarily by its edges—actually by shadows. So can mold the C4 into a thin pancake (and use the metal detonator as a belly button replacement when carrying the pancake over the belly).
- A few of these attacks were predicted even without access to the machine; having a machine makes fine-tuning the attacks obviously easier.
- Mitigations:
- The machines were pulled from the airports anyway :) Also disclosed to DHS and Rapiscan, but didn’t get much engagement.
- Suggested mitigations: perform also side scans, pair with metal detectors.
- DHS Office of Inspector General Report:
- The 106 machines that were decommissioned were put away into a warehouse, with public enough access and a fence only added later.
- At least one of the decommissioned machines was not properly wiped of TSA software.
- This is a “secret development model” data point: Either the TSA didn’t find these flaws, or found them and deployed anyway. We don’t know which one…
- Suggesting more third-party audits.
- The machines were withdrawn because the manufacturer couldn’t write “automatic target recognition” (image analysis without showing the images to people) before the deadline; so they may be redeployed if the software is written, and in the meantime can be used by other agencies that aren’t prohibited from viewing the images.
- There is a public report of where these machines went, mostly courthouses and jails.
- TSA is still contracting with others to develop such machines.
- Lessons:
- The effectiveness is limited by the physics of the method (X-ray single-value brightness per pixel)
- A software compromise can make the physics irrelevant.
- Procedures are critical, but not as reliable as embedded software; the information may get forgotten. (e.g. the UI really nudges the operator to doing 2, not 4, scans per subject.)
- Adversarial thinking and testing matters.
- Simplicity and modular design help. (E.g. discrete logic and simple protocols instead of SoCs)
- Secrecy of the design didn’t prevent discovering the attacks even without having access to the machine; OTOH the ability to test on a real machine may help against fine-tuning; the attacks got much better with a few iterations of refinement.
- It’s not clear that restricting access is feasible given the wide deployment.
- They have lost track of all the surplus machines, reportedly one can now be bought for $4k.
- Did not test explosives inside a human body, didn’t get a willing subject. It’s clear the scanner can see a little into the body, won’t even speculate on the more details.
- Could probably have a checkerboard or something like that on the background to detect the outline better, but the TSA is deploying scanning from two sides at the same time so there is no background.
- Journalists were able to use the FOI act to get saved images from a courthouse; so it is clearly possible to save images; the TSA just claims that their software does not do that (but can still take pictures with a cell phone…)
- Someone in the audience suggested to use e.g. pig skin to mask contraband: difficult to do in practice with raw meat, and it would have to be tapered to avoid unexpected shadows.
- “Best way to smuggle a gun is to wrap it in a simulated plastic explosive”, and it is trivial to buy the simulated plastic explosive.
- Did not test leather clothes.
- Quantifying excess deaths from the widespread deployment: Given the low levels of radiation, not confident in the models for impact that we have, but the models suggest that there is <1 excess death so far.
Let’s Build a Quantum Computer!
- Building blocks:
- Qubit: can be in two states (~0,1) at the same time, with a phase between them. Imagine as an interference between two waves, depending on a phase creates a wave over time.
- Every measurement also changes the state; this causes a “collapse of the wave function”, where we measure only one of the state with a probability that depends on the amplitude.
- Multi-qubit superposition states: we can set up a multi-qubit register where both states of each qubit are equally likely; the result is that the register has all 2n states in superposition.
- Quantum gates: ~functions; different set of functions, can not e.g make a simple copy. There is an universal gate, though. A quantum gate converts a superposition of input states into a superposition of function values for all input states.
- Quantum entanglement: can have a function that returns a superposition of (01,10); then reading/collapsing one qubit also inherently collapses the state of the other quit even before we have measured it!
- Cracking passwords:
- Can have a quantum function that evaluates (password validation + copy of password) for all inputs at once, but then we can read the wanted variant only with probability 1/N, i.e. not better. Grover search operator after sqrt(N) invocations raises the probability of reading that variant to ~1; i.e. brute force goes from O(N) to O(sqrt(N)) [is that O(sqrt(N)) time or that amount of hardware??]
- Shor algorithm for factorization: log(n)3, where n is number of bits of input
- Physical implementations:
- “ion Trap”: 50-100 qubits
- Superconducting quantum processors:
- Challenges:
- Decoherence: The surrounding environment in the vicinity is also measuring/manipulating the qubits, collapsing the quantum state.
- Gate fidelity & qubit-qubit coupling: it is difficult to reliably switch on/off qubit coupling of many qubits with high precision
Correcting copywrongs
- Worst of both worlds:
- English tradition, with a fair use decided by judges: i.e. intransparent, benefits those who can go to court.
- Continental traditions, with inalienable rights of authors and fair use exceptions written in the law: i.e. inflexible, the law must change for new types of use.
- EU law is both inflexible and intransparent: the users’ rights are not harmonized, there are 20 optional exceptions for copyright and every state decides on their exceptions—so have to both wait for law updates, and have a lawyers to understand the results.
- Examples:
- “Freedom of panorama”, i.e. taking a photo of a public place does not violate architectural rights even if it is an object of the picture: not an exception in most countries. (Wikipedia ended up taking a photo of flags in front of the building and having the building in the background only.)
- Parody: In some countries this is an exception against copyright of the subject of the parody, but not an exception against copyright of others’ works used within the parody
- Different copyright terms: EU only has a minimum copyright length.
…
EMET 5.1—Armor or Curtain?
- Demonstrating on an Firefox Array.reduceRight() missing bounds check, allows addressing relative to the array.
- First step is leaking an absolute address by allocating two array objects, and reading the first one using the second one.
- Code execution by overwriting virtual method table pointer within an object; but need to bypass Data Execution Prevention.
- EMET overview: injects EMET.dll into all protected applications, then hooks functions. Thus, paradoxically, bypassing EMET is best done by relying on contents of EMET.dll.
- Locating EMET: Can either find the GetModuleHandle function, or get the location of one of the EMET hooks.
- Simplify: use ROP only for a minimal JS runtime corruption, then do the rest in JavaScript: Use a string to read the hook data, to write directly to the stack, or to read all of EMET.dll.
- Bypassing EMET ROP protection:
- Bypass each separately
- e.g. check that a function was called and not returned/jumped into (by checking the return address). Bypass: return to a code that does call that function.
- Or edit the EMET global disable flag
- Or use hard-coded system calls instead of the hooked functions; that is less reliable, the numbers change even between service packs—but we can just read the syscall number from the hooked/copied function.
- Export Address Table access filtering: HW breakpoint on AddressOfFunctions fields, check that a “loaded module” is reading the field. Can bypass this by using a ROP gadget or reuse an existing function (both within a “loaded module”) inherently, or by just disabling the HW breakpoint (hard-coding system call functions, or used to be possible to call an user-land function but that is now also hooked).
- Is EMET being deployed? They don’t see it much, because there are false positives / compatibility problems.
- “In the default configuration Windows is harder to exploit than Linux” because PIE is not default in Linux.
DP5: PIR for Privacy-Preserving Presence
- PIR = Private Information Retrieval: get data from a DB server without revealing the DB server what is being retrieved; not trying to hide who is asking for the data. (e.g. to prevent whois requests triggering a front-running domain registration)
- Trivial: just let the DB server send its complete DB to the server
- Computational PIR: Alice encrypts the server to her own public key; using partial homomorphy, the server is able to create an encrypted response from that encrypted query and plaintext DB, and sent the encrypted response back to Alice
- Information-theoretic PIR: Assume multiple servers with unlimited power, but not colluding; much faster than computational PIR.
- e.g. treat the DB as a r x s matrix D of r records of s bits each; then ask each server for a sum of combination of rows of the DB, and choose the combinations to look random but add up to only to the wanted record.
- Extensions: variable-sized records, look up by keyword or SQL instead of record index, robustness to nonresponsive or malicious servers
- Shamir secret sharing: choose a random polynomial with the secret value at (0, secret); hand out other points on the polynomial to others; then need (degree of polynomial + 1) dots to reconstruct the secret.
- Can extend for error connection
- Private presence system: want friend registration, presence setting and status query; assuming a passive adversary and some amount of honest/non-compromised and non-colluding servers; security guarantees: privacy and integrity of the network connections; unlinkability over sessions
- DP5 protocol:
- Assuming Alice and Bob share a key!
- Uses PIR to let a user query status of each friend.
- Instead of registration/deregistration, split time into separate epochs, and have just set status / query status.
- For each period, (Kepoch, IDepoch) = PRF(Kab, epoch); then store status under ID and encrypt status using K, and let friends do PIR for the ID
- To overcome problems with handling the large database, split into a long-epoch DB containing a public key for each user (not relationship), and a short-epoch DB containing the presence status. [Why does using a different key decrease the short-epoch DB size at all?]
- Cost: “monthly per-user cost” goes toward $1 per user at 1M users
- Sharding the user database: would only allow users within a shart to be friends with each other.
Thunderstrike: EFI Bootkits for Apple MacBooks
- Boot ROM: 64 mbit SPI serial ROM has a public datasheet, so dumped the contents
- A single-byte change of the boot ROM bricks the machine; but the CPU fans first spin up and then spin down a few seconds later, so “something” is checking the ROM. Replaced the initial few instructions of the boot ROM with an infinite loop to see what is spinning down the fans—the fans keep spinning, so the check is being done by software somewhat under control of the boot ROM. Found that “zero bytes” in an EFI header actually contained a CRC32 checksum of the rest of the firmware. (An actual cryptographic check done from the same ROM it is checking would be pointless anyway.)
- ROM contents are all high-entropy; found (?) start of BIOS block using the EFI firmware volume format with a proprietary compression format, identified using the magic header as LZMA.
- Preventing flash write from the OS: FLOCKON bit can only be cleared by a reset. Firmware updates happen using a “bless” utility which sets up EFI “recovery boot” using a .scap file. SCAP is undocumented, but found a UUID as an EFI_CAPSULE_HEADER preceding the EFI firmware volume, and similarly through UUID found a RSA-2048 SHA-256 signature header. So, the “bless” tool does not check the signature, only something during the firmware flash process does. (And if the signature check fails, the machine is bricked.)
- Looking for the UUIDs strings, found code for the signature checking; and we see that it does the RSA check and then calls into the firmware image to do actual firmware write. Tested bypassing this RSA check in the ROM to make sure there is no other hardware verifying the RSA signature.
- Thunderbolt: opens the PCI bus, has an Option ROM facility to run code from the device. Created an option ROM with an infinite loop to see that option ROMs are loaded during the recovery mode boot—but only after the SCAP signature check is verified. OTOH, can let the signature check pass and hook the actual firmware writing process (to just write something else if the Option ROM is large enough, or to only modify the signed firmware so that it contains a different public key and an appropriate CRC32 checksum, and then reboot and flash with a properly signed firmware).
- Note that once the public key is replaced, there is no way software can recover from the firmware overwrite.
- Weaponizing:
- Could intercept hardware in shipment, evil maid attack, border inspections.
- Early boot code can be used to write to Option ROMs, i.e. they can spread virally.
- Might be possible to do with just a root exploit, using the embedded Option ROMs(?)
- The Boot ROM can use SMM or virtualization to hide its own presence.
- Unknown whether this can be performed with two MacBooks connected via Thunderbolt directly, without modifying hardware.
- Mitigations:
- Apple’s proposed fix: No longer load Option ROMs during a firmware update (on new machines); but will still be loaded on normal boots; also it is possible to downgrade the firmware to an older vulnerable version (if it exists).
- Could re-add TPM; would not prevent the attack but would detect it.
- Implement Option ROM driver signing.
- Just disable Option ROMs (the Thunderstrike exploit does this to prevent re-exploits/cleanups.) The Apple devices don’t need them, the drivers are embedded in the firmware.
- There should be a way to disable PCIe on Thunderbolt entirely, because users can’t tell a difference between miniDisplayPort and Thunderbolt; e.g. should require a firmware password (which is currently checked only after Option ROMs run). OTOH Option ROMs are only called during boot, so a reboot would be necessary, which would be noticeable; but a “slot screamer” performing an active DMA attack would still be possible.
- HW modification, hardwiring write-protect and the like.
- SecureBoot Option ROM signing would prevent this.
Computer Science in the DPRK
- An American simply found an on-line video about an university being founded, and simply emailed that he is interested in teaching there.
- Initial student body was “determined to be” men; for getting women in the response was that there may be a nursing program.
- Supposedly because the taught material was not supposed to be discussed publicly and men are expected to be less talkative.
- Was teaching in English; students had 1 year of English teaching on the university, and probably some high-school teaching.
- Had to have a guide to leave the campus.
- Internet:
- Star Joint Venture (DPRK and a Thai company): manage fiber cables
- Koryo Link (joint with an Euro company): provides cellular access; very expensive; ~1M subscribers (with a capital of 4M)
- He had an unfiltered internet access, but there was a campus-level proxy that required authentication.
- No DHCP, had to manually enter gateway address.
- Undergraduate students didn’t have Internet access, graduate students did have it; somehow restricted and had to go to a proctored room.
- There is an internal intranet:
- Typically a physical location has either the worldwide one or the internal one.
- For those who don’t have access, there is a “library service” that can find and download a document on request.
- About 3k sites in there.
- Their own DNS system.
- Didn’t see any IPv6.
- Computers:
- Generally running Windows (mostly XP, little of 7), generally apparently imported from China.
- RedStar:
- Didn’t see anyone seriously using RedStar. Gut feeling that it is more used for industrial control (due to vendor lock-in).
- Looks very OS X-like.
- Default installation doesn’t give you root access.
- Some applications clearly skinned, others apparently new. There is /Applications/*.app with a close copy of OS X bundle format.
- Bokem.app: “you can’t run this unless you are root” but users don’t have the access. If overridden, allows creating encrypted disks—includes AES, Blowfish and Twofish.
- Web browser: renamed Firefox
- Mobile network:
- Lot of feature phones, now Android devices.
- Had an Android tablet with an analog TV but no wifi or Bluetooth.
- Ice-cream Sandwich level Android.
- Includes a book library of leaders’ speeches etc., reskinned Angry Birds.
- Settings limited; USB does not work.
- To get apps, go to a physical store and get them loaded in.
- Can be booted into Android recovery mode, “only slightly broken”.
- The book library is actually accessing a separate encrypted partition?|
- Basic sense is that they want to have people knowledgeable about Android to help expand the customized Android ecosystem.
- Technology availability:
- “Not uncommon” to see phones around the capital; no idea about out-of-capital state.
- Didn’t see many laptops but wasn’t in situations where he’d expect them.
- Brought in a Chromebook and Raspberry Pis, weren’t checked at import.
The Perl Jam: Exploiting a 20-year Old Vulnerability
- Perl lists vs. typing driven by operand scalar/array/… context
- CGI module: CGI->param() returns either a scalar, or a list if there were several occurrences.
- OWASP documents this incorrectly.
- Assigning a list as a value within a hash is treated as an embedded list->hash conversion, allowing the list to control key/value pairs. (Known since 2006)
- Bugzilla: some privileges are given via email regex; after email validation, runs user creation using a hash that has a CGI parameter; i.e. can send a multi-valued real name to override the login name to an unverified email, gaining those regex-based privileges.
- ($a, $b, $c) = @_; used to expand function parameters; if a parameter is a list, it is spliced into the function parameters, allowing a list to override the following parameters (even ignoring the other submitted parameters entirely)
- DBI module: DBI->quote() used against SQL injections; DBI->quote(CGI->param(‘…’)) can be used to change the quoting “type”, and the number quoting “function” just returns the passed object unmodified.
- => TWiki RCE, arbitrary file upload; MovableType SQL Injection.
- Summary: Perl is a hazardous, bizarre language; stop using it.
- use strict; doesn’t help; warnings “only warn you” [? does it mean there are warnings on the program always, or only when exploited?]
- A new version of CGI.pm adds a warning (only) on multi-valued parameters.
- Why didn’t anybody notice within the 20 years?
- Mitigation: cast every value to a scalar or do something else.
- Prepared statements in DBI are safe as for SQL injection, but still allow injecting the following arguments.
unhash—Methods for Better Password Cracking
- Default/known passwords:
- Collected default passwords list
- Some hardcoded backdoors: HP Storage; Kronos Access Control (airport access control, opening doors); Morpho Itemiser 3 (scanning for narcotics)
- Also VPN/FW/antispam devices.
- sshpot.com collects data from ssh honeypots, which allows collecting ongoing attack patterns = others’ attack methods (unknown whether these are backdoors or leaked inside information)
- Difficult to collate the sources, explanations of individual entries, …; proposing a centralized tagged repository; “contact me”
- Cracking human-created passwords:
- Want a data-driven approach, based on password dumps.
- Getting clean data (including occurrence counts) is actually hard to find.
- Simple machine learning fails due to overtraining.
- Word lists for various languages: couldn’t use hunspell/aspell because people don’t use only correctly spelled words, also various names. Wikipedia database dumps are ideal starting points for language-specific word lists, further improved by scraping some Google results.
- This toolset is also useful for targeting a specific industry / interest group.
- Manually wrote classifiers for identified password classes; then re-classify the data set and find a new unknown set and write a new classifier, and iterate…
- i.e. not actually autonomous machine learning; the resulting rule set is ~ specific to the small set of used languages
- Result is a sequence/recurse rule describing the password class method; this can be submitted to a guessing algorithm to generate other guesses.
- Ended up finding 23% more passwords in the validation set than John the Ripper (at the cost of not finding short truly random password).
Why Are Computers So @#!*, and What Can We Do About It?
- “Fundamental process is trial and error of building and testing execution paths”, and there are too many execution paths and states
- Computers are discrete; we can’t do “load testing” and conclude that it will be safe for less demanding uses.
- Possible improvements:
- Do engineering better: doesn’t change root causes
- Should use 1970s languages (ML) instead of 1960s (BCPL, C); languages that have actually been designed using well-defined semantics for syntax, types and the like.
- Prove correctness: getting better (verified “C-like” and “core ML” compilers, LLVM optimization passes, L4 hypervisor, crypto protocols)
- Specify+test behavior of key interfaces (e.g. already happened for {x86,ARM,Power} multiprocessor behavior, TCP and sockets API, C/C++11 concurrency models, OCaml core language, C, ELF linking and loading)
- Even the CPUs are primarily described in prose, not a formal specification. So create a formal, somewhat abstracted/simplified, model/specification and test that the HW is not doing anything unexpected.
- Conversely, this allows using as an oracle: Is an observed behavior allowed by the model?
- Wrote this, an convert to OCaml and also JavaScript (i.e. run in the browser).
- (Nondeterminism (input dependence, or implementation leeway) makes testing difficult.)
- Recap: cost/benefit tradeoffs:
- Use better programming languages.
- Make real specifications of key abstractions, executable as test oracles, optimized for clarity rather than performance, and usable as basis for eventual proofs.
- Full formal verification
Let’s Encrypt
- Expanding from protecting credit card numbers to logins to session cookies… Really, “Networks are untrustworthy and communication needs to be protected”:
- Sidejacking and location tracking through some other server’s cookies
- Integrity of software downloads
- Reader privacy
- Protection against keyword-based censorship in networks
- Protection against ad injection, tracking header injection or malware injection by ISPs (none of these depends on whether the server has any confidential data or expects an attack)
- Barriers to adoption:
- Perception that TLS is slow or resource-intensive;
- Difficulty integrating into load-balancing designs
- Cost and effort of obtaining and managing PKI certificates; even knowledgeable people typically need an hour if not doing it regularly.
- “Let’s Encrypt”:
- EFF, University of Michigan, Mozilla
- A fully automated CA to issue certificates to any site quickly and at no charge.
- Accepted by browsers: the CA will be cross-signed by IdenTrust
- [Why is IdenTrust helping their competition?]
- Sponsored by Akamai and Cisco
- Only “domain validation”, that the applicant controls the domain name or a server at that domain name.
- Operation:
- There will be a client application talking to the CA, developing ACME protocol.
- Client asks for a certificate, server issues some challenges to prove domain ownership and if successful, gives a certificate.
- Organization: a California nonprofit, expecting to be available to the public in June 2015
- Verification method: DVSNI: the verifier asks the applicant to put up, on a separate domain using SNI, a self-signed certificate containing some server-provided information. This proves control of the web server (not only of the domain).
- Plan: A single command to both obtain and deploy the certificate; working on something to parse and modify Apache (and others) configuration. Will also have a stand-alone “get a certificate” mode.
- Revocation implemented, only requires the private key.
- Looking at short-lived certificates (as an alternative to CRLs/OCSP), not decided yet.
- Misissuance mitigations being considered:
- Perhaps publishing all certificates in Google Certificate Transparency.
- May prevent issuance for a domain that already has a certificate (from them or anyone?) unless they can prove control of the key.
- Considering a mechanism for a domain to ask to never issue a cert for them.
- DNS spoofing countermeasure: validate from multiple places (also tried using Tor, not sure whether that will happen)
- Wider integration:
- Would like to integrate this in every server OS or web server.
- Will submit the ACME protocol for standardization.
- Currently/yet no plans for non-TLS certificates; for non-HTTPS, should more or less work because the CN/subjectAltName is the host name, so can ask for a HTTPS certificate and use it for other protocols.
- Already support multi-domain-name certificates
- Don’t know how to do wildcard certificates safely; but with a <1 minute issuance time wildcard certificates may not be needed as much.
- Eric Rescorla’s involvement in a NSA backdoor: “impression is that he was called upon to edit/issue the standard as a chair of the working group, not that he has endorsed this.”
- Don’t have specific plan for how DANE relates to this.
- Having shared hosters automatically deploy this for their customers would be welcome (how does this allow the customer to leave?)
- Unknown whether this would allow issuing for .onion
Now I Sprinkle Thee with Crypto Dust
- (A multi-talk)
- Dename:
- Consensus-based system to map users to public keys
- (Other options: PGP, SSH/TOFU, x509)
- First-come, first-serve [i.e. no kind of verified identity?]; changing a public key requires a signature by the old one.
- Merkle tree, public keys in leaves, interior nodes = hashes of leaves
- Public “leader servers” collect update operations, coordinate their oder, publish them; then separate verifier servers verify the operations’ legitimacy and sign the resulting tree.
- Allows arbitrary consensus requirements.
- New development in OTR:
- Reference implementations are incomplete (?)
- Clients are “often” unsafe/unstable (took Pidgin 6 months to fix an arbitrary code execution exploit)
- LEAP Encryption Access Project:
- Want to bring back federated service protocols instead of monopolies; improve it by protecting users from providers and providers from users
- LEAP:
- “a platform” to make it easy to run a provider
- New protocols: Soledad (searchable client-encrypted synchronized database), Bonafide (user registration, authentication, password change etc.), automated key management for OpenPGP
- Goal: make email work using this. “Had this working a year ago” but many details are left
- Pixelated?
- Want to increase the cost of blanked surveillance, making encryption widespread.
- i.e. a web interface for LEAP-based email.
- You can decide where your keys are.
- Equinox:
- DNSSEC/DANE: DoS issues, ICANN has root keys, TLD operators have TLD keys.
- Tor allows domain owners and users to bypass MITMs (X DNS over UDP is inefficient with Tor).
- Dark Internet Mail Environment
- Just published architecture and specifications: https://darkmail.info/; looking for feedback.
- Goals: “make encryption automatic for the masses”; link security of the system to the strength of the password and the endpoint computer.
- Uses “DMTP” but “relatively transport-agnostic”
- Core: a service provider posts a DNS record with their public key.
- Envelope moved out of the SMTP protocol into the encrypted message; so can tell the sender’s server what domain but not what account to send it to, and tell the recipient’s server what domain but not what account to send it to
- Connects to port 26 using TLS, or 25 STARTTLS MODE=DMTP
Tor: Hidden Services and Deanonymization
- How Tor works: nested encryption: guard (entry)->relays->exit. The client downloads a full list of relays and randomly selects a path.
- Every relay publishes a descriptor to the “authorities”, there are 9-10 Directory Authorities (with public keys embedded in the client). Authorities test relays, sign their descriptor, and establish a consensus.
- Statistics: 1500 guard relays, 1191 exit relays, 6 bad exit relays (e.g. fiddling in the traffic, doing MITM attacks, patching downloaded binaries—if detected)
- Deanonymization:
- Controlling consecutive relays, can pass metadata between them
- Controlling the guard and exit relay, can correlate packet counts/times
- Countermeasures:
- Tor wants to make it difficult to become users’ guard relay: picks a node on boot and keeps using it for months. (Or could also regularly switch guard relays, but that increases the risk that some traffic will be attacked.)
- Could use high latency to make traffic correlation difficult (if the application isn’t interactive)
- Could add padding (where/to/what?)
- Hidden services:
- The hidden service picks three “introduction points”, and builds circuits to them (i.e. the introduction points do not know Bob’s identity), and publishes the introduction points. Then a connection setup chooses a (different) rendezvous point, and both client and server connect a circuit through it.
- The introduction point database is a distributed hash table
- Ran 40 Tor nodes over 6 months; after 25 hours each node was added into the DHT (Tor has increased that interval recently); then was recording both published hidden service descriptors, and requests for them.
- Then crawled all registered services (web only, not images, to avoid child porn)
- Estimating there are about 45k hidden services on average; totally observed 80k unique hidden services.
- ~28k of them were only seen (actually sampled) for 1 day, and then quickly falling to zero; average is 4–5 sampled days for popular services.
- Comparing the list of onion addresses from this sample and something earlier, 50k addresses have disappeared, ~65k new addresses added, only ~10k addresses are in both sets.
- Most popular hidden services: top 40 posts are botnet C&C servers, most no longer exist. Sefnit and Skynet are the most visible.
- Skynet owner had a Twitter account and did a reddit AmA (and was arrested last year)
- Categories of hidden web services (only measuring HTTP and HTTPS):i
- Per number of sites, drugs are most popular, then market place, fraud, bitcoin, mail, wiki, whistleblower… Also revenge porn, child abuse.
- Per access popularity (measuring directory requests, i.e. something between visits and visitors; unknown whether people or machines), abuse is overwhelming majority.
- Possible ways to shut down a site:
- A single individual can intentionally place relays in the 6 places of the DHT necessary to prevent finding the location in the DHT.
- Relay operators could add a blacklist of names to refuse to carry in the DHT (either individually, or in the default Tor software) [who would maintain the blacklist?]
- Deanonymizing hidden service users: Traffic confirmation attacks are much more powerful: on the DHT lookup, can send arbitrary traffic to the requestor through their guard node—and with a cooperating guard node could deanonymize fully. I.e. controlling a percentage of guard nodes makes it possible to deanonymize someone even if not a specific user.
- Deanonymizing hidden services: same: can with a reasonable likelihood deanonymize someone.
- Reportedly (by a tor2web operator?) child protection agencies are regularly crawling the child porn sites to collect more images to add to their databases; but the number of actual users is reportedly fairly low.