Side-Channel Analysis:
Analysis Tutorial

This tutorial describes how to use the black-box crawler from this paper:
Peter Chapman and David Evans. Automated Black-Box Detection of Side-Channel Vulnerabilities in Web Applications. In 18th ACM Conference on Computer and Communications Security (CCS 2011), Chicago, IL. 17-21 October 2011.

Contact

This is an initial release with a lot to be improved upon. Please feel free to contact Peter Chapman at pchapman@cs.virginia.edu with any questions or comments.

Work Abstract

Web applications divide their state between the client and the server. The frequent and highly dynamic client-server communication that is characteristic of modern web applications leaves them vulnerable to side-channel leaks, even over encrypted connections. We describe a black-box tool for detecting and quantifying the severity of side-channel vulnerabilities by analyzing network traffic over repeated crawls of a web application. By viewing the adversary as a multi-dimensional classifier, we develop a methodology to more thoroughly measure the distinguishably of network traffic for a variety of classification metrics.

Requirements

The system must have Java 6 SE installed.

Overview

The Side-Channel Analysis (SCA) System is divided into two components. The first component crawls a web application recording network traffic (see the Crawl Tutorial). This second part analyzes the network traffic to search for side-channel leaks in the application. This tutorial describes the second part.

Running The Analysis

To run the analysis, execute:

java -jar analysis.jar <Data Directory> <Host IP>

For example,

java -jar analysis.jar google-output 192.168.1

In the above command, the host IP refers to the first three octets of the IP from which the crawl was run. This input allows the WiFi metrics to distinguish between ingoing and outgoing traffic as would be available to an adversary. Make sure to execute the binary from the root of the project directory.

Modifying the Analysis

By default the analysis will perform all of the calculations described in the next few sections, in practice you may want to modify these settings by editing edu.virginia.pmc8p.sca.SCAFinder.runDetectors().

Distance Metrics

Total-Source-Destination. The Total-Source-Destination (TSD) metric returns the summed difference of bytes transferred between each party. In a trace containing only a server and a client, it is the difference in the number of bytes transferred to the client, added to the difference in the number of bytes transferred to the server. This metric is easily manipulated through basic packet padding which hides the actual lengths of the packets.

Size-Weighted-Edit-Distance. The Size-Weighted-Edit-Distance (SWED) adds robustness by tracking the control-flow of the transferred information. Unlike the Total-Source-Destination metric, the sequence of transferred data matters. Every transfer is treated as a symbol in a string of the sequence of transfers, src -> dst. The distance is the Levenshtein distance between the translated strings, but in order to give weight to the transfer sizes, the cost of each edit operation (insertion, deletion, and substitution) is the number of bytes being inserted or deleted. A minimum weight is set at a configuration value a, in order to lend sufficient weight to smaller transfers (TCP ACKs). If the source and destination are the same, the cost is simply the difference in transfer sizes.

Edit-Distance. Since the simple packet-padding defense dramatically affects the size distributions of transfers, we use the Edit-Distance (ED) metric to understand how well an attacker can do using only on the control flow of the network transfer. Like the previous metric, every transfer in the trace is a symbol in a string. The Edit-Distance is the Levenshtein distance between two strings where all edit operations have an equal cost. Since this metric is independent of the sizes of transfers, the Edit-Distance reveals how well an attacker can do against a perfect packet-padding strategy.

Random. The Random metric serves as a baseline in order to judge the distinguishability gained from the distance metrics beyond the assumption that the adversary can distinguish page breaks. In every metric, the nearest-centroid classifier will not consider classes that require a different number of transitions than the example in question. The Random assigns a random distance between 1 and 1000 regardless of the two examples being compared. Hence, the only useful classifiability gained from the Random metric is a result of the assumption that the adversary can identify when state transitions occur.

Threat Models: WiFi versus ISP

We differentiate between WiFi and ISP scenarios (see the paper for details). To use the ISP scenario, make sure to set the minSize parameter to 70. For the WiFi scenario, use the WiFi versions of the Edit-Distance-based metrics and set minSize to 0.

Distance Metrics

K-Fold Tests. To use as a baseline for testing the existence and exploitability of side-channel leaks, we construct a multi-class classifier, classifying network traces according to the action performed to generate them. Our classifier uses a nearest-centroid approach, assigning an unknown trace to the class of the nearest class centroid, where nearest is defined according to one of the distance metrics. Since the exact distribution of each class is unknown we estimate the centroid by attempting to create a trace that minimizes the Hamiltonian distance from the examples in the class. We validate the performance of the classifier by running K-fold cross-validation testing. The higher the success rate of the classifier, the more likely an attacker will be able to exploit a leak based on the properties measured in the metric. Ideally, a well protected system would not allow an attacker create a classifier that performs better than is possible with random guessing.

public KFoldTest(double percentageTraining, Metric m,
                 ArrayList<Integer> numberOfReturns, 
                 double threshold, int minSize)
percentageTraining
Set to -1 to do all-but-one folds, otherwise give a percentage to set as the training size.
m
The distance metric used.
numberOfReturns
A list of numbers corresponding to the top X returns as used in the paper. It's more efficient to give them as a list instead of calculating them separately.
threshold
Set to 1.
minSize
Described above. Set to 70 for ISP, 0 for WiFi.

Entropy Calculations. Previous work measured the severity of leaks using bits of entropy or reduction power. Both measurements are a function of the size of the uncertainty set the attacker has for classifying a given network trace. In other words, given a network trace, how many classes can be eliminated as an impossibility for generating that trace. Logically, bits of uncertainty indicate the amount of ambiguity an attacker has in classifying a network trace. Using a concrete example, if a network trace is identical for four actions that trace is said to have log24 = 2 bits of entropy.

Ideally we would measure the entropy for every possible network trace, looking at the number of classes that could possibly create each trace. To find the entropy of the system, we sum the entropy of each trace weighted by the probability of that trace occurring. In practice, however, it is infeasible to enumerate every possible trace so we use the corpus of those generated by our testing. To simplify our model we assume that each user action is equally probable.

The key difficulty in calculating entropy lies in determining the size of the uncertainty set for a given trace. In our analysis we take the estimated centroid for each class, then find the threshold distance from the centroid such that a certain percentage of the samples in the class are within that distance of the centroid. We use the threshold distance as the boundary for distinguishability.

public EntropyCalculator(Metric m, double threshold,
                         int minSize, boolean sameSize)
m
The distance metric used.
threshold
The threshold as described above.
minSize
Described above. Set to 70 for ISP, 0 for WiFi.
sameSize
Set to false.
Fisher Criterion Calculations. The Fisher criterion is essentially the ratio of the between-class variance to the within-class variance of the data. The higher the value, the more classifiable the data. The Fisher criterion is used as a tool in linear discriminant analysis to construct strong classifications. Since we are given the classifications (it is known which user actions created the network traces), we use the Fisher Criterion as a measurement of the severity of side-channel leaks.
public 
FisherCriterionCalculator(Metric m, double threshold, 
                          int minSize, boolean sameSize)
m
The distance metric used.
threshold
Set to 1.
minSize
Described above. Set to 70 for ISP, 0 for WiFi.
sameSize
Set to false.

License

Copyright 2011, Peter Chapman.

Licensed under the Apache License, Version 2.0 (the "License") available from http://www.apache.org/licenses/LICENSE-2.0.