A new analysis of site fingerprint (WF) attacks targeting the Tor web browser has revealed that it is possible for an adversary to assemble a site frequented by a victim, but only in scenarios where the threat actor is interested in a specific subgroup of the sites visited by users.
“While attacks can exceed 95% accuracy when monitoring a small set of five popular websites, random (non-targeted) attacks on sets of 25 and 100 websites fail to exceed an accuracy of 80% and 60%, respectively,” researchers Giovanni Cherubin Rob Jansen, and Carmela Troncoso said in a recently published paper.
Tor browser offers “unlinkable communication” to its users by directing internet traffic through an overlay network, consisting of more than six thousand relays, for the purpose of anonymizing the original location and use of third parties performing network monitoring or traffic analysis. It accomplishes this by building a circuit that passes through an input, center, and output relay before forwarding the requests to the destination IP addresses.
On top of that, the requests are encrypted once for each relay to further prevent analysis and avoid information leakage. While Tor customers themselves are not anonymous in relation to their entry relays because traffic is encrypted and requests jump through multiple hops, access relays cannot identify customers’ destination, just as exit nodes cannot distinguish a client for the same reason.
Website fingerprint attacks on Tor aim to break these anonymity protections and allow an adversary to observe the encrypted traffic patterns between a victim and the Tor network to predict the website being visited by the victim. The threat model, devised by the academics, assumes that an attacker runs an exit node – to capture the diversity of traffic generated by real users – which is then used as a source to collect Tor traffic tracks and devise a machine learning-based classification model above on the information collected to derive users’ site visits.
The Opponent Model involves an “online training phase that uses observations of true Tor traffic collected from an exit relay (or relays) to continuously update the classification model over time,” explained the researchers who ran relays entering and leaving for a week in July 2020 using of a custom version of Tor v0.4.3.5 to extract the relevant exit information.
To alleviate any ethical and privacy issues arising from the investigation, the paper’s authors stressed the security measures incorporated to prevent the leakage of sensitive websites that users can visit via the Tor browser.
“The results of our real-world evaluation show that WF attacks can only succeed in nature if the adversary aims to identify sites within a small set,” the researchers concluded. “In other words, untargetted opponents aiming to generally monitor users’ site visits will fail, but focused opponents that target a particular client configuration and website may succeed.”