In April, an independent researcher launched a tool called OnionScan, which probes dark web sites for various vulnerabilities and other issues. Now, another researcher has described how to deploy that tool en masse using Python, in order to more efficiently scan sites.
Open source intelligence trainer and author Justin Seitz has published the results of around 8,000 site scans using this method. The idea, Seitz told Motherboard, was to allow others to start more large-scale analyses that are usually too technically difficult for non-technologists to jump into to. The creator of OnionScan, however, is concerned that with all of this data collected in one place, sites across the dark web could be deanonymized by a large number of people fairly quickly.
“It’s the technology and resulting data that interest me, not the moral pros and cons”
OnionScan, written in the programming language Go, searches for a slew of potentially sensitive pieces of information related to Tor hidden services, characterised by their .onion addresses. For example, it looks for metadata in uploaded images or exposed server status pages (which can reveal true IP addresses and other sites hosted by the same person). When used against multiple targets, it can find shared encryption keys, implying a strong correlation between different sites.
In April, Motherboard found eight illegal sites leaking potentially identifying data about their owners, using the OnionScan tool.
On Thursday, Seitz published a detailed blog post on how to use OnionScan more efficiently. (Disclosure: I know Seitz as I’m a paying student on his Python course).
In his write-up, Seitz steps through setting up a server with Tor, installing all the necessary software prerequisites and Go, and automating OnionScan to loop through a list of Tor hidden services.
So, whereas before you typically had to deploy OnionScan against one site at a time, this script can cycle through a whole list and collate the results automatically.
“If more people begin publishing these results then I imagine there are a whole range of deanonymization vectors”
But Sarah Jamie Lewis, the creator of OnionScan, warns that publishing the full dataset like this may lead to some Tor hidden services being unmasked. In her own reports, Lewis has not pointed to specific sites or released the detailed results publicly, and instead only provided summaries of what she found.
“If more people begin publishing these results then I imagine there are a whole range of deanonymization vectors that come from monitoring page changes over time. Part of the reason I destroy OnionScan results once I’m done with them is because people deserve a chance to fix the issue and move on—especially when it comes to deanonymization vectors,” Lewis told Motherboard in an email, and added that she has, when legally able to, contacted some sites to help them fix issues quietly.
According to Motherboard’s analysis, 309 of the 8,167 sites scanned by Seitz have a server-status page exposed, although that does not necessarily mean that all of those are also leaking identifying information.
Seitz, meanwhile, thinks his script could be a useful tool to many people. “Too often we set the bar so high for the general practitioner (think journalists, detectives, data geeks) to do some of this larger scale data work that people just can’t get into it in a reasonable way. I wanted to give people a starting point,” he said.
“I am a technologist, so it’s the technology and resulting data that interest me, not the moral pros and cons of data dumping, anonymity, etc. I leave that to others, and it is a grey area that as an offensive security guy I am no stranger to,” he continued.
Whatever the case, hidden service operators may now be a lot more aware of some of their site’s issues—and so may anyone else.