Abstract | In the Tor-network are many single-vendor marketplace web sites with a wide range of offers. Some of these vendor websites could be hosted by the same operators. In this paper, a method is presented to find out similarities between these vendor websites to discover possible operational structures between them. In order to accomplish this, similarity values are determined between the darknet websites by combining various features from the different categories structure, content and metadata. A dataset is determined by a first execution of the method and manual validation. Based on this data set, important features are extracted using decision trees. The features of the category structure HTML-Tag, HTML-Class, HTML-DOM-Tree as well as the metadata features File Content and Links-To have proven to be particularly important and can very effectively highlight similarities between darknet web sites. Supported by the similarity detection method, it was found that only 49% of 258 single-vendor marketplaces were unique, i.e. no similar sites existed. In addition, it was possible to find several duplicates of vendor websites, which made up 20%. |
---|