Computer Networking Research Laboratory

Dept. of Electrical & Computer Engineering, Colorado State University

1. Search Terms - Raw Data

This dataset contains BitTorrent search terms extracted from which is the only website with search terms and time information.

1.1 Data Format


1.2 Datasets

dataset1.tgz Start time: 2010-06-08 18:00 MDT
End time: 2010-06-27 09:41 MDT
Duration: 18 days 15 hours 41 minutes
Sampling interval: 5 Seconds
No of samples: 320,239
No of queries: 9,669,035
No of distinct search terms: 1,353,662
dataset2.tgz Need to extract. Has some missing samples.
Start time: 2010-06-29 11:53 MDT
End time: 2010-07-05
dataset3.tgz Still collecting.
Start time: 2010-07-05 09:00 MDT

2. Search Clouds - Raw Data

This dataset contains BitTorrent search clouds from several websites. Some sites only indicate the font size and color (indicate type of content) while 2 sites also provide number of search requests for each search term. However, data collection time period is unknown.

2.1 Data Format

Line 1 - unixtime
Rest of the lines - search_term<tab>font_size<tab>no_of_queries

2.2 Datasets extratorrent_dataset1.tgz Start time: 2010-06-29 16:30:26 EDT fenopy_dataset1.tgz seedpeer_dataset1.tgz tapedown_dataset1.tgz torrentbit_dataset1.tgz torrentscan_dataset1.tgz torrentsection_dataset1.tgz torrenttractor_dataset1.tgz youbittorrent_dataset1.tgz

3. Scripts to Extract/Process

3.1 For Search Terms

3.2 For Search Clouds