Similarity Enhanced Transfer

right
Similarity-Enhanced Transfer (SET) is a technique for improving the speed at which peer-to-peer file sharing and content distribution systems can share data. SET works by finding similar copies of the desired file, and looking for subsets of those copies that match (or are similar to) subsets of the desired file. If these are found, the similar copies can be used as additional download sources, which can increase the download rate as long as the downloader's connection is not already saturated.
Method
The developers of SET found that for if a particular piece of content has several different versions available for download from a P2P network, there may be enough similarity between the files in the different releases that they can all be used as a download source for a single version. In particular they found, (quoted from ):
* MP3 music files with identical sound content but different header bytes (artist and title metadata or headers from encoding programs) were 99% similar.
* Movies and trailers in different languages were often 15% or more similar.
* Media files with apparent transmission or storage errors differed in a single byte or small string of bytes in the middle of the file.
* Identical content packaged for download in different ways (e.g., a torrent with and without a README file) were almost identical.
SET uses a technique called handprinting
Application areas
SET could be used to improve the speed of:
* peer-to-peer file sharing
* content distribution systems
* cooperative web caching
 
< Prev   Next >