The Kazaa Overlay
Essay by review • February 7, 2011 • Case Study • 7,577 Words (31 Pages) • 1,828 Views
The KaZaA Overlay: A Measurement Study
Jian Liang
Department of Computer and
Information Science,
Polytechnic University,
Brooklyn, NY, USA 11201
Email: jliang@cis.poly.edu
Rakesh Kumar
Department of Electrical and
Computer Engineering,
Polytechnic University,
Brooklyn, NY, USA 11201
Email: rkumar04@utopia.poly.edu
Keith W. Ross
Department of Computer and
Information Science,
Polytechnic University,
Brooklyn, NY, USA 11201
Email: ross@poly.edu
September 15, 2004
Abstract
Both in terms of number of participating users and in tra±c volume, KaZaA is
one of the most important applications in the Internet today. Nevertheless, because
KaZaA is proprietary and uses encryption, little is understood about KaZaA's
overlay structure and dynamics, its messaging protocol, and its index manage-
ment. We have built two measurement apparatus - the KaZaA Sni±ng Platform
and the KaZaA Probing Tool - to unravel many of the mysteries behind KaZaA.
We deploy the apparatus to study KaZaA's overlay structure and dynamics, its
neighbor selection, its use of dynamic port numbers to circumvent Їrewalls, and
its index management. Although this study does not fully solve the KaZaA puzzle,
it nevertheless leads to a coherent description of KaZaA and its overlay. Further-
more, we leverage the measurement results to set forth a number of key principles
for the design of a successful unstructured P2P overlay. The measurement results
and resulting design principles in this paper should be useful for future architects
of P2P overlay networks as well as for engineers managing ISPs.
1
1 Introduction
On a typical day, KaZaA has more than 3 million active users sharing over 5,000
terabytes of content. On the University of Washington campus network in June 2002,
KaZaA consumed approximately 37% of all TCP tra±c, which was more than twice
the Web tra±c on the same campus at the same time [8]. With over 3 million satisЇed
users, KaZaA is signiЇcantly more popular than Napster or Gnutella ever was. Sandvine
estimates that in the US 76% of P2P Їle sharing tra±c is KaZaA/FastTrack tra±c and
only 8% is Gnutella tra±c [23]. Clearly, both in terms of number of participating users
and in tra±c volume, KaZaA is one of the most important applications ever carried
by the Internet. In fact, it can be argued that KaZaA has been so successful that
any new proposal for a P2P Їle sharing system should be compared with the KaZaA
benchmark. However, largely because KaZaA is a proprietary protocol which encrypts
its signalling messages, little has been known to date about the speciЇcs of KaZaA's
overlay, the maintenance of the overlay, and the KaZaA signalling protocol.
In this paper we undertake a comprehensive measurement study of KaZaA's overlay
structure and dynamics, its neighbor selection, its use of dynamic port numbers to
circumvent Їrewalls, and its index management. Although this study does not fully
solve the KaZaA puzzle, it nevertheless leads to a coherent description of KaZaA and
its overlay, while providing many new insights about the details of KaZaA.
To unravel the mysteries of the KaZaA overlay, we developed two measurement
apparatus: the KaZaA Sni±ng Platform and the KaZaA Probing Tool. The KaZaA
Sni±ng Platform is a set of KaZaA nodes that are forced to interconnect in a con-
trolled manner with one another, while one node is also connected to hundreds of
platform-external KaZaA nodes. The KaZaA Sni±ng Platform collects KaZaA sig-
nalling tra±c, from which we can draw conclusions about the structure and dynamics
of the KaZaA overlay. The KaZaA Probing Tool establishes a TCP connection with
any supplied KaZaA node, handshakes with that node, and sends and receives arbitrary
encrypted KaZaA messages with the node. It is used for analyzing node availabilities
and KaZaA neighbor selection. Both of these apparatus consume limited resources.
One of the contributions of this paper is to show how it is possible to obtain extensive
overlay information of a large-scale overlay application with a low-cost measurement
infrastructure.
We use these tools to obtain insight into the following questions:
І It is well-known that the KaZaA overlay is organized in a two-tier hierarchy
consisting of Super Nodes (SNs) in the upper tier and Ordinary Nodes (ONs) in
the
...
...