Heavy-tailed distributions (also known as power-law distributions) have been observed in many natural phenomena including both physical and sociological phenomena. One example is the geographic distribution of people around the world. Most places in the world are completely empty or barely populated, while there are a relatively small number of geographical locations which are very densely populated.
A distribution is said to have a heavy-tail if:
This means that regardless of the distribution for small values of the random variable, if the asymptotic shape of the distribution is hyperbolic, it is heavy-tailed [7]. The simplest heavy-tailed distribution is the Pareto distribution which is hyperbolic over its entire range and has probability mass function:
and its cumulative distribution function is given by:
where represents the smallest value the random variable can take.
Heavy-tailed distributions have properties that are qualitatively different to commonly used (memoryless) distributions such as the exponential, normal or Poisson distribution.
In the Internet, heavy-tailed distributions have been observed in the context of traffic characterization and in the context of topological properties. In the area of traffic characterization, evidence indicates that Ethernet traffic exhibits self-similar properties [14]; also WAN traffic exhibits self-similar properties [20], as is the case for traffic specifically associated with WWW transfers [7]. The main implication of such discoveries is that most previous analytic work done in Internet studies adopted assumptions such as exponentially-distributed packet interarrivals. Conclusions reached under such exponentiality assumptions may be misleading or incorrect in the presence of heavy-tailed distributions.
In the context of topological properties, recent empirical studies
[9] have shown that Internet topologies exhibit power laws of
the form for the following relationships: (P1) outdegree
of node (domain or router) versus rank, (P2) number of nodes versus
outdegree, (P3) number of node pairs within a neighborhood versus
neighborhood size (in hops), and (P4) eigenvalues of the adjacency
matrix versus rank. Prior to this discovery, most Internet studies and
analyses had been done using underlying topologies that lack such
properties (e.g. random networks). Several possible causes and
plausible analytical models that explain the appearance of these
properties in Internet topologies have been proposed. However, the
area of topology characterization has not been explored so extensively
and causes for the appearance of such power laws have not been
convincingly given. In [2] the authors indicate that two
possible causes are (F1) preferential connectivity and (F2)
incremental growth. In [16] the authors examine these two
factors in the formation of Internet topologies, plus (F3)
distribution of nodes in space, and (F4) locality of edge connections.
There is agreement in that heavy-tailed distributions are ubiquitous in the Internet. To the best of our knowledge, node placement and the distribution of bandwidth and delays have not been conclusively established, although Paxson observed wide variability in path characteristics such as losses, round-trip times and bandwidth [19], and high variability is one of the landmarks of heavy-tailed distributions. BRITE incorporates heavy-tails in the topology generation for some models. In particular, for models such as Waxman and BarábasiAlbert, the user can select to place the nodes in the plane according to a heavy-tailed distribution. Furthermore, for all the models provided, included the Imported file model, the user can select a heavy-tailed distribution of bandwidths to links. The idea is to generate annotated graphs with bandwidth information to study the effect that such distributions may have on the performance of certain protocols and algorithms. Finally, for the experimental generation model Bottom-up hierarchical, the user can select to assign routers to ASs according to a heavy-tailed distribution.