advanced algorithm (python)

pablo90

thebarabasialbertmodel.pdf

Home >Computer Science homework help >advanced algorithm (python)

Advanced Algorithms and Computational Models (module A)

The Barabasi-Albert Model

Giacomo Fiumara

giacomo.fiumara@unime.it

2015-2016

1 / 67

The Barabasi-Albert Model

Hubs represent the most striking di�erence between a random

and a scale-free network

The existence of hubs such as google.com or molecules like ATP (involved in a number of chemical reactions) raises two

questions:

� Why do so di�erent system as the WWW or the cell converge

to a similar scale-free architecture?

� Why does the random network model fail to reproduce the

hubs and the power laws observed in real networks?

2 / 67

Growth and Preferential Attachment

Why are hubs and power laws absent in random networks?

The answer was found in 1999 (Barabasi and Albert, 23944

citations)

Author highlighted two hidden assumptions of the Erdös-Rényi

model

� Networks expand through the addition of new nodes

� Nodes prefer to link to the more connected nodes

3 / 67

Growth and Preferential Attachment Networks expand through the addition of new nodes

� In 1991 the WWW had a single node, the �rst webpage built

by TBL, the creator of the Web

� Today the Web has over a trillion (1012) documents

� This number was reached through the continuous addition of

new documents by millions of individuals and institutions

4 / 67

Growth and Preferential Attachment Networks expand through the addition of new nodes

5 / 67

Growth and Preferential Attachment Nodes prefer to link to the more connected nodes

The random network model assumes that we randomly choose

the interaction partner of a node

Most real network nodes prefer to link to the more connected

nodes (preferential attachment)

6 / 67

Growth and Preferential Attachment Nodes prefer to link to the more connected nodes

� We are familiar with only a tiny fraction of the trillion web

documents. We are more likely yo link to a high-degree node

than to a node with only few links

7 / 67

Growth and Preferential Attachment Nodes prefer to link to the more connected nodes

� No scientist can attempt to read the more than a million

scienti�c papers published each year. The more cited is a

paper, the more likely that we hear about it and eventually

read it. We cite what we read, therefore our citations are

biased towards the more cited publications (the high-degree

nodes of the citation network)

8 / 67

Growth and Preferential Attachment Nodes prefer to link to the more connected nodes

� The more movies an actor has played, the more familiar is a

casting director with her skills. Hence, the higher the degree of

an actor in the actor network, the higher are the chances that

she will be considered for a new role

9 / 67

Growth and Preferential Attachment Nodes prefer to link to the more connected nodes

10 / 67

Growth and Preferential Attachment

The random network model di�ers from real networks in two

important characteristics:

� Growth

Real networks are the result of a growth process that

continuously increases N. In contrast, the random network model assumes that the number of nodes N is �xed

� Preferential Attachment

In real networks new nodes tend to link to the more connected

nodes. In contrast, nodes in random networks randomly

choose their interaction partners

11 / 67

The Barabasi-Albert Model

The Barabasi-Albert (BA) model is de�ned as follows:

We start with m0 nodes, the links between which are chosen arbitrarily, as long as each node has at least one link. The

network develops following two steps:

� Growth � At each timestep we add a new node with m (≤ m0) links that connect the new node to m nodes already in the network

� Preferential attachment � The probability Π(k) that a link of the new node connects to node i depends con the degree ki as

Π(ki ) = ki∑ j kj

12 / 67

The Barabasi-Albert Model

13 / 67

The Barabasi-Albert Model

� Preferential attachment is a probabilistic mechanism

� A new node is free to connect to any node in the network,

whether it is a hub or has a single link

� However, it is highly probable that a new node connects to a

high degree node

� After t timesteps, the BA model generates a network with N = t + m0 nodes and m0 + mt links

� The obtained network has a power-law degree distribution with

degree exponent γ = 3

14 / 67

The Barabasi-Albert Model

15 / 67

Degree Dynamics

To understand the emergence of the scale-free property it is

necessary to focus on the time evolution of the BA model

� In the model, an existing node can increase its degree each

time a new node enters the network

� This node will link to m of the N(t) nodes already present in the system

� The probability that one of these links connects to node i is

Π(ki ) = ki∑ j kj

16 / 67

Degree Dynamics

� Let us approximate the degree ki with a continuous real variable

� The rate at which an existing node i acquires links as a result of new nodes connecting to it is

dki dt

= mΠ(ki ) = m ki

N−1∑ j=1

� The coe�cient m describes that each node arrives with m links

� Hence, node i has m chances to be chosen

� Moreover N−1∑ j=1

kj = 2mt −m = m(2t −1)

17 / 67

Degree Dynamics

Therefore dki dt

= m ki

m(2t −1) =

ki 2t −1

For large t the −1 term can be neglected in the denominator, obtaining

dki ki

= 1

which can be integrated using the fact that ki (ti ) = m (node i joins the network at time ti with m links). We obtain:

ki (t) = m

( t

)β Where β = 1

2 is called dynamical exponent

18 / 67

Degree Dynamics Some considerations

ki (t) = m

( t

)β

� The degree of each node increases following a power-law with

the same dynamical exponent β. Hence all nodes follow the same dynamical law

� The growth in the degrees is sublinear (β < 1). This is a consequence of the growing nature of the BA model: each

node has more nodes to link to than the previous node.

Hence, with time the existing nodes compete for links with an

increasing pool of other nodes

19 / 67

Degree Dynamics Some considerations

ki (t) = m

( t

)β

� The earlier node i was added, the higher is its degree ki (t). Hence, hubs are large because they arrived earlier, a

phenomenon called �rst-mover advantage in marketing and

business

� The rate at which the node i acquires new links is given by

dki (t)

dt =

1 √ tit

indicating that in each time step older nodes acquire more

links (as they have smaller ti). The rate at which a node acquires links decreases with time as t−1/2

20 / 67

Degree Dynamics Some considerations

21 / 67

Degree Distribution

� To calculate the degree distribution of the BA model in the

continuum approximation we �rst calculate the number of

nodes with degree smaller than k, i.e. ki (t) < k

� Using

ki (t) = m

( t

)β we can obtain

ti > t

( m

) 1/β

� In the BA model, a node is added at equal time step.

Therefore, the number of nodes with degree smaller than k is

( m

) 1/β

22 / 67

Degree Distribution

� Altogether there are N = m0 + t nodes, which becomes N ≈ t in the large t limit

� Therefore the probability that a randomly chosen node has

degree k or smaller, which is the cumulative degree distribution, follows

P(k) = 1− ( m

) 1/β

� By taking the derivative we obtain the degree distribution

pk = ∂P(k)

∂k =

m1/β

k1/β+1 = 2m2k−γ

with γ = 1 β

+ 1 = 3

23 / 67

Degree Distribution

The continuum theory predicts the correct degree exponent,

but it fails to accurately predict the pre-factors. The exact

degree distribution of the BA model can be obtained using

other approaches and is

pk = 2m(m + 1)

k(k + 1)(k + 2)

24 / 67

Degree Distribution

pk = 2m(m + 1)

k(k + 1)(k + 2)

The degree exponent γ is independent of m, a prediction that agrees with the numerical results

m0 = m = 1 (blue), 3 (green), 5 (grey), 7 (orange)

25 / 67

Degree Distribution

pk = 2m(m + 1)

k(k + 1)(k + 2)

The degree distribution of the BA model is independent of

both t and N. This is in agreement with real networks that di�er in age and size

N = 50000 (blue), N = 100000 (green), N = 200000 (grey) 26 / 67

Absence of Growth or Preferential Attachment

The coexistence of growth and preferential attachment in the

BA model leads to a question: Are they both necessary for the

emergence of the scale-free property?

Is it possible to generate a scale-free network with only one of

the two ingredients?

27 / 67

Absence of Growth or Preferential Attachment Model A: absence of preferential attachment

Model A starts with m0 nodes and follows these steps:

� Growth � At each time step we add a new node with m (≤ m0) links that connect to m nodes added earlier

� Preferential Attachment � The probability that a new node

links to a node with degree ki is

Π(ki ) = 1

m0 + t −1

� This means that Π(ki ) is independent of ki, indicating that new nodes choose randmly the nodes they link to

28 / 67

Absence of Growth or Preferential Attachment Model A: absence of preferential attachment

� For Model A the continuum theory predicts that ki (t) increases logarithmically with time

ki (t) = mln

( e m0 + t −1 m0 + ti −1

) � Consequently, the degree distribution follows an exponential

p(k) = e

m exp

( −

) � An exponential function decays faster than a power law, hence

it does not support hubs

� The lack of preferential attachment eliminates the scale-free

character of hubs from network. Indeed, as all nodes acquire

links with equal probability, we lack a rich-get-richer process

29 / 67

Absence of Growth or Preferential Attachment Model A: absence of preferential attachment

m0 = m = 1 (circles), 3 (squares), 5 (diamonds), 7 (triangles)

30 / 67

Absence of Growth or Preferential Attachment Model B: absence of growth

Model B starts with N nodes and evolves following this step

� Preferential Attachment � At each time step a node is

selected randomly and connected to node i with degree ki already present in the network, where i is chosen with probability Π(k). Π(0) = 0, therefore nodes with k = 0 are assumed to have k = 1, otherwise they can not acquire links

� In Model B the number of nodes remains constant during the

evolution of the network, while the number of links increases

linearly with time. As a result:

ki (t) = 2

N t

31 / 67

Absence of Growth or Preferential Attachment Model B: absence of growth

� At early time, when there are only a few links in the network

(L � N) each new link connects previously unconnected nodes. In this stage, the evolution is indistinguishable from the

BA model with m = 1

� After a transient period, the node degree converge to the

average degree and the degree develops a peak. For

t → N(N −1)/2 the network becomes a complete graph in which all nodes have degree kmax = N −1, hence pk = δ(N −1)

32 / 67

Absence of Growth or Preferential Attachment Model B: absence of growth

t = N (circles), t = 5N (squares), t = 40N (diamonds)

33 / 67

Absence of Growth or Preferential Attachment

� The absence of preferential attachment leads to a growing

network with a stationary but exponential degree distribution

� The absence of growth leads to the loss of stationarity, forcing

the network to converge to a complete graph

� This failure of models A and B to reproduce the empirically

observed scale-free distribution indicates that growth and

preferential attachment are simultaneously needed for the

emergence of the scale-free property

34 / 67

Measuring Preferential Attachment

Growth and preferential attachment are jointly responsible for

the scale-free property

� Growth is easily detectable: all large networks have reached

their size by adding new nodes

� Preferential attachment is also present in real networks, and

can be detected experimentally

35 / 67

Measuring Preferential Attachment

Preferential attachment relies on two distinct hypotheses:

� The likelihood to connect to a node depends on that node's

degree k. This is in contrast with the random network model, for which Π(k) is independent of k

� The functional form of Π(k) is linear in k

Both hypotheses can be tested by measuring Π(k), which can be determined for systems for which we known the time at

which each node joined the network

36 / 67

Measuring Preferential Attachment

Consider a network for which we have two di�erent maps,

taken at time t and t + ∆t

37 / 67

Measuring Preferential Attachment

Π(ki ) = ki∑ j

For nodes that changed their degree during the ∆t time frame we measure

∆ki = ki (t + ∆t) −ki (t)

The relative change is

∆ki ∆t ∼ Π(ki )

This approximation holds if ∆t is small, so that the changes in ∆k are modest. But ∆t must not be too small so that there are still detectable di�erences between the two networks

38 / 67

Measuring Preferential Attachment

The curve ∆ki/∆t can be very noisy

To reduce this noise the cumulative preferential attachment

function is used instead

π(k) = k∑

ki =0

Π(ki )

� In the absence of preferential attachment we have

Π(ki ) = constant, hence π(k) ∼ k � If preferential attachment is present, i.e. if Π(ki ) = ki, it follows π(k) ∼ k2

39 / 67

Measuring Preferential Attachment

40 / 67

Measuring Preferential Attachment

� In the previous Figure are shown the measured π(k) for four real networks

� For each system we observe a faster than linear increase in

π(k), indicating the presence of preferential attachment

� Π(k) can be approximated with

Π(k) ∼ kα

� For the Internet and the citation networks we have α ≈ 1, indicating that Π(k) depends linearly on k

� For the co-authorship and the actor networks the best �t

provides α ≈ 0.9±0.1, indicating the presence of a sublinear preferential attachment

41 / 67

Measuring Preferential Attachment

The theoretical results predict the existence of four scaling

regimes:

� No preferential attachment (α = 0) � The network has a simple exponential degree distribution. Hubs are absent and

the resulting network is similar to a random network

� Sublinear regime (0 < α < 1) � In this region fewer and smaller hubs are present than in a scale-free network. As

α → 1 pk follows a power law over an increasing range of degrees

� Linear regime (α = 1) �This corresponds to the BA model, hence the degree distribution follows a power law

� Superlinear regime (α > 1) � The high-degree nodes are very attractive. In this con�guration the earliest nodes become

super hubs and all subsequent nodes link to them.

42 / 67

The Origins of Preferential Attachment

The key role of preferential attachment poses another

question: where does it come from? From this question two