Discrete Mathematics in the Modern World
1
Mathematics - Driven by Needs BC: calendar - astronomy architecture - geometry navigation - trigonometry Middle Ages: currency conversion - algebra introduction of arabic numberals Rennaissance: first printed maths book: Peurbach’s Theoricae nova planetarum (1472) 16th -19th century: science - calculus gambling - probability, combinatorics 20th century: economics - game theory efficiency - linear programming Computer age: algorithmic theory, numerical maths, cryptography, finite mathematics, graph theory
2
Graphs Def: A graph is an object consisting of (i) points in the plane (the vertices) (ii) lines joining the points (the edges)
Rem: Often used synonymously: network Clarification: A graph is not
3
Ex: A map with cities and freeways is a graph
4
Ex: Consider only cities and freeways
5
Ex: London Underground is a graph
6
Ex: The structural formula of Butane is a graph
7
Ex: (i) network of metabolic pathways (ii) study of genes (iii) computer networks (iv) telephone networks (v) social networks (friendship graph) Ex: Characterisation of interval graphs led to Nobel Prize for Microbiology for Benzer’s work on the fine structure of genes.
8
Def: Distance between vertices a and b: dist(a, b) = #steps needed to get from a to b. Ex: Graph below: d(a, b) = 1 and d(a, c) = 2.
Rem: If a graph models a transportation network, then dist(a, b) ∼ travel time from a to b Def: diameter = largest of all distances. Ex: Above: diam(G) = 2.
9
Rem: In a transportation network: Diameter ∼ maximum travel time. Rem: In a sociological network: Diameter ∼ measure of cohesion.
10
Rem: The friendship graph F : Vertices = people, edges = friendships. Rem: Very big, hard to study F . Q: Diameter of F ? Experiment: (S. Milgram, 1967) (i) starter receives folder with name + address of target, (ii) hands folder to someone closer to target, (iii) many folders reached targets in ≤ 6 steps. Conclusion: diam(G) is about 6, the SIX DEGREES OF SEPARATION. Rem: Some objections, but more or less accepted. Mathematics says...
11
Def: The degree of a vertex is the number of vertices it is joined to. Ex: Graph below: deg(a) = 3 and deg(c) = 2. The overall average degree is 3.2.
Rem: Friendship graph: degree = # friends. Reasoning: We know: (i) F has, say, 5.000.000.000 vertices, (ii) F has average degree about, say, 42, (iii) 99% of all graphs satisfying (i) and (ii) have diameter about 6. so we conclude probably diam(F ) ≈ 6.
12
Erd¨ os, Renyi: Theory of Random Graphs: Many properties hold for either close to 100% of all graphs, or for close to 0%, depending on the average degree. Theo: Of all graphs with n vertices and average degree d, where d ≥ log n, almost 100% have log n diam(G) ≈ constant × . log d¯ Rem: log n is much smaller than n, log n ≈ # digits of n Cor: Most likely diam(F ) is very small.
13
Power Law Distributions Lotka’s Law (1926): Let A(k) = # authors who published k scientific articles. Then 1 A(k) ≈ constant × 2 . k Let A(k) be the number of authors who published exactly k articles. If, say, 1000 authors wrote one paper, then approximately A(1)
A(2)
A(3)
A(4) A(5) . . .
1000 1000 1000 . . . 1000 1000 4 9 16 25 1000 = 250 = 111 = 64 = 40 . . .
A(k) follows a power law with exponent 2.
14
Rem: Typical for power law: many authors published 1 paper, fewer published 2, even fewer published 3,... Rem: Power law =“heavy tail distribution” (polynomial, not exponential)
15
Zipf’s Law (1952): Suppose all English words are listed in order of frequency: w1 being the most common word, w2 the second most common word, etc. If W (k) = # occurrences of wk per 100 words of standard text, then W (k) follows a power law with exponent 1: 1 W (k) ≈ const × . k Rem: Similar for all human languages and some programming languages. Awerbach (1913) City sizes follow a power law.
16
Def: Let G be a large graph. Let Deg(k) = #vertices of degree k. If Deg(k) follows a power law, then we say that G is a power law graph. Observation Many graphs are power law. Year 1999 2002 1998 1999 2005 2002 1998 2000 2001
Network Social: phone calls emails film actors Information: www.nd.edu the web word co-occurr. citation netw. Biological: metabolic netw. protein interact.
# vert.
d
exp.
47 million 59912 449.913
3.16 1.44 3.48
2.1 1.5 2.3
269.504 53 billion 460902 783.339
5.55 70.1 8.57
2.1 2.1 2.7 3.0
765 2115
9.64 2.12
2.2 2.4
17
The Web Rem: Prime example of a PLG: WWW
Rem: Important pages have large in-degree. indeg(google) = 4, indeg(P D home) = 1. Rem: ment:
WWW grows by preferential attach-
A new page is more likely to be linked to pages that already have many links. Rem: Graphs that grow by preferential attachment are usually PLG.
18
Theo: Of all PLG with n vertices and given average degree d, almost 100% have diam(G) ≈ constantd × log log n. Meaning: PLG have extremely small diameter. Study: The web has diameter about 19. Rem: F also grows by preferential attachment. So F is also a power law graph. Corollary: If F is a PLG, then probably diam(F ) is extremely small.
19
Searching the Web Rem: search engines consist of 3 parts: crawler: surfs the web and sends data on the content of web pages to the search engine indexer: builds an index (list of key words of each page) query engine: checks which pages have relevant content, then ranks the pages found. Difficult part: Ranking Rem: Old search engines (AltaVista, Lycos) were text based. Google uses the structure of the web graph. Vast improvement!
20
Bad idea: Use in-degree for ranking.
Solution: PageRank algorithm (L. Page, S. Brin, 1998) Tool: Use random walks along edges: If we are at the School of Maths page then Prob(SoM −→ SAMS) =
1 1 = . outdeg(SoM) 4
Idea: Rank according to # visits.
21
Def: For a web page A define visits(A) as # times A is visited total number of steps of a long random walk. visits(A) =
Idea: Rank pages according to visits. Determine visits: Discrete Markov chains with transition matrix P where (
Pi,j =
1 outdeg(i)
0
if i links to j, otherwise,
but if vertex i has outdeg(i) = 0, then let 1 1 1 1 ith row = ( , , , . . . , ) n n n n to avoid getting stuck. Add, with 15% probability, a random jump from vertex i to any vertex. New transition matrix Q = 0.85P + 0.15J, where J is the ‘all 1’ n × n matrix. Qt is ≥ 0 and primitive. By Perron-Frobenius it has a unique eigenvector E > 0. If |E| = 1 then E corresponds to a stationary state: visit(i) = Ei. 22
Ex: A typical random graph with most vertices having the same degree:
23
Ex: A typical power law graph with many vertices of small degree and few vertices of large degree :
24