INS ID E P AG E R ANK
B ejo Thampi S 7 CS E R oll No:-14 S NG C E On :-26/08/2009 G uided B y : Mr. S ujith K umar
C O NTE NTS Introduction P ageR ank P ageR ank
calculation O ptimization of P ageR ank R emoval of D angling P ages C onclusion R eferences
Introduction R epresents
how important a page is on the
web. Numeric
value between 0 and 10
O ld
Algorithms cons idered only the content of the page causing web s pamming
C ons iders
both content and anchor text (BackLinks)
P ageR ank P ageR ank
is a link analys is algorithm.
D etermines
a page’s ranking in the search
res ults . A
trademark of Google which is patented to S tanford Univers ity.
C ontinue… Q uoting
from the original Google paper, P ageR ank is defined as: • a link from page A to page B as a vote, by page A, for P age B. • it also analyzes the page that casts the vote.
C alculation of P ageR ank Two
M ethods
S implified
Algorithm and D amping
Algorithm Based
probability distribution
S implified Algorithm In
the general cas e, the P ageR ank value for any page u can be expressed as:
D amping Algorithm P R (V)=
d
(1-d) + d(P R (V 2)/1)
(damping factor) =0.85
C alculated
using S imple Iterative M ethod
E xample P age A
E ach
P age B
page has one outgoing link
i.e. hA
= 1 and hB = 1
G uess1 S tart a
guess at 1.0 P R (A) = (1 – d) + d(P R (B )/1) P R (B ) = (1 – d) + d(P R (A)/1)
i.e.
P R (A) = 0.15 + 0.85 * 1 = 1 P R (B ) = 0.15 + 0.85 * 1 = 1
numbers aren’t changing. S o consider this as a good guess.
here
we
G uess2 guess at 0 ins tead and re-calculate: P R (A) = 0.15 + 0.85 * 0 = 0.15 P R (B ) = 0.15 + 0.85 * 0.15= 0.2775
And again: P R (A) = 0.15 + 0.85 * 0.2775= 0.385875 P R (B ) = 0.15 + 0.85 * 0.385875= 0.47799375
C ontinue… O n 39th iteration , P R (A)=.999999 and P R (B )=1.0000000
O n40th iteration, P R (A)=1.000 and P R (B )=1.000. And average P ageR ank =1.00000
P rinciple:- the “normalized probability dis tribution” (the average P ageR ank for all pages) will be 1.0.
Variations Google •
•
Toolbar R ank The Google Toolbar's P ageR ank feature dis plays a vis ited page's P ageR ank as a whole number between 0 and 10. S E R P R ank R es ult returned by search engine in res ponse to a keyword query.
O ptimization of P ageR ank
Three fundamental areas to look at when trying to optimize the P ageR ank for site:
1.
The links you choose to have link to you, i.e., which ones you choos e, and how much effort you put in to getting them.
C ontinue… 2. Who you choose to link out to from your site M aximising
P ageR ank Feedback and minimising P ageR ank leakage
3.The internal navigational structure and linkage of your pages distribute
P ageR ank within your site.
Internal Linking Hierarchical
Courtsey: http://www-db.Stanford.edu/~backrub/google.html
C ontinue… Looping:-
Courtsey: http://www-db.Stanford.edu/~backrub/google.html
R emoval of D angling P ages S imply
links that point to any page with no outgoing links.
These
links are simple pages that not downloaded yet.
R anking
Need
is affected.
to be removed http://www.iprcom.com/papers/pagerank/index.html
C ontinue… E liminates
dangling pages before P ageR ank C alculation.
D one
by introducing a dummy page.
A link to itself and is pointed by every dangling page.
C ontinue…
Courtsey: http://www.iprcom.com/papers/pagerank/index.html
C onclusion P arameter
involved in G oogle’s ranking of the answers to a given query.
O ptimal computation O ptimization
of P ageR ank
of P ageR ank
R eferences “The G oogle S tory” - D avid A Vise 2. htp://www.iprcom.com/papers/pagerank/index.html 3. http://en.wikipedia.org/wiki/P ageR ank 4. http://www.siteall.com/guide/ 1.
THANK YOU
Q UE S TIO NS ??