An introduction to social network analysis June 11, 2008 David Lazer Director Program on Networked Governance Kennedy School of Government Harvard University
Networks
Definition:
Paradigmatic focus on relationships Emergent/self-organized interconnected forms Units are indeterminate
Key issues:
How do the configuration of networks affect how individuals and systems function? How to study networks?
A brief history of the study of networks
Roots in sociology and anthropology going back to early days of those fields Sociometry in 1930s (Moreno) Substantial interest in social psychology in 1940s-1960s (Festinger, Milgram, Newcomb) Economic sociology 1970s-present (Granovetter,White, Uzzi) Explosion of social capital research in 1990s (Putnam, Burt) Invasion of the physicists (Watts, Barabasi, Newman)
Networks in political science
Examples go back (at least) to 1938. Applications to:
Legislative processes Public opinion IR Interest groups
But until very recently, very thin research tradition-does not fit into dominant paradigms
Network analysis
Overview of foci of current social network Research design Some examples of applications
Multiple levels of network theory
Systems level– what network structures function well for what tasks? Positional level– how does the individual position in the network affect that individual? Relational level– what drives the configuration of the network?
Overview of some social network “ideas”
How do networks affect how systems and individuals function?
How are networks structured?
How do networks affect systemic and individual functioning?
Regulation Circulation Coordination Control
Regulation vs Circulation
vs
Coordination and control: centralized vs decentralized nets
Network structure
Small worlds (Milgram; Watts and Strogatz) Scale free networks (Barabasi, Stanley) Homophily (Merton, Lazarsfeld)
Small world networks
Scale free networks
Homophily: birds of a feather
How to do social network research?
Types of data Research foci Design issues
Types of network data
One mode vs two mode Whole network vs egocentric Different types of relationships
One mode vs two mode
One mode: person to person
Jack likes
Jill
Two mode: person to event
Jack
Jill
Whole network
Network visualization of Members who traveled at least 10 days together (Williams 2006)
Egocentric
From:Assessing the Social and Behavioral Science Base for HIV/AIDS Prevention and Intervention: Workshop Summary (1995)
Types of relationships
Communication Affection Advice Proximity Power Multiplexity of relationships
Data
What kind of data might be appropriate?
Survey Any communication, meeting Proximity Affiliation Behavioral
Ramifications of missing/noisy data
Some network measures degrade more than others
The coming revolution in observational data
Impact of removal of links from 7m person mobile phone network: weak vs strong tie removal Structure and tie strengths in mobile communication networks by J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási, PNAS 104, 7332-7336 (2007)
Research foci
Individual level System level Network structure Micro to macro, using computational models
Analysis I: individual-level outcomes
Impact on being in a particular place in the network
E.g., impact of centrality: degree, reachability, betweeness on outcomes; benefit of a brokering between other actors (Burt)
Analysis II: system-level outcomes
Impact of structure of overall network
E.g., density of connections has inverse U relationship with performance in creative settings (Leenders, Uzzi) Impact of centralization on signal aggregation (Bavelas) Various research on small groups
Analysis III: network structure
Dyadic correlates of the configuration of the network
E.g., homophily, distance (McPherson)
Mid-level features (e.g., triads, quads, etc) Structure “reduction” (Newman, Frank) Descriptions of overall structure
Degrees of separation, clustering (small world) Degree distribution (scale free)
Analysis IV: computational modeling
In emergent processes, snapshot may not reflect micro-level processes (Schelling) Agent-based modeling: very simple assumptions about behavior, which (sometimes) yield surprising systemic patterns. Ex: my work on the social structure of exploration and exploitation
Network visualization Source: Pajek Homepa ge (http://r esearch. lumeta.c om/ches /map/ga llery/ind ex.html)
When useful?
To see unanticipated patterns More useful for small networks and egocentric networks Tools for pattern recognition in larger networks(?) Powerful tools for presenting ideas Software: Netdraw, Pajek, Visone
Example: Networks among State Health Officials Legend: Relationships 1. Grey ties = overlapping ties 2. Red = Talk in general 3. B l a ck = Pandemic preparedness 4. Dark grey ties = Professional development Nodes 5. R e 6. R e 7. R e 8. R e 9. R e
gion 1: red gion 2: blue gion 3: black gion 4: grey gion 5: pink (Territories)
Example: Flight patterns movie (Aaron Koblin)
http://vw.indiana.edu/07netsci/entries/#flight
Research design
Statistical challenges Design challenges
Statistical challenges
Interdependence of observations
For example, whether if A talks to B, and B to C, it is more likely that A talks to C (transitivity)
Statistical methods to deal with these interdependencies (QAP, P*, ERGM)
Design challenges
The causal nexus– whither the causal arrow?
Network to node? Node to network? Omitted variable driving both?
The value of control The value of longitudinal data
Example: studying social influence
How to dissect cause and effect of social influence? Problem of unobserved heterogeneity Some roommate studies Study of policy school students
Keys to studying social influence in this study
Longitudinal data Measurement of views at inception of system Implausibility of alternative explanation that network is related to unobserved heterogeneity
The network of influence triangle=section 1 square/diamond = section 2 circle/octagon = section 3 dark blue = 1 (extremely liberal) blue = 2 light blue = 3 gray = 4 pink = 5 red = 6 (conservative) larger= became more conservative smaller = became more liberal in-between=no change
Paradigm shortcomings
Until recently, almost all work was based on snapshots of small systems. Lack of attention to causal nexus Lots of attention on flow in networks, but little data on actual flow Relational focus obscures interplay of nodallevel factors and network