Scaling a Widget Company by Dathan Vance Pattishall
Contents
Who am I?
Introduction
Now I work at RockYou
Wh en
I st
art
ed
TEAM
Problems
Problems
Fix them now and not later
Stability Know Your Data
What data is added? How is data
accessed? How does data grow? Identify the power users?
Stability
Rate issues
Design to Fix the problem
Federation User 1’s Data User 2’s Data User 3’s Data …. User N’s Data
User 1’s Data
User 3’s Data
User 2’s Data
User N’s Data
Federation te
ri ew
D
as e r inc put T NO ough s oe thr
Federation
This Increases Write Throughp ut
How does one Federate?
Enter Global Lookup Cluster • Hash Lookups are fast, can do 45K qps • Ownerid -> Shard_id • Groupid -> Shard_id • Tagid -> Shard_id • Url_id -> Shard_id • Front By memcache • Use NDB to add capacity horizontally and HA
Write Multiple Views of the Data
Keep Data Consistent
What if I need an ID to represent a row REPLACE INTO Tickets VALUES(‘a’); Get a ID back CREATE TABLE `TicketsGeneric` ( `id` bigint(20) unsigned NOT NULL auto_increment, `stub` char(1) NOT NULL default '', PRIMARY KEY (`id`), UNIQUE KEY `stub` (`stub`) ) ENGINE=MyISAM AUTO_INCREMENT=7445309740
Eventually Consistent • Can repair if a race condition occurs and one form of the data is saved, less then .0001% in my environment.
But what if I need a global view of the table • Cron Jobs • Front by Memcache • Offline Tasks to atomic write job and return the page quickly i.e. defer writes to Many RECPT – Pure PHP – Like GEARMAND uses IPC distributed across servers – Does 100Million actions per day and scales linearly
What about maintenance
What about Shard Misbalance?
Migrate them • object_id -> shard_id, lock shard_id for object_id • Migrate the user • If error die, send alert • Takes less then 30 seconds per primary object • Currently shards are self balancing, can migrate 4 million users in 8 days, at slowest setting.
What about managing datasize • Enter Shard Types – Archive Shard – Sub Shards
• One way a DBA can scale is to partition and allocate a server per table, why not by shard. • Allows for bleeding edge techs, have 10 shards of XTRA-DB
What about Split Brain?
Friend Queries • Currently I brute force it • Hopefully by the time I present this my Java Layer is done.. So I’ll talk about that.
Friend Queries MULTI-GET from Shards
Get this book
Capacity Planning
Capacity Planning
Look at the Cluster as a whole
Stats
BCP • Duplicate mySQL statement based replication outside of mySQL • Use SQL comments – SHOW BINLOG EVENTS – PARSE Comments – patch mySQL to show charsets and SET TIMESTAMP events – /* DC:$N:$SERVERID:MEMCACHE KEY $PATH_TO_SQL_EXECUTION*/ – Execute statement IF NOT DC:$N
• Open source this?
Questions