Scaling Postgresql Pgday Eu 2009

  • Uploaded by: Gavin M. Roy
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Scaling Postgresql Pgday Eu 2009 as PDF for free.

More details

  • Words: 793
  • Pages: 29
Scaling PostgreSQL Under Fire Gavin M. Roy Chief Technology Officer myYearbook.com pgDay Europe 2009

About myYearbook.com 2007 - 100M Page views per Month 2009 - 1.5B Page views per Month Top 5 Social Network in the United States as measured by Hitwise Top 25 trafficked site in the United States as measured by ComScore 99% Uptime

The 1am Phone Call

Growth is a double-edged sword.

Database Project Considerations • Good Schema • Is it designed to Scale? • Is it designed to be Maintained? • Good Hardware • Will you have enough space for unexpected growth? • Will it be fast enough to handled additional load? • Will it be stable in production under load?

Database Project Considerations • Good Planning • What will you do when there is a failure? • How long will it take to recover? • What kind of failures will you have? • How will you handle upgrades and downtime?

Internet Startup Growth Cycle 1. Prototype 2. Launch 3. Re-Engineer (Fix problems) 4. Add new functionality 5. Repeat Steps 3 and 4

Internet Startup Growth Cycle • Steps 1 & 2 • Limited Budget • Limited Time & Resources • Steps 3 & 4 • Increased Budget • Limited Time & Resources

“The best laid schemes o’ mice an’ men Gang aft agley” - Robert Burns, To a Mouse

Prepare for Growth • Hardware • CPU Horsepower based upon need • Disk based upon need • RAM based upon budget. • Get 2

“The Cloud”

Plan for Growth

Concurrency & Data Growth

Concurrency in PostgreSQL • Each running PostgreSQL connection carries overhead • More connections == Slower queries • Pool your connections • pgBouncer - light weight libevent based pooling daemon • pgPool II - Does pooling and much, much more • Language specific pooling

Table Partitioning • Supported in PostgreSQL as of 8.1 • Excellent method for maintaining data • Allows for removal of aged data without bloat • Focused SELECTS while allowing ad-hoc SELECT across all partitions

Vertical Partitioning Data • Isolate application data in different database servers • Replicate common data needed for joins

Horizontally Partitioning Data • Spread table data across multiple database servers • Use a CRC or Hashing algorithm to determine server location of data • Roll your own in your client code • Use pl/Proxy • Plan for growth, use multiple server slots per server

“Anything that can possibly go wrong, does.” - Jack Sack

Recovering from Failures • Daily backup is not enough • Disaster recovery option • Replicate data for failover and maintenance • via Replication Tools like Londiste, Bucardo and Slony • via Warm Standby (PITR Log Shipping) • Hot standby? Maybe in 8.5

PostgreSQL is YeSQL

Saying no to the naysayers

No[SQL] is Not an Option* • Key/Value store databases are not new, they’re just trendy • Facebook still has to use Memcached even though it developed the Cassandra distributed key/value database. • Same type of developers who jumped on MySQL because “it was fast” • Data without schema has limited use • Ad-hoc reporting is a key business value that is realistically unachievable in non-structured data • Find the balance of speed and simplicity with normalization

How to kill PostgreSQL performance.

Inadequate Hardware • Not enough RAM • Not enough disk bandwidth • Depending on topology • Not enough disks • Not enough controllers • Slow communication bus • Over-saturated CPU

Lock Contention == Death • Nothing brings a server to its knees faster than a long running Exclusive Lock • Even high share count locks go hand-and-hand with slowness • Reduce lock contention • Use partitioning schemes • Use concurrent operations for maintenance related activities

Bloat == Slower Death • MVCC is both integral to PostgreSQL performance and impacts performance • Enter Heap Only Tuples (HOT) • Updates to non-indexed columns in rows can re-use the same tuple • Dead tuples can be re-used • Index Bloat • Massive index bloat can occur in high-write transaction databases • Address by concurrently reindex tables without locking*

“Knowledge is Power” - Sir Francis Bacon

Reacting can be Proactive • check_postgres.pl by Greg Mullane • Nagios plugin • Bloat • Management activity such as last analyze and vacuum • wal file count, txid wrap around, sequence exhaustion • many other items it checks

Be Trendy • Know your database behavior over time • Predict future issues and behavior • Identify issues as they occur • Review impact of maintenance • Know if your heap use exceeds your index use • Look for daily change

Trending and Analysis at myYearbook.com • Cacti - http://www.cacti.net • Posuta - http://code.google.com/p/posuta/ • pgFouine - http://pgfouine.projects.postgresql.org/ • Staplr - http://github.com/gmr/staplr

Fin • Questions? • Follow me on Twitter: http://twitter.com/Crad • Leave feedback: http://2009.pgday.eu/feedback

Related Documents

Postgresql
May 2020 12
Postgresql
June 2020 4
Postgresql Pratico.pdf
December 2019 19
Scaling Desin
May 2020 32

More Documents from "Tanju YILDIZ"