How MySQL Powers Web 2.0 An overview of how MySQL helps power Web 2.0 technologies and companies
A MySQL® Technical White Paper August 2006
Copyright © 2006, MySQL AB All logos are trademarks of their respective companies
Table of Contents Executive Summary....................................................................................................................4 What is Web 2.0? ........................................................................................................................4 Characteristics and Core Competencies...................................................................................................................5 The Web is the Platform............................................................................................................................................6 An Architecture of Participation .................................................................................................................................7 O’Reilly’s Hierarchy of “Web 2.0-ness” .....................................................................................................................7
MySQL & Web 2.0 Applications.................................................................................................8 Virtual/Online Communities & Worlds .......................................................................................................................8 Linden Labs/Second Life and MySQL ...................................................................................................................9 Additional Characteristics ......................................................................................................................................9 Web Syndication & Feeds .........................................................................................................................................9 FeedBurner and MySQL......................................................................................................................................10 Additional Characteristics ....................................................................................................................................10 Blogs........................................................................................................................................................................10 LiveJournal and MySQL ......................................................................................................................................11 Additional Characteristics ....................................................................................................................................11 Social Networking....................................................................................................................................................11 Mixi.jp and MySQL...............................................................................................................................................12 Additional Characteristics ..............................................................................................................................12 Wikis ........................................................................................................................................................................12 Wikipedia and MySQL .........................................................................................................................................12 Additional Characteristics ....................................................................................................................................13 Customized and Advanced Meta Search Engines..................................................................................................13 Technorati and MySQL........................................................................................................................................13 Additional Characteristics ....................................................................................................................................14 File, Image & Video Sharing ...................................................................................................................................14 Flickr and MySQL ................................................................................................................................................14 Additional Characteristics ....................................................................................................................................15 Online Gaming ........................................................................................................................................................15 PokerRoom.com and MySQL..............................................................................................................................15 Additional Characteristics ....................................................................................................................................16
Technology Requirements of Web 2.0....................................................................................16 The Benefits of Community & Open Source ...........................................................................................................16 Linux ........................................................................................................................................................................17 Apache ....................................................................................................................................................................17 MySQL.....................................................................................................................................................................17 PHP, Perl and Python .............................................................................................................................................18 Ruby on Rails ..........................................................................................................................................................18 Ajax..........................................................................................................................................................................19 memcached.............................................................................................................................................................19
How MySQL Powers Web 2.0 ..................................................................................................20 “Fail Fast – Scale Fast” ...........................................................................................................................................20 Scale Out vs. Scale Up ...........................................................................................................................................20 MySQL Replication..................................................................................................................................................21
Copyright © 2006, MySQL AB
Page 2 of 29
MySQL Cluster for High Availability ........................................................................................................................22 MySQL Query Cache ..............................................................................................................................................24 Pluggable Storage Engine Architecture ..................................................................................................................25 10 Reasons to Choose MySQL for Web 2.0 Applications ......................................................................................26
Conclusion ................................................................................................................................27 About MySQL ............................................................................................................................28 Additional MySQL & Web 2.0 Resources ...............................................................................28 MySQL and Web 2.0 Portal.....................................................................................................................................28 Web 2.0 Articles ......................................................................................................................................................28 White Papers ...........................................................................................................................................................29 Case Studies ...........................................................................................................................................................29 Press Releases, News and Events .........................................................................................................................29 Live Webinars..........................................................................................................................................................29 Webinars on Demand..............................................................................................................................................29
Copyright © 2006, MySQL AB
Page 3 of 29
Executive Summary The Internet continues to evolve at an accelerated rate, with new technological innovations being introduced all the time. These changes are manifesting themselves in hardware computing power, software versatility and networking speeds. This is constantly forcing us to rethink not only how we currently use the web, but also in what new possibilities lie in the future. The rapid adoption in which these new technologies and services are being integrated into our lives, are dramatically changing the way we communicate, socialize, share and locate information, entertain ourselves and shop for goods and services. It also creates unique opportunities (and challenges) for companies and organizations on how to best leverage these innovations. This rapidly evolving landscape of “next generation” technologies and companies, are being categorized as “Web 2.0”. Because these applications predominately “live” online, a strong collaborative and collective nature is being harnessed. Where the web was once a static and passively consumed experience, it is now dynamic, transactional and interactive, where participation is not optional, it is mandatory. The companies that are delivering these applications and services are taking advantage of the lowered market entry points, by making full use of the benefits of open source software running on commodity-offthe-shelf hardware. This has allowed Web 2.0 companies to meet their capacity and performance requirements, incrementally over time. It is no surprise a common characteristic of many Web 2.0 websites, applications and companies, is their use of the LAMP (Linux, Apache, MySQL, PHP, Perl & Python) open source stack. This allows fast-growing sites to deliver performance, scalability and reliability to millions of users at a fraction of the cost of proprietary databases. MySQL enables up-and-coming Web 2.0 sites like Wikipedia, FeedBurner and digg, - as well as established web properties like Craigslist, Google and Yahoo! - to scale out and meet the ever-increasing volume of users, transactions and data. The information presented here will be valuable to entrepreneurs about to create their own Web 2.0 business, existing web properties wishing to bring their applications to the next level, but also to the large number of enterprises interested in leveraging Web 2.0 technologies. You will also gain an understanding of how MySQL can be used in conjunction with other open source components to deliver low-cost, reliable, scalable, high performance Web 2.0 applications.
What is Web 2.0? Web 2.0 can generally be thought of as the technologies and web sites who leverage users and developers in a socially collaborative manner in order to rapidly develop data and applications with a high level of integration across platforms and other services. The term “Web 2.0” was first coined back in 2004 during a brainstorming session between Tim O’Reilly of O’Reilly Media and MediaLive International, a company which puts on technology tradeshows. The term was originally intended for use as the name to describe an upcoming conference showcasing new webbased companies and technologies that had emerged post dot-com bubble. The term “Web 2.0” has since been dismissed as a marketing buzzword, co-opted and validated several times over by various individuals and companies. It has typically been used as a way to describe the new technologies and companies that are revolutionizing the way we use and think about the World Wide Web. Tim O’Reilly expands further on the definition of the term, in his article “What is Web 2.0”, September 30, 2005: “Web 2.0 applications are those that make the most of the intrinsic advantages of that platform: delivering software as a continually updated service that gets better the more people use it, consuming and remixing data from multiple sources, including individual
Copyright © 2006, MySQL AB
Page 4 of 29
users, while providing their own data and services in a form that allows remixing others, creating network effects through an “architecture of participation” and going beyond the page metaphor of Web 1.0 to deliver rich user experiences.” In the following sections we delve deeper into four ideas central to the discussion of Web 2.0, which O’Reilly and others have elaborated on since the initial emergence of the term. They include: • • • •
Characteristics and Core Competencies of Web 2.0 The Web is the Platform An Architecture of Participation Hierarchy of “Web 2.0-ness”
Characteristics and Core Competencies O’Reilly names the seven characteristics and core competencies of Web 2.0 companies in his article, “What is Web 2.0”, September 30, 2005 and we elaborate on them in this section. “They should be in the business of providing services not packaged software, while enabling cost effective scalability.” A key component here is the frequency in which Web 2.0 companies leverage MySQL Replication in a “scale-out” configuration. Scale-out enables the use of low-cost commodity servers to increase database performance and scalability, incrementally, at a fraction of the price of traditional “fork-lift” or “scale-up” methods. This capability is critical for companies who experience explosive growth and adoption in very short time frames. “They should also exercise control over unique, difficult to replicate data sources which get richer the more individuals use and contribute to them.” A site might include a database of customer reviews and recommendations on products. You could imagine the difficulty of attempting to recreate all the unique, varied, and unbiased opinions you may find on a particular product. A site like Amazon’s “Customer’s Reviews” is an everyday example. They “control” the data of customer reviews, which would be difficult to replicate that without the same customer traffic and level of participation Amazon’s customers engage in. Another similar use can be found in eBay’s “Seller Ratings”. For many companies, this competency revolves around controlling data that for competitors is prohibitive to replicate due to the licensing costs from private data providers or an inability to engage users to “create” the data. “Trusting users as co-developers.” This concept revolves around the idea that the users are actively assisting in various capacities in the development process of the application. To some degree, this is not a new concept. The open source community has relied on this model of active contribution since its inception. More specifically, the users and development community are active participants in the development, testing, requesting of enhancements and reporting of bugs. Even companies that offer applications which are not “open source” have employed this methodology. This can manifest itself by the introduction of new functionality in an accelerated manner which mimics the open source community’s “release early, release often” development model. It can also be accomplished by monitoring the usage patterns of users in order to gather intelligence on what functionality is being used and creating value, and which functionality is not. By adopting the development state of being “in perpetual beta”, it allows a company to be more responsive in the adoption of new technologies and usage patterns. They also become more adaptable to changing business conditions. An example of an application still in “beta” yet with a large user base actively using and contributing either actively or passively, is Google’s Gmail application. “Harnessing collective intelligence.” “Collective Intelligence” refers to the level participation a website reaches when the users themselves are actively deciding what is important and provides value to them. Websites which offer product reviews allow the users amongst themselves to rate products which present
Copyright © 2006, MySQL AB
Page 5 of 29
a good value and those that do not. Other examples include Wikipedia where the users are charged with creating and policing data on the system. But also any site which leverages “tags” created by the users themselves to help aggregate and locate relevant data. The idea of collective intelligence can be applied to just about any system where the users of the system have been empowered to decide what is important or accurate and what is not. “Leveraging the long tail through customer self service.” The “long tail” was first coined by Chris Anderson in a 2004 Wired Magazine article to describe certain business and economic models such as Amazon.com or Netflix. The point being made was that customer self-service can be leveraged with effective data management in order to offer goods and services which appeal to users outside the mainstream. The belief is that the aggregate of all these non-mainstream users is much larger then the mainstream users. For example, online booksellers and DVD rental sites draw a significant portion of their revenue from titles that have long disappeared from the general public’s radar. Another example can be seen in the gaining popularity of music trading sites like LaLa.com, whose business revolves around bringing “like-minded” individuals to trade music amongst themselves, regardless if their tastes fall well outside the mainstream. It could be argued that this isn’t necessarily a new concept, as many traditional booksellers, video rental and music stores can attest to the fact that a significant portion of their business stems from music albums which have long disappeared from the charts, movies which have not screened in a theatre for years or books by authors who have long since fallen out of favor with the general public. Web 2.0 companies have realized that their applications must be designed to serve not only popular tastes but also the interests of those on the fringes. “Software above the level of a single device.” The portability of data and the access to it is something Web 2.0 applications must attempt to adhere to. Users are coming to expect that their data can be accessed and synchronized across many devices, such as MP3 players, PDAs, cell phones, kiosks and more traditional computing mediums like workstations and laptops. This data must be especially indifferent to the hardware or operating system platforms on which is to be accessed. “Lightweight user interfaces, development models, AND business models.” On this point, the interfaces which users use to access data must be “lightweight” and highly portable, but still capable delivering a rich end user experience. The use of programming techniques and methodologies like Ajax and Ruby on Rails can be thought of in this respect. Using commodity off-the-shelf hardware, open source software and leveraging development communities and users for testing, allow Web 2.0 companies to enter established markets or create new markets at lower costs. It also allows their applications to be in “perpetual beta”, constantly adapting to the changes conditions of the marketplace and needs of end users.
The Web is the Platform The Web has become “the destination where it all happens”. The exchange and distribution of ideas, how we socialize, conduct business, work, and play is increasingly finding its way on to the Web. In some cases it has become the primary way in which many particular interactions are now conducted. Of course, at the heart of these interactions are the people, applications and the data that drive them. Not many years ago it would have been hard to imagine the Web as a strategically important platform for many of the things that are now common place, like trading stocks, booking travel, conducting commerce, bartering for goods and services, finding new/old friends or even a potential life mate. This perception was often due to the fact that the applications making use of the web as a platform, were often sluggish, had few security controls, were graphically uninteresting, or were held captive by the speed of the endusers internet connection. When comparing these characteristics against the existing desktop
Copyright © 2006, MySQL AB
Page 6 of 29
applications of the day, it is no wonder some people found it hard to imagine that the web could ever be considered a viable platform over the desktop. Fast forward a few years and web applications are now beginning to provide close to if not better enduser experiences. Email is a good example of an application originally relegated to the desktop if you wanted any advanced features. It is now an application that can be accessed over the web, from essentially anywhere in the world with an internet connection, with minimal loss in functionality over a desktop client. In some cases, there is even increased functionality, like portability, if we include accessing email over a PDA or cell phone. Plus advanced search capabilities, contact sharing between devices, no need for local backups and almost unlimited “theoretical” storage capacity. A similar trend is well under way as it relates to spreadsheets, word processing and calendaring. The evolution we are witnessing is that of the web quickly becoming the next “desktop”, or more specifically, the next operating platform on which applications are being designed to run on exclusively.
An Architecture of Participation The concept of “an architecture of participation” is typically used to describe companies, technologies and projects, intentionally designed for contribution from developer communities and individual users with an emphasis on empowerment and openness. Often times this concept is closely linked to open source projects and companies. It may be worth noting that a technology or company that is open source does not necessarily mean it automatically exhibits “an architecture of participation”. However, it is often much easier for open source companies and projects, as they will likely have a devout and often vibrant developer community. Many times proprietary products find it difficult to cultivate a participatory quality without heavy subsidization. This can be further complicated if the source code is closed, or the exposed APIs are complex, making even peripheral contributions difficult. A “release early and release often” development cycle, characteristic of open source software, is an excellent way to include a community of volunteers and parties with vested interests in the software, to test and help debug code. Often the introduction of new features is done in strategic locations on a website or within an application to help ascertain its popularity or usability. This helps developers understand if the feature should be more widely employed and enhanced, or abandoned all together. “An architecture of participation” also relates to the idea of users creating meaningful and valuable data for themselves. Often times the application simply provides the framework and tools to empower users in this capacity. A practical manifestation of this may include seller ratings, user recommendations and restaurant reviews. Some examples of applications which typify this concept include:
Feeds: Users and applications allow their content to be picked up for distribution to subscribers Blogs: Users create site content and drive traffic Social Networking: Users create site content and through their social channels to build a network Wikis: Users contribute articles and manage the content for accuracy and relevance
O’Reilly’s Hierarchy of “Web 2.0-ness” O’Reilly also articulated a “hierarchy” of degrees to which an application possesses or typifies Web 2.0 attributes. They can be found in his article, “Levels of the Game: The Hierarchy of Web 2.0 Applications” July 17, 2006, and we also elaborate on them here.
Copyright © 2006, MySQL AB
Page 7 of 29
“Level 3: Could ONLY exist on the Web and draws its essential power from the network and the connections it makes possible between people and applications.” These applications are characterized as those who require the collective online activity of users in order for the application to become more valuable. Examples in this regard include eBay’s “seller ratings”, Wikipedia’s articles and del.icio.us’s aggregation of tags and sharing of bookmarks. The larger the network of users which contribute or depend on the data, the more valuable it becomes. “Level 2: Could exist offline, but has unique advantages by being online.” An example in this case could be found among photo sharing applications. Unlike desktop photo management applications like Adobe Photoshop Album or Google’s Picasa, online applications like Flickr, gain unique advantages by being online. Specifically, by their ability to share images publicly with other users. Plus, their ability to then be indexed and searched for online via the use of tags and other metadata characteristics. “Level 1: Can and does exist offline, but gains additional functionality by being online.” This level can be usually assigned to productivity applications which sometimes benefit from collaboration. O’Reilly uses the example of Writely in this case. He points out that his word processing application can be used offline when remarks or comments are not required from others (as is true in the vast majority of cases). But when collaborative editing and review is required, its online attributes make it much more efficient and effective then trying to reconcile the markups from multiple reviewers on the same document, individually. This same idea can be expanded to calendaring software as well. Unless, the calendar needs to be viewed, edited or shared by others, outside of the ability to access it online, there is little advantage. “Level 0: The application has primarily taken hold online, but it would work just as well offline if you had all the data in a local cache.” This of course is prohibitive in many cases based on the amount of data that may be required or if the data is licensed or proprietary. O’Reilly’s examples include, MapQuest, Yahoo! Local, and Google Maps.
MySQL & Web 2.0 Applications In this section we explore some Web 2.0 applications and the companies delivering them. We also highlight how MySQL is being leveraged in each scenario.
Virtual/Online Communities & Worlds A virtual/online community or world, sometimes referred to as a “virtual reality”, is a group of people who gather and communicate on the web, often through virtual identities. “Inhabiting” these worlds is typically done for recreational and entertainment purposes. The companies which host virtual communities and worlds make extensive use of Web 2.0 technologies and business models. Communication features like instant messaging, forums, and email are typically offered by these sites in order to foster intercommunication amongst its members. Software tools which permit a high level of customization and personalization of these identities are also made available. These virtual communities depend upon a high-level of social interaction and participation among its members in order to function, but also to remain dynamic and grow. Although typically moderated, especially those communities where children congregate, other communities allow the bulk of the moderating to be done by the users themselves. Therefore empowering the end users to decide was acceptable and worth while. Popular examples of virtual/online communities include: Habbo Hotel, Neopets, Linden Labs (Second Life), and Cyworld.
Copyright © 2006, MySQL AB
Page 8 of 29
Linden Labs/Second Life and MySQL Second Life is a 3-D virtual world entirely built and owned by its residents. Since opening to the public in 2003, it has grown explosively and today is inhabited by 370,997 people from around the globe. The Second Life "world" is hosted on servers that are owned and maintained by Linden Lab. Second Life provides users with tools to interact and modify the virtual world they inhabit. A vast majority of the content in the Second Life world is created by the users themselves. Because of the empowerment and high degree of participation amongst its users, the Second Life world is comprised of many rich and diverse cultures. It is also worth noting that Linden Lab’s actively encourages users to retain the intellectual property rights to any objects they create within the Second Life world.
Additional Characteristics • • • • • • • •
Business is growing over 20% per month Expected to have anywhere from 1 to over 3 million users by 2007 At peak times 5000-5500 users working in parallel Over 1100 servers and about 2000 CPUs to support the “virtual world” Actively replacing several proprietary portions of Second Life with open-source technologies MySQL used to manage user accounts, inventories and presence information High data storage and performance requirements Technology Stack: Debian Linux, Apache, MySQL, PHP/Perl/Python and Ruby on Rails
For more information about Linden Labs/Second Life and MySQL please see: Interview with Ian Wilkes, Director of Operations at Linden Labs http://dev.mysql.com/tech-resources/interviews/ian-wilkes-linden-lab.html My Second Life Runs on MySQL: War Stories from the Metaverse http://mysqluc.com/presentations/mysql06/wilkes_ian.pdf
Web Syndication & Feeds A web “feed” is typically an XML-based document which contains content items, often summaries of stories or blog posts with web links to longer versions. News websites, blogs and podcasts are common sources for these feeds, but they are also used to deliver structured information like current weather data. RSS and Atom are currently the two main web feed formats. Web syndication can be used to describe the function of making an information source, such as a blog available for feed distribution. It is very similar to other syndicated media like television and radio programs or news stories distributed over “the wire”. Likewise the contents of a web feed may be shared and posted by other web sites. Feeds are typically subscribed to directly by users using aggregators or feed readers, which combine the contents of multiple web feeds for presentation. Subscription to a feed is typically done by manually entering the URL of a feed or by clicking a link on the page.
Copyright © 2006, MySQL AB
Page 9 of 29
Because web feeds are designed to be machine-readable rather than human-readable, they can also be leveraged to automatically transfer information from one website to another, without any human intervention. Popular examples of Web Syndication, Feed Management, Feed Aggregator sites and readers include: FeedBurner, digg, Feedster, MyYahoo!, and Google Reader.
FeedBurner and MySQL FeedBurner is the world's largest feed management provider. Their Web-based services help bloggers, podcasters and commercial publishers promote, deliver and profit from their content on the Web. FeedBurner also offers the largest advertising network for feeds that brings together an unprecedented caliber of content aggregated from the world's leading media companies, A-list bloggers and blog networks and individual publishers.
Additional Characteristics • • • • • •
Provides services for over 170,000 bloggers, podcasters and publishers Handles over 270,000 feeds 1 million hits per day 11 million subscribers in 190 countries MySQL Replication for Scale-Out leveraged for reads and snapshot backups Query Cache leveraged in this very high-read environment
For more information about FeedBurner and MySQL please see: FeedBurner: Scalable Web Applications Using MySQL and Java http://www.mysqluc.com/cs/mysqluc2006/view/e_sess/8099
Blogs Weblogs or Blogs as they are more commonly referred to are personal websites in a journal or diary format. Text, images, videos and files make up the majority of the content on blogs. They typically allow visitors to post comments and other messages in response to the bloggers posts. “Pingback” and “trackbacks” can be leveraged so that conversations spanning several blogs can be easily traversed or navigated by readers attempting to follow an exchange. It is vital for blog building applications and blog hosting sites that the database(s) they leverage are: • •
Easy to Use: For administrators and end-users if they must interact directly with database. Reliable: Many users may depend on the service to be available round the clock.
Copyright © 2006, MySQL AB
Page 10 of 29
•
Scalable: Blogging is incredibly popular and is experiencing explosive growth. The sheer amount of data coupled with the varying formats of the data, including text, video, audio and image files put additional demands on the database
Popular examples of blog websites include: LiveJournal, Blogger, MySpace, and Wordpress.com.
LiveJournal and MySQL LiveJournal is a simple-to-use communication tool that lets you express yourself and connect with friends online. You can use LiveJournal in many different ways: as a private journal, a blog, a social network and much more.
Additional Characteristics • • • •
InnoDB for transactions MyISAM for logging and static data for fast reads MySQL Replication for Scale Out Technology Stack: Debian Linux, MySQL, F5 Load Balancing, mod_perl, memcached
For more information about LiveJournal and MySQL please see: Inside LiveJournal’s Backend http://www.danga.com/words/2004_mysqlcon/mysql-slides.pdf LiveJournal’s Backend: A History of Scaling http://mysqluc.com/cs/mysqluc2005/view/e_sess/6257
Social Networking Social networking websites enable users to socialize online based on common interests or causes. These sites normally offer an interactive, user-submitted network of blogs, profiles, groups, photos, MP3s, videos and even internal e-mail or messaging systems. It is estimated that there are currently well over 300 hosted social networking websites on the internet. Popular examples of social networking sites include: MySpace, Friendster, and Mixi.jp
Copyright © 2006, MySQL AB
Page 11 of 29
Mixi.jp and MySQL Mixi is the largest social networking site in Japan. Members can create diaries, share photos, post messages, and participate in discussions.
Additional Characteristics • • • • • • •
More then 100 MySQL Servers in production About 10 additional servers are being added every month Explosive growth: 3.7 million users signed up within the first 3 years Dynamic: 70% are active users (less the 72 hours since last login) Technology Stack: Linux, Apache, MySQL, Perl and memcached MySQL Replication for Scale-Out MySQL leveraged to store meta-data about stored images
“MySQL delivered the right balance of features, reliability, performance and scalability, making MySQL a perfect fit for scaling-out our system.”
Batara Kesuma CTO
For more information about Mixi.jp and MySQL please see:
mixi, inc
Mixi Delivers Massive Scale-Out with MySQL http://www.mysql.com/why-mysql/case-studies/mysql-cs-mixi Mixi.jp: Scaling Out with Open Source, http://mysqluc.com/cs/mysqluc2005/view/e_sess/6257
Wikis A wiki is a type of site that permits users to easily add, remove and edit content on wiki pages within the web browser without having to know HTML. Wikis also have built-in tools for discussing, tracking and implementing changes to content. This is critical when erroneous or incorrect information is posted and needs to be resolved and then corrected. The high level of interaction and empowerment it gives users makes it an excellent tool for projects which require high degrees of collaboration. Another interesting characteristic about wikis is that they generally do not maintain any sort of access restrictions, or if they do, they tend to be quite minimal. Wikis are commonly used for both public and private project communication, intranets, documentation and knowledge bases. Popular examples of public Wiki sites include: Wikipedia, WikiWikiWeb, Wikitravel, World66, and Susning.nu.
Wikipedia and MySQL Wikipedia is the most popular Wiki site on the internet. It is a free encyclopedia built collaboratively by the general public using Wiki software. From the Wikipedia website, co-founder Jimmy Wales has described
Copyright © 2006, MySQL AB
Page 12 of 29
Wikipedia as "an effort to create and distribute a multilingual free encyclopedia of the highest possible quality to every single person on the planet in their own language."
Additional Characteristics • • • • • • •
The site receives over 3000 page views per second In turn, over 25,000 SQL requests are issued per second Technology Stack: Linux, Apache, MySQL, memcached and PHP Data is aggregated, compressed, many times duplicated and replicated Reliability and availability are key database concerns MySQL Replication for Scale-Out Low Cost: Despite being in the top 20 of the most heavily trafficked sites in the world, it relies on donations to stay up and running. This requires the flexibility to deliver performance and scalability on COTS hardware components.
For more information about Wikipedia and MySQL please see: Wikipedia, MySQL and Free Software http://mysqluc.com/cs/mysqluc2005/view/e_sess/6179 Wikipedia: Cheap and Explosive Scaling with LAMP http://mysqluc.com/presentations/mysql06/mituzas_wikipedia.pdf
Customized and Advanced Meta Search Engines This group of Web 2.0 applications include specialized search engines that can crawl, index and return results about relevant blogs, photos, podcasts, videos, personal media, and classified postings. In general, this is the area of the web where one of the largest challenges revolves around the ability to manage the sheer volume of new data constantly being created. Often “mashup” techniques are leveraged to search various data sources and unite them into one comprehensive view. • • • •
Millions of users leverage these search engines either explicitly or through services Hundreds of millions of new posts are created every day This creates billions of hyperlinks In turn, there is a constant expansion of data, meta-data and relationships created every minute
Popular examples of customized and advanced meta search sites include: Craigslist, Feedster, Technorati, and Trulia
craigslist Technorati and MySQL Technorati is the world’s leading search engine of weblogs. Their services help individuals search and organize bloggers and their content and posts. Technorati also tracks other forms of media, including video blogs, podcasts, and movies and videos in real time. All this online activity is monitored and indexed within minutes of posting. Technorati provides a real-time view into the global conversation of the web, helping end-users locate what is interesting, topical, or entertaining to them, as it happens.
Copyright © 2006, MySQL AB
Page 13 of 29
According to Technorati’s data: • • •
About 75,000 new blogs appear each day About 1.2 million new blog posts every day, or about 50,000 updates an hour Technorati tracks over 50 million blogs and counting
Additional Characteristics • • • • • • •
Data accumulation is done in real time Data continues to grow exponentially Data is being queried intensively for up to the minute results New tag data tops almost 600,000 a day and increases every day MySQL Replication for Scale Out and asynchronous calculations An interoperable open source technology stack Ability to deliver a lower cost development and business model
For more information about Technorati and MySQL please see: Technorati: Scaling the Real Time Web http://mysqluc.com/presentations/mysql06/carroll_dorion.ppt
File, Image & Video Sharing File, image and video sharing sites and the online community which develops around them are considered excellent examples of Web 2.0 applications. These web sites provide users an easy and collaborative way share personal photographs, videos and files. These types of services are commonly used by bloggers as an easy way to store and manage their online images. Many of these sites popularity derive from the online community tools they provide which allow content to be tagged and browsed using a folksonomy. Popular examples of file, image and video sharing sites include: Flickr, YouTube, SmugMug, Snapfish, and SimpleStar
“The exponential growth of our PhotoShow platform demands systems that scale rapidly and costeffectively. MySQL provides the mission-critical, highvolume systems that uniquely meet our needs.”
Mike Edmunds CEO
SimpleStar
Flickr and MySQL Flickr is a digital photo sharing website and web services suite, and an online community platform. Flickr allows photo submitters to categorize their images by use of keyword "tags" (a form of metadata), which allow searchers to easily find images concerning a certain topic such as place name or subject matter. Flickr provides rapid access to images tagged with the most popular keywords. Because of its support for user-generated tags, Flickr repeatedly has been cited as a prime example of effective use of folksonomy. Also, Flickr was one of the first websites to implement tag clouds. Flickr also allows users to categorize their photos into "sets", or groups of photos that fall under the same heading. However, sets are more flexible than the traditional folder-based method of organizing files, as
Copyright © 2006, MySQL AB
Page 14 of 29
one photo can belong to many sets, or one set, or none at all. Flickr's "sets", then, represent a form of categorical metadata rather than a physical hierarchy.
Additional Characteristics • • • •
Technology Stack: RedHat Linux, Apache, MySQL, PHP, Perl, and Java Over 25,0000 transactions per second at peak times MySQL Replication for Scale Out Full Text search
For more information about Flickr and MySQL please see: Flickr and PHP http://www.niallkennedy.com/blog/uploads/flickr_php.pdf
Online Gaming Online gaming can be broken down into roughly two main categories, those which involving wagering and those which do not. Non-wagering games are typically known as massively multiplayer online games (MMOG). This year, DFC Intelligence sized the online gaming market at about $3.4 billion, with growth expected to exceed $13 billion by 2011. Curiously, MMOG are expected to remain the leading game category, despite appealing to a smaller segment of players. One of the Web 2.0 competencies that MMOG companies employ is the reliance on online channels in order to deliver their gaming services. This allows them to adopt a business model that relies on subscriptions rather then shrink-wrapped products on retail shelves. Their business models also exist 100% online and often make use of the “long tail” of customer self-service, creating many games and offering almost limitless customization options. MMOGs are incredibly popular and are experiencing unprecedented growth, with several of the top games having millions of subscribers. Popular examples of online gaming sites include: Mythic Entertainment (Dark Age of Camelot), Ongame (PokerRoom.com), SimDynasty, and TombRaiderChronicles.
Sim Dynasty PokerRoom.com and MySQL Ongame’s PokerRoom.com is one of the largest poker sites online. During peak hours, 12,000 plus players occupy poker tables playing over 2 million hands of poker a day. Since each bet, played hand and needs to be recorded, the database is required to handle up 2,000 transactions per second. This means well over 100 million queries are being issued to a MySQL database. The database is comprised of more then 300 tables holding upwards of 20 gigabytes of data. The largest table, which logs played poker hands contains over 30 million rows of data.
“When growing at our speed, it’s essential that tools are easy to use, since there simply isn’t time to back for three months and carefully make design changes.”
Anders Thor DBA
Ongame Copyright © 2006, MySQL AB
Page 15 of 29
Additional Characteristics •
Technology Stack: RedHat Linux and MySQL
For more information about PokerRoom.com and MySQL please see: PokerRoom.com Powers High Transaction Online Poker System with MySQL and HP http://www.mysql.com/why-mysql/case-studies/mysql-hp-ongame-casestudy.pdf
Technology Requirements of Web 2.0 Next we will examine some of the technological components commonly employed by many Web 2.0 companies when developing and delivering applications. All of those we will explore are low cost and open source. They are also highly portable and can scale very well on commodity-based hardware platforms. This is critical for Web 2.0 entrepreneurs, who choose MySQL overwhelming because it is affordable, open source and does the job, reliably. As their businesses grow, accommodating increasing amounts of users and traffic, they develop additional reasons for staying with MySQL as their database platform. This is because MySQL offers advanced feature sets and when used in conjunction with Scale Out, it allows companies to multiply their traffic and users while keeping costs low. Ultimately this allows a company to increase their margins and profits as the support ever increasing volumes of data.
The Benefits of Community & Open Source As mentioned, many of the components that Web 2.0 companies rely on to deliver their applications and services are built on open source components. There are a couple of key benefits that are derived by these companies leveraging software components which are open source in nature. •
Lower Total Cost of Ownership. This enables lower up front and longterm costs associated with the development, delivery and execution of new business models.
“"Without the LAMP software stack, many Web 2.0 companies would have never got off the ground”
Tim O’Reilly CEO
•
O’Reilly Media Component & Application Freedom: Choosing open source software prevents vendor lock in to specific hardware or software stacks. This high degree of interoperability across hardware and software platforms can also be leveraged against the abundance of tools and applications for use throughout the development cycle, from design and modeling, to testing, versioning and day to day operations management.
•
LAMP: Fortunately, there is a proven open source stack which has been consistently leveraged by companies big and small to deliver scalable, cost effective, and interoperable applications. This has been achieved by leveraging the tight integration of Linux – Operating System, Apache – Web Server, MySQL – Database, PHP/Perl/Python Programming & Scripting.
Copyright © 2006, MySQL AB
Page 16 of 29
•
Support: Support for open source components can usually be obtained at no cost from a community of users and developers. It can also be purchased in the form of professional consulting and technical support, which is available by most of the open source vendors, as well as from both large and small ISVs. In this regard, MySQL offers both professional consulting services and technical support through a subscription to MySQL Network. To learn more about MySQL’s professional services and MySQL Network please visit: http://www.mysql.com/consulting/ http://www.mysql.com/network/
Linux Linux is a Unix-like operating system which is free and open source. All of the source code is available for anyone to use, modify and redistribute. This is in contrast to proprietary operating systems like Windows. Linux has been around since the early 1990’s and has at this point become the fastest growing operating system in the world. Much of this success can be attributed to the fact it is a low-cost, secure, scalable and highly interoperable alternative to proprietary operating systems. Linux can be found running on everything from hand-held devices to hardware components, desktop computers to massive computing clusters. All of these characteristics make it an excellent choice for Web 2.0 applications. Linux shares a long tradition of compatibility with MySQL and other open source components by serving as the operating system component of the LAMP (Linux, Apache, MySQL, Perl, Python & PHP) technology stack. MySQL offers binaries and support on many popular Linux distributions like RedHat, Debian, SUSE and Ubuntu. Generic RPMs & TAR packages are also available for other, more specialized distributions.
Apache The Apache HTTP Server Project is an open-source Web server. It is known for being secure, highly portable across many operating systems, efficient in utilizing resources and extensible. According to a Netcraft Web Survey in February 2006, the Apache Web Server continues to be the world’s most popular web server with over 70% of websites leveraging it within their technology stacks. Apache has been extend with compiled modules for interfacing with Perl, Python and PHP. The Apache Web Server also shares a long tradition of compatibility with MySQL and other open source components by serving as the web server component of the LAMP stack. We should also note the emerging popularity of another web server called Lightppd, also compatible with MySQL, which is especially popular with developers using Ruby on Rails.
MySQL Within the LAMP stack, MySQL comprises the database component. The database component serves as the critical piece of software which manages the data leveraged by the applications and web servers. MySQL is a multithreaded, multi-user, SQL Database Management System (DBMS) with over six million installations. MySQL is the database of choice for consistently delivering lower TCO, reliability, performance and ease of use. Many of the largest and fasting growing Web 2.0 companies are designing,
Copyright © 2006, MySQL AB
Page 17 of 29
developing and going into production using MySQL. As their needs grow in terms of capacity, availability and performance, MySQL continues to assist in satisfying these requirements. As with many of the components which comprise the LAMP stack, MySQL is made available under the GNU General Public License. We discuss in more depth the characteristics and features which make MySQL such a popular choice for Web 2.0 companies in Section 5.
PHP, Perl and Python PHP, Perl and Python comprise the dynamic programming and scripting language components of the LAMP stack. Some shared characteristics which make their use appealing to Web 2.0 developers is that they are all free, open source, highly interoperable and perfect for use with dynamic database-backed websites and applications. PHP Hypertext Preprocessor or simply, PHP, is an open-source language for producing dynamic web content mainly in server-side applications. PHP typically runs on a web server, using PHP code as its input and rendering Web pages as output. PHP is a very popular server-side alternative to Microsoft’s ASP.NET and Adobe’s ColdFusion. PHP works extremely well with all the components within the LAMP stack. According to php.net, it is estimated that over 20 million domains on the internet make use of the language. PHP includes many free and open source libraries. PHP actually provides two different MySQL API extensions: •
mysql: which is available for PHP versions 4 and 5, is intended for use with MySQL versions prior to MySQL 4.1
•
mysqli: which stands for “MySQL, Improved”; is available only in PHP 5. It is intended for use with MySQL 4.1.1 and later. This extension fully supports the authentication protocol used in MySQL 5.0, as well as the Prepared Statements and Multiple Statements APIs. In addition, this extension provides an advanced, object-oriented programming interface.
Perl is a dynamic procedural language often used for CGI scripts or as “glue” tying together systems and interfaces not specifically designed to be interoperable. CGI or Common Gateway Interface is a standard protocol for interfacing external application software with an information server, commonly a web server. This allows the server to pass requests from a client web browser to the external application. The web server can then return the output from the application to the web browser. Python is an open source scripting language similar to Perl and Ruby. It has been used to develop many large software projects such as the Zope application server and BitTorrent file sharing system. According to wiki.python.org, it is also used extensively by websites like Google, Yahoo Groups and Yahoo Maps.
Ruby on Rails Ruby on Rails is a free and open source framework written in Ruby optimized for rapidly developing database driven web-based applications. This easy to use framework requires less code and minimal configuration. Although Ruby on Rails ships with a default database and web server components, production environments typically rely on the addition of MySQL and Apache. It is also characterized as being highly interoperable across a variety of platforms and components. Web 2.0 application development typically requires that features and enhancements be developed quickly. An emphasis is also placed on reusability and availability so other applications and web services
Copyright © 2006, MySQL AB
Page 18 of 29
can consume them. Ruby on Rails does an excellent job of providing a framework to achieve these requirements. For Ruby on Rails developers working with MySQL, there is the MySQL-Ruby module. It is a Ruby API for accessing the MySQL server. It has the same functions as the very popular MySQL C API. It is available at: http://www.tmtm.org/en/mysql/ruby/ .
Ajax Asynchronous Javascript and XML or Ajax is it is more commonly known, is a development technique for creating rich, visual appealing and interactive web applications. Web pages which leverage Ajax are more responsive because they exchange smaller amounts of data with the web server. This means that the entire web page does not have to be “refreshed” or completely reloaded in user’s browser after each interaction. This results in web pages which have increased interactivity, speed and usability. These are all key components to help Web 2.0 companies deliver applications which are highly interactive and deliver rich end user experiences which rival those of desktop applications. For these reasons, Ajax is already being widely leveraged in both consumer and business applications. The core components of the Ajax technique include: • • • •
XHTML (or HTML) and CSS leveraged for mark up and style information The description of how an HTML or XML document is represented in a tree structure (otherwise known as DOM), is accessed with a client-side scripting language, usually JavaScript The XMLHttpRequest object is used to exchange data asynchronously with the web server XML as the format to transfer data between the server and client
memcached memcachced is a popular open source distributed memory caching system originally developed by Danga Interactive for the blogging website LiveJournal. It is traditionally leveraged to enhance the performance and responsiveness of dynamic content websites backed by databases. This is achieved by caching data and objects in memory, thereby reducing the amount of data that needs to be read from the database. The performance characteristics it can deliver are faster page loading for users, more efficient resource utilization, and faster database access times in the event of a memcached miss. In more technical detail, memcached acts as a large hash table, caching data as it is being requested by clients. Although it was originally designed to improve the performance of database queries, it has been extended to cache server-side objects as well. In essence, any operation which is resource or time intensive can benefit from the use of memcached. It goes without saying that this technology is of great advantage to Web 2.0 applications, who by definition are very dynamic and data driven. This is in contrast to the static non-interactive web sites characteristic of the early years of the Web. Many websites who make use of memcached, such as LiveJournal, Slashdot and Wikipedia, also make use of MySQL.
Copyright © 2006, MySQL AB
Page 19 of 29
How MySQL Powers Web 2.0 The importance of data within the context of Web 2.0 cannot be overstated. An overarching goal of Web 2.0 application design is for the system to harness collective intelligence with network effects so that the ownership of unique and difficult to reproduce data can be generated. Obviously, at the core of this story is the database which will store and dispense the data. O’Reilly has gone so far as to state in his “What is Web 2.0” article that “Data is the Next Intel Inside”. There are many reasons why MySQL is being leveraged time and again by established web companies and new emerging Web 2.0 companies. The most glaring one is cost. Web 2.0 entrepreneurs
“Fail Fast – Scale Fast” In a July 10, 2006 article entitled “Operations: The New Secret Sauce”, Tim O’Reilly makes the statement that, “Web 2.0 has been summed up as "Fail Fast, Scale Fast,"” The point being made is that the operational environment will be a key differentiator in the success of Web 2.0’s adoption, especially in the enterprise. These services are required to fail over and scale transparently without any disruption in service to the consumer. The ability to handle explosive growth in user volume, transactions and storage capacity is critical to delivering services on a web-based platform. To this effect, MySQL addresses these types of issues via Scale Out, MySQL Network and an ecosystem of partner solutions.
Scale Out vs. Scale Up Scale-out using MySQL enables organizations to cost-effectively solve database capacity issues that result from increased traffic and transaction volumes. In particular, scale-out with MySQL provides organizations with the following advantages: • • • • • •
Easily and cost-effectively add capacity to your database infrastructure using open source software and commodity hardware. Reduced hardware costs by incrementally adding several low-cost commodity systems vs. upgrading high-cost mainframe-class systems Reduce software costs and eliminate up-front licensing by scaling out with MySQL Improve response time and availability by improving the performance of your system so users experience fewer interruptions. Improve scalability using MySQL Replication to distribute large workloads to individual server nodes for execution. Increased flexibility to right-size the initial purchase of commodity hardware and software, incrementally add capacity.
Virtually any application that has a rapidly growing volume of users, transactions or data may be a candidate for more cost-effective deployment using open source technology combined with a scale-out architecture. MySQL is widely used for Scale-Up the following Scale-Out architectures: •
Web Scale-Out to improve the performance, scalability, and availability of web applications such as e-commerce, content management, session
– – – – –
Vertical Expensive SMP hardware Proprietary software Platform lock-in “Fork Lift” to increase capacity & performance
Scale-Out Copyright © 2006, MySQL AB
– – – – –
Horizontal Commodity Intel/AMD hardware Page 20 of 29 Open source software Platform independence Add servers to increase capacity & performance
•
management, search, and security. Data Warehousing Scale-Out to improve the performance and availability of traditional data warehousing (e.g. centralized data warehouse and data marts) as well as real-time Operational Data Stores.
For more information on how Scale Out with MySQL consult the white paper titled, “Guide to Costeffective Database Scale-Out using MySQL” available at: http://www.mysql.com/why-mysql/white-papers/
MySQL Replication MySQL Replication is the key enabler of “Scale Out” discussed in the previous section. Scale Out is leveraged extensively by Web 2.0 sites and applications. MySQL natively supports one-way, asynchronous replication. Replication works by simply having one server act as a master, while one or more servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster. Asynchronous data replication means that data is copied from one machine to another, with a resultant delay. Often this delay is determined by networking bandwidth, resource availability or a predetermined time interval set by the administrator. However, with the correct components and tuning, replication itself can appear to be almost instantaneous to most applications. Synchronous data replication implies that data is committed to one or more machines at the same time, usually via what is commonly known as a “two-phase commit”. In standard MySQL Replication, the master server writes updates to its binary log files and maintains an index of those files in order to keep track of the log rotation. The binary log files serve as a record of updates to be sent to slave servers. When a slave connects to its master, it determines the last position it has read in the logs on its last successful update. The slave then receives any updates which have taken place since that time. The slave subsequently blocks and waits for the master to notify it of new updates. Below in Figure 1 is an illustration of a basic Scale Out implementation using MySQL Replication.
Copyright © 2006, MySQL AB
Page 21 of 29
Web/App Server
Web/App Server
Web/App Server
Load Balancer Reads
Writes & Reads
Master MySQL Server
Slave MySQL Server
Reads
Slave MySQL Server
Replication
Figure 1 Replication offers the benefits of reliability, performance, and ease of use: • •
•
In the event the master fails, the application can be designed to switch to the slave. Better response time for clients can be achieved by splitting the load for processing client queries between the master and slave servers. Queries which simply “read” data, such as SELECTs, may be sent to the slave in order to reduce the query processing load on the master. Statements that modify data should be sent to the master so that the data on the master and slave do not get out of synch. This load-balancing strategy is effective if non-updating queries dominate. (This is normally the case.) Another benefit of using replication is that database backups can be performed using a slave server without impacting the resources on the master. The master continues to process updates while the backup is being made.
MySQL Cluster for High Availability Database high availability is becoming more and more relevant as entire business models begin to be based on the premise that the underlying data that drives a company’s applications must be consistently available. MySQL Cluster has begun to play a critical role in many of these operations, specifically in the area of “session management”. MySQL Cluster was originally designed to meet the throughput and response time requirements needed by some of the most demanding enterprise applications in the world. In a nutshell, MySQL Cluster can be described as a shared-nothing, synchronous database cluster which supports automatic fail over, transactions and in-memory data storage without any special networking, hardware or storage requirements. Designing the system in this way allows MySQL Cluster to deliver both highly availability and reliability, since single points of failure have been eliminated. Any node can fail
Copyright © 2006, MySQL AB
Page 22 of 29
without affecting the system as a whole. An application, for example, can continue executing transactions even though a data node has failed. MySQL Cluster has also proven to handle tens of thousands of distributed transactions per second, replicated across data nodes. As of version 5.1, MySQL Cluster supports data storage not only in main memory (RAM), but also on disk. This arrangement allows applications to leverage the benefits of in-memory data storage, which not only increases the performance of the application, but also limits I/O bottlenecks by asynchronously writing transaction logs to disk. But, with the introduction of disk-data support, ever larger data sets that do not require the performance characteristics granted to in-memory data, can be leveraged within the cluster. MySQL Cluster delivers an extremely fast fail over time with sub-second responses so your applications can recover quickly in the event of a software, network or hardware failure. MySQL Cluster uses synchronous replication to propagate transaction information to all the appropriate data nodes. This also eliminates the time consuming operation of recreating and replaying log files as is typically required by clusters employing shared-disk architectures. MySQL Cluster data nodes are also able to automatically restart, recover, and dynamically reconfigure themselves in the event of failures, without developers having to program any fail over logic into their applications. MySQL Cluster implements an automatic node recovery that ensures any fail over to another data node will contain a consistent set of data. Should all the data nodes fail due to hardware faults, MySQL Cluster ensures an entire system can be safely recovered in a consistent state by using a combination of checkpoints and log execution. Furthermore, as of version 5.1, MySQL Cluster ensures systems are available and consistent across geographies by enabling entire clusters to be replicated across regions. We have illustrated an example MySQL Cluster architecture below in Figure 2. A brief description of the main components of a MySQL Cluster follows as well. • •
•
Data Nodes are the main nodes of the system. All data is stored on these nodes. Data is replicated between data nodes to ensure it is continuously available in case one or more of the data nodes fail. These data nodes in turn, handle all database transactions. Management Nodes handle the cluster configuration and are used to change the setup of the system. Only one management server node is required, but there is also the option of running additional management nodes in order to increase the level of fault tolerance required. The management node is only used at startup or during a system re-configuration, which means the cluster is operable without the management node being online. MySQL Nodes are the MySQL Servers accessing the clustered data nodes. By incorporating this design, the MySQL Server provides developers a standard SQL interface to program their applications against. This eliminates the need for any special application programming in order to interact with the cluster.
Copyright © 2006, MySQL AB
Page 23 of 29
Web/App Server
MySQL Server or NDB API for all Writes & Reads
Web/App Server
MySQL Server
MySQL Server
NDB API
Memory & Disk
Data Node
NDB Storage Engine
Management Server Data Node
Management Server
MySQL Cluster Figure 2 For more information concerning MySQL Cluster and how MySQL can be part of your session management architecture, please visit: http://www.mysql.com/products/database/cluster/ http://www.mysql.com/why-mysql/white-papers/
MySQL Query Cache Another feature that is used extensively by Web 2.0 applications is MySQL’s Query Cache. This is due to the fact that accessing application data, query plans, or database metadata in RAM is much faster than repetitively retrieving that same information from disk or building it from scratch. The MySQL Query Cache, introduced in 4.0.1, can deliver excellent gains in the response times of both basic and resource-intensive SQL statements. The MySQL Query Cache stores the SELECT queries issued by clients to the MySQL database server. If an identical statement is received, the results are returned from the query cache rather then parsing and executing the statement again. Other characteristics of the MySQL Query Cache include: • • • •
No stale data is ever returned to clients. Data is flushed whenever an UPDATE is issued which invalidates the cached data set. The query cache is not applicable for server-side prepared statements. The expected overhead for enabling the query cache is about 10-15%. However, performance gains can be anywhere from 200 to 250% faster when used correctly.
The MySQL Query Cache should be enabled in the following scenarios:
Copyright © 2006, MySQL AB
Page 24 of 29
• • • •
Whenever identical queries are issued by the same or multiple clients on a repetitive basis. Underlying data being accessed is static or semi-static in nature Queries have the potential to be resource intensive or brief, but the result sets are computed in a more complex manner Excellent for data that will be presented across many successive web pages a user may be navigating through
For more information abut MySQL’s resources/articles/mysql-query-cache.html
Query
Cache
visit:
http://dev.mysql.com/tech-
Pluggable Storage Engine Architecture MySQL's unique Pluggable Storage Engine Architecture (PSEA) gives the developer’s of Web 2.0 database-driven web applications the flexibility to choose from a portfolio of purpose-built storage engines that are optimized for specific application domains - OLTP, Read-Intensive Scale-out, High-Availability Clustering, Data Archiving, Data Warehousing, and more. The PSEA also provides a standard set of server, drivers, tools, management, and support services that are leveraged across all the underlying storage engines. An illustration of how the PSEA fits into the overall MySQL Server’s design can be illustrated bellow in Figure 3.
Connectors Native C API, JDBC, ODBC, .NET, PHP, Python, Perl, Ruby, VB
MySQL Server Enterprise Management Services & Utilities Backup & Recovery Security Replication Cluster Partitioning Instance Manager INFORMATION_SCHEMA Administrator Workbench Query Browser Migration Toolkit
Connection Pool Authentication -Thread Reuse - Connection Limits – Check Memory - Caches SQL Interface
Parser
Optimizer
Caches & Buffers
DML, DDL, Stored Procedures Views, Triggers, etc.
Query Translation, Object Privilege
Access Paths, Statistics
Global and Engine Specific Caches & Buffers
Pluggable Storage Engines Memory, Index & Storage Management
MyISAM
InnoDB
Cluster
Falcon
File System NTFS – NFS SAN - NAS
Archive
Federated
Merge
Memory
Partner Community
Files & Logs Redo, Undo, Data, Index, Binary, Error, Query, and Slow
Figure 3 For more information about MySQL’s Pluggable Storage Engine Architecture visit:
Copyright © 2006, MySQL AB
Page 25 of 29
Custom
http://solutions.mysql.com/engines.html
10 Reasons to Choose MySQL for Web 2.0 Applications •
Scalability and Flexibility: The MySQL database server provides the ultimate in scalability, sporting the capacity to handle deeply embedded applications with a footprint of only 1MB to running massive data warehouses holding terabytes of information. Platform flexibility is a stalwart feature of MySQL with all flavors of Linux, UNIX, and Windows being supported. And, of course, the open source nature of MySQL allows complete customization for those wanting to add unique requirements to the database server.
•
High Performance: A unique storage-engine architecture allows database professionals to configure the MySQL database server specifically for particular applications, with the end result being amazing performance results. Whether the intended application is a high-speed transactional processing system or a high-volume web site that services a billion queries a day, MySQL can meet the most demanding performance expectations of any system. With high-speed load utilities, distinctive memory caches, full text indexes, and other performance-enhancing mechanisms, MySQL offers all the right ammunition for today's critical business systems.
•
High Availability: Rock-solid reliability and constant availability are hallmarks of MySQL, with customers relying on MySQL to guarantee around-the-clock uptime. MySQL offers a variety of high-availability options from high-speed master/slave replication configurations, to specialized Cluster servers offering instant fail over, to third party vendors offering unique high-availability solutions for the MySQL database server.
•
Robust Transactional Support: MySQL offers one of the most powerful transactional database engines on the market. Features include complete ACID (atomic, consistent, isolated, durable) transaction support, unlimited row-level locking, distributed transaction capability, and multiversion transaction support where readers never block writers and vice-versa. Full data integrity is also assured through server-enforced referential integrity, specialized transaction isolation levels, and instant deadlock detection.
•
Web and Data Warehouse Strengths: MySQL is the de-facto standard for high-traffic web sites because of its high-performance query engine, tremendously fast data insert capability, and strong support for specialized web functions like fast full text searches. These same strengths also apply to data warehousing environments where MySQL scales up into the terabyte range for either single servers or scale-out architectures. Other features like main memory tables, B-tree and hash indexes, and compressed archive tables that reduce storage requirements by up to eighty-percent make MySQL a strong standout for both web and business intelligence applications.
•
Strong Data Protection: Because guarding the data assets of corporations is the number one job of database professionals, MySQL offers exceptional security features that ensure absolute data protection. In terms of database authentication, MySQL provides powerful mechanisms for ensuring only authorized users have entry to the database server, with the ability to block users down to the client machine level being possible. SSH and SSL support are also provided to ensure safe and secure connections. A granular object privilege framework is present so that users only see the data they should, and powerful data encryption and decryption functions ensure that sensitive data is protected from unauthorized viewing. Finally, backup and recovery utilities provided through MySQL and third party software vendors allow for complete logical and physical backup as well as full and point-in-time recovery.
Copyright © 2006, MySQL AB
Page 26 of 29
•
Comprehensive Application Development: One of the reasons MySQL is the world's most popular open source database is that it provides comprehensive support for every application development need. Within the database, support can be found for stored procedures, triggers, functions, views, cursors, ANSI-standard SQL, and more. For embedded applications, plug-in libraries are available to embed MySQL database support into nearly any application. MySQL also provides connectors and drivers (ODBC, JDBC, etc.) that allow all forms of applications to make use of MySQL as a preferred data management server. It doesn't matter if it's PHP, Perl, Java, Visual Basic, or .NET, MySQL offers application developers everything they need to be successful in building database-driven information systems.
•
Management Ease: MySQL offers exceptional quick-start capability with the average time from software download to installation completion being less than fifteen minutes. This rule holds true whether the platform is Microsoft Windows, Linux, Macintosh, or UNIX. Once installed, selfmanagement features like automatic space expansion, auto-restart, and dynamic configuration changes take much of the burden off already overworked database administrators. MySQL also provides a complete suite of graphical management and migration tools that allow a DBA to manage, troubleshoot, and control the operation of many MySQL servers from a single workstation. Many third party software vendor tools are also available for MySQL that handle tasks ranging from data design and ETL, to complete database administration, job management, and performance monitoring.
•
Open Source Freedom and 24 x 7 Support: Many corporations are hesitant to fully commit to open source software because they believe they can't get the type of support or professional service safety nets they currently rely on with proprietary software to ensure the overall success of their key applications. The questions of indemnification come up often as well. These worries can be put to rest with MySQL as complete around-the-clock support as well as indemnification is available through MySQL Network. MySQL is not a typical open source project as all the software is owned and supported by MySQL AB, and because of this, a unique cost and support model are available that provides a unique combination of open source freedom and trusted software with support.
•
“We would not have been able to achieve our goals so quickly without the highquality support we have received through MySQL Network. The rapid response and creative solutions provided by MySQL’s trained staff have been invaluable.”
Lowest Total Cost of Ownership: By migrating current databaseChris Lunt drive applications to MySQL, or using MySQL for new development Dir of Engineering projects, corporations are realizing cost savings that many times Friendster stretch into seven figures. Accomplished through the use of the MySQL database server and scale-out architectures that utilize low-cost commodity hardware, corporations are finding that they can achieve amazing levels of scalability and performance, all at a cost that is far less than those offered by proprietary and scale-up software vendors. In addition, the reliability and easy maintainability of MySQL means that database administrators don't waste time troubleshooting performance or downtime issues, but instead can concentrate on making a positive impact on higher level tasks that involve the business side of data.
Conclusion As we have seen, a common characteristic of many top emerging Web 2.0 companies and of established web properties, is that they rely of MySQL as a critical piece of infrastructure within their technology stacks to deliver performance, scalability and reliability to millions of users. The MySQL database server consistently offers a lower total cost of ownership without having to sacrifice performance, reliability or
Copyright © 2006, MySQL AB
Page 27 of 29
ease of use in the process. Combined with an open source development model and community characteristics, the ability to implement scale out with pronounced ubiquity and interoperability, make MySQL the logical choice for the next generation of Web 2.0 web sites, applications, services and companies. Web 2.0 is the future, and open source and MySQL are the fabric, and it has never been easier to take advantage of new concepts and technologies at such a low cost. Welcome to World 2.0!
About MySQL MySQL AB develops and markets a family of high performance, affordable database servers and tools. Our mission is to make superior data management available and affordable for all. We contribute to building the mission-critical, high-volume systems and products worldwide. MySQL AB is defining a new database standard. This is based on its dedication to providing a less complicated solution suitable for widespread application deployment at a greatly reduced TCO. MySQL's robust database solutions embody an ingenious software architecture while delivering dramatic cost savings. With superior speed, reliability, and ease of use, MySQL has become the preferred choice of corporate IT Managers because it eliminates the major problems associated with downtime, maintenance, administration and support. MySQL is a key part of LAMP (Linux, Apache, MySQL, PHP / Perl / Python), a fast growing open source enterprise software stack. More and more companies are using LAMP as an alternative to expensive proprietary software stacks because of its lower cost and freedom from lock-in. Our flagship product is MySQL, the world's most popular open source database, with more than 10 million active installations. Many of the world's largest organizations, including Sabre Holdings, Cox Communications, The Associated Press, NASA and Suzuki, are realizing significant cost savings by using MySQL to power Web sites, business-critical enterprise applications and packaged software. MySQL AB is a second generation, open source company, with dual licensing that supports open source values and methodology in a profitable, sustainable business
Additional MySQL & Web 2.0 Resources MySQL and Web 2.0 Portal http://www.mysql.com/industry/web/
Web 2.0 Articles “What is Web 2.0” Tim O’Reilly 9/30/2005 http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html “Levels of the Game: The Hierarchy of Web 2.0 Applications” Tim O’Reilly 7/17/2006 http://radar.oreilly.com/archives/2006/07/levels_of_the_game.html
Copyright © 2006, MySQL AB
Page 28 of 29
“Operations: The New Secret Sauce” Tim O’Reilly 7/10/2006 http://radar.oreilly.com/archives/2006/07/cloudy_with_a_chance_of_server_1.html
White Papers http://www.mysql.com/why-mysql/white-papers/
Case Studies http://www.mysql.com/customers/
Press Releases, News and Events http://www.mysql.com/news-and-events/
Live Webinars http://www.mysql.com/news-and-events/web-seminars/
Webinars on Demand http://www.mysql.com/news-and-events/on-demand-webinars/
Copyright © 2006, MySQL AB
Page 29 of 29