Essential Guide To Clustering Alternatives

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Essential Guide To Clustering Alternatives as PDF for free.

More details

  • Words: 3,117
  • Pages: 6
The

Essential March 2006

Guide

to Choosing a

Clustering Alternative

By Alan Sugano

Y

ou know the dilemma. You have strict uptime requirements but your company is hesitant about investing in a Storage Area Network (SAN) and the Microsoft Cluster Service (MSCS) solution. Of course, each company’s downtime costs vary significantly, but any downtime has the potential to cost your company thousands, tens of thousands, or hundreds of thousands of dollars per hour. So what are the alternatives to a SAN and MSCS solution? This essential guide exam-

Special Advertising Supplement This special advertising section was produced by Windows IT Pro in conjunction with Neverfail and appears as an insert in the March 2006 issue of Windows IT Pro. sponsored by

ines the critical issues you should consider when evaluating alternatives to a SAN and MSCS.

Clusters Go Down Regardless of the high availability solution you select, it still has the potential to go down. Corruption in Active Directory (AD), power problems, cluster heartbeat problems, Domain Name Service (DNS) problems, and other global cluster issues have the potential to bring down your server cluster. Any issue that can globally affect the cluster has the potential to bring it down. Therefore, before you implement any clustering solution, make sure that the infrastructure where the cluster will be installed is stable and solid. This includes adequate WAN speed links that, ideally, are fault tolerant; properly configured Domain Controllers (DC) with stable and reliable AD replication; adequate and properly configured DNS servers; and proper power protection. A clustering solution will protect you from server and disk failure, but it will not fix basic infrastructure issues. Unfortunately, clusters do go down, so plan accordingly.

Comparing Clustering Solutions What does a cluster look like? The diagrams in Figures A and B represent what an MSCS might look like compared to a third-party data replication cluster. Because MSCS typically uses a SAN, this solution is more complex than a data replication cluster. The SAN adds significant cost and complexity, but each cluster node has the potential to “see” the data on the SAN, eliminating the need for data replication. Data replication clusters store a separate set of data on primary and secondary cluster nodes on separate sets of locally attached storage just like a nonclustered file server. Let’s look at what a simple two-node cluster might look like from MSCS and a third-party vendor such as Neverfail.

Microsoft Clustering Solution Typically, Microsoft’s clustering solution is implemented with a SAN, in an Active/Passive Configuration. Although it is possible to have an Active/Active cluster, it is not recommended due to memory fragmentation issues, especially with applications like SQL Server. A simple twonode Active/Passive cluster configuration is shown in Figure A.

Fibre Channel SAN

In Figure A, two servers (one active and one passive) are connected to a SAN via a Fibre Channel switch. All devices connected to the SAN have dual connections, eliminating a single point of failure. However, the SAN itself is potentially a single point of failure unless it, too, is configured for redundancy. A Cluster Heartbeat on a dedicated network is used to monitor the health of all nodes in the cluster. The major advantage of using a cluster with a SAN is when you have more than two nodes in the cluster. MSCS does not limit the number of nodes you can place in the cluster. For example, you can have a four-node cluster with three active nodes and one passive node acting as a backup for the other three active nodes. Replication between the nodes is not an issue because all of the data resides on the SAN.

Neverfail Clustering Solution In Figure B, the diagram is simpler than the MSCS diagram because it does not require the use of a SAN. All data changes are replicated from the Active node to the Passive node. Each node has its own copy of the data stored internally on the node. A Cluster Heartbeat on a dedicated network is used to monitor the health of the primary and secondary server just like MSCS. This is an excellent solution for a geographically dispersed cluster, because the data replication can take place over a WAN, automatically replicating changes to a remote location.

Fibre Channel Switch

Server 1 Active Node

Server 2 Passive Node Cluster Heartbeat Switch

Server 1 Active Node With Cluster Software

Network Switch

Network Switch

Workstation 1

Workstation 2

Cluster Heartbeat Switch

Server 2 Passive Node With Cluster Software

Workstation 3

Figure 1 Microsoft Server Cluster Diagram

Workstation 1

Workstation 2

Workstation 3

Figure 2 Neverfail Cluster Diagram

Cluster Monitoring

Cluster Applications

Ideally the cluster should monitor all potential sources of failure and handle them appropriately. In addition to general server hardware failure, the solution should correctly deal with problems related to data corruption, registry corruption, networking problems, application failures, and anything else that has the potential to cause cluster downtime.

What applications do you plan to cluster? File Server, Exchange, SQL Server, SharePoint, Blackberry Enterprise Server (BES), Oracle? Microsoft does a good job of providing support for their server products, but what about other server applications such as BES or Oracle? If you want to cluster a non-Microsoft application, thirdparty vendors often support non-Microsoft applications. Some vendors have the ability to develop support for any server-based application based on a development kit for their clustering solution.

Back Up the Cluster Regardless of the clustering solution, you still need to back it up. Viruses, corrupted data, accidentally deleted files, and other reasons still require you to back up the cluster. Make sure you can store the backup off-site on a separate type of media. We suggest full daily backups of the cluster. Because most installations will contain a significant amount of storage, make sure that your backup solution has the capacity and performance necessary to back up your cluster within your backup time window.

Operating System (OS) Requirements Most vendors require that you have the same OS on the primary and secondary servers. Make sure you install a supported OS on the primary and secondary servers. Many of the non-MSCS solutions will work with the standard versions of Windows 2003 and do not require the Enterprise (MSCS requires the Enterprise version) versions of the products.

SAN or No SAN?

Cluster Software Requirements

Most of the clustering alternatives do not require a SAN. This usually eliminates a single point of failure in the SAN, unless the SAN is fully redundant. Typically, non-MSCS solutions use locally attached storage for their cluster. Because of this architecture, most vendors require the secondary server to have an equal or a greater amount of disk space than the primary server. But what if you already have a significant investment in SAN Technology? If you plan to use a SAN with a non-MSCS solution, make sure that the third-party vendor’s product will work with your data stored on a SAN. It’s important to inform the vendor of the type, storage, and SAN type (Fibre Channel or iSCSI) that you have, to verify that it’s compatible with the vendor’s clustering solution.

How much disk space, memory, and processor load will the cluster software take? What happens if the cluster service is stopped on the server? What happens if there is a problem with the cluster heartbeat?

Third-Party Reviews The history and stability of your vendor of choice also is important. Be sure to ask the following questions: • How happy are the people using the product? • Is the product easy to use? • Does the product require significant training? • Are there current customers available to discuss the product? • Has the product won any awards?

Server Hardware Requirements In a perfect world, both the primary and secondary server hardware would be identical. But what if they aren’t? Some vendors allow you to have different hardware for your primary and secondary server, allowing you to leverage existing hardware. Most vendors still require the same OS and equal or greater disk space on the secondary server, but other than that the hardware can be different. Many cluster solutions require a dedicated connection for the cluster heartbeat, which allows the nodes to monitor the health of each other. This requires at least two network cards in each cluster node. From a practical standpoint, I suggest matching cluster nodes that are at least similar. This will guarantee similar performance in case the primary server goes down. In other words, don’t use a quad processor Pentium 4 server with 32GB of RAM as your primary server and a single processor Pentium 2 with 512MB of RAM as your secondary server without expecting performance issues. The applications that you plan to cluster can have a significant influence on your server selec-

tion. For example, if you plan to cluster a more processor intensive application such as SQL Server, you might beef up the speed and number of processors and memory, compared to a disk intensive application such as Exchange that requires more investment in the disk subsystem.

Cost As with any IT purchase, cost is a significant consideration. Consider asking the following questions: • Is the product sold directly, through resellers, or through other channels? • How is the product licensed? By server cluster, by application? • Are there annual support fees? • Are product upgrades included in the support costs? Although third-party clustering products aren’t free they’re easy to cost justify, because they do not require a SAN. Even though SAN prices are falling, the amount of money saved on the SAN should more than pay for the cost of the clustering software.

Cluster Installation and On-going Maintenance Although good documentation exists on MSCS, it’s not the easiest solution to install. MSCS is very sensitive to Service Pack levels. • How difficult is it to install the cluster software? • How difficult is it to set up the passive servers in your cluster? Do you need to manually install the same applications as the primary server? Do you need to manually ensure that all applications are configured identically? • After the solution is installed, how easy is it to maintain and troubleshoot? • If the cluster has a problem, can the cluster tools assist in an accurate and fast diagnosis to solve the problem? • Is additional training required to perform basic troubleshooting and maintenance tasks (like applying service packs) or can you figure it out by reading the manual or knowledge base? • How are service packs and critical updates handled? • Is a special procedure required to install a service pack on the cluster? • Are service packs supported on the cluster as soon as they are released?

Ease and Method of Failover One of the big advantages of Microsoft’s Clustering solution is automatic failover without the administra-

tor’s intervention. Some third-party clustering solutions require a manual failover while others can failover automatically. • If the vendor does support automatic failover, can the failover still be controlled manually? • How do clients address the cluster? With a virtual server name, primary server name or other method? • When a failover occurs, does the secondary server “take over” the attributes of a primary server or is the failover handled by a different method? • Do workstations have to be remapped to the secondary server? • If the vendor supports automatic failover, how long does it take? Seconds? Minutes? • What is the impact of the users when a primary server fails over to the secondary server? Do users have to reboot their workstations? Do they lose data? Do they have to reload their application? Is it completely transparent? Make sure the expectations of management and your end-users match the capabilities of your clustering solution. Almost as important as the failover method is the fail-back method. • After the primary server is repaired, how does the cluster software handle a fail-back to the primary server? • Is the fail-back automatic or manually controlled? • Can you override an automatic fail-back? • For data-replication clusters, does the software check that the data has been fully replicated from the secondary server back to the primary server, before a fail-back is allowed? • Can you override a fail-back to the primary node without having the data fully replicated? Automation and flexibility are key components to any failover or fail-back cluster solution.

WAN Considerations and Disaster Recovery (DR) You can use some clustering solutions to replicate data across a WAN to a DR site, or other off-site location. Of course, the amount of data that changes on the server has a significant influence on the load that replication will place on the WAN link. Some vendors offer a feature that can compress replication data before it’s sent across a WAN module that performs data compression before it’s sent across the WAN. Ideally, the vendor should just replicate the file changes across the WAN and not the entire file. For

example, if a user modifies a 2MB Word document and changes only a few sentences, those changes may only represent, at the most, 1K of actual data changes. Does the solution replicate the 1K of changes or the entire 2MB file? The type of data that is replicated also should be taken into consideration. If the WAN module only supports data compression and you’re replicating files that are already compressed (e.g., graphics or MP3 files), data compression will probably not improve replication performance. Depending on your WAN speeds, it may take longer to compress the data than to send it over a fairly high-speed WAN link. Where is that threshold? If you plan to use the cluster in this manner, make sure to ask the vendor what WAN speeds are required to ensure a timely and efficient replication. When planning the WAN speed links, take into account the available bandwidth based on the current WAN load, not just the speed of the WAN link itself. If you plan to replicate data to an off-site location, see if your WAN Firewall/Router has Quality of Service (QOS) capabilities to ensure the server cluster gets a guaranteed amount of WAN bandwidth. This will help reduce the problems of data staleness due to WAN bottlenecks. If you plan to replicate data to a remote site, monitor your existing server to get an idea of the amount of data that typically changes daily. This will help you plan WAN speed links and the amount of time it will take for changes to be reflected on the remote server.

• Can the roll back be performed in minutes? Efficiency is a key factor in quickly rolling back the state of a cluster.

On-going Support It always pays to know what kind of support you can expect from a vendor. • Is there an annual maintenance fee? • Does the vendor have a knowledge base? • How long does it take for technical support to answer the phone? • How knowledgeable is the tech support staff? Are they able to resolve most issues without escalating the case? • How many customers are already using the product? • How easy will it be to install upgrades? Can upgrades be done in-house or will they require someone from the vendor to complete the installation?

Cluster Changes and Maintenance Some clustering solutions are very sensitive to changes on the OS. • How are service packs and critical updates handled? • Is a special procedure required to install a service pack on the cluster? • Are service packs supported on the cluster as soon as they are released?

Data Rollback

Cluster Monitoring

Some products work in conjunction with Microsoft’s Volume Shadow Copy Service (VSS), which lets you roll-back the server to a point in time. Other products use their own method to perform data rollbacks, while other vendors do not support data rollback. • If your company requires data roll back, make sure your company’s requirements are compatible with the cluster’s capabilities. • Do you need data roll back to a specific point in time, or do you need data roll back to infinite points in time? • How much performance and disk space are you willing to sacrifice in order to get this functionality? • When a data roll back is performed will the cluster roll back just the data or the entire application state? Depending on the reason for the roll back, just rolling back the data may not be enough to fully recover the cluster or possibly leave the cluster in an unstable state. • When a roll back is necessary, how efficient is the roll back mechanism?

Once the cluster is in place, how is it monitored? • Are health checks performed on the server pairs to ensure they are in a healthy state? • Can email notifications be sent out automatically in the event of a server failover? • How good are troubleshooting and monitoring tools to ensure the cluster’s reliability? These critical issues should help you narrow your clustering solution to a few vendors. MSCS is not the only game in town, and there probably is a clustering solution that is perfect for your company’s budget ■ and uptime requirements.

Alan Sugano is the president of ADS Consulting Group, which specializes in networking, custom programming, Microsoft .NET Web development, and SQL Server development.

Server failure.

Never miss a beat.

Keeping Users Connected. At the heart of your IT infrastructure is your server environment. And if a single server or an entire site fails, availability to critical business applications fails, along with the productivity of users company-wide. Whether you’re a start-up or a Global 100, server downtime will kill your business. With Neverfail, users are kept continuously connected to their applications no matter when, where, or why a failure occurs. Neverfail delivers cluster-class disaster recovery, data protection and high availability software solutions to every size company, and at a significantly lower total cost and complexity. With automatic failover response measured in mere seconds rather than minutes, and no user or IT management intervention needed, no one

covers your back better than Neverfail. Anything less is a lesser solution. Designed for Windows-based applications, Neverfail’s comprehensive suite of awardwinning software solutions will help ensure that your productivity stays high ... and your downtime is put to rest — forever! To make your business a more productive — and profitable — enterprise, visit neverfailgroup.com and get our Free Guide To Removing User Downtime. Or better yet, call or email us today to join companies all over the world that have chosen Neverfail for the most effective disaster recovery, data protection and high availability solutions in the industry.

EXCHANGE • SQL SERVER • FILE SERVER • IIS • SHAREPOINT • BLACKBERRY

Keeping Users Connected. www.neverfailgroup.com [email protected] 512.327.5777 x-1815

Related Documents

Clustering
June 2020 12
Clustering
July 2020 15
Clustering
October 2019 27
Clustering
May 2020 10