s
IPStor ™ SAN+NAS
FalconStor, Inc. 125 Baylis Rd. Melville, NY 11747 1-631-777-5188
Table of Contents
The Storage Explosion ..................................................................................... 3 Introduction of SAN and NAS ........................................................................... 3 NAS .................................................................................................................. 4 SAN .................................................................................................................. 5 Current Challenges........................................................................................... 7 A Sensible Alternative....................................................................................... 8
FalconStor White Paper
SAN + NAS
Page 2
The Storage Explosion There is no doubt that the demand for storage has been increasing. Business documents have also evolved from simple text to fancy graphic presentations. Email messages now contain voice and video; databases contain multimedia objects; even TV set-top appliances now use a hard drive to buffer/store up to 48 hours of programming that can be watched at a different time. iCommerce, eCommerce, and other ever-increasing eBusiness solutions that require corporate data centers to provide information in various formats have all contributed to the explosion in demand for storage capacity. Enterprise-class businesses are deploying huge RAID devices that are Terabytes (TB) in size. According to industry analyst Dataquest, average desktop consumption of storage space has grown from 1.4 GB in 1997 to 3.5 GB in 1999 and is projected to reach 14 GB in 2003. For corporate data centers, worldwide RAID capacity deployment will grow to 1.3 million TB by 2003 at a compounded annual growth rate of 79%. Even though cost-per-megabyte for storage is declining at the rate of 35% to 40% per year, business requirements for fault-tolerance, high availability, disaster recovery and online backup have significantly increased the costs of managing the storage. As the stored data becomes more critical and irreplaceable, the need for uninterrupted availability of data and for fast and complete 8 Ratio of storage recovery is absolutely required. Before the management costs to hardware costs 7 electronic age, it was possible to re-enter the purchase orders from their file cabinets once 6 the storage system failed. In today’s 5 eBusiness environment where paper backup 4 no longer exists, lost data translates into lost 3 business. In the world of eCommerce, competition is heated. The increasingly savvy 2 customers have the freedom to choose from a 1 wide range of products and services. A 0 company without the ability to store and 1984 1992 1996 1999 retrieve its data when and where it is needed will lose “the battle” to its competitors with that ability. It is estimated that the ratio of management cost vs. acquisition cost for storage has increased over time to 8:1 in 1999 (source: SNW, 1999). This means that for every $1 spent on purchasing storage systems, another $8 will be spent to perform on-going maintenance and management.
Introduction of SAN and NAS SAN and NAS are two ways to re-organize a system into a separately managed storage farm and server farm. In theory, separation of the storage and the server is like the tried-and-true “divide and conquer” military strategy that has historically enjoyed great success. The end result should be a reduction of overall management complexity. SAN and NAS are two very different approaches to storage management and address two very different needs.
FalconStor White Paper
SAN + NAS
Page 3
Below is the locally-attached configuration that illustrates the traditional storage-to-server relationship:
LAN NT
10/100/1000 Mb
Solaris Linux AIX NetWare
Locally Attached: - Every file server and application server has its own storage. - It is not possible to share storage.
SCSI
NAS Appliance
- Backup traffic can degrade LAN performance and server performance. - If each server has its own backup device, cost and management complexity are significantly increased.
As the above diagram shows, a locally-attached storage system requires that the storage devices be managed individually by each server. Backup typically must be done locally at each server, resulting in high hardware/software costs (needing a backup device per server, and a license for backup software at each server) and extremely high management costs. Alternatively, it is feasible to deploy a single and central backup server. With the agent software deployed on each server, the data is backed up to this central backup server over the LAN. The central backup server still introduces issues. For example, LAN traffic has increased significantly over the years, to a point where backup can only be performed during “quiet” or off-peak hours. In today’s enterprise environment, where the operation tends to be 24x7, it is increasingly difficult to allocate quiet moments for backup. Furthermore, servers cannot share storage space available locally to one another. In an environment where the space consumption rate of various servers is unpredictable, the end result is constant over-estimation of space requirements or maximized space usage, which leads to a waste of storage space. Once a particular server runs out of space, the amount of work to increase storage space is significant and service interruption is inevitable. NAS Network-Attached Storage (NAS) was embraced as a great idea in the same way network-attached printers were. Providing a network printing service used to require that a print server be set up on the network before printers could be attached to each print server’s LPT or COM port. As printers became more intelligent, it became viable to embed the network interface and printer management software directly inside the printer. In this case, the network-attached printers can be plugged directly onto the corporate network, and users can easily find and connect to them by a “printer share”. As the number of users grows, additional network attached printers can be plugged in. Overall
FalconStor White Paper
SAN + NAS
Page 4
print service management is greatly simplified due to the separation of server and printer. Similarly, as storage appliances become more intelligent, NAS becomes viable in the form of network-attached storage appliances. A NAS box is essentially a storage device with a built-in network interface, network operating system, and storage allocation software. A NAS box can be plugged directly onto the corporate LAN, making itself accessible via one or multiple “file shares”. Users and groups are assigned read/write privileges and space quota. As the number of users grows and free space becomes low, additional NAS boxes can be plugged in. Although NAS simplifies the “provisioning” of the storage to clients, it does not address the problem of backup. Either backup has to be performed from a central backup server, which introduces traffic on the LAN, or an unconventional backup, adopting proprietary means supplied by the NAS vendor, would have to applied, which inevitably leads to increases in management overhead. Speed is another issue in NAS environment. An application running on a client that accesses storage on a NAS box has to go down the entire seven layers of the networking protocol, across the LAN wire, up the seven layers, get the data, and transmit back. This is the reason why, despite the ease of space management, NAS cannot be used in data-intensive server applications. Some high-performance database systems actually perform direct, raw I/O using SCSI commands to avoid the inefficiency of the server’s OS file system. Since the NAS access is via the network redirector through the client’s OS file system, raw I/O is not possible. In short, NAS is ideal for “sharing” files by general users and for some non-data-intensive application servers. NAS: Intelligent Storage box directly attached to the LAN.
Unix Clients Windows Clients LAN SMB File I/O
NFS File I/O
Application I/O goes through OS and Network Redirector. Raw I/O impossible. Does not resolve backup issue.
SAN For data-intensive servers, SAN is the solution. As the following diagram illustrates, although the storage devices are detached from the server and centralized in the storage farm, there remains a high-speed path that connects the storage back to the servers. The SCSI protocol is preserved, making the servers believe that a SCSI host adapter is still available, with SCSI storage devices attached. The keyword here is “high-speed”. In order for the server performance to remain high (or actually improve), the maximum speed of the SAN connection must be a multiple of the typical speed required by a single server-storage pair. Otherwise, when each of the servers is accessing their assigned storage over a common SAN connection, they will be fighting for bandwidth. FalconStor White Paper
SAN + NAS
Page 5
LAN
10/100/1000 Mb Ethernet NT
Solaris Linux
AIX
NetWare
NAS Appliance
Fibre Channel SAN
Fibre
FC-SCSI Bridge
SCSI Devices
FC Devices
In 1992, Fibre Channel (FC) emerged as the de-facto standard for implementing a SAN, as it was the only viable Gigabit-speed transport at the time. When Fibre Channel became available, the SCSI BUS had a maximum transfer rate of 20 MB/s. Therefore, SCSI drives were all under 20 MB/s in burst throughput, and much less (about 5 to 10 MB/s) in sustained throughput rate. The Gigabit rate of FC can sustain close to 90 MB/s of throughput. This effectively yields a comfortable bus-to-drive ratio of about 10:1, making it possible to have tens of server-drive pairs in active transfer at the same time. In addition to raw bit rate, FC protocol was designed to be highly efficient for storage traffic that is characterized as “large block transfer”. FC can efficiently move megabytes of data in a single transaction, hence reducing the CPU utilization. Being a primary “connectivity” tool, FC left many fundamental storage management issues un-addressed, such as: backup, snapshot, replication, mirroring, and virtualized storage. Logically, FC is simply a “super SCSI bus” providing any-to-any connectivity among lots of hosts and storage devices that are far apart. There is really no entity on the FC SAN that “supervises” the data activities, nor acts as the master to perform data moving operations. This proves to be a problem for security reasons. For example, how do you prevent a particular host from accessing a particular drive? FC has some provisions for access control, such as LUN Masking (ability to mask or prevent a storage device from being seen by a host or hosts) and zoning (ability for a FC switch to group ports into access zones so that only devices and hosts on the same zone can see each other). However these are simple name-based matching schemes that are far from being “hacker-proof”. Additional steps are needed to ensure the security of each access path, and it is easy to inadvertently leave a back door open. Name spoofing and loop snooping are easily achievable in any FC deployment. Experience shows that true network data security can only be achieved through industry-approved encryption algorithms and key-based authentication. As in the world of the SCSI bus, FC also has fundamental issues in terms of device sharing. Although tape storage devices can indeed be shared effectively via the SCSI reserve/release mechanism, multiple host servers cannot logically share a disk device unless the hosts are members of a cluster. The reason is that the file system on the disk volume is typically designed to be written to by a single host. This problem alone makes it impossible to share a single large RAID device among multiple hosts; each host will see all the sectors of the RAID.
FalconStor White Paper
SAN + NAS
Source: EMC, Nov 1999
Page 6
On the connectivity side, as a new link-layer protocol, FC has no provision for remote routing. It is necessary to deploy additional FC-to-IP protocol routers in order to join two FC SANs that are far apart. Furthermore, the only way to provide access by any server is to deploy a FC adapter at the server, and to deploy the necessary FC cabling. Still, 8% of corporate data centers (source: EMC, November 1999) are deploying FC to fill their immediate need for centralized storage SAN Adoption management, despite the aforementioned issues. This means that existing investments in SCSI devices are potentially wasted. Niche Completed 8% vendors started to capitalize Waiting on these shortcomings by 44% producing FC-to-IP routers, Evaluating FC-to-SCSI routers (to back35% support existing SCSI Implementing devices), and so called “SAN 13% Managers” that are another layer of hardware/software on top of FC to improve access control and to provide disk sharing. Innovative storage appliance makers also started to make high-end, sophisticated RAID devices (typically, a large cabinet with many drives and an intelligent controller) that have built-in virtual volume managers to logically divide their storage space into individually addressable virtual drives so that the storage space can be effectively shared among many hosts. However these solutions provide virtualization only within the cabinet. For a data center that requires several cabinets, it is impossible to virtualize across the cabinets. Advanced features such as mirroring and replication are typically possible only in a homogeneous (single-vendor) environment due to the lack of cooperation between the competing vendors. In short, there are many fundamental storage security and connectivity issues that are left unsolved by FC, and lots of disjointed point solutions are designed to address the niche, which introduces even more management overhead. The storage market warrants a highly sophisticated, high performance, vendor-neutral solution that works with all management tools, reporting utilities, and maintenance procedures. Current Challenges The storage management challenge has led to innovative ways of providing storage: NAS and SAN. Although there are still debates about advantages and disadvantages between NAS and SAN, the reality is that these two storage topologies do not compete. Rather, they are both needed in corporate data centers. NAS represents a quick and easy way to add shareable storage space to users and workgroups for general purpose file sharing, or to some application servers that are not storage-intensive. SAN represents a way to separate the server and storage into two independently-managed layers, yet maintain SCSI block level access by the servers, thereby simplifying the complexity of the overall IT infrastructure. FC has come a long way and emerged as the de facto means to implement a SAN. However, various shortcomings have called for the FalconStor White Paper
SAN + NAS
Page 7
need to apply multiple niche solutions to fill gaps and holes in the areas of connectivity, storage virtualization, storage management, and security. Given that the demand for enterprise storage keeps increasing, the market is ripe for a more sensible approach to SAN and NAS, an approach that makes not only good use of FC, but also taps into the vast connectivity of IP/iSCSI via Gigabit Ethernet. Furthermore, rather than leaving the virtualization and storage management functions in the hands of disjointed niche vendors, these features should be part of the network storage infrastructure. In addition, SAN and NAS storage should be provided from a single virtualized storage pool and managed under a unified management umbrella. This will ensure that storage space allocation and storage management (replication, backup, etc) are shared by both SAN and NAS, eliminating the need to manage yet another box. A Sensible Alternative The IPStor network storage infrastructure software by FalconStor, Inc. is the only solution in the market that is based on the vision of providing unified SAN plus NAS services. The design of IPStor incorporates the following key technology implementations, allowing it to function as enterprise-class storage infrastructure software: • • •
Supports FC, SCSI, and future iSCSI and InfiniBand-based storage devices. Interfaces with application servers through standard/Gigabit Ethernet IP infrastructure using a storage-over-IP (SAN/IP™) protocol to provide virtualized storage resources. Provides NAS services using CIFS (Common Internet File System) and NFS (Network File System) protocols on virtualized storage resources shared with SAN.
As listed below, IPStor incorporates some of the most in-demand enterprise-class storage services to facilitate the consolidation of NAS and SAN and reduce management overhead: • • • • • • • • • • • • •
Storage virtualization with dynamic volume re-sizing Mirroring Snapshot (for a point-in-time copy of a NAS resource) Remote replication High availability through active-active failover SNMP management integration Access security (Authenticated Diffie-Hellman) Zero-impact backup and restore using standard backup software on the IPStor Server, eliminating the need for backup software on each application server. LAN-free and server-less backup using standard backup software eliminates LAN traffic during backup. Java-based management console End-to-end diagnostics and reporting NAS service through CIFS/SMB and NFS SAN service through IPStor SAN Client driver, a virtual SCSI Adapter driver running at each application server.
FalconStor White Paper
SAN + NAS
Page 8
IPStor’s Linux-based software runs on any Intel-based server. All storage devices are attached to one or multiple IPStor servers equipped with SCSI, Fibre Channel, and/or InfinBand/iSCSI host adapters. All of the physical storage devices are discovered and aggregated into a storage pool and virtualized. A Java Console (which can be run from anywhere on the IP network) provides an easy GUI interface for total management of the IPStor server(s). Using this console, a few mouse clicks is all it takes to create SAN or NAS ‘Virtual Resources’ of any size from the available storage pool. Once created, you can assign the SAN resource to any application server running the IPStor SAN Client driver. Once that is done, the application server’s operating system (Windows NT, Window 2000, Linux, Solaris, HP-UX, IBM AIX, etc.) will function as if a host adapter is installed and a drive is attached. As for NAS, no client side software driver is needed because the client’s operating system already has network redirectors to access CIFS/SMB and NFS. It is possible to dynamically resize the virtual resource at any time to increase the capacity. Also using the Java Console, it is possible to right-click on the SAN or NAS virtual resource to define a Mirror, a Snapshot, or perform scheduled Replication or Backup. The way you set up and perform these advanced storage services is the same no matter if the virtual resource is SAN or NAS. With IPStor, the question of SAN or NAS is no longer asked. The right question is: how can IPStor’s unified SAN-plus-NAS software storage infrastructure be deployed today to address your ever-increasing storage needs?
FalconStor White Paper
SAN + NAS
Page 9
Appendix A - Sample IPStor Deployment
SAN Clients: App Servers on NT, Win2K, Solaris, …
10/100/1000 Mb Ethernet LAN LAN/WAN LAN Clients
NAS Appliance
NAS Clients SMB / NFS
FC Devices
FC SAN Gigabit Ethernet Switch
IPStor Server
Java-based Management Console
SAN Virtual Drives
SCSI Devices
SMB/NFS NAS Shares
For more information, please contact:
FalconStor, Inc. 125 Baylis Rd. Melville, NY 11747 1-631-777-5188 www.falconstor.com
FalconStor White Paper
SAN + NAS
Page 10