Advanced Computing in the Age of AI | Monday, April 22, 2024

The Human Element in SQL High Availability in Virtual Environments 

(Source: Shutterstock/Ton Snoei)

Always present but often ignored is the human element of information technology. Systems and software all get installed, configured and maintained by people. And not a single one of those people on the IT staff could possibly be an expert in every one of the technologies involved. So every engineer or administrator approaches every issue from a different perspective and with a different toolbox of potential solutions.

Perhaps no scenario represents the need to accommodate this diversity in the human element more than a business-critical Microsoft SQL database application running in a virtualized environment where at least four different administrators might be involved. Like the blind men describing the elephant, the virtual machine, storage area network, Windows, and database admins all perceive the need for high availability quite differently:

  • The VM and SAN admins are likely to prefer whatever HA capabilities are built-into the hypervisor and storage configuration, respectively;
  • The Windows admin will want to update and patch all of the system software multiple times every week;
  • The database admin may want to use best-in-class AlwaysOn Availability Groups provided in SQL Server Enterprise Edition;
  • And, of course, the CIO and CFO will have their own and very different perspectives on the operations of and the budget for the IT department.

It is important to recognize that although the goals of the each constituent may conflict, there is no right or wrong in these application scenarios. In fact, enterprises can find strategies that get the best possible outcome in each situation by seeking common ground.

Using SLAs to Establish Common Ground

A proven way to get the different disciplines to pursue a common solution is to establish a common goal, such as a shared requirement to comply with the service level agreement. It can be useful to create a table or matrix that assigns these values, at a minimum, to each application:

  • Recovery Time and Recovery Point Objectives (RTO and RPO)
  • Mean Time To Repair (MTTR)
  • Criticality to the business (with a high, medium, or low performance priority)
  • Maintenance needs
  • Tier-level classification

There may be other criteria that are critical in some environments, and some of these might not be pertinent. For example, MTTR can be irrelevant for certain types of failures, such as that of a single drive in a RAID array or one networking path in a multi-path configuration (although these failures can affect application performance).

Options for Virtual HA SQL Applications

The team can evaluate its options to choose a mutually satisfactory HA solution. Understanding the strengths and weaknesses of categories of HA solution helps select the option that best meets their requirements.

VM-based HA provisions are designed primarily to provide recovery from hardware failures. This means that if the operating system or application crashes, from the hypervisor’s perspective there is no failure to recover from, and users will experience an outage. Recovery from a hardware failure can take up to five minutes or longer depending on available resources. So while HA for VMs may be necessary, it is not sufficient for a business-critical SQL application.

Rapid failover demands standby resources, which either requires shared storage (a potential single point of failure) or real-time replication of all data involved. Most SAN solutions have built-in provisions for data redundancy and protection, but these often fail to meet the RPO and/or performance requirements of business-critical SQL applications.

Finally, there are the HA options available with SQL Server. AlwaysOn Availability Groups enable carrier-class high availability. But the need to license the enterprise edition makes this too expensive for most database applications. Fortunately, there is a way to achieve equivalent results for a fraction of the cost.

A More Affordable Business-class HA SQL Configuration

As is often the case, the simpler a solution, the better it is, and such is the case for SANLess cluster software. SANLess cluster technology uses real-time, block-level replication to synchronize local storage between the primary and one or more secondary nodes (as shown in the diagram). This makes it possible to leverage the familiar and proven Windows Server Failover Clustering (WSFC) technology to enable business-class failover protection.


Compatibility with Windows Server Failover Clustering makes SANLess clusters a simple, affordable and effective solution for SQL applications requiring high availability in virtualized environments.

Compatibility with Windows Server Failover Clustering makes SANLess clusters a simple, affordable and effective solution for SQL applications requiring high availability in virtualized environments.

Despite the designation as SANLess, SANs are supported; they are just not required. The reason is the use of storage-agnostic block level synchronization, which gives administrators the flexibility to replicate using configurations of their choice: SAN and/or SANLess, and any combination of physical, virtual and/or cloud-based configurations both within and across datacenters.

The simplicity of the approach also enables SANLess clusters to perform fast, non-disruptive failover and failback as needed for the planned downtime used to perform upgrades, maintenance and testing.

SANLess clusters deliver a more affordable high availability solution for two reasons. The first is that by leveraging WSFC, SANLess clusters are able to utilize AlwaysOn Failover Clustering, which is included in the much less expensive Standard Edition of SQL Server. The second reason derives from the ability to utilize local storage in fully redundant, high-availability configurations, which eliminates the need for significantly more expensive redundant SANs.

About the Author

Tony-Tomarchio-SIOSTony Tomarchio, Director of Field Engineering, SIOS Technology Corp., is responsible for defining and delivering technical pre-sales services, support and best practices to SIOS customers, prospects and partners. Tony has more than a decade of experience providing systems management and high availability solutions to enterprise customers. Prior to joining SIOS, Tony served as the Global Sales Engineering lead for the Oracle systems management practice. Tony joined Oracle through the acquisitions of Sun Microsystems and Aduva, Inc., where he served as the lead Sales Engineer / Technical Account Manager and played a critical role in product adoption and evolution. Tony holds a BS in Computer Science from California Polytechnic (Cal Poly) State University, San Luis Obispo.

About the author: Alison Diana

Managing editor of Enterprise Technology. I've been covering tech and business for many years, for publications such as InformationWeek, Baseline Magazine, and Florida Today. A native Brit and longtime Yankees fan, I live with my husband, daughter, and two cats on the Space Coast in Florida.