SQL Server High Availability – DBCC 2021 – Part 1

Any one of my customers has asked, or should have asked, “do I need a certain level of high availability for my SQL server or one of my SQL servers?” Regrettably, these deliberations often come to a halt when it comes to the budget… Naturally, such HA-solutions are more expensive, and depending on the solution chosen and the environment’s configuration, appropriate licenses must be made available. However, if one calculates the costs that will occur in the “worst-case” scenario during or as a result of an outage, it should be clear that these costs are reversible.

I had the opportunity to give a speech about it last week at the Data Blaster Community Conference 2021 (thank you, SQLPASS Deutschland), and here is a summary of what I spoke about.

I’ve shown right from the start how high such costs can be, and I’ve done so with the help of a Percona online calculator. The SQL Server goes down for 1-2 hours, 5 employees are assigned to the Restore, each earning $100,000 per year, and 100 additional employees cannot work correctly, earning an average of $50,000 per year. In the following screenshot, you can see the numbers that can occur during an emergency, during recovery, or as a result of the trouble.

Why does SQLServer High Verfuegbarkeit matter? - Percona Calculator

With total costs of about 8.000.000 Euros for a 2 hour SQL Server outage, I’d consider spending an additional 250.000 Euros for the installation to add a second server and equip it with the required licenses. But, as I’ll explain in the following sections, such high availability can be seen in the SQL Server setup dialog, but first, let’s clear up the terminology… 😉

Disaster Recovery vs. High Availability

On wikipedia.de, the term “high availability” is described as follows:

High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than standard period.

https://en.wikipedia.org/wiki/High_availability

In comparison, the concept of disaster recovery.

Disaster Recovery assumes that the primary site is not recoverable (at least for some time) and represents a process of rebuilding data and services to a secondary survived site, which is opposite to restoring back to its original place.

https://en.wikipedia.org/wiki/Disaster_recovery

When it comes to high availability, it’s all about keeping the time of an outage or downtime as low as possible so that all connected systems can keep running as quickly as possible without data loss or manual intervention. Naturally, before the installation, you should think about the process design, the goals to be achieved, and the need.

  • What is the purpose of the business?
  • What would it cost the company if they are unable to work?
  • Is there, on the whole, a compromise between technology and business? Does it really have to be the minimum downtime (<10 seconds), or is it possible that maximum downtimes of less than a minute are sufficient?
  • What is it that technology can accomplish?
  • Knowledge, processes, and all those must all be followed to the letter of the law?

And only after these questions have been answered can you think about the actual solution for high availability and how it can/should be implemented.

The AlwaysOn Failover-Cluster (FCI)

High Availability - SQL Server - AlwaysOn Failover Cluster

We now come to one of the possible solutions for achieving high availability of SQL Servers inside your own datacenter, the Windows Failover Cluster (basically identical to a Linux Failover Cluster, but you should refer to the documentation of the respective distribution and cluster software).

Initially, a Windows Failover Cluster is created from two Windows Servers with the additional installed Failover-Cluster feature, or, to put it another way, both servers are logically (and in AD) connected. Both are familiar with one another and are now aware that they belong together, exchanging several relevant health-status information. This Windows Failover Cluster is given its own IP address and DNS name and a Cluster-Named-Object (CNO) in Active Directory, which later “manages” the cluster.

I can’t say anything about the required storage requirements because it varies depending on the environment… Whether it’s a SAN, NAS, NFS, HCI, or something else, it’s always about a storage system that can give both servers access. Microsoft adds the following:

Contrary to the availability group, an FCI must use shared storage between all nodes of the FCI for database and log storage. The shared storage can be in the form of WSFC cluster disks, disks on a SAN, Storage Spaces Direct (S2D), or file shares on an SMB. This way, all nodes in the FCI have the same view of instance data whenever a failover occurs. This does mean, however, that the shared storage has the potential of being the single point of failure, and FCI depends on the underlying storage solution to ensure data protection.

https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/always-on-failover-cluster-instances-sql-server?view=sql-server-ver15&WT.mc_id=DP-MVP-5002576#FCIelements

Now you can start installing SQL Servers; on Node 1, you’ll set up the foundation for a SQL Server Failover Cluster (Installation of a Failover-Cluster), while on Node 2, you’ll only add a node to an existing SQL Server Cluster.

Depending on how this cluster is set up – whether it’s only one instance or many – SQL Server licenses must be acquired. You can choose between two operating modes: active/passive or active/active. When using active/passive, SQL Server instances can only run on one cluster node, but they can run on both nodes for a limited time in the event of a failure. When using active/active, all nodes can be used equally and for an unlimited amount of time, which is especially useful when dealing with several instances because it allows some kind of load-balancing.

If a node in the cluster breaks, the second node can no longer communicate with the other side and tries to get the broken services into his side as quickly as possible and start them there.

Unfortunately, a hybrid scenario is not possible due to the lack of scalable resources on-premise and in the cloud. However, a Failover-Cluster can be built in the Azure Cloud as an alternative. To do so, you’ll need a Proximity Placement Group, Managed Disks with the “Shared Storage” feature enabled, and the two virtual machines described above, whose installation and operation are essentially identical.

High Availability - Azure SQL Server Failover Cluster - DBCC2021

More on SQL Server high availability will be covered in a next blog post.

TSQL Tuesday # 96: Community people who have influenced my way

TSQL Tuesday # 96: Community people who have influenced my way

T-SQL Tuesday is a recurring blog series started by Adam Machanic (b | t), a blogger hosts a SQL Server topic every month, and anyone can write a blog post on this particular topic. This month, Ewald Cress (b | t) is our host and it’s about the people who have influenced us in our lives with data in the community.

For a long time, I had no relation to any community… in my youth, I was a volunteer at the German Red Cross and now on the road again with “you”.

Who influenced me in the last few years?

Actually, everything started with PASSCAMP 2013… Ok, maybe a bit earlier when I signed up for Twitter in mid-2012 and read a lot about a #sqlfamily.

Definitely worth mentioning Brent Ozar, who has inspired me with his knowledge and blog contributions for years and continues to bring in the daily DBA life. Also, his contributions to the topic “Why should I run a blog?” count among my reasons mentioned again and again. Unfortunately, I have not been able to meet him personally yet, but that’s guaranteed to happen. His funny way of explaining complicated things to one person is just great!

Then in 2013, it was my visit to the PASSCamp and thus my first direct contact with the German sqlfamily aka SQLPass. Here you can only list the usual suspects, who are always present at such events. => Oliver Engels, Tillmann Eitelberg, Kostja Klein, Niko Neugebauer and Andreas Wolter, just to name a few… I found the group dynamics awesome and all that love between them… everyone knows each other, everyone laughs at each and everyone talks to everyone, no one is left behind!

I wanted to join in, somehow belong to them… but how?

Henning L. @hlhr_dev Jun 2 thanks to all the speakers and specially to @sql_williamd for this great event #SQLGrillen

So after this experience, I started to get more involved with the PASS and its activities and found among others like Cathrine Wilhelmsen, whose community activities also spurred me and showed me how to start => more participation in local or national activities of the PASS. Then it came to SQLGrillen from William Durkin and the session of Andre Kamann about “PoSh meets Ola Hallengren” and another meeting with Andre Essing, which in total motivated me to jump over my own shadow and act as a spokesman in the PASS. During the following two SQLSaturdays (Rhineland and Munich 2016) I was on the road for my first time as a volunteer and was able to get a taste of the national activities of PASS Germany. As luck would have kicked in or the data network rolled the balls, the SQLSaturday in Munich was the next step in my “community career” and brought me the Azure Meetup Hamburg.

In 2017, I had my first public talks, some in-house, some in the PASS user group Hamburg, at SQLGrillen 2017 (Thanks to William) and twice at SQLSaturday Rheinland 2017 (thanks to Olli, Tillmann and Kostja)… all this in conjunction with my blog and Twitter activities helped me to be awarded by Microsoft with the MVP Award.

I would like to say thank you to following people of the German #sqlfamily:

Oliver Engels
Tillmann Eitelberg
Kostja Klein
Andre Essing
William Durkin
Andre Kamann

A special thanks go to Gabi Münster for the support during my first public appearance, in the end, it’s all about a “kick my ass” to jump over my shadow. Of course, many conversations and Twitter contacts with numerous other community members (Chrissy, Claudio, Rob, Dirk, Volker and, above all, Conny!) helped me to feel as a member (at least of the German) SQL family! THANK YOU SO MUCH! Further goals are planned for 2018 😉

A very special thanks go to my team lead Thorsten Moeller, who supports me again and again in all my activities and a much bigger “Thank you” goes to my wife, who also supports these activities and always “kept my back”!