Azure Arc-enabled Data Services – Cloud Summit 2021

Also, yesterday I was able to speak at an international community event – the Cloud Summit – about Azure Arc-enabled Data Services, of course mainly about how to operate an Azure SQL Managed Instance in your own data center. As already at DataSaturday #14 in Oslo, I started with the various options of Azure Arc and presented the individual components of the environments, what is needed, and how this can be deployed relatively quickly. In the end, there was again a detailed demonstration of how you can deploy a data controller on Kubernetes and how you can then roll out a managed instance or PostgreSQL Hyperscale using Azure Data Studio.

I was pleased to be part of the Cloud Summit 2021 event, which lasted several days. The lecture itself was a very successful one from my point of view; everything worked, time was a bit short, but you can work on it / improve it in the future.

With this short article, I just wanted to put the slides online and thank the organizers and sponsors of the Cloud Summit very much!

Cloud Summit 2021 – Bjoern Peters – Azure Arc-enabled data services

Data Saturday #14 – Oslo 2021 – Introduction into Azure Arc Data Services

Today the time had come – I was finally able to speak at Data Saturday (formerly SQLSaturday) in Oslo … that was finally not due to the organizers but rather to me or the pandemic, since I hadn’t dared to give lectures until 2020 Keeping English, I hadn’t submitted any sessions for Oslo either. I would have loved to have been there, of course, because Norway and Oslo are beautiful; I now also know several community members in the region that I haven’t seen for a long time. Still, this year it had to stay virtual for the time being, maybe yes, in person again next year.

But now to my lecture on the introduction to Azure Arc or Azure Arc Data Services, which was on Saturday, September 4th, 2021, at 9:45 am. Azure Arc itself is, first of all, a platform, a service that Microsoft provides in Azure to manage and administer the administration of various environments – hardware, virtual machines, Windows, Linux, AWS, GCP, or whatever – in a central location.

Introduction into Azure Arc

This means that you can now, for example, connect your on-premise Windows or Linux servers to the Azure portal to administer them from there. Or from the multitude of services from the cloud, such as security or compliance, these benefits come with Azure Arc, in which you can integrate your on-premise systems into the analyzes of these Azure services. In addition, automation such as patching or backups can be created and administered from the Azure portal or other Azure services, which means a significant advantage in the standardization of systems and processes.

And not only with on-premise servers or systems, but also with multi-cloud and “disconnected” systems, including Kubernetes clusters or SQL servers

  • Azure Arc-enabled servers
  • Azure Arc-enabled Kubernetes
  • SQL Server auf Azure Arc-enabled Servers
All applications, systems, and environments are centrally united.

Azure Arc Data Services

Azure Arc Data Services is understood to mean the possibility of running an Azure SQL Managed Instance or a Postgres Hyperscale on “any” Kubernetes cluster; whether this Kubernetes cluster is in your own data center or another cloud does not matter. The main thing is it has to be a CNCF-certified Kubernetes cluster.

However, to operate a managed instance in your own data center, you have to follow several steps … on the one hand, you should turn the Kubernetes cluster into an Azure Arc enabled Kubernetes; on the other hand, you have to have a “connector/agent” on this cluster roll out the data controller. This data controller establishes the interface between Azure, the Azure portal, and the SQL Managed Instance. Now you can deploy and use a fully functional SQL Managed Instance on this cluster in various ways – Azure Portal, Azure CLI, or Azure Data Studio.

Deployment of an Azure SQL Managed Instance on a data controller
on your own Kubernetes cluster

From a personal point of view, it is a great story to have a central point of contact for all services and applications used and thus to use the additional (security-enhancing) features from Azure on all systems. All in all, an improvement for the company!

My slides and further links

DataSaturday-Oslo-2021-Azure-Arc-Data-Services.pdf

Microsoft Documentation

If you want to try out this possibility of the rollout of Azure Data Services yourself, then I can only recommend everyone to deal with Microsoft’s Jumpstart scenarios; here, prefabricated environments are rolled out with which you can then easily test what works how where:

https://azurearcjumpstart.io/overview/

All that remains to say is THANK YOU to all listeners and the organizers; it was a great pleasure and a lot of fun. #CommunityRocks

Why is it called “SQL aus Hamburg” or “Nord DBA”?

Actually, I should have written this post years ago 😉 to shed light on the history or the genesis of this blog. But as the saying goes … it’s never too late.

At that time, I took my first initiatives and activities with the German SQL Server community, attended the first events, and made the first contacts. Unfortunately, I realized that my knowledge of the SQL Server was relatively low even after more than 10 years. I wouldn’t say I liked this myself, and I have set the goal to deal more with the product, basically, exactly the story of why I am active in the community. At the time, I was employed as a database administrator and, together with my colleague Thomas, looked after our customers’ SQL servers. Thomas knew a lot about SQL Server from other areas, and I already knew a lot, so we complemented each other wonderfully. Together we wanted to run this blog and write about our daily experiences, problems, or challenges. Since we both came from Hamburg – in a larger team spread across Germany – and it was our main task – the SQL Server – the search for a name was relatively easy …

Due to my active self-employment at that time, I already had a corresponding webspace, and the domain “SQL-aus-Hamburg.de” was quickly registered. The first blog post wasn’t long in coming, and then it got a little quieter around our blog 🙁

We had probably thought it would be easier, and the work demand grew… after such a long time; I can’t say why we didn’t write anything anymore… we just weren’t that far back then… Thomas always had good ideas and gave me the headlines, so to speak… but unfortunately, he never wrote any post himself (shouldn’t be a reproach!)… and then everything turned out differently… Thomas died unexpectedly and suddenly. So after that, I had to continue this blog under the same name.

For me, this blog is still, as you can see with the last posts on the Azure SQL Database Refresh, simply a notebook in which I review my own experiences for myself so that I might find them at some point or remind me when I was able to solve a specific problem successfully.

Why SQL-aus-HH ???

In social media, you can find me under @SQL_aus_HH; For the German-speaking reader, no problem at all to draw a connection between Hamburg and HH… initially, I had never toyed with the thought that at some point, I would also write articles in English or I even give talks at international events… so here again to clarify for the “foreigners.”

Hamburg is a Hanseatic city and therefore bears the abbreviation HH for “The Hanseatic City of Hamburg” on the German license plate.

As early as the Middle Ages, merchants earned a golden nose with long-distance trade. The Hanseatic League was particularly successful. This is what an alliance of cities and trade associations called itself about 700 years ago. The Hanseatic League was closed to simplify trade between the members, the Hanseatic cities. At that time, numerous tariffs, different currencies, and units of measurement made long-distance trade difficult.
[…]
Ultimately, only Hamburg, Bremen, and Lübeck officially bore the title of Hanseatic city. Hamburg and Bremen have partly retained their Hanseatic independence to this day. Besides Berlin, they are the only German cities that are also separate federal states. Their license plates also indicate the medieval city federation: HH stands for the Hanseatic city of Hamburg and HB for the Hanseatic city of Bremen.

http://stadtgeschichtchen.de/artikel/stadtgeschichte/was-ist-eine-hansestadt/
http://stadtgeschichtchen.de/artikel/stadtgeschichte/was-ist-eine-hansestadt/

Since this blog was supposed to be run by two people at the time, I had also thought about a name for my “personal appearance” or wanted to be a little more “creative” in connection with a favicon for the blog … than “Northern DBA” … please don’t ask me about the reasons/thoughts that led to this name … I can’t say it anymore. Still, this favicon has been around for a good 10 years as a “distinguishing mark” for this blog.

If you have any questions about the genesis of my blog … write to me.

SQL Server High Availability – DBCC 2021 – Part 1

Any one of my customers has asked, or should have asked, “do I need a certain level of high availability for my SQL server or one of my SQL servers?” Regrettably, these deliberations often come to a halt when it comes to the budget… Naturally, such HA-solutions are more expensive, and depending on the solution chosen and the environment’s configuration, appropriate licenses must be made available. However, if one calculates the costs that will occur in the “worst-case” scenario during or as a result of an outage, it should be clear that these costs are reversible.

I had the opportunity to give a speech about it last week at the Data Blaster Community Conference 2021 (thank you, SQLPASS Deutschland), and here is a summary of what I spoke about.

I’ve shown right from the start how high such costs can be, and I’ve done so with the help of a Percona online calculator. The SQL Server goes down for 1-2 hours, 5 employees are assigned to the Restore, each earning $100,000 per year, and 100 additional employees cannot work correctly, earning an average of $50,000 per year. In the following screenshot, you can see the numbers that can occur during an emergency, during recovery, or as a result of the trouble.

Why does SQLServer High Verfuegbarkeit matter? - Percona Calculator

With total costs of about 8.000.000 Euros for a 2 hour SQL Server outage, I’d consider spending an additional 250.000 Euros for the installation to add a second server and equip it with the required licenses. But, as I’ll explain in the following sections, such high availability can be seen in the SQL Server setup dialog, but first, let’s clear up the terminology… 😉

Disaster Recovery vs. High Availability

On wikipedia.de, the term “high availability” is described as follows:

High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than standard period.

https://en.wikipedia.org/wiki/High_availability

In comparison, the concept of disaster recovery.

Disaster Recovery assumes that the primary site is not recoverable (at least for some time) and represents a process of rebuilding data and services to a secondary survived site, which is opposite to restoring back to its original place.

https://en.wikipedia.org/wiki/Disaster_recovery

When it comes to high availability, it’s all about keeping the time of an outage or downtime as low as possible so that all connected systems can keep running as quickly as possible without data loss or manual intervention. Naturally, before the installation, you should think about the process design, the goals to be achieved, and the need.

  • What is the purpose of the business?
  • What would it cost the company if they are unable to work?
  • Is there, on the whole, a compromise between technology and business? Does it really have to be the minimum downtime (<10 seconds), or is it possible that maximum downtimes of less than a minute are sufficient?
  • What is it that technology can accomplish?
  • Knowledge, processes, and all those must all be followed to the letter of the law?

And only after these questions have been answered can you think about the actual solution for high availability and how it can/should be implemented.

The AlwaysOn Failover-Cluster (FCI)

High Availability - SQL Server - AlwaysOn Failover Cluster

We now come to one of the possible solutions for achieving high availability of SQL Servers inside your own datacenter, the Windows Failover Cluster (basically identical to a Linux Failover Cluster, but you should refer to the documentation of the respective distribution and cluster software).

Initially, a Windows Failover Cluster is created from two Windows Servers with the additional installed Failover-Cluster feature, or, to put it another way, both servers are logically (and in AD) connected. Both are familiar with one another and are now aware that they belong together, exchanging several relevant health-status information. This Windows Failover Cluster is given its own IP address and DNS name and a Cluster-Named-Object (CNO) in Active Directory, which later “manages” the cluster.

I can’t say anything about the required storage requirements because it varies depending on the environment… Whether it’s a SAN, NAS, NFS, HCI, or something else, it’s always about a storage system that can give both servers access. Microsoft adds the following:

Contrary to the availability group, an FCI must use shared storage between all nodes of the FCI for database and log storage. The shared storage can be in the form of WSFC cluster disks, disks on a SAN, Storage Spaces Direct (S2D), or file shares on an SMB. This way, all nodes in the FCI have the same view of instance data whenever a failover occurs. This does mean, however, that the shared storage has the potential of being the single point of failure, and FCI depends on the underlying storage solution to ensure data protection.

https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/always-on-failover-cluster-instances-sql-server?view=sql-server-ver15&WT.mc_id=DP-MVP-5002576#FCIelements

Now you can start installing SQL Servers; on Node 1, you’ll set up the foundation for a SQL Server Failover Cluster (Installation of a Failover-Cluster), while on Node 2, you’ll only add a node to an existing SQL Server Cluster.

Depending on how this cluster is set up – whether it’s only one instance or many – SQL Server licenses must be acquired. You can choose between two operating modes: active/passive or active/active. When using active/passive, SQL Server instances can only run on one cluster node, but they can run on both nodes for a limited time in the event of a failure. When using active/active, all nodes can be used equally and for an unlimited amount of time, which is especially useful when dealing with several instances because it allows some kind of load-balancing.

If a node in the cluster breaks, the second node can no longer communicate with the other side and tries to get the broken services into his side as quickly as possible and start them there.

Unfortunately, a hybrid scenario is not possible due to the lack of scalable resources on-premise and in the cloud. However, a Failover-Cluster can be built in the Azure Cloud as an alternative. To do so, you’ll need a Proximity Placement Group, Managed Disks with the “Shared Storage” feature enabled, and the two virtual machines described above, whose installation and operation are essentially identical.

High Availability - Azure SQL Server Failover Cluster - DBCC2021

More on SQL Server high availability will be covered in a next blog post.

Azure SQL VM – no possibility to configure the disk usage

A customer had asked me the last few days to help him optimize a SQL server on an Azure VM. The customer always had slight performance problems with this SQL server, so he first asked the application manufacturer or supervisor, but they didn’t want to or could help. The only answer they received was that a disk latency of 5 ms is recommended for the DATA disks of the SQL Server… Officially, in Azure, this can only be achieved with ULTRA disks. In this blog post, I don’t want to detail how this might be fulfilled. Just show you an error that I noticed during the analysis.

Starting with a small intro, I talked to the customer and showed him my calculation of what it would cost to equip the server with 2 TB ultra disks and whether it might not be more reasonable and adequate to put the SQL server through its paces and optimize it (also the application or its T-SQL) and in the last step to go to ultra disks if necessary.

The customer agreed, and we have taken the SQL Server under review. Even if they are using a marketplace image from Microsoft (thanks to the product team!), which are already configured very well, there might be some more screws to adjust.

Azure SQL VM and the data disks

If you follow the recommendations for Azure SQL in a virtual machine, you can read the recommendations there:

Stripe multiple Azure data disks using Storage Spaces to increase I/O bandwidth up to the target virtual machine’s IOPS and throughput limits.

https://docs.microsoft.com/en-us/azure/azure-sql/virtual-machines/windows/performance-guidelines-best-practices-storage

In a first effort, we replaced the 1TB disk with several small disks to increase the maximum throughput from 200MB to 625MB. So far, this has been accomplished without any downtime, the customer has been delighted so far. But in this activity, I also came across a message in the Azure portal on this virtual machine or the SQL Server configuration overview, which confused me a little; Google wasn’t a real help.

In the Azure portal, there is the possibility to roll out several extensions for all SQL servers within an Azure VM, which are intended to support the admin-team in managing the VM or the SQL server via Powershell, Azure CLI, or the portal and provide an overview on data from the logs and metrics. Microsoft uses the “SQL Server IaaS Extension Query Service” or “SQL Server IaaS Agent extension” for this purpose. Both collecting data from the virtual machine and the SQL Server and display it in a user-friendly way in the portal. Unfortunately, that was the problem with this customer VM, and these metrics were not 100% available, as the following screenshot shows.

Azure SQL VM - No data in the portal for disk usage

No data on the storage could be determined in the further “Configure” overview, although all extensions were rolled out on the server and did not display any data even after one or more reboots. So I went looking for the cause…

Extension-Log and fn_trace_gettable

As the Microsoft documentation states, there are different modes for different requirements, and of course, you can also notice a local error log, which you can use for troubleshooting. This error log is an XE file and a trace file, neither of which can be read with Notepad only with SQL Server Management Studio… so the easiest way to access the trace file with a T-SQL system function to determine that the system user apparently has no or too few authorisations. Usually, the extension is located directly on the C drive in the directory C:\SQLIaaSExtension, so you have to use this path accordingly in the SQL query.

SELECT * FROM fn_trace_gettable('C:\SQLIaaSExtension\Log\log.trc', default);  
GO  

By executing this T-SQL statement to examine the extension log file, I received the following line, among other things:

alter server role [sysadmin] add member [NT Service\SQLIaaSExtensionQuery]

When seeking for the user, I discovered that the customer had deleted this user – for whatever reason (later I found out that this apparently did not happen “by accident”, on other SQL servers in Azure, the user is also missing)

USE [master]
GO

/****** Object:  Login [NT SERVICE\SqlIaaSExtensionQuery]    Script Date: 5/10/2021 3:35:08 PM ******/
DROP LOGIN [NT SERVICE\SqlIaaSExtensionQuery]
GO

/****** Object:  Login [NT SERVICE\SqlIaaSExtensionQuery]    Script Date: 5/10/2021 3:35:08 PM ******/
CREATE LOGIN [NT SERVICE\SqlIaaSExtensionQuery] FROM WINDOWS WITH DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english]
GO

ALTER SERVER ROLE [sysadmin] ADD MEMBER [NT SERVICE\SqlIaaSExtensionQuery]
GO

As a test, I recreated this user, authorized it, and checked in the Azure portal whether there was a change in the representations… Yes, the SQL Server IaaS Agent Extension can now access the SQL Server and read out the necessary and relevant data.

Azure SQL VM - Data in the portal for disk usage