John McCormack DBA

SQL Server Databases and Cloud

  • Personal
    • About
  • Free Training
    • SQL Server on Amazon RDS (Free Course)
    • Free practice questions to help you pass DP-900
  • Save money in Azure
    • Azure IaaS SQL Backups – Stop burning money
    • Your Azure SQL Database and Managed Instance is too big
    • Turn the cloud off at bedtime to save 70%
    • Your Azure SQL Virtual Machine might be too big
    • Save money with Azure SQL DB serverless
    • Save up to 73% with reserved instances
    • Delete unused instances to save money in Azure
  • Hire me
    • 60 minute cost optimization
    • Let me solve your SQL Server problems
    • Take a look at my Sessionize speaker’s profile

Pool breaks to avoid burnout in IT – tsql2sday

12th January 2021 By John McCormack 2 Comments

Avoid burnout in IT

t-sql tuesday logo

This month’s t-sql Tuesday is hosted by James McGillivray (b|t) and he says:

Breaks are critical for our mental health. Write a post about relaxation techniques, dream destinations, vacation plans or anything else relating to taking a break for your own mental health.

Covid-19 WFH burnout

When we were first locked down in our homes in 2020, due to the first wave of Coronavirus, I felt like there was not much to do other than work. Football was cancelled, my kids activities were cancelled, we couldn’t go to the pubs or restaurants. So I worked. Extra hours were available to me and I just did them, often at my desk from 8-8. I had a small spare bedroom office which suited me fine when I worked from home 1 day a week, but it wasn’t ideal for spending so much time in.

I don’t know if I reached burnout but I know I was a bit fatigued. Other than to work as it was paying the bills, I had no appetite to do much else online. Friends and family were meeting by zoom and I didn’t feel like I wanted to join in. My own blogging productivity mostly stopped. I had been wanting to create some training content and I just couldn’t face it. I had an expensive mountain bike gathering dust because I just didn’t want to go outside.

New House

Fast forward to September and we moved house. I moved to a larger house with a dedicated garage/office. I know I’m incredibly lucky to be able to do this but it also allowed me one more thing, my own pool table. I’ve wanted one forever, actually I would have loved a snooker table but I couldn’t squeeze one into the space.

  • white two story garage
    Outside the garage
  • John playing pool to avoid burnout in IT
    John playing pool
  • Garage office
    Upstairs in the office

My short term escape to avoid burnout in IT

When I’m taking a break, I like to knock a few balls around on the pool table. Having something to do in my own house has been a godsend with other facilities constantly being shut down to prevent the spread of covid-19. It’s great to have the option to do this at home.

Relaxation goals

I’ve made it one of my 2021 goals to take 28 days holiday in a warm climate this year. We have an extended family holiday booked for Florida at the end of June. I don’t know if this will happen but I hope it does. I’d also like to spend two other single weeks away somewhere like Tenerife. Having these planned will break up the year and motivate me to work hard the rest of the time.

My wife and son went to Tenerife last January with my sister-in-law and niece. I could have gone but we knew nothing of coronavirus at the time and I thought I would just work to bring in some money, since I was on a short term contract. I was going to take a break in March and also planned on going to London and Lingen for SQLBits and DataGrillen during the year. Missing that trip was my biggest regret of 2019 because I ended up going nowhere.

Even if you can’t go anywhere, it’s still worth taking time off work to help avoid burnout in IT. I just hope this year that I do get to enjoy some sunshine with my family and friends.

IT Certification Category (English)728x90

Filed Under: Personal, T-SQL Tuesday Tagged With: burnout, personal, pool, relaxation

Firefighting – When your SQL Server is in big trouble

6th January 2021 By John McCormack Leave a Comment

When your SQL Server is in big trouble

It’s so important to stay calm when your SQL Server is in big trouble. I’m talking about when blocking is through the roof, when CPU is pinned and people are yelling from all different directions. Staying calm isn’t just about a state of mind, you need to have a process that you work through, that you have practised and you know inside out.

Part 1

How bad is it?

Silhouette of a firefighter in front of a blaze

In this post, I want to describe what I call a priority 2 problem. It is serious and business impacting but the server is online. It needs immediate attention however or it could escalate to a P1.

P1 – Business critical system(s) are unavailable
P2 – Serious impairment of functionality on a critical system
P3 – Performance is slower than usual but requests are completing
P4 – Performance is generally sub optimal and should be improved

Get a colleague on comms

One thing that is guaranteed to happen is that people who are not directly involved in fixing the issue, will want updates. They have good reason to want updates and are well meaning, but the constant need to reply to emails or Teams messages will seriously impact the speed at which you are working.

Back in the good old pre-covid days of office working, this could easily be achieved by someone sitting near you and the two of you communicating verbally. With remote working being more and more common now, I recommend you have some kind of audio channel open that you can speak when you need to and silence is ok too. This could be a phone call on speaker, or a teams call. The technology isn’t the main thing here, the idea is that you can express updates vocally to someone capable of communicating with the wider organisation.

Where possible, your colleague should be technically capable of answering related questions. They should open a chat in Teams or Slack or whatever software your organisation prefers and provide regular updates there. They should answer questions, keep to the update schedule and most importantly, divert people away from you.

A pre practiced routine

Now the scenarios can be different, in this one I want to talk about how I would handle a SQL Server which appears to be functioning (barely) but is extremely slow. In other words, it is a priority 2 incident as defined above. Helpdesk supervisors are calling tech support to advise and customers are calling them as the website is unusable. On top of that, the MI team and data scientists are contacting the DBAs directly because their queries won’t run.

Have a script or checklist

In my script, I tend to use well established community stored procedures. The sort of ones that most DBAs know about and many use. If you start trying to write new queries in the middle of a slow down, that is going to cost you time. Stick with what works, what is established and what you have used before and know well.

I’ve mentioned these in previous posts but the main things I am going to run are:

  1. sp_whoisactive – https://github.com/amachanic/sp_whoisactive
  2. sp_blitzfirst – https://www.brentozar.com/first-aid/
  3. sp_blitzlock – https://www.brentozar.com/first-aid/

sp_whoisactive

I always run this first because it it designed specifically for showing you current database activity, and it has saved my bacon so many times in the past. Simply running this parameter free will show you what is running on your SQL Server right now. It is ordered by query duration descending so the long runners are at the top. It can give you blocking information, information on wait types and on open transactions as well. For advanced features, you need to use some parameters.

sp_blitzfirst

If you can find what you need with sp_whoisactive, you may not even need to use this stored procedure, Where this procedure comes into its own is it tells you when certain things have recently changed or when they out of the ordinary.

e.g. sp_blitzfirst will tell you if:

  1. The plan cache has been recently erased
  2. You have high CPU utilization from a program other than SQL Server
  3. How long each wait type has been waiting during the previous 5 seconds

If something stands out that you don’t understand, there will be data in the URL column that you can copy/paste into a web browser. The web page will give you an explanation of the problem.

sp_blitzlock

This proc is all about deadlocks. You might not need to run it if the first two have given you enough information to fix the issue in hand. However, if deadlocks are at the root of your problems, this will tell you which queries are involved and when. It’s also easier to read than deadlock graphs.

There are more things that I’ll run in the fullness of time but we’ll leave them for later. I only want to focus on these 3 things for now to zero in on the immediate problem.

The script

USE dba

/*
	To run this, you also need to install:
	- sp_whoisactive
	- sp_blitzfirst
	- sp_blitzlock
*/

-- Raises an error if you run the whole script in error
RAISERROR ('Dont run it all at once',20,-1) WITH LOG

-- sp_whoisactive with various parameters
EXEC sp_whoisactive @find_block_leaders = 1 --,@get_locks = 1
EXEC sp_whoisactive @sort_order = 'sql_text' -- Tells at a glance if you have a lot of the same query running. For the F5 report runner troublemakers


-- What has been hurting us in the last 5 seconds. Look for wait stats, and anything out of the ordinary, such as the plan cache has been recently erased.
EXEC dbo.sp_BlitzFirst @expertmode = 1


-- Are we experiencing deadlocks 
EXEC sp_BlitzLock

-- Deadlocks in last hour
DECLARE	@StartDateBlitz datetime = (SELECT DATEADD(HH,-1,GETDATE())), @EndDateBlitz DATETIME = (SELECT GETDATE())
EXEC sp_BlitzLock @EndDate = @EndDateBlitz, @StartDate = @StartDateBlitz
GO



/*  Some other things to consider

	Have the usual optimisation jobs run as expected. Stats/indexes etc
	If one proc has regressed badly, could it help to clear only that plan from the cache or to recompile it.
	EXEC sp_blitz -- Although not as handy as the others for real time issues. Better used as a health check
	EXEC sp_blitzcache -- More useful for helping you identify the resource hungry queries, allowing you 
	EXEC sp_readerrorlog 0,1, 'memory'
*/

The calm after the storm

I will dedicate a full post to this however for now, it is important to say that slow queries or resource intensive queries should be identified and added to a backlog for fixing. If one particular query is likely to go bad again, it should be treated as a priority to fix, in order that we don’t see the same issue escalating again.

You should also do a SQL Server Health Check in order to satisfy that you don’t have an sub optimal configuration causing you issues.

IT Certification Category (English)728x90

Related to: When your SQL Server is in big trouble

Locks, blocks and deadlocks in SQL Server
Zero Budget DBA – SQLBITS 2020

Filed Under: front-page, Performance Tuning, SQL Server Tagged With: firefighting, sql server troubleshooting, triage

Locks, blocks and deadlocks in SQL Server

8th December 2020 By John McCormack 1 Comment

Terminology matters: Locks, blocks and deadlocks

I’ve lost count of how many times people have told me there is deadlocking in the database, only to check and see no deadlocks have occurred. In this scenario, it is usually blocking they are trying to describe. As DBAs or developers, it is important to know the difference between locks, blocks and deadlocks.

Read on, or just skip to the video

Locks block and deadlocks YouTube video

What are SQL Server locks

Locks are essential for ensuring the ACID properties of a transaction. Various SELECT, DML and DDL commands generate locks on resources. e.g. In the course of updating a row within a table, a lock is taken out to ensure the same data cannot be read or modified at the same time. This ensures that only data that is committed to the database can be read or modified. A further update can take place after the initial one, but they cannot be concurrent. Each transaction must complete in full or roll back, there are no half measures.

It should be noted that isolation levels can have an impact on the behaviour of reads and writes, but this is generally how it works when the default isolation level is in use.

Lock types

I don’t want to write a full post about lock types, mainly because the ultimate guide already exists, along with a matrix showing lock compatibility across all possible lock combinations. For a simple explanation of the basics:

  1. If data is not being modified, concurrent users can read the same data.
    1. As long as the isolation level is the SQL Server default (Read Committed)
    2. This behaviour changes however if a higher isolation level such as serializable is being used.
  2. If data is being modified, the select query will have to wait on acquiring the shared lock it needs to read data.

What is blocking

Blocking is the real world impact of locks being taken on resources and other lock types being requested which are incompatible with the existing lock. You need to have locks in order to have blocking. In the scenario where a row is being updated, the lock type of IX or X means that a simultaneous read operation will be blocked until the data modification lock has been released. Similarly, data being read blocks data from being modified. Again, there are exceptions to these based on the isolation level used.

Blocking then is a perfectly natural occurrence within SQL Server. In fact, it is vital to maintain ACID transactions. On a well optimised system, it can be hard to notice and doesn’t cause problems.

Problems occur when blocking is sustained for a longer period of time, as this leads to slower transactions. A typical connection timeout from a web app is 30 seconds so anything above this leads to lots of exceptions. Even at 10 or 15 seconds, it can lead to frustrated users. Very long blocking can bring full servers to a stand still until the lead blockers have cleared.

Identifying blocking

I simply use Adam Machanic’s sp_whoisactive stored procedure. You could use sp_who2 if you absolutely can’t use 3rd party scripts, but this proc is pure t-sql so argue your case.

EXEC sp_whoisactive @find_block_leaders = 1

To kill or not to kill

Sometimes you may have no option but to kill spids in order to clear blocking but it is not desirable. I’m generally a bit happier killing a select query if it is causing blocking, because it won’t result in a DML transaction failing. It might just mean that a report or user query fails.

Multiple identical blockers

If you have multiple blockers and they are all similar or identical, it could mean that an end user is rerunning something that keeps timing out on the app layer. These app timeouts don’t correlate to SQL timeouts so it can be the case that user just keeps hitting f5, oblivious that this is making the problem worse. I’m a lot happier killing these spids, but it’s important to say to the end user where possible, so they don’t keep doing the same thing.

It could also be that a piece of code which is called regularly has regressed and no longer completes quickly. You’ll need to fix this or the blocking headache won’t go away.

What are deadlocks?

Deadlocks occurs when two or more processes are waiting on the same resource as well as waiting on the other process to finish before they can move on. With a scenario like this, something has got to give or they will be in a stand off until the end of time. They are resolved by SQL Server picking a victim, usually the least expensive transaction to roll back. This is like having one of your blocking queries automatically killed to get things moving again. It’s far from ideal, leads to exceptions and may mean that some data intended for your database never got there.

How to check for deadlocks

I like to use sp_blitzlock from Brent Ozar’s first responder kit. If I’m in firefighting mode, I’ll just check for the previous hour. You can also pick out deadlocks from the SQL Server Error Log, or you can set up extended events to capture them.

-- Deadlocks in last hour
DECLARE @StartDateBlitz datetime = (SELECT DATEADD(HH,-1,GETDATE())),@EndDateBlitz DATETIME = (SELECT GETDATE())
EXEC sp_BlitzLock @EndDate = @EndDateBlitz, @StartDate = @StartDateBlitz

Simulating blocking

If you want to simulate blocking, you can try this on the the Wide World Importers database.

/*
   Run each of these, in order, in a different SSMS window.
*/
-- Query 1 (This naughty person went to lunch and never committed their update)
BEGIN TRANSACTION
UPDATE [WorldWideImporters].[Sales].[Customers]
SET CustomerName = 'SpinTail Toys (Head Office)'
WHERE customerID  = 1
-- COMMIT
-- Only run the commit above after all the queries have been run and you have observed blocking. Query 2 will finish instantly.

-- Query 2 (I just want my select results, but there is an uncommitted transaction blocking me)
SELECT *
  FROM [WorldWideImporters].[Sales].[Customers]
WHERE customerID  = 1

-- Query 3 (Check the wait_info)
USE DBA

EXEC sp_whoisactive @find_block_leaders = 1

-- You should see a wait type of LCK_M_S on your select query. This means the thread is waiting to acquire a shared lock.

The image below shows the output of the 3 queries side by side. Query 1 completes quickly, but notice it is uncommitted. Query 2 will not complete until Query 1 is committed or rolled back. Running Query 3 (sp_whoisactive) lets you know which spids are causing the blocking and which are being blocked.

t-sql blocking example

I’ve tried to keep the post on locks, blocks and deadlocks about the differences. I haven’t gone too technical with isolation levels, wait types or lock compatibility. The post is aimed at newer DBAs and developers to help them grasp the technology and understand the distinct purposes of locks, blocks and deadlocks.

IT Certification Category (English)728x90

Popular posts on johnmccormack.it

How do I set up database mail for Azure SQL DB Managed Instance
Put tempdb files on D drive in Azure IAAS

Filed Under: front-page, Guides, SQL Server Tagged With: blocking, blocks, deadlocks, locking, locks, SQL server, sql server blocking

PowerShell Splatting – What was wrong with backticks?

8th December 2020 By John McCormack Leave a Comment

T-SQL Tuesday

t-sql tuesday logo

This month’s T-SQL Tuesday is hosted by Lisa Griffin Bohm. (b|t). Lisa asks “This month, I’d like those of you who have presented, or written a presentation, to share something technical THAT DID NOT RELATE to the topic of the presentation, that you’ve learned in writing or giving the presentation.” I’m going to write about how I came across PowerShell Splatting and how it made me better at PowerShell, despite presenting on a cloud topic.

Powershell splatting

At DataScotland 2019, I did a presentation on AWS RDS for SQL Server. The technical content was about how RDS works, what you can do with it, and how to provision it etc. As part of my demos, I decided to use AWS PowerShell commands. When I had made this presentation at previous events, I had used AWS CLI so I had to update my code examples. I’m ok with PowerShell, but I’m not an expert. I just wanted to show that were were multiple ways to interface with AWS.

My code was full of backticks. You could say I was daft about backticks. I loved them and thought they made my PowerShell code readable because they stopped the lines from running off the monitor. Someone asked me why I don’t use PowerShell splatting? “Whatting” I asked? I had never heard of splatting.

At the break, I spoke to a couple of people who were more experienced in PowerShell than me. They advised that PowerShell splatting was the way to go for large commands because they make the code more readable. More readable was definitely something I was interested in so I decided to go away and learn about splatting, and update my code for future events.

So what is PowerShell Splatting?

Rather than passing a long list of parameters into commands, you can create a variable in advance to hold these values. The variable is an array or a hash table and includes as many parameters as you need. If you need to pass in parameter names and values, use a hash table. If you just need to pass in a list of parameter values, you should use an array. Then, when you run the command, you simply pass in the hash table parameter instead of all the individual parameters.

Example command before splatting

New-RDSDBInstance -dbinstanceidentifier "datascotland-posh" -region "eu-central-1" -VpcSecurityGroupId "sg-00a1234g567c3d4ab" `
    -allocatedstorage 20 -dbinstanceclass "db.t2.micro" -engine "sqlserver-ex" `
    -masterusername "rds_name" -masteruserpassword "secure_pw_here" -availabilityzone "eu-central-1a" `
    -port 50000 -engineversion "14.00.3049.1.v1"

Example command after splatting

$HashArguments = @{
dbinstanceidentifier= "datascotland-posh"
region = "eu-central-1"
VpcSecurityGroupId = "sg-00a1234g567c3d4ab"
allocatedstorage = 20
dbinstanceclass = "db.t2.micro"
engine = "sqlserver-ex"
masterusername = "rds_name"
masteruserpassword = "secure_pw_here"
availabilityzone = "eu-central-1a"
port = 50000
engineversion = "14.00.3049.1.v1"
}
New-RDSDBInstance @HashArguments

At a glance

As you can see, the 2nd example which uses splatting is easier to read and you can pick out the value of each parameter at a quick glance. It was worth learning to make my code more readable and improve my overall PowerShell knowledge.

“Writing this post has just made me realise that I should update my RDS course ↙ as the examples in it don’t use splatting. 🤦‍♂️”

SQL Server on Amazon RDS (Free Course)

Filed Under: front-page, PowerShell, T-SQL Tuesday, Uncategorised Tagged With: powershell, splatting, t-sql tuesday

Your SQL Server indexes can’t always be perfect

27th November 2020 By John McCormack 1 Comment

Perfection is not the aim

Your SQL Server indexes can’t always be perfect. What I mean by that is you can’t index for every query on a busy SQL server. In this post I’m going to describe just one table from a server I have been tuning recently. It had 26 indexes, and that was not unusual for this database. This most likely happened over time as a result of developers creating an index per query using the table, as well as following some perceived best practices (with the best of intentions).

Did you say 26 indexes?

Yes, one of the heavily used tables on this database had 26 indexes. It had one clustered index on an ID column which is fairly standard. There were 25 further non clustered indexes including an index per foreign key. A shared code base runs across several customers. Some customer’s use all the functionality, some only use parts of it, but they all get the same database schema. And all of those indexes.

Current usage and how to fix

Remove unused indexes

9 of the non clustered indexes on this table were read from less than once per day but were written to up to 18,000 times per day. The servers had been online for 180 days at the time of checking and some of the index uses were in single or double figures. These needed to be dropped as they were not bringing anything to the party.

I got this info by running sp_blitzindex and comparing the index usage stats against the server uptime. (I took care to run on all production servers for all customers, including read only secondaries).

Combine similar indexes

This left 16 remaining non clustered indexes to review.

  • 9 of which had been used several million times with a maximum of 32 Million seeks.
    • 5 out of those 9 appeared to be heavily overlapping other indexes.
      • I needed to see if these can be combined into 1 or 2 indexes.
    • 4 out of those 9 are fairly unique.
      • Those 4 will be retained.
  • 7 have been used hundreds of thousands of times and don’t appear to have much overlap.

This leaves us with 4 non clustered indexes being retained, 9 being dropped and 12 to be reviewed in more detail. This is the most difficult stage of the process as I need to decide which indexes can be merged and which can be dropped. Its always best to be cautious with this part, because removig the wrong index could slow down an important process.

Of the 12 indexes that needed a more detailed look, I decided to drop 8, and create 3 new ones. A net reduction of 5 indexes for this stage. Multiple groups of indexes had the same keys but different include columns. Some of the include column ranges were very wide and included large data types such as large NVARCHARs. Replacement indexes used the same keys, in the same order, but had a more targeted range of include columns.

Perfection? No. Better? Yes.

In total, we have a net reduction of 14 non clustered indexes. It still leaves us with 11 non clustered indexes. Your SQL Server indexes can’t always be perfect but 11 is better than 25. I’d really have loved to get that count down to 5 but it is a lot better than it was. This process should be repeated after a month of the new index structure being in place with the aim of further consolidation if possible. The new indexes won’t be perfect, some queries may not be served as well by the new consolidated indexes or by having a rarely used index removed but I was trying to improve the overall health of the SQL Server. Reducing the number of writes that were hammering the storage is a step in the right direction.

How did we get so many indexes?

Index creation was all developer led so when new functionality was rolled out, indexes were created for each new process. This resulted in a lot of overlapping indexes. I don’t blame the developers, there should just have been a process for reviewing index requests, and comparing them to what we already had.

How to avoid in future

I see two key approaches to helping avoid a repeat of this death by indexing.

  1. Don’t hoard the analysis. Sharing the results with the development team will help increase awareness of this death by indexing scenario.
  2. Implementing an approval process for pull requests where a DBA should be in the approvers list if the change includes a new index.

Verify improvements

Index changes can affect multiple queries all using the same table so just testing one query or stored procedure is not enough to release with confidence. Where possible, use a load testing environment which mirrors production, and replay a repeatable workload. By baselining a typical workload, then making the changes and taking a new benchmark, you can measure the improvements overall and for specific database calls.

With this client, there was a load testing environment with synthetic API calls to the application that attempted to mirror production. Not everyone has this luxury when testing but capturing and replaying a workload, and measuring the differences is vital for this type of performance tuning. It can help you catch any serious regressions as well as giving you the confidence that your performance will improve.

If you don’t alreqady have a suitable load testing environment, the tool I recommend to get you going is an open source project called WorkloadTools by Gianluca Sartori.

More handy links

A successful performance tuning project
How DBATools can help with performance tuning

Filed Under: front-page, Guides, Performance Tuning Tagged With: indexes, Performance tuning, SQL Performance, SQL server

  • « Previous Page
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next Page »
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

John McCormack · Copyright © 2022