Unraveling the Mystery of Automatic Checkpoints on TempDB

Blogs

Common Problems in Apache Spark(5s) – Part 1
July 31, 2023
SQL Server Security Features
August 13, 2023

Unraveling the Mystery of Automatic Checkpoints on TempDB

This article delves into a significant challenge we encountered while serving one of our largest enterprise clients. During our engagement, we faced a complex issue that demanded careful attention and resolution. In this write-up, we’ll explore the details of this problem and how we navigated through it, shedding light on the valuable insights we gained along the way.

“Have you ever experienced the frustration of your carefully crafted database transactions being unexpectedly rolled back, leaving you scratching your head for answers? In the world of SQL Server, automatic checkpoints on TempDB might just be the stealthy culprit behind these seemingly perplexing rollbacks. Let’s dive into the depths of this intriguing phenomenon and shed light on its impact in the realm of database management.”

Frequently, our customers are engaged in running intensive workloads to process vast volumes of data, generating powerful reports essential for their operations.Additionally, they performed significant processing jobs, adding to the criticality of these tasks.

TempDB in SQL Server serves as the core of the system, responsible for handling all temporary workloads generated by various tasks and processes. Managing the transient data and operations that occur during this critical period. Our customers consistently encounter the same error, which has persisted from several months. “The transaction log for database ‘TempDB’ is full due to ‘ACTIVE_TRANSACTION’.”  And the transactions were getting rollback.

After conducting a thorough analysis of the problem, we discovered that the issue was caused by a configuration change related to the target recovery time, which directly impacted the checkpoint process. This misconfiguration was identified as the root cause of the problem.

The symptoms exhibited by the database when the target recovery time is set to 0 are indicative of using indirect checkpoints.

  • When transaction logs are being processed at a high frequency of fulfillment.
  • When the number of physical reads is more than logical reads.
  • When buffer cache ratio hits less than 60%.

Figure 1

In the previous version of SQL Server 2016, the target recovery time was set to 0, indicating an automatic checkpoint process. However, starting from the 2016 version, the default setting for the target recovery time was changed to 60. When our customer upgraded to SQL Server 2019, the configuration remained unchanged at 0, causing the problem to persist.

After implementing the configuration change from 0 to 60, our customer is no longer encountering the previously observed errors during the quarter-end workload. As a result, all their tasks and processes are running smoothly without any challenge.

Please note that indirect checkpoints operate differently from automatic checkpoints, and their impact on highly transactional databases can vary significantly.

Note: click the link below to get better insights on checkpoints.

https://learn.microsoft.com/en-us/sql/relational-databases/logs/database-checkpoints-sql-server?view=sql-server-ver16

 


thejas.krishna

Leave a Reply

Your email address will not be published. Required fields are marked *