Automating Restoration of Transaction Log Backups in SQL Server Using Python
Introduction:
In the world of database management, ensuring the integrity and consistency of data is crucial. One key aspect of this is regularly backing up transaction logs, especially in scenarios where data is frequently updated. In this technical blog, we’ll explore how to automate the process of applying transaction log backups to a SQL Server using Python. The provided Python script utilizes the pyodbc library to connect to SQL Server instances, pandas for data manipulation, and subprocess for executing SQL commands.
Scenario:
The client have a 4 node Always on Availability group cluster, out of this 4 nodes 1 node is in synchronous commit mode and other 2 nodes are in asynchronous commit mode. Copy only backup is generated using secondary node, there is one full backup every 24 hours and transactional log backup every 15 seconds.
Challenge:
To add any new node using backup the LSN number of primary should match with LSN number of new node, for that the backups should be restored in same manner in which it was generated. The main challenge here is to restore all the transactional log backup in a automated way.
Approach:
Prerequisites:
Python Script Overview:
The provided Python script is divided into several sections:
1. Connecting to Source and Destination Servers:
The script establishes connections to both the source and destination SQL Server instances using pyodbc. Ensure that the connection details (server name, database name, username, and password) are correctly configured for your environment.
2. Functions for Getting Last LSN and File List:
Two functions, get_lsn and get_file_list, are defined to retrieve the last Log Sequence Number (LSN) applied to the destination server and to get a list of transaction log backup files, respectively. The LSN is crucial for filtering and applying the correct transaction log backups.
3. Function for Restoring Log Backups:
The restore_backup function iterates through the transaction log backup files, executing SQL Server commands to restore each file to the destination database.
4. Execution Loop:
The script includes a loop that continually checks for new transaction log backups, restores them, and prompts the user whether to continue running the script.
Note: Before implementing the script, ensure you have the necessary permissions and understand the implications of automating transaction log backups.
Python code:
##---------------------------------------------------------------------------------------------- import pyodbc import pandas as pd import numpy as np from pathlib import Path import os import subprocess import shutil ##---------------------------------------------------------------------------------------------- ## There will be two servers first from where we will be referring the file and second server where we will be restoring the files. ## Connection string to the source server server1 = 'servername' # Replace with 127.0.0.1,21443 for production and start tunneling for the same database1 = 'master' # For restoration master database will work username1 = 'test' # Specify the user which has appropriate access password1 = 'password1###' # Establishing a connection with database cnxn1 = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server1+';DATABASE='+database1+';UID='+username1+';PWD='+ password1) cursor1 = cnxn1.cursor() ## Checking connection to source query='select @@servername' df=pd.read_sql_query(query,cnxn1) print(df) ##---------------------------------------------------------------------------------------------- ## Connection string to the destination server server2 = 'servername' # For destination server, we can give server name no need to do tunneling as it is local server. database2 = 'master' username2 = 'sa' password2 = 'password2####' # Establishing a connection with database cnxn2 = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server2+';DATABASE='+database2+';UID='+username2+';PWD='+ password2) cursor2 = cnxn2.cursor() ## Checking connection to source query='select @@servername' df=pd.read_sql_query(query,cnxn2) print(df) ##---------------------------------------------------------------------------------------------- ## Function to get last LSN which has been applied to destination server def get_lsn(db_name): get_lsn=(''' with cte as (SELECT s.server_name,d.name AS database_name,m.physical_device_name,CASE m.device_type WHEN 2 THEN 'Disk' WHEN 102 THEN 'Backup Device (Disk)' WHEN 5 THEN 'Tape' WHEN 105 THEN 'Backup Device (Tape)' WHEN 7 THEN 'Virtual Device' END AS device_type ,CAST (s.backup_size / 1048576.0 AS FLOAT) AS backup_size_mb ,CAST (s.compressed_backup_size / 1048576.0 AS FLOAT) AS compressed_backup_size_mb ,s.backup_start_date,s.first_lsn,s.backup_finish_date,s.database_backup_lsn ,s.last_lsn,s.checkpoint_lsn ,CASE s.[type] WHEN 'D' THEN 'Database (Full)' WHEN 'I' THEN 'Database (Differential)' WHEN 'L' THEN 'Transaction Log' WHEN 'F' THEN 'File or Filegroup (Full)' WHEN 'G' THEN 'File or Filegroup (DIfferential)' WHEN 'P' THEN 'Partial (Full)' WHEN 'Q' THEN 'Partial (Differential)' END AS backup_type ,s.recovery_model,ROW_NUMBER () OVER (PARTITION BY s.database_name, s.database_backup_lsn ORDER BY s.backup_start_date) AS Row FROM msdb.dbo.backupset s INNER JOIN (SELECT database_name, MAX(backup_set_id) AS max_backup_set_id FROM msdb.dbo.backupset WHERE [type] = 'D' GROUP BY database_name ) t ON s.database_name = t.database_name and s.backup_set_id >= t.max_backup_set_id INNER JOIN msdb.dbo.backupmediafamily m ON s.media_set_id = m.media_set_id RIGHT OUTER JOIN sys.databases d ON s.database_name = d.name AND s.recovery_model = d.recovery_model_desc COLLATE SQL_Latin1_General_CP1_CI_AS ) select max(first_lsn) from cte where database_name='{}' '''.format(db_name)) #print(subquery) lsn=cursor2.execute(get_lsn) val=(lsn.fetchone())[0] return(val) ##---------------------------------------------------------------------------------------------- ## Function to get list of files in dataframe def get_file_list(lsn): query=''' with cte as (SELECT bs.database_name, bs.first_lsn, bs.last_lsn , database_backup_lsn, -- For tlog and differential backups, this is the checkpoint_lsn of the FULL backup it is based on. checkpoint_lsn, backuptype = CASE WHEN bs.type = 'D' AND bs.is_copy_only = 0 THEN 'Full Database' WHEN bs.type = 'D' AND bs.is_copy_only = 1 THEN 'Full Copy-Only Database' WHEN bs.type = 'I' THEN 'Differential database backup' WHEN bs.type = 'L' THEN 'Transaction Log' WHEN bs.type = 'F' THEN 'File or filegroup' WHEN bs.type = 'G' THEN 'Differential file' WHEN bs.type = 'P' THEN 'Partial' WHEN bs.type = 'Q' THEN 'Differential partial' END + ' Backup', CASE bf.device_type WHEN 2 THEN 'Disk' WHEN 5 THEN 'Tape' WHEN 7 THEN 'Virtual device' WHEN 9 THEN 'Azure Storage' WHEN 105 THEN 'A permanent backup device' ELSE 'Other Device' END AS DeviceType, bms.software_name AS backup_software, bs.recovery_model, bs.compatibility_level, BackupStartDate = bs.Backup_Start_Date, BackupFinishDate = bs.Backup_Finish_Date, LatestBackupLocation = bf.physical_device_name, backup_size_mb = CONVERT(DECIMAL(10, 2), bs.backup_size / 1024. / 1024.), compressed_backup_size_mb = CONVERT(DECIMAL(10, 2), bs.compressed_backup_size / 1024. / 1024.), begins_log_chain, bms.is_password_protected FROM msdb.dbo.backupset bs LEFT JOIN msdb.dbo.backupmediafamily bf ON bs.[media_set_id] = bf.[media_set_id] INNER JOIN msdb.dbo.backupmediaset bms ON bs.[media_set_id] = bms.[media_set_id] ) select database_name,first_lsn,last_lsn,database_backup_lsn,backuptype,BackupStartDate,BackupFinishDate,LatestBackupLocation from cte where first_lsn>{} and backuptype='Transaction Log Backup' and database_name='sample_bkp' '''.format(lsn) df1=pd.read_sql_query(query,cnxn1) return(df1) ##---------------------------------------------------------------------------------------------- def restore_backup(server2,username2,password2,database2): b=[] for i in df.LatestBackupLocation[:5]: bak='''RESTORE LOG sample_bkp FROM DISK = '{}' WITH NORECOVERY'''.format(i) print(bak) aa=subprocess.check_output('''sqlcmd -S {} -U {} -P {} -d {} -Q "{}"'''.format(server2,username2,password2,database2,bak)) b.append(aa) return(b) ##---------------------------------------------------------------------------------------------- ## Execution iteration=0 a=1 while a==1: lsn=get_lsn('sample_bkp') #get_lsn function will fetch lsn number which will be used as filter for next step print(lsn) df=get_file_list(lsn) #get_file_list function will get list of all directory list in dataframe #df.head() print(df.shape) result=restore_backup(server2,username2,password2,database2) #restore_backup function restores log backup to destination database print(result) lsn1=get_lsn('sample_bkp') #get_lsn function will fetch lsn number which will be used as filter for next step print(lsn1) df1=get_file_list(lsn1) #get_file_list function will get list of all directory list in dataframe #df.head() print(df1.shape) new_rows=df1.shape[0] print("We got {} new transaction log backup files".format(new_rows)) a=int(input("Will you continue running the script, if yes enter 1 else press any other key: ")) iteration+=1 # Default is 0 and with every iteration it will if a==1: print(''' This will be the {} iteration '''.format(iteration)) continue else: print('Exiting the script....................') break ##---------------------------------------------------------------------------------------------- ## END ##----------------------------------------------------------------------------------------------
Naveen Kumar Singh