SQL Server: Tips and Tricks - 2 (2017)
How to Find Percentage of Null Values in every Column of all the Tables in SQL Server Database
Scenario:
You are working as SQL Server / TSQL Developer in Auto Insurance Company. You are working on data validation and data analysis for your project. You got this requirement in which you need to find the Percentage of Null values in every column for all the tables in SQL Server Database.
This can be really helpful in analysis so you would know how much data you are really getting or you are getting no values (Null) and plan the storage. Also depending upon the analysis you might find some records those should not be Null but you have receive Null for those Columns, so data corrections can be next step.
Solution:
We are going to use Cursor to find percentage and Null Row Count in each of the column in all the tables in SQL Server Database. This process can take long time to calculate as it will be running for each column in database if Column allows Nulls. If Column is Not Nullable, we don't need to check that columns as nobody can insert null values in those columns.
I will suggest to run the script on small database and take the backup of production database and restore to UAT and run query on database in UAT. In case your database is small, you might be fine. Again you don't want to hurt performance of running processes, you might want to schedule to run after hours.
USE YourDBName
GO
DECLARE @DatabaseName VARCHAR(100)
DECLARE @SchemaName VARCHAR(100)
DECLARE @TableName VARCHAR(100)
DECLARE @ColumnName VARCHAR(100)
DECLARE @FullyQualifiedTableName VARCHAR(500)
--Create Temp Table to Save Results
IF OBJECT_ID('tempdb..#Results') IS NOT NULL
DROP TABLE #Results
CREATE TABLE #Results (
DatabaseName VARCHAR(100)
,SchemaName VARCHAR(100)
,TableName VARCHAR(100)
,ColumnName VARCHAR(100)
,TotalTableRowCount INT
,NullRecordCount INT
)
DECLARE Cur CURSOR
FOR
SELECT C.Table_CataLog
,C.Table_Schema
,C.Table_Name
,C.Column_Name
,'[' + C.Table_CataLog + ']' + '.[' + C.Table_Schema + '].'
+ '[' + C.Table_Name + ']' AS FullQualifiedTableName
FROM information_schema.Columns C
INNER JOIN information_Schema.Tables T ON C.Table_Name = T.Table_Name
AND T.Table_Type = 'BASE TABLE'
and C.IS_Nullable='YES'
OPEN Cur
FETCH NEXT
FROM Cur
INTO @DatabaseName
,@SchemaName
,@TableName
,@ColumnName
,@FullyQualifiedTableName
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @SQL VARCHAR(MAX) = NULL
SET @SQL = ' Select ''' + @DatabaseName + ''' AS DatabaseName, '''
+ @SchemaName + ''' AS TableName,
''' + @TableName + ''' AS SchemaName,
''' + @ColumnName + ''' AS ColumnName,
(Select count(*) from ' + @FullyQualifiedTableName + ' with (nolock))
AS TotalTableRowCount,
count(*) as NullRecordCount from ' + @FullyQualifiedTableName
+ 'with (nolock) Where ' + @ColumnName + ' IS NULL'
--Print @SQL
INSERT INTO #Results
EXEC (@SQL)
FETCH NEXT
FROM Cur
INTO @DatabaseName
,@SchemaName
,@TableName
,@ColumnName
,@FullyQualifiedTableName
END
CLOSE Cur
DEALLOCATE Cur
SELECT *,
Cast((NullRecordCount/Cast(TotalTableRowCount as Numeric(13,1)))*100 AS Numeric(4,1))
AS NullPercentColumnValues
FROM #Results
where NullRecordCount <>0
--drop table #Results
Here are my results when I ran above query on one of my database, it will return you DatabaseName, SchemaName, TableName, ColumnName, TotalTableRowCount, NullRecordCount and Null Percentage Column Values as shown below.