Monitoring Database Mail for SQL Server

Phil Ekins (@SQLPhilUK), Principal Architect

The second blog in this series on Database Mail discussed how to troubleshoot and identify some common issues seen with Database Mail and some options for resolution.

In this blog, we shift our focus to being proactive, and how we can get SQL Server to identify, self-heal, and finally, if needed, notify us when email is down.

So let’s revisit the elements from the second blog in this series with a proactive approach in mind. First and foremost, the three system views we have been leveraging are:

  • dbo.sysmail_allitems
  • sysmail_help_queue_sp
  • .sysmail_help_status_sp

 

They are all limited to being helpful only while being run as sysadmin.

Workarounds to the SysAdmin Requirement

Hopefully your SQL Agent service account does not have that level of access. Instead, we can create procedures with the EXECUTE AS logic. These three procedures should be secured (permissions wise), as sensitive information could potentially be seen via access to the subject line.

Note: we are using procedures and temp tables to process results due to the inability to create a view with the execute as approach.

USE [msdb]
GO

CREATE PROC [dbo].[HoB_sysmail_help_queue_sp]
WITH EXECUTE AS 'dbo'
AS
BEGIN
       EXEC msdb..sysmail_help_queue_sp
END
REVERT
GO

CREATE PROC [dbo].[HoB_sysmail_help_status_sp]
WITH EXECUTE AS 'dbo'
AS
BEGIN
        EXEC msdb..sysmail_help_status_sp
END
REVERT
GO

CREATE PROC [dbo].[HoB_sysmail_allitems] @Filter INT
WITH EXECUTE AS 'dbo'
AS
IF (@Filter = 0)
BEGIN
      SELECT mailitem_id, recipients, send_request_date, sent_status, sent_date
      FROM msdb.dbo.sysmail_allitems
END
IF (@Filter = 1)
BEGIN
      SELECT mailitem_id, recipients, send_request_date, sent_status, sent_date
      FROM msdb.dbo.sysmail_allitems
      WHERE sent_status = 'sent'
END
IF (@Filter = 2)
BEGIN
      SELECT mailitem_id, recipients, send_request_date, sent_status, sent_date
      FROM msdb.dbo.sysmail_allitems
      WHERE (sent_status = 'unsent' OR sent_status = 'retrying')
END
IF (@Filter = 3)
BEGIN
      SELECT mailitem_id, recipients, send_request_date, sent_status, sent_date
      FROM msdb.dbo.sysmail_allitems
      WHERE sent_status = 'failed'
END
REVERT
GO

 

Now we can use these procedures, via a scheduled process, to review if the queue has started, and if not attempt to start it.

Check and Restart the Mail Queue

In examples #1 and #2 below we check, attempt to start, recheck and finally raise an error if the start fails.

Example #1

DECLARE @ResultOutput VarChar(64)
CREATE TABLE #Result ([Status] VarChar(64))

INSERT INTO #Result ([Status]) EXEC msdb..[HoB_sysmail_help_status_sp]

SELECT @ResultOutput=[Status] FROM #Result

IF @ResultOutput = 'STOPPED'
BEGIN
       EXEC msdb.dbo.sysmail_start_sp
END

WAITFOR DELAY '00:00:10'

TRUNCATE TABLE #Result
INSERT INTO #Result ([Status]) EXEC msdb..[HoB_sysmail_help_status_sp]

SELECT @ResultOutput=[Status] FROM #Result

IF @ResultOutput = 'STOPPED'
BEGIN
       RaisError ('Unable to Start Mail Queue', 15, 10)
END

DROP TABLE #Result


Example #2

DECLARE @ResultOutput VarChar(64)
CREATE TABLE #Result ([queue_type] nVarChar(6),
       [length] INT,
       [state] nVarChar(64),
       [last_empty_rowset_time] DateTime,
       [last_activated_time] DateTime)

INSERT INTO #Result ([queue_type], [length], [state], [last_empty_rowset_time], [last_activated_time])
       EXEC msdb..[HoB_sysmail_help_queue_sp]

SELECT @ResultOutput=[queue_type] FROM #Result WHERE [queue_type] = 'mail'

IF @ResultOutput IS NULL
BEGIN
      EXEC msdb.dbo.sysmail_start_sp
END

WAITFOR DELAY '00:00:10'

TRUNCATE TABLE #Result
INSERT INTO #Result ([queue_type], [length], [state], [last_empty_rowset_time], [last_activated_time])
       EXEC msdb..[HoB_sysmail_help_queue_sp]

SELECT @ResultOutput=[queue_type] FROM #Result WHERE [queue_type] = 'mail'

IF @ResultOutput IS NULL
BEGIN
      RaisError ('Unable to Start Mail Queue', 15, 10)
END

DROP TABLE #Result

Check for Failed Emails

We also need to check if we have any failed entries. In a healthy system, the failed count should be zero. In example #3 we are using the prior six hours as our sample, but this can obviously be adjusted as needed.

Example #3

DECLARE @ResultOutput BIGINT
CREATE TABLE #Result ([mailitem_id] INT,
[recipients] VarChar(MAX),
[send_request_date] DateTime,
[sent_status] VarChar(8),
[sent_date] DateTime)

-- passing parameter 3 to return failed emails only
INSERT INTO #Result
EXEC msdb.dbo.HoB_sysmail_allitems 3

SELECT @ResultOutput=COUNT(send_request_date) FROM #Result
       WHERE send_request_date > DateAdd(HOUR,-6,GetDate())

IF @ResultOutput > 0
BEGIN
      RaisError ('Failed eMails Have Been Logged in the Prior 6 Hours', 15, 10)
END

DROP TABLE #Result

 

Check for a Stalled Mail Queue

Finally, in example #4, we check for a stalled queue.

Example #4

DECLARE @ResultOutput BIGINT
CREATE TABLE #Result ([mailitem_id] INT,
       [recipients] VarChar(MAX),
       [send_request_date] DateTime,
       [sent_status] VarChar(8),
       [sent_date] DateTime)

-- passing parameter 2 to return unsent (and retrying) emails only
INSERT INTO #Result
EXEC msdb.dbo.HoB_sysmail_allitems 2

SELECT TOP 1 @ResultOutput=DateDiff(MINUTE,MAX(send_request_date),sent_date) FROM #Result
             GROUP BY send_request_date, sent_date
             ORDER BY MAX(send_request_date) DESC

IF @ResultOutput > 15
BEGIN
      RaisError ('Stalled eMails in Queue for Greater than 15 Minutes', 15, 10)
END

DROP TABLE #Result


How We Can Alert While Mail is Down?

Obviously, we can’t use SQL Server directly to tell us that SQL Server can’t talk to us… so we have to be more creative.

Thankfully, PowerShell also has a simple mail interface that can be called from within SQL, as shown below.

Example #5

$computer = hostname


Send-MailMessage -From "noreply@company.com" `
    -To "DBAGroup@company.com" `
    -Subject "$computer - Database Mail Issues Have Been Detected" `
    -Body "Database Mail Errors Have occurred - Execute these Commands to review
     SELECT TOP 100 * FROM [msdb].[dbo].[sysmail_event_log] ORDER BY 1 DESC
     SELECT TOP 100 * FROM [msdb].[dbo].[sysmail_allitems] ORDER BY 1 DESC" `
    -SmtpServer "smtp.company.com"

Pull It All Together as a Complete Solution

So the final step is to review these examples, create a SQL Agent Job, and configure the on-failure step to run the PowerShell email step.

Here is how that should look to handle the RaisError logic:

The T-SQL used to create this job is listed below.

USE [msdb]
GO

BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0


IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'Monitor Database Mail',
            @enabled=0,
            @notify_level_eventlog=0,
            @notify_level_email=0,
            @notify_level_netsend=0,
            @notify_level_page=0,
            @delete_level=0,
            @description=N'No description available.',
            @category_name=N'[Uncategorized (Local)]',
            @owner_login_name=N'sa', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Check Queue',
            @step_id=1,
            @cmdexec_success_code=0,
            @on_success_action=3,
            @on_success_step_id=0,
            @on_fail_action=4,
            @on_fail_step_id=4,
            @retry_attempts=0,
            @retry_interval=0,
            @os_run_priority=0, @subsystem=N'TSQL',
            @command=N'DECLARE @ResultOutput VarChar(64)
CREATE TABLE #Result ([Status] VarChar(64))

INSERT INTO #Result ([Status]) EXEC msdb..[HoB_sysmail_help_status_sp]

SELECT @ResultOutput=[Status] FROM #Result

IF @ResultOutput = ''STOPPED''

BEGIN

EXEC msdb.dbo.sysmail_start_sp

END

WAITFOR DELAY ''00:00:10''

TRUNCATE TABLE #Result

INSERT INTO #Result ([Status]) EXEC msdb..[HoB_sysmail_help_status_sp]

SELECT @ResultOutput=[Status] FROM #Result

IF @ResultOutput = ''STOPPED''

BEGIN

RaisError (''Unable to Start Mail Queue'', 15, 10)

END

DROP TABLE #Result

',
            @database_name=N'msdb',
            @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Check for Failed emails',
            @step_id=2,
            @cmdexec_success_code=0,
            @on_success_action=3,
            @on_success_step_id=0,
            @on_fail_action=4,
            @on_fail_step_id=4,
            @retry_attempts=0,
            @retry_interval=0,
            @os_run_priority=0, @subsystem=N'TSQL',
            @command=N'DECLARE @ResultOutput BIGINT
CREATE TABLE #Result ([mailitem_id] INT,
       [recipients] VarChar(MAX),
       [send_request_date] DateTime,
       [sent_status] VarChar(8),
       [sent_date] DateTime)

-- passing parameter 3 to return failed emails only
INSERT INTO #Result
       EXEC msdb.dbo.HoB_sysmail_allitems 3

SELECT @ResultOutput=COUNT(send_request_date) FROM #Result
       WHERE send_request_date > DateAdd(HOUR,-6,GetDate())

IF @ResultOutput > 0
BEGIN
       RaisError (''Failed eMails Have Been Logged in the Prior 6 Hours'', 15, 10)
END
DROP TABLE #Result
',
            @database_name=N'msdb',
            @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Check for Stalled Queue',
            @step_id=3,
            @cmdexec_success_code=0,
            @on_success_action=1,
            @on_success_step_id=0,
            @on_fail_action=4,
            @on_fail_step_id=4,
            @retry_attempts=0,
            @retry_interval=0,
            @os_run_priority=0, @subsystem=N'TSQL',
            @command=N'DECLARE @ResultOutput BIGINT
CREATE TABLE #Result ([mailitem_id] INT,
[recipients] VarChar(MAX),
[send_request_date] DateTime,
[sent_status] VarChar(8),
[sent_date] DateTime)

-- passing parameter 2 to return unsent (and retrying) emails only
INSERT INTO #Result
EXEC msdb.dbo.HoB_sysmail_allitems 2

SELECT TOP 1 @ResultOutput=DateDiff(MINUTE,MAX(send_request_date),sent_date) FROM #Result
GROUP BY send_request_date, sent_date
ORDER BY MAX(send_request_date) DESC

IF @ResultOutput > 15
BEGIN
      RaisError (''Stalled eMails in Queue for Greater than 15 Minutes'', 15, 10)
END

DROP TABLE #Result
',

@database_name=N'msdb',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Send eMail Alert on Failure',
            @step_id=4,
            @cmdexec_success_code=0,
            @on_success_action=1,
            @on_success_step_id=0,
            @on_fail_action=1,
            @on_fail_step_id=0,
            @retry_attempts=0,
            @retry_interval=0,
            @os_run_priority=0, @subsystem=N'PowerShell',
            @command=N'$computer = hostname

Send-MailMessage -From "noreply@company.com" `
    -To "DBAGroup@company.com" `
    -Subject "$computer - Database Mail Issues Have Been Detected" `
    -Body "Database Mail Errors Have occurred - Execute these Commands to review
             SELECT TOP 100 * FROM [msdb].[dbo].[sysmail_event_log] ORDER BY 1 DESC
             SELECT TOP 100 * FROM [msdb].[dbo].[sysmail_allitems] ORDER BY 1 DESC" `
    -SmtpServer "smtp.company.com"
',

              @database_name=N'master',
              @flags=0

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Every Hour',
            @enabled=1,
            @freq_type=4,
            @freq_interval=1,
            @freq_subday_type=8,
            @freq_subday_interval=1,
            @freq_relative_interval=0,
            @freq_recurrence_factor=0,
            @active_start_date=20181024,
            @active_end_date=99991231,
            @active_start_time=0,
            @active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

 

Conclusion

As we have show in parts one, two and three of this blog series, while Database Mail is typically a “set it and forget it” process, there are still considerations and proactive steps needed to ensure the health of your Database Mail and, big picture, the functionality of your SQL Server.

Table of Contents

Related Posts