Archives for category: SQL Server

I was attempting to create a modularized approach to a set of calculations in SQL Server.  The result was the dreaded “Nested stored procedure” error.  I had to use stored procedures instead of functions because I was using a cursor.

I figured that I could resolve this issue pretty easily if I could only pass in a TABLE variable to one of my stored procedures.  At first glance, this did not appear to be a possibility until I discovered TYPES.

TYPES are one of those features that is not readily apparent in SQL Server, because I cannot see them in the SQL Server Manager.  For my purposes, a TYPE is like a named table variable in SQL Server.  And because it is a named variable, you can pass it into a stored procedure just like any other variable.

Here is an example of creating a TYPE in SQL:

CREATE TYPE MonthlyTimesheetEntries AS TABLE
(
timesheet_id int identity,
month_date datetime,
consultant_id int,
consultant_name varchar(50),
billing_code int,
billing_code_name varchar(50),
entry_date datetime,
total_hours money,
billing_rate money,
total_amount money
)

If you need to add or modify any of the fields defined in the type, you can simply drop the TYPE and recreate it, assuming you don’t have any dependencies.

DROP TYPE MonthlyTimesheetEntries

Please Note:  Once you pass the TYPE into a stored procedure, you can no longer DROP it.  To work around this, I backup my stored procedure, delete it.  Drop and recreate the type, and then recreate my stored procedure.

Once you have created the TYPE, you can use it as a stored procedure parameter:

// you must pass it in as read only

CREATE PROCEDURE [dbo].[sCalculateMonthlyInvoiceAmount]
@month_date datetime,
@timesheet_entries ManagedServiceMonthlyTimesheetEntries READONLY
AS
BEGIN

END

And you can simply populate and pass in the TYPE like this:

// populate type

DECLARE @timesheet_entries MonthlyTimesheetEntries
DECLARE @month_date datetime

SET @month_date = ‘8/1/2012’

INSERT INTO @timesheet_entries
(

     timesheet_id,
month_date,
consultant_id,
consultant_name,
billing_code,
billing_code_name,
[entry_date],
[total_hours],
[billing_rate],
total_amount
)
EXEC sGetTimesheetEntries @month_date

// pass in type

EXEC sCalculateMonthlyInvoiceAmount @month_date, @timesheet_entries

Advertisements

In some cases, I like to use SQL Server Functions to encapsulate some of my SQL Server logic into reusable and maintainable components.

Recently, I have found the use of Table-valued Functions to be extremely helpful.  With Table-valued functions, you can return multiple values, which is useful enough in itself, or even entire result sets.

The following is an example of creating a simple Table-valued Function that prorates a monthly amount based on the number of days which have elapsed.  Notice that the return value is a TABLE variable.

CREATE FUNCTION [dbo].[fnProrateMonthlyAmount]
(
@Month datetime,            — month for which the proration will be calculated
@DurationStart datetime,    — start date of the proration duration
@DurationEnd datetime,      — end date of the proration duration
@MonthlyAmount money,       — monthly amount to be prorated
@MonthlyHours money         — monthly hours to be prorated
)

RETURNS @RtnValue table
(
TotalDays int,             — total days in month
TotalAmount money,         — total monthly amount
ProratedDays int,          — number of prorated days
HoursPerDay money,         — number of hours per day
ProratedHours money,       — number of prorated hours
AmountPerDay money,        — dollar amount per day
ProratedAmount money       — prorated dollar amount
)
AS
BEGIN

    — declare variables
DECLARE @MonthEnd datetime
DECLARE @TotalDays int, @ProratedDays int
DECLARE @HoursPerDay money, @ProratedHours money
DECLARE @AmountPerDay money, @ProratedAmount money

— initialize variables
SET @Month = DATEADD(DAY, -(DATEPART(DAY, @Month)-1), @Month)
SET @MonthEnd = DATEADD(DAY, -1, DATEADD(MONTH, 1, @Month))
SET @TotalDays = DATEPART(dd, @MonthEnd)
SET @ProratedDays = DATEDIFF(DAY, @DurationStart, @DurationEnd) + 1
SET @AmountPerDay = ROUND(@MonthlyAmount/@TotalDays, 2)
SET @HoursPerDay = ROUND(@MonthlyHours/@TotalDays, 2)
SET @ProratedAmount = @MonthlyAmount
SET @ProratedHours = @MonthlyHours
IF @TotalDays > @ProratedDays
BEGIN
SET @ProratedAmount = @ProratedDays * @AmountPerDay
SET @ProratedHours = @ProratedDays * @HoursPerDay
END

— return results
INSERT INTO @RtnValue
(
TotalDays,
TotalAmount,
ProratedDays,
HoursPerDay,
ProratedHours,
AmountPerDay,
ProratedAmount
)
SELECT
@TotalDays,
@MonthlyAmount,
@ProratedDays,
@HoursPerDay,
@ProratedHours,
@AmountPerDay,
@ProratedAmount

Return
END

 

I can simply render the results of the function like this:

SELECT * FROM
dbo.fnProrateMonthlyAmount
(
‘7/1/2012’,
‘7/1/2012’,
‘7/31/2012’,
10000,
50
)

Which gives me this output:

image

Or I can stuff the results of the function into a TABLE variable or SQL TYPE:

DECLARE @prorate TABLE
(
TotalDays int,
TotalAmount money,
ProratedDays int,
HoursPerDay money,
ProratedHours money,
AmountPerDay money,
ProratedAmount money
)

INSERT INTO @prorate
(
TotalDays,
TotalAmount,
ProratedDays,
HoursPerDay,
ProratedHours,
AmountPerDay,
ProratedAmount
)
[dbo].[fnManagedServicesProrateAmount](@Month, @DurationStart, @DurationEnd, @MonthlyRevenue, @MonthlyHours)

 

And if I know that the results are a single record with multiple values, and I want to work with these values to further my calculations, I can stuff the result directly into variables:

SELECT TOP 1
@TotalDays = TotalDays,
@TotalAmount = TotalAmount,
@ProratedDays = ProratedDays,
@HoursPerDay = HoursPerDay,
@ProratedHours = ProratedHours,
@AmountPerDay = AmountPerDay,
@ProratedAmount = ProratedAmount
FROM
[dbo].[fnManagedServicesProrateAmount](@Month, @DurationS, @DurationEnd, @MonthlyRevenue, @MonthlyHours)

I will admit that sometimes I have been known to use temporary tables rather than table variables, because of what I thought were limitations of the table variable. Turns out, table variables can do almost anything that a temporary table can do. Plus, they have the added bonus of having better performance, especially if they are used in a stored procedure. The reason for this is that a stored procedure that contains a temporary table cannot be precompiled. A precompiled script typically executes must faster. Also, while table variables exist only in the same scope in which they are created, they can be returned and passed into a stored procedure by creating a User Defined Type. So the only major limitation of a table variable versus a temporary table is that transaction logs are not recorded for table variables. Therefore, you cannot use them in the scope of a transaction.  One other difference, of course, is that table variables automatically go out of scope, while temporary tables must be explicitly DROPPED.

One thing I must admit that it took me a while to figure out is how to join with a table variable. Seems like it would be easy. For example, support I want to update a temporary table with the results of a subquery:

UPDATE @Users SET billable = (SELECT TOP 1 au.billable FROM application_user au WHERE au.[user_id] = @Users.[user_id]) WHERE u.billable IS NULL

The example above results in an error on the highlighted line stating that I must declare @Users. One simple change to this statement resolve the issue very easily.

UPDATE @Users SET billable = (SELECT TOP 1 au.billable FROM application_user au WHERE au.[user_id] = u.[user_id]) FROM @Users u WHERE u.billable IS NULL

This was definitely one of my DUH moments!  Smile

I have a set of records for employee utilization, and I wanted a quick way to rank them from highest utilization to lowest utilization. It turns out that the ROWNUMBER() function did exactly what I needed.

Here is an example of my data:

ConsultantName UtilizationPercent
Consultant A 100.5%
Consultant B 87.4%
Consultant C 95.0%
Consultant D 101.2%

The following query can be used to return the results of my data by ranking:

SELECT
ConsultantName
UtilizationPercent,
ROW_NUMBER() OVER (ORDER BY UtilizationPercent DESC) AS ConsultantRanking
FROM
ConsultantUtilization
ORDER BY
ConsultantRanking

This query returns the following:

ConsultantName UtilizationPercent ConsultantRanking
Consultant D 101.2% 1
Consultant A 100.5% 2
Consultant C 95.0% 3
Consultant B 87.4% 4

I am starting to appreciate the absolute awesomeness of table variables. Rather than using those dreaded cursors, I use a temporary working table and use update statements to apply my calculations. The performance benefits are incredibly tangible.

Table variables are very powerful and can do just about anything a temporary table can do. The only major difference is that once a table variable is declared, it cannot be altered. In addition, all indexes must use unique constraints.

Declaring a Table Variable with a Primary Key

DECLARE @work TABLE
(
consultant_id int primary key,
consultant_name varchar(50),
month_date datetime,
project_id int,
project_name varchar(255),
total_hours float
)

Declaring a Table Variable with a Composite Primary Key

DECLARE @work TABLE
(
consultant_id int,
consultant_name varchar(50),
month_date datetime,
project_id int,
project_name varchar(255),
total_hours float
primary key (month_date, consultant_id, project_id)
)

Declaring a Table Variable with a Composite Unique Index

DECLARE @work TABLE
(
id int identity primary key,
consultant_id int,
consultant_name varchar(50),
month_date datetime,
project_id int,
project_name varchar(255),
total_hours float
unique clustered (id, consultant_name)
)

Have you ever tried to add a column to a SQL Server Table using SQL Server Management Studio, only to be confronted with the following error message:

Saving changes is not permitted. The changes you have made require the following tables to be dropped and re-created. You have either made changes to a table that can’t be re-created or enabled the following Prevent saving changes that require the table to be re-created.

Not only that, but it forces you to cancel out of the dialog and you lose all of the changes you made. *Not cool*

While it is a best practice to always use TSQL when creating or altering the structure of Tables (so that you can save your scripts), it would be convenient on occasion to allow database developers to make small structural changes using SQL Server Management Studio.

Turns out, allowing a table to be dropped and recreated to alter its structure is an option you can configure in SQL Server Management Studio.

  1. Launch SQL Server Management Studio
  2. Choose Tools à Options
  3. On the left, expand Designers and select Table and Database Designers
  4. Uncheck the checkbox entitled Prevent saving changes that require table re-creation

  1. Click OK
In my never ending quest to completely phase out SQL cursors, I am truly discovering some truly awesome features about SQL Server that I have never looked deeply enough before to consider.
My requirement: I need to determine the total value for a project based on a set of monthly deliverables. These deliverables each contain a start and end date, a monthly revenue value, and a monthly set of hours.
For example:
Deliverable
Monthly Hours
Monthly Revenue
Contract Start
Contract End
Deliverable 1
20
$ 2,000
1/1/2012
10/7/2012
Deliverable 2
80
$ 8,400
10/8/2012
12/31/2012
Deliverable 3
1
$ 500
10/8/2012
10/7/2013
What makes this interesting is that for those months in which the contract either doesn’t start on the 1st or end of the last day of the month, the Monthly Revenue must be prorated out: Total Days/Revenue Per Day
I created the following Table Valued function to handle the calculations:
FUNCTION [dbo].[fnCalculateManagedServicesDeliverableTotals]
(
    @deliverable_id int,
    @published int
)
RETURNS @RtnValue table
(
    deliverable_id int,
    published int,
    total_hours money,
    total_revenue money
)
My goal, as I said, is to calculate the total project value. Before I gave up cursors, I would have simply created a cursor that iterated each project deliverable and updated some variables with the results of my Table Valued function. I wanted to accomplish the same thing WITHOUT USING A CURSOR.
The first thing I did was to create and populate a table variable to hold my project deliverables:
DECLARE @deliverables TABLE
(
    deliverable_id int
)
I then created another table variable to hold my results:
DECLARE @work TABLE
(
    deliverable_id int,
    total_amount money,
    total_hours money
)
This is where it got a little sticky for me. I wanted to insert into my @work table the deliverables from my @deliverables table, along with the results of my Table Valued function.
Initially I tried this and got a SQL syntax error: L
INSERT INTO @work
(
    deliverable_id,
    total_amount,
    total_hours
)
SELECT
    pd.deliverable_id,
    d.total_revenue,
    d.total_hours
FROM
    @deliverables AS pd
    INNERJOIN dbo.fnCalculateManagedServicesDeliverableTotals(pd.deliverable_id, @published)AS d
WHERE
    pd.deliverable_id = d.deliverable_id
It turns out that this is entirely possible, as long as you are not passing in a value or variable to the table function as parameter, but NOT a column from the joining table.
SQL Server does have a solution for this called CROSS APPLY. If you use CROSS APPLY for INNER JOINS and OUTER APPLY for LEFT OUTER JOINS, then you have the ability to create a join between two table valued expressions, which in my case is a TABLE VARIABLE and the results of a TABLE VALUED FUNCTION.
INSERT INTO @work
(
    deliverable_id,
    total_amount,
    total_hours
)
SELECT
    pd.deliverable_id,
    d.total_revenue,
    d.total_hours
FROM
    @deliverables AS pd
    CROSSAPPLY dbo.fnCalculateManagedServicesDeliverableTotals(pd.deliverable_id, @published)AS d
WHERE
    pd.deliverable_id = d.deliverable_id
Here are my results:

I find that I am consistently trying to determine the first and last day of the current month using Transact SQL. The following is a cheat sheet for calculating easily those dates:

DECLARE@todayDATETIME,@first_of_month DATETIME,@last_of_month DATETIME

 

SET@today=GETDATE()

SET@first_of_month=DATEADD(dd,-(DATEPART(dd, @today)-1),@today)

SET@last_of_month=DATEADD(dd,–1,DATEADD(mm, 1, @first_of_month))

 

PRINT@today

PRINT@first_of_month

PRINT @last_of_month

When you install the Microsoft SQL Server Management Studio, the default settings do not allow you to save any changes to the structure of your tables that would cause the tables to be dropped and recreated, which can be extremely limiting when you are needing to make a quick change using the interface.

The good news is that it is very easy to change this setting by following these steps:

  1. Open SQL Server Management Studio
  2. From the file menu, choose Tools à Options
  3. From the left menu, choose Designers
  4. Uncheck the box entitled Prevent saving changes that require table re-creation
  5. Press OK to save

That’s it!

There are two basic techniques for updating one table with information from another, in this example, we are updating an OLAP Dimension table with the results from an OLTP table.
 
Flush and Fill Technique
The first and easiest method is called the “flush and fill” technique. Basically, you remove all of the data from the Dimension table and fill it with data from an OLTP table. This is the simplest method by far, but can be extremely inefficient if we’re talking about tables with a lot of rows. The only thing you need to decide here is whether you are going to use the DELETE command or the TRUNCATE command.
 
DELETE COMMAND
Did you know that if you run the following command…
 
DELETE FROM [dbo].[DimDeliverable]
 
…and the CUSTOMER table contains 5,000, that the command will be executed 5,000 times, once for each row, because it deletes them one-by-one? Also, if you use the DELETE command, your IDENTITY key does not automatically RESEED itself. You would also need to explicitly reseed the identity field…
DBCCCHECKIDENT(‘DimDeliverable’,RESEED)
…or use this statement to reseed the identity field to start at a specific value.
DBCC CHECKIDENT (‘DimDeliverable’, 1)
 
TRUNCATE COMMAND
An alternative to the delete command is the TRUNCATE command. The nice thing about the truncate command is that it is executed only once, so it is much more efficient and it also automatically reseeds your identity key. How does it work? Well, the truncate command deallocates data pages which internally store the data in SQL Server. One thing to note, however, is that while the truncate command is the quickest way to clear out a SQL table, it cannot be used on a table with foreign key constraints. You would first need to remove the constraints as shown in the following example:
 
— temporarily remove constraint
ALTERTABLE[dbo].[DimDeliverable]DROPCONSTRAINT[PK_DimDeliverable]
 
— truncate table
TRUNCATETABLE[dbo].[DimDeliverable]
 
— re-add constraint
ALTER TABLE [dbo].[DimDeliverable] ADD CONSTRAINT [PK_DimDeliverable] PRIMARY KEY ([DeliverableKey])
 
Incremental Loading Technique
This technique is much more efficient than the “flush and fill” technique in that it only updates the items which have changed. While it is more efficient, it is definitely not simpler, so you will probably use the “flush and fill” technique on your smaller dimension tables. As of SQL 2008, a new command was introduced to support the “incremental loading” technique called MERGE.
Type 1:  Updates are overwritten and history of changes is not kept. 
This statement does the following: 1) if a record does not exist, create it; 2) if a record exists but is different, update it; 3) if a record exists in the target but not in the source, then deactivate it.
 
use CatapultDatawarehouse
Go
— Declare Variables
DECLARE @BC int = 570
— TYPE 1 (Overwrites history) – DimDeliverable:  CATINDB06.CAT_DB.project_deliverable
BEGIN TRY — perform in transaction so if it fails values are rolled back
      MERGE DimDeliverable AS [Target] — begin merge statements (merge statements end with a semi-colon)
      USING CATINDB06.CAT_DB.project_deliverable AS [Source]
      ON[Target].[DeliverableId]=[Source].[deliverable_id]
      WHEN MATCHED AND — record exists but values are different
      (
            [Target].[Name] <> [Source].[deliverable_name]
            OR [Target].[EstimatedHours] <> [Source].[est_hours]
            OR [Target].[EstimatedRate] <> [Source].[est_billing_rate]
            OR [Target].[EstimatedRevenue] <> [Source].[total_amount]
      )
      THEN UPDATE SET — update records (Type 1 means record values are overwritten)
            [Target].[Name] = [Source].[deliverable_name]
            , [Target].[EstimatedHours] = [Source].[est_hours]
            , [Target].[EstimatedRate] = [Source].[est_billing_rate]
            , [Target].[EstimatedRevenue] = [Source].[total_amount]
      WHEN NOT MATCHED BY TARGET — record does not exist
      THEN INSERT — insert record
      (
            [Name]
            ,[EstimatedHours]
            ,[EstimatedRate]
            ,[EstimatedRevenue]
      )
      VALUES
      (
            [Source].[deliverable_name]
            , [Source].[est_hours]
            , [Source].[est_billing_rate]
            , [Source].[total_amount]
      )
      WHEN NOT MATCHED BY SOURCE — record exists in target but not source
      THEN DELETE — delete from target
      OUTPUT $action, inserted.*, deleted.*; — output results
      COMMIT TRAN
END TRY
BEGIN CATCH
      ROLLBACK TRAN
ENDCATCH
Go
 
Type 2:  Slowly Changing Dimensions track historical changes. 
In this case the merge statement is simply checking to see if a row exists but contains changes.  If this is the case then the merge statement simply deactivates that row but keeps its data intact for historical purposes.  A new row then is created and marked as the current (most up-to-date) information using an INSERT statement that is wrapped around the SQL MERGE.
 
— TYPE 2 (Tracks history) – DimDeliverable:  CATINDB06.CAT_DB.project_deliverable
BEGIN TRY — perform in transaction so if it fails values are rolled back
      INSERT INTO DimDeliverable
      (
            [Name]
            , [EstimatedHours]
            , [EstimatedRate]
            , [EstimatedRevenue]
            , [IsCurrentRow]
      )
      SELECT
            [deliverable_name]
            , [est_hours]
            , [est_billing_rate]
            , [total_amount]
            , 1
      FROM
      (
            MERGE DimDeliverable AS [Target] — begin merge statements (merge statements end with a semi-colon)
            USING CATINDB06.CAT_DB.project_deliverable AS [Source]
            ON [Target].[DeliverableId] = [Source].[deliverable_id] AND [Target].[IsCurrentRow] = 1
            WHEN MATCHED AND — record exists but values are different
            (
                  [Target].[Name] <> [Source].[deliverable_name]
                  OR [Target].[EstimatedHours] <> [Source].[est_hours]
                  OR [Target].[EstimatedRate] <> [Source].[est_billing_rate]
                  OR [Target].[EstimatedRevenue] <> [Source].[total_amount]
            )
            THEN UPDATE SET — update records (Type 1 means record values are overwritten)
                  [Target].[IsCurrentRow] = 0
                  , [Target].[ValidTo] = GETDATE()
            WHEN NOT MATCHED BY TARGET — record does not exist
            THEN INSERT — insert record
            (
                  [Name]
                  ,[EstimatedHours]
                  ,[EstimatedRate]
                  ,[EstimatedRevenue]
                  , [IsCurrentRow]
            )
            VALUES
            (
                  [Source].[deliverable_name]
                  , [Source].[est_hours]
                  , [Source].[est_billing_rate]
                  , [Source].[total_amount]
                  , 1
            )
            WHEN NOT MATCHED BY SOURCE — record exists in target but not source
            THEN DELETE — delete from target
            OUTPUT $action AS Action, [Source].* — output results
      ) AS MergeOutput
      WHERE
            MergeOutput.Action = ‘UPDATE’
      ;
      COMMIT TRAN
END TRY
BEGIN CATCH
      ROLLBACK TRAN
ENDCATCH
Go