Showing posts with label SQL Server 2008. Show all posts
Showing posts with label SQL Server 2008. Show all posts

Monday, September 8, 2014

New Add-on for SSMS: T-SQL Flex

Do you ever copy query results out of SQL Server Management Studio, paste them into your text editor, and then perform gold-medal find + replace gymnastics to create T-SQL scripts?

Do you ever need to copy a small amount of data from one SQL server to another, but don’t want to set up a linked server or take the time to create an SSIS package?

Do you ever need to create simple Excel sheets from SQL Server queries – preferably one tab per query result and with valid formatting (such as showing dates correctly and not truncating leading zeros)?

Have you ever wanted to quickly script a temp table that has the exact schema (including nullability and precision) of a query, stored procedure, or table-valued function?

If you answered yes to any of the above questions, there is a new add-on available for SQL Server Management Studio that you may like called T-SQL Flex.

TSqlFlexScriptToInserts

T-SQL Flex is a free, open-source add-on for SQL Server Management Studio that does only two things:

  1. It scripts SQL query results to INSERT statements.
  2. It scripts SQL query results to Excel-compatible spreadsheets (specifically, XML Spreadsheet 2003 format).

Because T-SQL Flex does only these two things, it is very easy to use.  Simply paste in your query and click the “Run ‘n’ Rollback” button.  T-SQL Flex will create an ADO.NET transaction, run your query, collect the results, and then roll back the transaction.

You can check out the latest release here:

https://github.com/nycdotnet/TSqlFlex/releases

If you like T-SQL Flex, please let Steve know on Twitter:

https://twitter.com/nycdotnet

Thanks to the team at Red-Gate that created the SIP framework used by T-SQL Flex.

Happy scripting!

Tuesday, July 23, 2013

Query to Find "Update" Dependencies on a Table or View (SQL Server 2012)


The built-in dependency finder in SQL Server Management Studio does not provide the ability to distinguish between dependencies that are read-only and dependencies that are read-write.  However, SQL Server 2012 does make this information available via the management views.

 

The query below will return all of the objects that reference the table (or view) identified by @SchemaName and @ObjectName where the referencing object can modify the data in the table (or view)... specifically this means an INSERT, UPDATE, or DELETE operation.

 

Note that dependency tracking in SQL Server is not perfect because of the possibility of using dynamic SQL, but this will hopefully be useful information.


 
DECLARE @SchemaName NVARCHAR(128) = 'dbo';
DECLARE @ObjectName NVARCHAR(128) = 'MyTableName';
 
SELECT re.referencing_schema_name, re.referencing_entity_name,
  ref.*
FROM sys.dm_sql_referencing_entities(@SchemaName + '.' + @ObjectName,'OBJECT') re
CROSS APPLY sys.dm_sql_referenced_entities(referencing_schema_name + '.' +
    referencing_entity_name,'OBJECT') ref
WHERE ref.is_updated = 1
  AND ref.referenced_entity_name = @ObjectName
  AND (ref.referenced_schema_name IS NULL OR ref.referenced_schema_name = @SchemaName);

 

This post was edited to fix code formatting and to specify SQL Server 2012 as this sadly does not appear to be supported in SQL Server 2008 or before.

 

References

MSDN: sys.dm_sql_referenced_entities (Transact-SQL)

MSDN sys.dm_sql_referencing_entities (Transact-SQL)

Monday, October 1, 2012

SQL in the City - New York 2012

I was able to attend the SQL in the City event in New York this past Friday.  I highly recommend that anyone who can spare the time and travel should attend the next SQL in the City day in your area.  I found it to be a very well put-together event with useful content and excellent networking opportunities.

At the event I was able to meet both Steve Jones (who compassionately shook my hand as I walked in dripping wet from the rain during my walk over to 3rd Avenue) and Grant Fritchey (Scary DBA).  It was great to have even a brief conversation with each of them and introduce myself.  I also got to meet Product Manager David Atkinson and Developer David Simner and talk briefly about the awesome SQL Source Control and SQL Compare products.  The presentations I attended, "Database Maintenance Essentials", "Red Gate Tools - the Complete Lifecycle", and "The Whys and Hows of Database Continuous Integration" were all very informative and useful.

The highlight of the day was finding out about Red-Gate's Virtual Restore product.  This tool essentially allows you to mount a .BAK file as a database.  It supports multiple databases mounted from the same .BAK file, and it is not destructive to the .BAK file.  From the description, it works very similarly to VMs that are mounted from a disk file and then store changes in a disk "diff" file.  This tool allows you to save both space and time and sounds very awesome for developers.

I must say that I think Red-Gate has made a critical error with the marketing of this tool.  Developers hear "restore" and think "DBA tool"; in fact I think that's what the Red-Gate marketing team thought too as this application is included in their DBA bundle only.  I would absolutely use this tool as a developer and I might even use it as often as I use things like SQL Compare and the rest of the more developer-oriented tools.  It's a game-changer, and I hope Red-Gate figures out that they're doing a very bad job of marketing this product and could probably be making a lot more money from it by just getting the word out that it's not only a DBA tool.  For most of the other Red-Gate developer tools, a savvy developer could imagine cooking-up some sort of "good enough" workaround to not have to buy it (though certainly not in a shorter time than the reasonable expense for the license), but I would never even attempt creating something like Virtual Restore.  I will absolutely be trying out this software later this week.  Red-Gate - you've got to get the word out, even think about changing the name or something!!!

Thanks very much to everyone at Red-Gate for putting this event together.  I hope to see you again at the next SQL in the City.

Sunday, June 24, 2012

Excellent SSIS 2012 Upgrade Video

This presentation gives an excellent overview of the process of upgrading SQL Server 2008 SSIS packages to work with SQL Server 2012 SSIS.  I’m very impressed at the much-improved capabilities for deploying SSIS packages in a less-scary way using package and project parameters, support for SSIS “environments” (“prod”, “dev”, “test”, etc.) and several other improvements including SQL Server 2012 automatically archiving previous versions of your packages when a new version is deployed.  It seems that it may actually be practical to use the built-in SSIS deployment functionality now.

http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DBI329

This presentation was delivered by Sven Aelterman at TechEd 2012 – well done Sven!

Tuesday, May 15, 2012

Another Way To Insert Many Rows Very Fast From Your .NET Application

There was a great deal of feedback to my previous post One Way To Insert Many Rows Very Fast From Your .NET Application.

The post focused on the performance benefit of using a single SQL statement with an XML-based parameter to do inserts of many rows to a table versus issuing an insert "per row".

Several commenters on the original post mentioned that with SQL Server 2008 and higher, it is possible to pass a Table-Valued Parameter (TVP) to SQL Server and that this would save SQL Server the overhead of parsing the XML before saving the data.  I had used TVPs before inside databases, but never as a parameter to an externally-facing DB object.  Here's how I added a third insert method to my test application:

Creating the Type

When creating a table-valued type in your database, you have two main options: specific or generic.  The specific case would be to create a type called something like ProductIDList and the generic case would be to create a type called something like UniqueIntegerList.  I like using generic Table-valued types when I can, because it helps to avoid "bunny types" - basically needing to have many duplicate types that seem to multiply on their own.  Your mileage may vary here - I find that this is easier for me than maintaining "EmployeeIDList", "ProductIDList", "WidgetIDList", and the like, when these might be all the same definition.

So I created this "generic" table-valued type:

CREATE TYPE dbo.UniqueIntegerList AS TABLE

(

       TheInteger INT NOT NULL

    PRIMARY KEY (TheInteger)

);

Creating the Save Stored Procedure

Next, I created a new stored procedure which would accept my new Table-Valued Type as a parameter.

CREATE PROC DoTableValuedParameterInsert(@ProductIDs dbo.UniqueIntegerList READONLY)

AS BEGIN

      

       INSERT INTO ProductsAccess(ProductID)

       SELECT TheInteger AS [ProductID]

       FROM @ProductIDs;

 

END

 

In this procedure, I am passing in a parameter called @ProductIDs.  This is of type "dbo.UniqueIntegerList" which I just created in the previous step.  SQL Server looks at this and says "oh I know what this is - this type is actually a table". Since it knows that the UniqueIntegerList type is a table, I can select from it just like I could select from any other table-valued variable.  You have to mark the parameter as READONLY because SQL 2008 doesn't support updating and returning a passed table-valued parameter.

Creating the Save Routine

Then I had to create a new save routine on my business object that would call the new stored procedure.  The way you prepare the Table-Valued parameter is to create a DataTable object with the same column signature as the Table-Valued type, populate it, and then pass it inside a SqlParameter object as SqlDbType.Structured.

Public Sub SaveViaTableValuedParameter()

 

  'Prepare the Table-valued Parameter

  Dim objUniqueIntegerList As New DataTable

  Dim objColumn As DataColumn = objUniqueIntegerList.Columns.Add("TheInteger", _

             System.Type.GetType("System.Int32"))

  objColumn.Unique = True

 

  'Populate the Table-valued Parameter with the data to save

  For Each Item As Product In Me.Values

    objUniqueIntegerList.Rows.Add(Item.ProductID)

  Next

 

  'Connect to the DB and save it.

  Using objConn As New SqlConnection(DBConnectionString())

    objConn.Open()

    Using objCmd As New SqlCommand("dbo.DoTableValuedParameterInsert")

      objCmd.CommandType = CommandType.StoredProcedure

      objCmd.Connection = objConn

      objCmd.Parameters.Add("ProductIDs", SqlDbType.Structured)

 

      objCmd.Parameters(0).Value = objUniqueIntegerList

 

      objCmd.ExecuteNonQuery()

    End Using

    objConn.Close()

  End Using

End Sub

 

So was it faster using a Table-Valued Parameter?

My old test harness used runs of up to 504 rows in up to 50 test runs.  When doing a small number of rows and 50 trials, using the XML-based parameter versus the Table-valued parameter didn't seem to make a detectable or significant difference.   The two methods kept trading places and the difference was indistinguishable from noise.

 

When I upped the test runs to 1,000 batches at a time, the pattern became more clear.

 

image

Lower is faster.  Test runs 1-3 done with the XML insert first; tests 4-6 done with the TVP insert first.  Testing was done with SQL Server 2008 Developer Edition SP2 (10.0.4311.0).

So the answer is that the Table-valued parameter is clearly faster when you start getting into several hundred inserts in one call, but using a TVP does not appear to provide an advantage to using an XML parameter below 100 rows, at least with the way that I've implemented this.  I would hesitate to say that the XML is "faster" below 100 rows to insert - the difference is so slight (under 2 ms per call) that it could be due to the implementation details in my test program and not SQL Server-related.

Pros and Cons

One of the advantages of using a TVP is that you can create a primary key and zero or more other unique indexes on it - this might help with performance when implementing "Upsert" or MERGE patterns.  XML data doesn't come in with indexes applied, so it could definitely be slower for scenarios that involve a join (not tested in this blog post).  It seems that if the count of rows you need to insert per call is fairly large and performance is your only consideration, using a TVP could be a good choice.

In my opinion, using XML is more flexible than using a TVP approach because you aren't required to maintain types inside the database along with the stored procedures and other code.  Using an XML parameter also requires somewhat less code in the persistence layer to implement (and practically none if the XML serialization logic lives in the business layer).  Lastly, XML serialization code is reusable in other scenarios (serializing an object as XML to disk, to a log, or in unit tests, for example), whereas the persistence code for a TVP is only useful for passing whole objects to SQL Server.

Wrap-up

Even in the largest scenario tested, the insert work averaged 24ms to insert 500 rows of data via the XML parameter versus 14.2ms for the TVP.  This type of difference isn't perceptible to a user, though clearly it could add up for a large application with many users calling this code frequently.  However, either approach compares very favorably to the average 719ms to complete a 500 row insert measured in my previous post using the non-optimized 1x1 insert method implemented by many ORMs.

As always, the best choice depends on the requirements for your application.  Either of the two methods tested XML or a TVP are strong candidates for inserting many rows very fast from your .NET application, and each holds some advantages over the other.

I greatly enjoyed hearing the feedback to the prior blog post; I'd love to hear your thoughts on this additional article!

Friday, May 11, 2012

One Way To Insert Many Rows Very Fast From Your .NET Application

This is my attempt to explain something that I have wanted to figure out for a long time.  I have read many articles on this very subject but they always seem to fall short for various reasons - generally by including way more detail or complexity than is necessary to answer the question or get someone going.  Due to the work of others who have written similar articles on this subject in the past (particularly MVP Michael Coles), I have eventually figured this out.  This is my attempt to present this material in an end-to-end yet bare-bones and hopefully easily-understandable format for any .NET or SQL developer who needs to do this.

Note: I've written a follow-up to this article based on the excellent feedback: Another Way To Insert Many Rows Very Fast From Your .NET Application

The issue:

Recently, I was asked to troubleshoot the startup performance of a VB.NET application with a SQL 2008 back-end.  The startup routine was taking up to a minute or so to complete in locations with a fast DB connection, and up to 15 minutes in locations with a slow DB connection.

By stepping through the application and inserting Debug.Print() statements liberally, I narrowed-down the "slow" part of the routine to a loop that was inserting product IDs into a table.  Basically, the way the app worked was that when it was launched, it would figure out which products were relevant to the active user and then it would insert those into a table that was used with INNER JOINs in all the rest of the screens and reports.

There was a table in the SQL database that looked a bit like this (pretend there is only ever one user for this example):

CREATE TABLE ProductsAccess(

ProductID INT NOT NULL

PRIMARY KEY (ProductID));

 

The VB.NET code that initialized the product list for the user in the DB was something like this (using typical ORM stuff):

 

'Delete all records in the table

ProductsAccess.DeleteAll() 

'now populate all products that the user can choose in a loop

For i = 0 To objProductList.GetUpperBound(0)

       ProductsAccess.Insert(objProductList.Item(i).ProductID)

Next

 

Behind the scenes, this meant at least 1 call to the database per row to be inserted, plus the first call to clear out any existing rows.  Since it was typical for our users to have several thousand products available to them, this could take a very long time, particularly with remote users on high-latency connections.

 

It wasn't the amount of data that was the problem - it was the count of the calls that was the issue.  Instead of saying "Insert product 1", "Insert Product 2", "Insert Product 3", etc., I needed to say "Insert Products: 1, 2, 3," etc.

 

What I did:

Here is the simple version of how I changed the insert routine to cut the number of database calls to insert "n" rows from "n+1" to "always 2":

 

I created this stored procedure:

CREATE PROC ProductsAccessXMLInsert(@ValidXMLInput XML)

AS BEGIN

 

       INSERT INTO ProductsAccess (ProductID)

       SELECT Col.value('@ID','INT')

       FROM @ValidXMLInput.nodes('//Products/Product') Tab(Col)

 

END

 

Then I modified my .NET code to look something like this (note that I left the DeleteAll method untouched):

 

'Delete all records in the table

ProductsAccess.DeleteAll()

 

'Build a valid XML string (you should use System.Xml in real code - this is just a demo)

Dim strBuffer As New Text.StringBuilder

strBuffer.Append("<Products>")

For i = 0 To objProductList.GetUpperBound(0)

    strBuffer.Append(" <Product ID=""" & objProductList.Item(i).ProductID.ToString & """ />")

Next

strBuffer.Append("</Products>")

 

'now populate all products that the user can choose by passing a single XML document

ProductsAccess.InsertXML(strBuffer.ToString)

 

How it works:

The XML generated in the loop would look something like this:

<Products>

<Product ID="1" />

<Product ID="2" />

<Product ID="3" />

</Products>

 

The new InsertXML() method called the new ProductsAccessXMLInsert stored procedure, passing all of the generated XML in only a single call to the DB.

 

Public Sub InsertXML(TheXML As String)

    Using objConn As New SqlConnection(DBConnectionString())

        objConn.Open()

        Using objCmd As New SqlCommand("dbo.ProductsAccessXMLInsert")

            objCmd.CommandType = CommandType.StoredProcedure

            objCmd.Connection = objConn

            objCmd.Parameters.Add("ValidXMLInput", SqlDbType.Xml)

            objCmd.Parameters(0).Value = TheXML

 

            objCmd.ExecuteNonQuery()

        End Using

        objConn.Close()

    End Using

End Sub

 

Inside the ProductsAccessXMLInsert stored procedure, SQL Server does a SELECT from the input XML parameter's built-in "nodes" table-valued function using the passed XPath query ("//Products/Product").  The XML "columns" are then accessible using the Col alias (this comes from the Tab(Col) tag after the XPath query).  In the SELECT clause, I referenced the ID attribute of the XML nodes, and instructed SQL Server to cast it as an Integer (these are the two parameters passed to the Col.value(,) function in the SELECT clause).  XML attribute names must be prefixed with an @ symbol.  (For node names you can leave the @ off.)  Then I'm simply doing an insert to the ProductsAccess table from the results of my SELECT.

 

Performance Difference:

I did four sets of tests using insert sets of 10 rows, 50 rows, 100 rows, and 504 rows (the number of products in the AdventureWorks Production.Product table which was the test data source).  I did three sets of tests for each count of rows with doing the one-by-one inserts first, and then three series of tests doing the XML inserts first.  All runs were measured using the .NET Stopwatch class.

 

So how much better was this single call using XML versus the using the standard row-by-row INSERT call?  Here are the results:

 

(lower milliseconds to complete is better)

 

image

 

Conclusion:

The XML insert method for our scenario of needing to insert many rows into our SQL database in one shot averaged more than 97% faster than looped inserts when processing 504 rows of data, and was still more than 81% faster with a mere 10 rows of data to insert.  This was averaged over 6 runs of 50 tries of each method and by my eye, the results were fairly consistent.

 

I hope you found this a useful introduction on one way to efficiently handle multi-insert scenarios with SQL Server using an XML parameter.  I would love to hear any comments or questions you might have.

 

Notes and Disclaimers:

  1. This is merely one way to insert many rows very fast from your .NET application.
  2. The term "many" may mean different things for you or your application.
  3. To keep the performance testing fair, I first refactored the existing insert method to use a stored procedure that took the product ID as a parameter so it wasn't using wonky ORM-generated code.:
    CREATE PROC DoSingleInsert(@ProductID INT)

    AS BEGIN

           INSERT INTO ProductsAccess (ProductID) VALUES (@ProductID);

    END
    Also the inserts were processed in a tight loop using a single SqlCommand object with a strongly-typed parameter and just updating the parameter value and calling ExecuteNonQuery().  This is about as efficient as ADO.NET typically gets, though I would love to hear if there is a way to drive a SqlCommand object harder.

  4. Testing was done with the application and SQL Server running off two ports of the same gigabit switch to minimize the factor of network latency.

  5. To keep this post as simple as possible, I am pretending that there is only ever one user for this application - my real table also has a UserID field, among others.  Additionally, in my real solution I do a MERGE against my XML parameter which allowed me to also eliminate the DELETE call - perhaps this can be a future blog post.

  6. Dear fellow developer: please don't do what I did above and use a StringBuilder to generate XML.  The moment you have a funny character such as <, >, or &, you will be in a world of pain.  You should use the features of the System.XML namespace to generate XML, however that was more complicated than I wanted to get into for this blog post.

  7. The slow part of the original code in this application was basically the last four lines of code in a massive subroutine.  I had seen dozens of things earlier that I thought might be the culprit, but actually were OK.  In doing this work, I was reminded of the lesson to not do premature optimization of code; I could have spent an hour or two making all the "kind-of fast" code "super fast" and the end result would have been completely unnoticeable due to the four lines of boat-anchor DB access code which were the main problem.

See Also:

Creating a deep hierarchy with FOR XML

Using the MERGE Statement

Thursday, January 19, 2012

Case Sensitive searches in T-SQL

I just ran into an issue with one of my applications where a SQL statement had updated a status code field to a lower-case status code such as “active”.  We were expecting “Active” in the application and as a result our application began exhibiting odd behavior.  There are a host of things that went wrong here and that could have been done better, but it certainly does illustrate an issue with a case-insensitive database (our collation sequence is the USA default SQL_Latin1_General_CP1_CS_AS) interacting with a case-sensitive application.  Here are some ideas we came up with.

How to find the incorrect records

First off, we have to find the offending records.  I was able to run this to find the records we needed.

SELECT ID, [Status] FROM MyTable WHERE [Status] = 'active' COLLATE SQL_Latin1_General_CP1_CS_AS;

Note the COLLATE keyword refers just to the string ‘active’.  It’s not a query modifier like ORDER BY would be – instead it actually directly modifies the value that precedes it.

Sidebar: This is conceptually similar to how some implicit typing works in VB.Net – In VB, coding x=“s” implicitly declares x as a string type, but x=“s”c implicitly declares x as a character type; x=1.1F declares x as a single precision float, but x=1.1D declares x as a decimal type.  The COLLATE directive in this T-SQL statement declares the text 'active' as SQL_Latin1_General_CP1_CS_AS (case sensitive Latin ascending sort) so SQL will treat it as such when evaluating the statement.

Another note is that STATUS is a reserved word in T-SQL, so I am delimiting my field name, [Status], with square brackets. Delimiting with square brackets instead of single- or double-quotes makes SQL Server very happy – particularly the IntelliSense feature of SQL Management Studio.

So this is how you do a case-sensitive search on a SQL server with a case-insensitive sort order.  (For non-USA default SQL server installations, you should specify your current server collation and change “CI” to “CS”.)

How to fix the incorrect records (only)

I was able to fix the records in my database by using code like this.  This has the advantage of only doing updates on the rows that have the incorrect case.  To do this in a different environment, you’d have to provide a line per bad code.  I am using SET XACT_ABORT ON in this case so that the entire transaction will be stopped and rolled-back automatically in case of any errors encountered.

SET XACT_ABORT ON;

BEGIN TRAN;

 UPDATE MyTable SET [Status] = 'Active'

  WHERE [Status] = 'active' COLLATE SQL_Latin1_General_CP1_CS_AS;

 UPDATE MyTable SET [Status] = 'Inactive'

  WHERE [Status] = 'inactive' COLLATE SQL_Latin1_General_CP1_CS_AS;

 UPDATE MyTable SET [Status] = 'Withdrawn'

  WHERE [Status] = 'withdrawn' COLLATE SQL_Latin1_General_CP1_CS_AS;

COMMIT TRAN;

If you wanted to fix this in a single statement, you could do something like this:

UPDATE MyTable

  SET [Status] = CASE

    WHEN [Status] = 'active' COLLATE SQL_Latin1_General_CP1_CS_AS THEN 'Active'

    WHEN [Status] = 'inactive' COLLATE SQL_Latin1_General_CP1_CS_AS THEN 'Inactive'

    WHEN [Status] = 'withdrawn' COLLATE SQL_Latin1_General_CP1_CS_AS THEN 'Withdrawn'

  END

  WHERE [Status] IN ('active' COLLATE SQL_Latin1_General_CP1_CS_AS,

    'inactive' COLLATE SQL_Latin1_General_CP1_CS_AS,

    'withdrawn' COLLATE SQL_Latin1_General_CP1_CS_AS);

 

Notice how it is required to put the COLLATE after every string.  The “Sidebar” above will explain why.

 

How to prevent this from happening in the first place

In this situation, I didn’t have a foreign key relationship on this field, so the best I could do was enforce a constraint.  This is the constraint I put into place:

 

ALTER TABLE MyTable

ADD CONSTRAINT MyTable$ValidStatus

  CHECK([Status] IS NULL OR

    [Status] IN ('Active' COLLATE SQL_Latin1_General_CP1_CS_AS,

      'Withdrawn' COLLATE SQL_Latin1_General_CP1_CS_AS,

      'Inactive' COLLATE SQL_Latin1_General_CP1_CS_AS);

With this constraint in place, an insert or update that tried to set the value of the Status field to something other than NULL, “Active”, “Withdrawn”, or “Inactive” would fail.  I can check that my check constraint is working by querying the sys.checkconstraints view.  If is_not_trusted comes back as 0, then my constraint should be working.

 

SELECT is_not_trusted, * FROM sys.check_constraints

  WHERE name = 'MyTable$ValidStatus';