There was a great deal of feedback to my previous post One Way To Insert Many Rows Very Fast From Your .NET Application.
The post focused on the performance benefit of using a single SQL statement with an XML-based parameter to do inserts of many rows to a table versus issuing an insert "per row".
Several commenters on the original post mentioned that with SQL Server 2008 and higher, it is possible to pass a Table-Valued Parameter (TVP) to SQL Server and that this would save SQL Server the overhead of parsing the XML before saving the data. I had used TVPs before inside databases, but never as a parameter to an externally-facing DB object. Here's how I added a third insert method to my test application:
Creating the Type
When creating a table-valued type in your database, you have two main options: specific or generic. The specific case would be to create a type called something like ProductIDList and the generic case would be to create a type called something like UniqueIntegerList. I like using generic Table-valued types when I can, because it helps to avoid "bunny types" - basically needing to have many duplicate types that seem to multiply on their own. Your mileage may vary here - I find that this is easier for me than maintaining "EmployeeIDList", "ProductIDList", "WidgetIDList", and the like, when these might be all the same definition.
So I created this "generic" table-valued type:
CREATE TYPE dbo.UniqueIntegerList AS TABLE
(
TheInteger INT NOT NULL
PRIMARY KEY (TheInteger)
);
Creating the Save Stored Procedure
Next, I created a new stored procedure which would accept my new Table-Valued Type as a parameter.
CREATE PROC DoTableValuedParameterInsert(@ProductIDs dbo.UniqueIntegerList READONLY)
AS BEGIN
INSERT INTO ProductsAccess(ProductID)
SELECT TheInteger AS [ProductID]
FROM @ProductIDs;
END
In this procedure, I am passing in a parameter called @ProductIDs. This is of type "dbo.UniqueIntegerList" which I just created in the previous step. SQL Server looks at this and says "oh I know what this is - this type is actually a table". Since it knows that the UniqueIntegerList type is a table, I can select from it just like I could select from any other table-valued variable. You have to mark the parameter as READONLY because SQL 2008 doesn't support updating and returning a passed table-valued parameter.
Creating the Save Routine
Then I had to create a new save routine on my business object that would call the new stored procedure. The way you prepare the Table-Valued parameter is to create a DataTable object with the same column signature as the Table-Valued type, populate it, and then pass it inside a SqlParameter object as SqlDbType.Structured.
Public Sub SaveViaTableValuedParameter()
'Prepare the Table-valued Parameter
Dim objUniqueIntegerList As New DataTable
Dim objColumn As DataColumn = objUniqueIntegerList.Columns.Add("TheInteger", _
System.Type.GetType("System.Int32"))
objColumn.Unique = True
'Populate the Table-valued Parameter with the data to save
For Each Item As Product In Me.Values
objUniqueIntegerList.Rows.Add(Item.ProductID)
Next
'Connect to the DB and save it.
Using objConn As New SqlConnection(DBConnectionString())
objConn.Open()
Using objCmd As New SqlCommand("dbo.DoTableValuedParameterInsert")
objCmd.CommandType = CommandType.StoredProcedure
objCmd.Connection = objConn
objCmd.Parameters.Add("ProductIDs", SqlDbType.Structured)
objCmd.Parameters(0).Value = objUniqueIntegerList
objCmd.ExecuteNonQuery()
End Using
objConn.Close()
End Using
End Sub
So was it faster using a Table-Valued Parameter?
My old test harness used runs of up to 504 rows in up to 50 test runs. When doing a small number of rows and 50 trials, using the XML-based parameter versus the Table-valued parameter didn't seem to make a detectable or significant difference. The two methods kept trading places and the difference was indistinguishable from noise.
When I upped the test runs to 1,000 batches at a time, the pattern became more clear.
Lower is faster. Test runs 1-3 done with the XML insert first; tests 4-6 done with the TVP insert first. Testing was done with SQL Server 2008 Developer Edition SP2 (10.0.4311.0).
So the answer is that the Table-valued parameter is clearly faster when you start getting into several hundred inserts in one call, but using a TVP does not appear to provide an advantage to using an XML parameter below 100 rows, at least with the way that I've implemented this. I would hesitate to say that the XML is "faster" below 100 rows to insert - the difference is so slight (under 2 ms per call) that it could be due to the implementation details in my test program and not SQL Server-related.
Pros and Cons
One of the advantages of using a TVP is that you can create a primary key and zero or more other unique indexes on it - this might help with performance when implementing "Upsert" or MERGE patterns. XML data doesn't come in with indexes applied, so it could definitely be slower for scenarios that involve a join (not tested in this blog post). It seems that if the count of rows you need to insert per call is fairly large and performance is your only consideration, using a TVP could be a good choice.
In my opinion, using XML is more flexible than using a TVP approach because you aren't required to maintain types inside the database along with the stored procedures and other code. Using an XML parameter also requires somewhat less code in the persistence layer to implement (and practically none if the XML serialization logic lives in the business layer). Lastly, XML serialization code is reusable in other scenarios (serializing an object as XML to disk, to a log, or in unit tests, for example), whereas the persistence code for a TVP is only useful for passing whole objects to SQL Server.
Wrap-up
Even in the largest scenario tested, the insert work averaged 24ms to insert 500 rows of data via the XML parameter versus 14.2ms for the TVP. This type of difference isn't perceptible to a user, though clearly it could add up for a large application with many users calling this code frequently. However, either approach compares very favorably to the average 719ms to complete a 500 row insert measured in my previous post using the non-optimized 1x1 insert method implemented by many ORMs.
As always, the best choice depends on the requirements for your application. Either of the two methods tested XML or a TVP are strong candidates for inserting many rows very fast from your .NET application, and each holds some advantages over the other.
I greatly enjoyed hearing the feedback to the prior blog post; I'd love to hear your thoughts on this additional article!