Tuesday, December 6, 2011

Excel for the DBA, Part 1–The Basics

This will be a series of handy hints for people that know SQL and are frustrated by Excel.  I am going to be writing this for Excel 2007 (what I currently have at work) but the tips should work for most recent versions.  The SQL stuff should also work with SQL 2000 and higher, but I have only tested on 2008.

The Basics

Run this on a test DB in your SQL Server, making sure you’re set to the default of “Results to Grid” (CRTL-D):

SELECT 1 AS [ID],'Jim' AS [Name], 'Red' AS [FavoriteColor],

CAST('20110101 14:00:00' AS DATETIME2) AS [ModifedDate],

'007' AS [SpecialCode] UNION ALL

      SELECT 2, 'Sally','Blue', '20110205 12:30:00','025' UNION ALL

      SELECT 3, 'Tim', 'Orange', '20110308 04:17:00','097'  UNION ALL

      SELECT 4, 'Todd', 'Green', '20110512 18:22:00','101'  UNION ALL

      SELECT 5, 'Ann', 'Orange','20110719 19:59:00','002'

 

 

Left-click the origin of the results (the blank space left of the “ID” column header and above row number 1) to select all the results.  Then right click in the same spot and choose “Copy with Headers”.  Launch Excel and paste the data into Sheet1 in cell A1.  You should get a mess that looks like this:

image

Several things can quickly be fixed up about the presentation of this data.

  1. First of all, for some insane reason, Excel does not properly handle formatting date data pasted from SQL Management Studio – it assumes that the correct thing to do is format it as a time.  At least in this case, the data is still there and it’s just a presentation issue.
    • Highlight column D by clicking the column D header (above where it says “ModifiedD”) and then right clicking in the same location and choosing Format Cells…
    • Under the first tab, select category “Date” and then pick the closest one to what you want to see.
    • If you find one that is close, but not exact, select it, and then click “Custom” where you can see the actual formatting code under the “Type:” field for customization.  “Military” SQL time format for the Gregorian calendar is yyyy-mm-dd hh:mm:ss;@
    • Don’t worry about selecting the column header row when you do the formatting – Excel will usually ignore it (see below for some ways that this magic works).
  2. To fix the column widths, on the other fields, left-click the origin in Excel.  This should do a “Select All” on the current sheet.  Then, double click the vertical divider between cells A and B in the column header (when you are in the right place, the pointer should look like a vertical bar with an arrow pointing left and right).  If you did it correctly, all of the columns should have auto-sized to the width of the data.  You can do this for individual columns by just not doing “Select All” first.
  3. The other issue is that the leading zeros on the “SpecialCode” field have been dropped.  In this case, unfortunately, Excel has actually dropped the data and is treating the data as numeric.  There are two ways to fix this – a formula or a format.
    • To fix via formula, click on cell F2 and type =RIGHT(CONCATENATE("000",E2),3) and press <ENTER>.  Assuming you were doing a select statement in T-SQL from a table with a field named E2, this would be equivalent to RIGHT('000' + CAST(E2 AS VARCHAR(30)),3) To duplicate this formula to all successive cells in the column, you can just double-click on the “bump” on the lower right of the F2 cell border when it is selected.
    • The other way to fix it (since Excel actually does shred out the leading zeros) is to pre-format the column as text.  Select the column by clicking the column header for E, right-click and do Format Cells… and on the first tab, choose “Text”.  (You can also choose “Custom” and use the format code @ which basically means “whatever was in there”.).  Then go back to SQL Management Studio, reselect all and recopy the data with headers.  Then paste the data back in to Excel in A1 and it should no longer drop the leading 0s; the only difference is that now Excel puts a green triangle in the cell to warn you that the data is a “Number stored as text”.  Note that these format codes are doubly-useful for SQL DBAs as they are also used by SSRS as the number/date formatting codes.
      image
  4. The column headers in Excel are not differentiated from the data in any meaningful way.  Excel is a bit magic so it does make a difference to hint to it that row 1 is not data but header info.  The easiest way to do this is to click the “1” row header to select all of row 1 and hit CTRL-B to make the row bolded.  This is enough to ensure that Excel knows that these are really column names and not data.
    • Note that Excel is generally pretty smart and will realize that row 1 is a header row automatically if for example you have text in all fields in row 1, but some columns have numerics or other data types in all successive columns.
  5. One last handy thing is to do “Freeze Panes”.  Click Cell A2 (this is the upper left data field).  Then in Excel 2007, click the “View” ribbon tab and “Freeze Panes… Freeze Panes”.  In Excel 2003, it’s under Window… Freeze Panes.  Then if you scroll down, the column headers will remain, but only data will move. This is not super useful with a small data set, but it’s great for large ones.  Just remember to click cell A2 before you do it because anything above and to the left of the selected cell will be frozen when the feature is enabled.

Coming Next

In the next post, I will be showing you how to use VLOOKUP and some other functions in Excel.  VLOOKUP is an awesome function that allows Excel to do some fairly clever things to facilitate analyzing different sets of data (it is also how some users start to develop scary monster sheets, so be careful!).  I will also show you the Filter feature(awesome!!!) and also how to get a bit quicker with copying formulas around via the keyboard.

No comments:

Post a Comment