Build upon SQL's built-in statistical functions
Takeaway: Arthur Fuller defines statistical terms and functions and then demonstrates how you can benefit from using SQL's built-in statistical functions.
Even though SQL wasn't designed for calculating statistics, it does include a number of basic statistical functions. In this article, I'll define statistical terms and functions and then demonstrate how you can benefit from using SQL's built-in statistical functions.
Here are some basic statistics terms (which, if you're not very familiar with statistics, will come in handy when I define statistical functions below):
Now, here's a list of statistical functions (as well as some standard arithmetical functions that are useful in calculating statistics) and their meanings.
Notice that there aren't any built-in functions for calculating mode and median. Nevertheless, you can derive these values with a little effort.
Using the Northwind sample database, let's suppose that you want to determine the mode of quantities ordered. Here's how you do it:
SELECT TOP 1 quantity, COUNT(*) Count
FROM [order details]
GROUP BY quantity
ORDER BY Count DESC
Calculating the median isn't much more difficult (although there is a wrinkle). The basic strategy is to join the table to itself on the column of interest, using one table to count the rows that are less than the given value, and using the other table to count the rows that are greater than that value. This is the query that calculates the median:
SELECT a.quantity median
FROM [order details] a, [order details] b
GROUP BY a.quantity
HAVING
SUM(CASE WHEN b.quantity <=
a.quantity
THEN 1 ELSE 0 END)
>=(COUNT(*))/2
AND
SUM(CASE WHEN b.quantity >=
a.quantity
THEN 1 ELSE 0
END)>=(COUNT(*)/2)
Here's the wrinkle: The code above assumes that the number of rows is odd; therefore, there is guaranteed to be a middle row. When the number of rows is even, you have a choice of which row to return. In my experience, most statisticians return the lower of the two middle values.
I'm not recommending that you write an entire statistics package using SQL. But, if your needs are relatively simple, you can use the built-in functions, build upon them, and derive the basic statistical values.
TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!
Print/View all Posts Comments on this article
|
|
|
|
|
|
|
|
White Papers, Webcasts, and Downloads
- VMware Infrastructure: A Guide to Bottom-Line Benefits VMware Frustrated by the high cost of maintaining or building ever-larger data centers? Get the facts you need to formulate your Virtualization Action Plan. Download Now
- Why Isn't Server Virtualization Saving Us More? A Few Small Changes May Dramatically Increase Your Efficiency VMware Ever wonder why your company isn't saving more from its server virtualization? Making a few small changes could dramatically increase your efficiency. Download Now
- The Impact of Virtualization Software on Operating Environments VMware Today's use of virtualization technology allows IT professionals to ... Download Now
- Five Steps to Determine When to Virtualize YourServers VMware Thinking of virtualizing the servers at your company? Use this step-by-step guide to determine when's the best time to make your big move. Download Now
- Leveraging SMB ERP for an Economic Recovery ZDNet Times are tough but better days are sure to follow. In the wake of an ... Download Now
Article Categories
- Security
- Security Solutions, IT Locksmith
- Networking and Communications
- E-mail Administration NetNote, Cisco Routers and Switches
- CIO and IT Management
- Project Management, CIO Issues, Strategies that Scale
- Desktops, Laptops & OS
- Windows 2000 Professional, Microsoft Word, Microsoft Excel, Microsoft Access, Windows XP,
- Data Management
- Oracle, SQL Server
- Servers
- Windows NT, Linux NetNote, Windows Server 2003
- Career Development
- Geek Trivia
- Software/Web Development
- Web Development Zone, Visual Basic, .NET

