Sunday, December 15, 2013

ORM vs SQL Part 3: Filtering

The final of the 3 basic scenarios that nearly every application will need is filtering, the where clause in an sql statement. An application will have to display a list of products in a category, filter a list of users by email address, get an order by it's id or something along those lines. It's practically impossible to write an application with out it.

The most interesting thing about filtering is that it's rather simple to write a query that can handle all your scenarios. The problem is that it renders all your carefully planned indexes irrelevant or worse.

Sql server can be quite fickle about using indexes, depending on the cardinality, execution plan and phase of the moon. I'll show just how stupid it can sometimes be.

Filtering with an ORM

Once again we can build our dynamic query with an ORM quite simply:

  1. var query = Session.QueryOver<User>();
  2. if (string.IsNullOrEmpty(filterId))
  3.   query = query.Where(x => x.Id == filterId);
  4. if (string.IsNullOrEmpty(filterLastName))
  5.   query = query.Where(Restrictions.On<User>(x => x.LastName).IsLike(filterLastName));
  6. if (string.IsNullOrEmpty(filterEmail))
  7.   query = query.Where(Restrictions.On<User>(x => x.Email).IsLike(filterEmail));
  8. var result = query.List();
Once again, our ORM builds a prepared statement for every unique situation which can make use of the appropriate indexes if we have any. It's not neccessarily fast, that will depend on your performance tuning, but it's as fast as it can be.

Filtering with Sql

Dynamically filtering with sql is also quite simple. To get the same functionality as above:

  1. select TOP(20) id, lastName, firstName, email
  2. from users 
  3. where --lastName = '21250A22-6B3F-452C-9FE3-7EBBF216B13C'
  4.   (@id is null OR @id = @id)
  5.   AND
  6.   (@lastName is null or lastName Like @lastName + '%')
  7.   AND
  8.   (@email is null or email LIKE @email + '%')
This is much more elegeant than our solutions in the past two posts. The first problem is, what if you want some other logic, if you want to search by last name or email then this won't work. You'll either need something far more elaborate or you'll need several identical queries. The other problem is that it may not use an index, or even worse, will use the wrong index.

What index will this use?

Well the answer is it depends on a lot of things. The first query, collected statistics and moon phases being the primary influences. I don't know about you, but I like my systems to perform consistently. Assuming the above query is in a stored procedure:

  1. execute the procedure with @id=null, @lastName='abc', @email=null
  2. Query finishes in ~280ms. The index on lastName was used
  3. execute the procedure with @id=null, @lastName=null, @email='abc'
  4. Query finishes in ~11000ms, still using the index on lastName, not the index on email
  5. execute the procedure with @id='C15BAFAA-B772-4863-8849-00413B531C29', @lastName=null, @email=null
  6. Query finishes in ~11000ms, still using the index on lastName, not the primary key index
  7. execute the procedure with @id=null, @lastName=null, @email=null
  8. Query finishes in ~12000ms, still using the index on lastName
Deleting our indexes and repeating the queries in the same order takes ~50ms, ~60ms, ~20ms, 20ms.

So a query like this will not only render many of your indexes useless but can cause the wrong index to be used, which is much worse. Again the tailor made query that an ORM will generate will have much better performance than a generic one that you can create. 

No comments:

Post a Comment