Saturday, December 28, 2013

Creating an MVC framework part 2 - First Decisions and Code

In writing the first bit of code a quickly came across a few key decisions. The very first one was what sort of routing I wanted to include. This lead to a deeper question of what sort of applications I wanted the framework to be suitable for.

My career has largely been spent creating line of business applications and that is the priority focus of the framework. Eventually I would like to expand on that but with limited time you have to simplify or get no where. I also didn't intend for it to be particularly opinionated but I'm naturally getting pushed that way. Again I hope to make it more flexible in future but for now it's simply not a priority.

The first casualty of this is routing, otherwise known as pretty URL's. This feature is simply not that important in a line of business app and it simplifies things to not have to worry about it at this point. For now, the only route is /{controller}/{action}/{querystring}.

The other big decision I made is project structure. I'm sick of having various parts of an application strewn about the project, or several projects, based on type. The grouping of this framework is going to be done by function. A standard project layout will be along the lines of:


  1. MvcProject
  2. -\_Common
  3. ---_Layout.cshtml
  4. -\Product
  5. ---_Controller.cs
  6. ---List.js
  7. ---List.cshtml


Let there be Light


So I created a new web application and in the global.asax.cs I put:


  1. protected void Application_Start(object sender, EventArgs e) {
  2.   RouteTable.Routes.Add(new Route("{*url}", new RequestHandler()));
  3. }
This tells ASP that we want to handle all incoming URL's with our request handler, which is all I want for now. RequestHandler looks like this:

  1. public class RequestHandler : IRouteHandler, IHttpHandler {
  2.   IHttpHandler IRouteHandler.GetHttpHandler(RequestContext requestContext) { return this; }
  3.   bool IHttpHandler.IsReusable { get { return true; } }

  4.   void IHttpHandler.ProcessRequest(HttpContext context) {
  5.     context.Response.Write("Hello World!");
  6.   }
  7. }
IRouteHandler and IHttpHandler are the interfaces we need to implement to recieve requests from ASP. I'm not sure if it's a good idea to combine them or not but it works for now. The real meat and potatoes starts with ProcessRequest though, it might not look like much but this is the entry point to our framework. This is where the adventure begins.

Wednesday, December 18, 2013

Creating an MVC framework part 1 - Why???

So I was working on an application framework, something that would take some of the drudgery out of creating a new application. I like things to be nicely structured, but setting up this architecture takes time. Usually by the time I've set it up that urge to solve the original problem had waned and my projects folder was filled with yet another orphan.

I just wanted to solve the most common things. Validation, logging, permissions are the obvious ones, I find the implementation of these in microsofts MVC framework to be pretty terrible, it's almost always one of the first things I replace. I like the command/query model, so I wanted that in place. I want to be able to offer desktop/mobile integration further down the line. I want easy integration with knockoutjs. Most importantly, I don't want to have to set all this up every single time.

So as I'm sporadically putting various parts together I noticed one thing. MS MVC is getting in my way and stopping me from creating this framework, It must go! So what was once going to be an application framework is now going to be MVC framework as well.

A touch of arrogance


I think if your going to replace something then you need to replace it with something better, which requires at least a modest level of arrogance. You have to know how what you want to do better and hopefully have a rough idea on how to get there.

As I got further through my exploratory and brainstorming phase I started to get more confident that yes, I can make a better MVC.

Controllers


The first thing I started looking at is what I wanted my controllers to look like. I came up with this:


  1. public abstract class Controller {

  2.   [Route("/product/view/*/{Id}")]
  3.   public abstract object View(ProductDetailsQuery query);

  4.   public abstract object List(ProductListQuery query);

  5.   [Roles("ProductManager")]
  6.   [HttpPost]
  7.   public object Create(ProductCreateCommand command) {
  8.     System.Execute(command);
  9.     User.Message(MessageType.Success, "Product has been created");
  10.     return User.Redirect("/Product/List");
  11.   }
  12. }

There are a number of interesting things here, First of all, the controller is responsible for defining a public API, so routes and permissions are defined by attributes. MS MVC has recently added attribute based routing as well.

A permissions system will be provided by the framework that will allow role based and finer grained permissions.

The controller is abstract. For many common actions like view and list in the above sample, all the controller does is delegate responsibility. Abstract actions will have the ability to be routed by default. Saving those couple of lines of code was one of the main drivers for my framework, yes I'm that anal.

The HttpPost may seem fairly standard, but by convention this will enlist a transaction.

All action return objects. There are no view results, json results or anything like that. It's one object in, one object out. The format of the result is determined by the caller.

We don't have to check if ModelState.IsValid on *every single function*. That is done for us.

It's just called Controller. I will mention the project layout at a later date.

There are a couple of big interfaces in there, User and System. These are basically wrappers for real objects. I really like this abstraction because it makes it explicit who where interacting with. Where not interacting with NotificationSubSystem, where interacting with the user. We aren't interacting with the command executor, we are interacting with the system.

Fingers Crossed


This is the current plan any way. Some ideas may turn out to be bad ones, some might be brilliant and there could be many more additions. Finding out which is which should be a fun journey.












Sunday, December 15, 2013

ORM vs SQL Part 3: Filtering

The final of the 3 basic scenarios that nearly every application will need is filtering, the where clause in an sql statement. An application will have to display a list of products in a category, filter a list of users by email address, get an order by it's id or something along those lines. It's practically impossible to write an application with out it.

The most interesting thing about filtering is that it's rather simple to write a query that can handle all your scenarios. The problem is that it renders all your carefully planned indexes irrelevant or worse.

Sql server can be quite fickle about using indexes, depending on the cardinality, execution plan and phase of the moon. I'll show just how stupid it can sometimes be.

Filtering with an ORM


Once again we can build our dynamic query with an ORM quite simply:


  1. var query = Session.QueryOver<User>();
  2. if (string.IsNullOrEmpty(filterId))
  3.   query = query.Where(x => x.Id == filterId);
  4. if (string.IsNullOrEmpty(filterLastName))
  5.   query = query.Where(Restrictions.On<User>(x => x.LastName).IsLike(filterLastName));
  6. if (string.IsNullOrEmpty(filterEmail))
  7.   query = query.Where(Restrictions.On<User>(x => x.Email).IsLike(filterEmail));
  8. var result = query.List();
Once again, our ORM builds a prepared statement for every unique situation which can make use of the appropriate indexes if we have any. It's not neccessarily fast, that will depend on your performance tuning, but it's as fast as it can be.

Filtering with Sql


Dynamically filtering with sql is also quite simple. To get the same functionality as above:

  1. select TOP(20) id, lastName, firstName, email
  2. from users 
  3. where --lastName = '21250A22-6B3F-452C-9FE3-7EBBF216B13C'
  4.   (@id is null OR @id = @id)
  5.   AND
  6.   (@lastName is null or lastName Like @lastName + '%')
  7.   AND
  8.   (@email is null or email LIKE @email + '%')
This is much more elegeant than our solutions in the past two posts. The first problem is, what if you want some other logic, if you want to search by last name or email then this won't work. You'll either need something far more elaborate or you'll need several identical queries. The other problem is that it may not use an index, or even worse, will use the wrong index.

What index will this use?


Well the answer is it depends on a lot of things. The first query, collected statistics and moon phases being the primary influences. I don't know about you, but I like my systems to perform consistently. Assuming the above query is in a stored procedure:

  1. execute the procedure with @id=null, @lastName='abc', @email=null
  2. Query finishes in ~280ms. The index on lastName was used
  3. execute the procedure with @id=null, @lastName=null, @email='abc'
  4. Query finishes in ~11000ms, still using the index on lastName, not the index on email
  5. execute the procedure with @id='C15BAFAA-B772-4863-8849-00413B531C29', @lastName=null, @email=null
  6. Query finishes in ~11000ms, still using the index on lastName, not the primary key index
  7. execute the procedure with @id=null, @lastName=null, @email=null
  8. Query finishes in ~12000ms, still using the index on lastName
Deleting our indexes and repeating the queries in the same order takes ~50ms, ~60ms, ~20ms, 20ms.

So a query like this will not only render many of your indexes useless but can cause the wrong index to be used, which is much worse. Again the tailor made query that an ORM will generate will have much better performance than a generic one that you can create. 












Tuesday, December 10, 2013

ORM vs SQL Part 2: Ordering

Most applications will need to to control the ordering of the data being presented to users, the exact requirements will differ depending on the application and situation, but virtually all will require it. Sometimes the same data will presented in several places, each requiring it's own order for natural reasons. Other times dynamic tables will be presented and users will control what order the view information in. Some even allow full control and let the users specify multiple orders.

In the later two cases ordering usually goes hand in hand with filtering, but I'm going to leave the filtering until part three. In part one I looked at paging.


Ordering with an ORM


Simple ordering is straight forward with an ORM (and with sql for that matter):

  1. var users = session.QueryOver<User>().OrderBy(x => x.LastName).Asc.List();
In fact, at this level, sql is even simpler. An ORM starts to shine when you have some other logic controlling the ordering because you get a composable query object to work with, for example:
  1. var query = session.QueryOver<User>();
  2. if (orderByLastName == true)
  3.   query = query.OrderBy(x=>x.LastName).Asc;
  4. if (orderByLastLogin == true)
  5.   query = query.OrderBy(x=>x.LastLogin).Asc;
By composing queries like this, any ordering that can be logically described can be performed, your only limit is your own code and how much power and flexibility you want to give the end users.

The ORM will generate the sql needed for that individual scenario and make full use of our indexes.


Ordering with Sql


Simple ordering in sql is extremely straight forward:

  1. select TOP 10 * from users order by lastName

Most importantly, this will use the index on lastName. On my computer this will execute in ~5ms with 1,000,000 rows. This is essentially what our ORM will generate. 

Now, let's say our application is showing a list of users in two places, the first we are trying to find users by last name and the second we want to see who were the latest to login. We can write two stored procedures, which will be identical apart from the order by clause. Or we can add some dynamic ordering to the query:

  1. SELECT TOP 10 id, lastName, firstName
  2. FROM users
  3. order by CASE
  4.   WHEN @OrderBy = 'LastNameAsc' THEN lastName
  5.   WHEN @OrderBy = 'FirstName' THEN firstName
  6. END
The drawbacks for this are that the data types have to be the same in the case statement, so we can't use the same query to order by lastLogin. Likewise, we can't mix ascending and descending. And even if we are ordering on lastName, which has an index in this scenario, the index will not get used. It also performs quite badly, 753ms  which is more than two orders of magnitude slower for the same functionality.

In part one I said the new 2012 offset/next syntax performed quite slowly with dynamic ordering, this is why.


Ordering with Sql - Again


So being unable to mix datatypes is a deal breaker in our scenario. we want to order by either lastName or lastLogin. The easiest way to do this is to invite our old friend from part one, rownumber: 

  1. SELECT TOP 10 id, lastName, firstName, CASE
  2.   WHEN @OrderBy = 'LastNameAsc' THEN ROW_NUMBER() OVER(ORDER BY lastName)
  3.   WHEN @OrderBy = 'LastLoginDesc' THEN ROW_NUMBER() OVER(ORDER BY lastLogin desc)
  4. END  as rownumber 
  5. FROM users
  6. order by rownumber
Finally, we can order by multiple data types and by ascending or descending order. But it comes at a hefty price, it's yet another order of magnitude slower, ~5800ms, way beyond what I would consider acceptable.


Ordering and Paging


So now let's look at the combination of paging and ordering:
  1. SELECT TOP(@take) id, lastName, firstName, email, lastLogin 
  2. FROM (
  3.   SELECT id, lastName, firstName, email, lastLogin, CASE
  4.     WHEN @OrderBy = 'LastNameAsc' THEN ROW_NUMBER() OVER(ORDER BY lastName)
  5.     WHEN @OrderBy = 'LastLoginDesc' THEN ROW_NUMBER() OVER(ORDER BY lastLogin desc)
  6.   END  as rownumber 
  7.   FROM users
  8. ) as query 
  9. WHERE query.rownumber > @skip
  10. ORDER BY rownumber

Our simply little select isn't so simple any more and the performance has suffered terribly (~8500ms), to the point where it is immediately noticeable to end users. All because you thought ORM's were too slow.

We are quickly approaching the limits of what we can do with sql. The ORM can easily handle ordering over multiple columns dynamically, to do the same with sql would require a fundamentally different, and probably much more complex, strategy.


Why are ORM's faster?


In this scenario our ORM is 3 orders of magnitude faster than a stored procedure with the same functionality. This is simply because it generates queries dynamically, it doesn't have to account for every possibility at programming time because it can adjust at run time instead.

Sql, despite frequently being championed as faster, can not even make use of indexes appropriately as soon as you throw a real world situation at it.

The final score, with paging and ordering is ORM: ~10ms, Sql: ~8500ms.

Next up, I'll take a look at dynamic filtering.




















ORM vs SQL Part 1: Paging

In this series of posts I want to talk about the real world ramifications of not using an ORM for application development and why falling back to SQL and stored procedures is a poor choice.

Quite often the argument seems to be that an SQL select and a few joins is simple and effective (which it is), and that ORM's are a useless abstraction. The problem is that simple selects don't get you very far, even when creating the CRUDiest of applications.

In real applications, we have to do paging, sorting (dynamically), joining (dynamically) and filtering (dynamically). That's just on the read side, not even considering what an ORM can do for you when your writing data. I'm going to go through the abstractions an ORM provide and show how they make life easier for us.

Some caveats before I go on:

  1. I'm talking about general application development, web, desktop or mobile doesn't really matter here.
  2. I'm talking about general line of business applications. If your working on something more exotic you may have legitimate needs that an ORM doesn't cater for.
  3. The ORM I'm using will be nHibernate, but the same applies to all ORM's.
  4. I'll mostly be talking about Sql Server.
  5. Stored procedures have there place, if your ORM creating a legitimate bottleneck you should feel free to work around it entirely.


Paging with an ORM


Here is an example of paging a list with nHibernate, it's simple and it works, I don't even have to bother explaining it:


  1. Session.QueryOver<Product>().Skip(20).Take(10)


Paging with SQL


I'm going to assume Sql Server here but it varies from database to database. Before we go down that path I will also say that database independence is not a good reason to use an ORM.

A quick Google search for sql paging will give you quite a number of different possibilities. MySQL, for all it's faults, at least includes the ability to say "LIMIT 20, 10", which means skip 20, take 10. Those of use working with Sql Server aren't so lucky, it blows my mind that such  a common scenario wan't at the forefront of developers minds when Sql Server was first created and they've had to add various hacks over the last decade. Now that we can (hopefully) pretend that Sql Server 2000 no longer exists the best option seems to be:


  1. SELECT TOP (@take) id, name 
  2. FROM (
  3. SELECT 
  4. id, name, ROW_NUMBER() OVER(ORDER BY name) as rownumber 
  5. FROM products) as query 
  6. WHERE query.rownumber > @skip 
  7. ORDER BY query.name

This has good performance, it's relatively simple and it works well with dynamic ordering, which I'll get to in part two. This is exactly the sql that nHibernate will generate for you. Another possibility, as of Sql Server 2012 is:


  1. SELECT *
  2. FROM products
  3. ORDER BY name OFFSET @skip ROWS FETCH NEXT @take ROWS ONLY;

While still lacking the elegance of  MySQL, it works and is faster than the previous example. Throw in dynamic ordering though (part 2) and the performance goes out the window. There are other possibilities I haven't covered, the pros and cons of each is worth it's own blog series, though I won't be the one writing it (if you know of one, let me know and I will link it here).

The dumbest one I've seen, unfortunately the most common also, is to page on the client side (client being the application, not end user). The idea seems to be that you cache entire result sets and filter/order it in application code. People actually think this performs better, but I'll save this for a longer rant later.

That doesn't seem too bad


Well it isn't too bad but it's not such a simple query anymore. This extra complexity goes into nearly every query in the application, so it's a non-trivial amount of extra code to maintain. In later posts we'll see how this complexity is multiplied by other, equally necessary features.

Finally, compare it with the simplicity of doing the same in an ORM, which will execute the exact same query and perform the same, all that extra code starts to look pretty meaningless.

In part two I'll take a look at dynamic ordering.