Archive for June, 2008

HashSet to the rescue

Wednesday, June 25th, 2008

This week while working on NHibernate.Remote I came across an unusual performance problem.  When sending objects back and forth across the wire, I need to walk the object graph to ensure that things like lazy sessions are updated to correspond to the local ISession.  This is done through a combination of reflection and serialization (which is reflection anyways).  I should be doing this in a custom serializer but MS has made it so incredibly difficult to extend the XmlObjectSerializer that I’ve just decided to walk the graph after deserialization. While walking the object graph I need to store the references of objects I’ve already probed so that I don’t end up with an infine loop due to circular references.  Anyways due to a combination of poor code and some optimization oversight I ended up with just over 15000 objects in my reference list which was just using a standard generic List<object>.   A query for ~1000 records was taking over 4 seconds to complete which simply was not acceptable.  At first I thought that it was just an unavoidable consequence of using so much reflection.  However in a bid to speed things up a little I ran the Ants analyzer on it.  To my complete surprise it was the reference list that was slowing things down.  After looking at my code I identified the week points in my code and optimized the amount of references I was storing in the list eventually ending up with ~4000 entries for my 1000 record query.   The query was now completing in well under a second.

Even though the problem was solved it still bugged me.  I had just mitigated the problem by reducing the amount of entries rather than making the lookup faster.  So the same slowdown would be experienced as someone increased the amount of records returned by their queries.

Then today  I unexpectedly ran across the solution to my problems.  The HashSet<T>.  This is a collection that is new in the .Net 3.5 framework and is contained in the System.Core library.  It uses indexing to help speed up the Contains() method and boy is it fast. Because it is a set and not a list it cannot contain the same element twice, but that happens to be just perfect for my needs.  After implementing the same buggy code using the HashSet<object> instead of List<object>  my 1000 record query took ~0.5 of a second to complete.  That’s even faster than the optimized code was using List<object>.   Of course optimizing is still important, but I gained an important speed boost due to the use of this exciting new collection.  I know this will definitely find its way into some of my other projects too.

Unit tests, FTW!

Wednesday, June 18th, 2008

I’ve been working on an inventory system for my company and have been encountering a serious amount of resistance from both my neurons and the problem at hand.  So I’m giving the old noggin a brief respite and I’m back to working on NHibernate.Remote.  However my forray into inventory land has taught me some valuable lessons.

  1. Dependency injection is fun, and very usefull when used right.  I’m using the Castle Windsor framework and it seems to be working very well.
  2. Unit tests work and are necessary to ensure proper functioning of your software.

Now some of you may be spewing coffee right about now after reading number 2, because unit tests just seem like such a no brainer right now.  However <whisper>I’m the only one in my company who does them</whisper>.  Anyhow, I am convinced of the need for good unit tests if not completely ready to go straight to TDD.

Anyhow the point of this story is that I am taking some time to write unit tests for NHibernate.Remote, and I expect to find lots and lots of bugs.