I’m currently working a project with a major focus on insert – throughput and I’d thought I mention a few things I’ve done to improve that throughout that are departures from your typical CRUD .Net application.
- Don’t use identities.
- They require an additional read immediately after inserting
- They defeat insert batching
- Use Guid.Comb instead
- Creates Guids with just enough order to enable SqlServer to index effectively
- Generated by NHibernate without a round trip to the server.
- Enables deferment on inserts
- Enables insert batching
- Lock only aggregate roots and not every entity
- Control entity access through repositories for each aggregate root entity.
- Use pessimistic locking instead of optimistic locking (with timestamps/versions)
- Using a timestamp results in an additional read after each insert or update
- Defeats batching
- Use a table per hierarchy instead of a table per subclass (which is the default)
- A joined-subclass table strategy requires two inserts, one to the primary table and then the a second to the joined table.
- Can’t be batched
- Look hard at inheritance hierarchies. Batching only works for entities of the same type, so subclass entities cannot be batched with their base types. I whittled down a few classes and opted for null-properties (exposed as components) rather than subclass – you shouldn’t be building a hierarchy based on data anyway, but behaviour.
- Use second level caching for reference objects. I had a few entities were really almost static in nature but need to be referenced by each inserted record. I used the HashtableCache (not really a enterprise cache) to hold these entities. I am not concerned about the cache growing out of control and growing stale, and the application is not clustered so my choice I think stands up. By caching these entities I save a lot of reads from the database.
In general what I have attempted to do is to minimise the calls the database. NHibernate batching is brilliant at this, enabling any number of commands to be issued in a single ADO batch, but it works only under certain conditions.
There were a few other ideas put to me by the local DBA which I haven;t implemented but would consider:
- Drop foreign key constraints as the constraints are handled at the application layer
- Don’t have a primary key on the inserting tables, rather simply have indexed columns. This would enable SqlServer to heap insert rather than B-tree inserts, which if the primary key is sequential might become problematic as the B-tree would need to rebalance.
