Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Swapping from EF6 to Dapper was one of the best choices we ever made with our project stack. It is so relieving to be able to hand-tune queries and transactions now. Initially, we were sold on the apparent simplicity of EF6, but as with many things there is a cost for an abstraction like this. In our case, the performance penalties and opaqueness were dealbreakers after a while. We saw an average speedup of 10x on all SQL operations after rewriting using direct SQL with Dapper handling mapping of result sets into models.

Writing your own SQL migrations is actually incredibly straightforward if you just think for a few minutes about how you would do it if ORMs didn't exist. Many database systems have ways to store metadata like schema versions outside the scope of any table structure, so you can leverage these really easily in your scripts/migration logic. EF6 uses an explicit migration table which I was never really a huge fan of.

One thing we did do that dramatically eased the pain of writing SQL was to use JSON serialization for containing most of our complex, rapidly-shifting business models, and storing those alongside a metadata row for each instance. Our project would be absolutely infeasible for us today if it weren't for this one little trick. Deserializing a JSON blob to/from a column into/from a model containing 1000+ properties in complex nested hierarchies is infinitely faster than trying to build up a query that would accomplish the same if explicit database columns existed for each property across the many tables.



If I was starting from scratch I would probably use an micro ORM like dapper or in my case I have experience with Petapoco.

There are so many "gotchas" to using EF that I'm not sure if was worth using to begin with (for a complex project at least). 8 years into developing and maintaining a huge ecommerce platform built on EF I think if I started over I would probably not use EF.


This is just another take on EAV (https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80...).

I'm unsure if it's better than the typical way of modeling EAV though, I'd have to think it through.


I am seriously glad I don't work on this project!


Care to elaborate on which aspect noted above is adverse from your perspective? I would be happy to provide more context and explain in more detail some of the reasoning involved in our decisions.


> Deserializing a JSON blob to/from a column into/from a model containing 1000+ properties in complex nested hierarchies

> complex, rapidly-shifting business models

Your business logic classes have 1000+ properties. And you plan to not migrate them when the schema changes but leave many instances with old versions of the schema sitting in the datastore. Your application logic is going to get nasty!


The other aspect here is that the lifetime of these objects is very brief for our application. Typically 10-60 minutes. Schema changes, while breaking, are tolerable along business cycle boundaries.


Why use SQL at all? It sounds like you needed a key value store?


Some commenters have asked what kind of business would deploy this kind of technique.

I’ve done it. For MLS syncing software, there are lots of properties, not thousands but can be hundreds. And each MLS RETS has its own Schema, so for any kind of logic portability this is necessary.

Actually, storing the raw data as a blob is a flexibility technique and is a separate concern than the number of fields. As I can’t predict the future set of optimized queries I’ll need, and I don’t want to constantly sync and resync (some MLS will rate limit you), then this way I can store the raw data once, and parse plus update my tables/indexes very quickly.


I didn't quite understand fully how your solution worked in the end, are you storing the entire object graph as a JSON blob alongside the relational data in the table, or are you simply storing the JSON blob instead of using relational data?

Its difficult for me to picture how Dapper even comes into play when you're doing this trick with the JSON blob.

Why not use NoSQL?

Also 1000+ properties on an object? I know some domains sometimes surface these kind of extreme cases, but is there not an alternative to having 1000 properties in one object?


For certain abstractions, we are looking to maintain the state of a business process over time:

The relational concern is the storage of metadata sufficient to locate & retrieve the state. E.g.: integer primary key, name of process, current transition in process, some datetime info, active session id, last user, etc.

The 'non-relational' portion is simply a final 'Json' column per row that contains the actual serialized state. This state model can vary wildly depending on the particular process in play, so by having it serialized we can get a lot of reuse potential out of our solution.

In terms of the models, it's not 1 gigantic class with 1000 properties. Its more like a set of 100+ related models with 5-30 properties each.


Ah yup, that makes a lot more sense. I can imagine how fun that might be trying to do through EF.

It sounds like a workflow engine. I'm picturing one table that is very generic that tracks "this job id, this workflow type, this stage in workflow, this entity, this state of the entity" and a single job has multiple of those entries over the lifetime of that job execution and the JSON blob is the current state of things, so that you don't have to go and recompute that.

Yeah, it seems like a reasonable choice.

What sticks out to me in a scenario like that is a good CQRS implementation. The write side of it pumps in the history of the job execution, then denormalizers run to project that into a shape amenable to being read by the application.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: