I’ve been meaning to wade into the NoSQL debate for a while but couldn’t figure out exactly where to start. A comment on a Reddit post about what “backend to choose” got me excited enough to write:

What is a NoSQL DB? That term means nothing – all DBs need a “structured query language” to convert its storage schema into an abstraction you use in your code. “NoSQL” usually means ‘NoSchema’ – or rather, “NoFixedSchema.”

Why is it important sometimes to have a fixed schema? Because you want to impose some semantics around an attribute lacking a value and that usually means requiring some attributes or columns. When the required columns in a class of objects “are significantly greater than” the non-required or variable columns, you are essentially dealing with a fixed schema and you might as well take advantage of the performance benefits this assumption gives you.

If instead you have a use case like a Yelp.com where each business entity has a large number of non-fixed attributes that vary greatly with regional boundaries, keeping the schema up to date with the new information you gather might not be worth the effort.

Your DB choice should reflect your real-world semantics. A wise friend recently said to me – SQL works for 90% of startups; and NoSQL works for 90% of startups; and the 10% on either side are a volatile population. Those are wise words, indeed.

There’s more to be said, but that’s all I’ll say for now.