My Regular DBA Skillz Are Leading Me Astray

There’s a number of sort of ambient things that a DBA thinks about and some of them are a bit misleading in the context of phone applications. I’m feeling more and more like I have to rethink everything for the phone world.

For instance, the usual SQL situation is that there is a lot of concurrency, there are multiple potential applications using any given server and maybe sharing one or more databases. Further, it is often the case that each of the applications has different sorts of operations they are trying to facilitate and they have to be compatible. For instance, commonly you don’t mix OLTP type things with OLAP type things because the performance is horrible when you do; but on a phone you do that all the time — largely because there probably is nothing to compete with your OLAP when you run it (and also because there is less of a choice).

Usually the big considerations start from the fundamental question “What’s the unit of work?” The elementary operations are super-critical and the RI choices play a big role in getting sensible elementary operations. That still matters. But on phone again some of the usual choices don’t seem to make sense.

Even smallish details matter: for instance on larger system you might avoid a foreign key cascade delete like the plague if only because it makes small looking transactions end up being huge and huge transactions are big problem. But on a phone, do you care?

Sidebar: Huge transactions risk being disproportionally aborted because of deadlock or conflict, so you typically decompose them into a series of committable steps on the way to the overall operation. This lets you retry/restart the minimum amount to work and provide the smallest footprint with which to conflict with other operations. It also keeps the size of your Transaction Log down (if applicable). Retiring transactions fast is good for everyone.

Likewise, triggers are often avoided for similar reasons — they add arbitrary overhead to your transactions and it is often very hard to reason about exactly what will happen when triggers are combined with say FK cascade or when triggers start chain-reacting.

So these things are frequently shunned in favor of more predictable patterns. Maybe a series of well-defined stored procedures, each of which does a single elementary operation. Here each procedure is small and easy to understand, retires fast, and composes well with the others. Big helper procedures that do “kitchen sink operations” orchestrate a series of transactions to do the job, often moving between several legal but intermediate (and resumable) interim states.

In a big system you’d likely avoid doing OLAP and OLTP in the same database. Largely because the locking required to get a consistent results in the face of abundant updates is crazy. Those operations are basically not compatible. Big, filtered, aggregated queries just lock tons of stuff— even if they only eventually return a few rows. But does it matter on a phone? You have a much better idea what the total workload is. The “OLTP stuff” probably isn’t even happening while your “OLAP stuff” is going on.

So, this leaves us now wondering: how much of the typical “this is how it’s done” practices even remotely apply to a phone application with very limited concurrency and a rich mix of OLAP/OLTP by design? What if there is near zero ad-hoc SQL going on, do I need as much RI? What if the stored procedures just don’t change very much?

In that world many of the mistakes you might typically make — like trying to manually clean up an invoice that’s supposedly defunct and then typing the wrong id number into an ad-hoc SQL window (or a one-off cleanup script) are just not applicable. After all, there’s not one database to update here, there’s zillions, on zillions of phones, and nobody can run ad hoc sql against those databases without trying them on sample databases first.

Large transactions are probably still bad, and the chain reaction nature of cascade and triggers is kind of terrifying. But then client app triggers are pretty simple so maybe it all matters hella less?

There’s a lot to rethink here.

Written by

I’m a software engineer at Facebook; I specialize in software performance engineering and programming tools generally. I survived Microsoft from 1988 to 2017.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store