Test Your SQL Deletes
This is simple suggestion I’d assumed was available elsewhere on the internet, but which I couldn’t find anywhere when trying to reference this idea in conversation. While we should strive toward “individual users shouldn’t have DELETE
privileges in data warehouse environments,” sometimes (I’d be genuinely curious to know the breakdown on this one) that’s not a feasible outcome.
When you need to delete data from a SQL database, you write something like:
DELETE
FROM table
WHERE ... -- some subset condition which matches and is important to get correct
Before you run that query, strongly consider running the query as a SELECT
first. It’s a great way to see what you’re about to delete, before you do it. Then when you realize you meant >
and not <
, you’re only fixing code and not scrambling to fix the data.
SELECT *
-- DELETE -- +1/-1 line edit gets you a free test
FROM table
WHERE ... -- you wrote that subset condition correctly, right?
Caveats:
- No, I didn’t just tell you to delete your data. Use at your own risk.
- …you have (and test) backups, right?
- Transactions solve for some of this, provided you use them correctly
- OLAP databases make transactions function differently than OLTP databases
- That time you think you’re in a transaction you can
ROLLBACK
to fix a mistake and you’re not is never fun
- Modern analytic database features like time travel make fixing these kinds of mistakes easier. There’s still value in this strategy:
- Lessens reliance on backup / system-level reverts
- Builds discipline for that one time TM that it’s useful
- Costs practically nothing (the compute for the
SELECT
is relatively cheap, both compared to theDELETE
you’re about to run, as well as the time you’d spend fixing a mistake)