How I use git aliases

I talk about this not infrequently, so decided to try to better structure how I think about using git aliases in my development workflow in a consolidated (and hopefully more coherent!) way. I’m of the opinion that you can choose to either: Incorporate it (either CLI or through some tool) meaningfully into your workflows; or Tack it on at the end, because you have to use it (“I’m required to check this in, ugh”) The natural result of your choice here means git can be either a productivity and code-quality boon, or an annoying hoop to jump through.

Invest in Tooling

Adapted from a talk given at HopHacks 2022. Thesis Investing a (small) amount of time on how you work pays dramatic dividends in terms of future effort-to-value. If you find yourself frequently thinking: Ugh I never remember the flags I need for I wish I didn’t have to do <repetitive, boring task> Why do my commits always have that “no newline at end of file” warning? then you probably have a use-case which would benefit from thinking about how you build tools for yourself.

Test Your SQL Deletes

This is simple suggestion I’d assumed was available elsewhere on the internet, but which I couldn’t find anywhere when trying to reference this idea in conversation. While we should strive toward “individual users shouldn’t have DELETE privileges in data warehouse environments,” sometimes (I’d be genuinely curious to know the breakdown on this one) that’s not a feasible outcome.

Comparing Dates and Timestamps in SQL

We (humans) intuitively think we understand dates and times. There’s a lot out there on the subject of how this creates problems when it comes to programming (both of those are good reads, by the way). Within that space, something I’ve seen cause a decent amount of pain and frustration is the notion of date (i.e., YYYY-MM-DD) to timestamp (i.e., YYYY-MM-DD HH:MM:SS …) comparison in SQL. To sharpen my thinking on the subject, I wrote an initial version of this post, which I’ve expanded and tried to clean up a bit.

Docker Cleanup

I’m using Docker (a lot) more often these days both at home and at work, and wrote a set of short snippets to save some time fixing some painful yet simple-to-solve issues. Abandoned Containers Depending on the workflow, I find myself ending up with exited containers which aren’t ever going to be re-used. This can be a good thing - the container is an ephemeral artifact that we’re not coming back to.

Testing Your Code

I’ll own that I don’t always have great tests for my code. I just finished a hobby project where I’ve got 200 lines of code, but no tests (it’s got a README though, that’s gotta count for something, right?). “Make sure you’ve got tests for your code” is of those maxims that we all say, and there are entire software development processes which are built on a foundation of tests. That’s great; but sometimes (and especially if you or the project are new to the context) it’s hard to live up to those standards and expectations.

Monorepos for Data Teams

Content from this discussion, adapted for a longer format. Within “it depends on your exact case”, I’m still pro mono-repo. The use cases I’ve seen where it might make sense not to be mono-repo were security (restricting the repo for PRs the way you need to) and function (completely disparate work) related. Broadly, there are 3 things I like about mono-repos: Workflows and tools (e.g., linters, bots, tests) get defined once You can do all of the work to deploy something in one PR Searchability - variable names, references all in one repo GitLab’s data team handbook is one of my favorites, but they’re not showing you their actual code, if I remember correctly.