drud.is

Costly mistakes

The most valuable experience of side projects is the mistakes you make. You need to pick an untested solution and try to figure things out. Errors ranging from “not optimal” to “what?!” are the name of the game.

In my current side project choosing SQLite was the most costly mistake. Simon Willison’s passion and social proof from Twitter’s indie hackers convinced me.

At first, SQLite just worked. It was easy to get started: no service to manage, the database is just one file, it’s fast and has available tooling.
Soon, locks on concurrent writes became a problem. First a small problem, then a bigger one until I discovered I had to enable WAL. Then, that I couldn’t share across different machines, not even through a NFS mount. So I copied the file, which was fast until the data size reached hundreds of MiB.
I also ran into issues due to the flexible typing, like dates written in a format that were not parseable, which I found way too late.

Problems kept mounting. So it was bye SQLite, hello PostgreSQL. I also considered MySQL, but PostgreSQL has more momentum and it can act as a vector database, which I’m not using at the moment but likely will for embeddings/RAG.

The migration was very painful. pgloader was useful, but only to an extent. SQLite’s convenient “INSERT OR IGNORE” is much more complicated to achieve, I hadn’t realized how often my referential integrity was compromised and also had incorrect values as a result of SQLite’s approach to typing. I had to rewrite most inserts, some queries (IFNULL), fixing column types (SQLite doesn’t have boolean nor datetime), and inconsistent values. I’m glad that my data is now much better, but boy was it time consuming.

What I love:

What I don’t:

Was the migration worth it? Absolutely! the previous setup was slowing me down terribly. Was it fun? No. Yet, what a great learning. To the next!

Learning: the hidden cost of seemingly simple solutions can far outweigh their initial convenience.