These days I’m spending a lot of my time making sure the PeakZebra product works.
But it’s clear to me that quite possibly the biggest challenge I have before me is dealing with the dead certainty that there will be lots of customizations made to client instance of PeakZebra. In a way, that’s PeakZebra’s secret weapon: you get exactly what you need for your app.
I know how to manage this is a general, not all that optimized way, but I’ve started thinking about how to scale this so that I can handle thousands of clients with lots of different deployments.
Deployment?
To start with, even what counts as a deployment needs tighter definition. There’s a lot of code in PeakZebra, and that code is tracked using git and GitHub. There needs to be a good way to deal with having different versions of the actual code across different deployments, because that’s definitely going to happen. It’s going to happen lots, I presume. So it isn’t as simple as having
Second, a production WordPress site is a big amalgamation of things that profoundly impact the site’s look and behavior, but that aren’t necessarily code that would be checked into a repo somewhere. You can add custom post types using a plugin, for example, and it’s genuinely important, should you need to recreate the site, that those custom types be checked in.
Also, many changes that are made to a WordPress-based app result not in changes to code, but to changes in the database. We don’t really want to check in a new copy of the database every time anything at all gets tweaked, do we?
Tracking the combinatorial
But what’s the best way to track all this amalgamated stuff?
One could, in theory, just keep a full backup of each site (which is, not surprisingly, what most developers and agencies do). That doesn’t really answer the needs of the “backside” of all these deployments.
Whatever the system is, it needs to be quick and efficient for a developer to make changes to each system. Yes, there’s a more or less common codebase. But in order to make changes, the developer needs to know what other customizations have been made to this deployment and how to avoid inadvertently breaking prior modifications.
And there’s another twist. Some of the changes will make sense to propagate back to the “original” code base. But how will the developer know they aren’t going to break other sites with other modifications when the next update to the original code rolls out to all the other deployments.
Non-computable?
There’s almost certainly a “computer science” way to prove this is actually an unsolvable problem. But before we panic, let’s note that most changes, most of the time, won’t matter to other sites and aren’t likely to break much of anything else. Not in the WordPress world, where things are always running with the assumption that each site is cobbled together with a combination of parts in addition to the core code.
Basically all I’m aiming to do in this particular blog entry is lay out the problem set, but let me say a few things about how I’m viewing a solution.
First, I think there probably needs to be a scrubbed-data version of each deployment stored on local servers in a “ready to use” status. Imagine hundreds of local deployments, each with a copy of the code used in the relevant PeakZebra plugins.
Given that WordPress stores revision histories for posts, this capability with regular backups, plus git repo storage for all the actual code, could potentially solve the problem of being able to track thousands of revisions across thousands of sites.
What it doesn’t do is help the developer not shoot themselves in the foot while updating a particular site. More about this in future posts.