What is a difference between production and stage/dev/sandbox relational databases? No difference actually.
But … there is one detail, this is a SQL traffic … heavy SQL traffic, every second we have more than 1000 TPS (Transaction Per Second). That’s good for business basically, but nightmare for releases.
Just try to apply ALTER TABLE … ADD COLUMN … on relational database with heavy SQL traffic. There are 2 ways in this case.
- your change will be cancelled by timeout (because other parallel R/W sessions heavy use this table)
- your change will be applied but during alteration you will get a lot of exceptions on backends and frontends. Because this table is locked during this structure change.
Of course, you can say lets use controlled DB switchover and that’s true.
Matvey selected another approach. He gathers statistics about table usage based on metadata PostrgeSQL layer (statistics include counters for “select” / “insert” / ” delete” / “update” operations) and making prediction when the particular table will be have a minimal usage by SQL traffic.
The common architecture of solution is presented below.
… and please take a look on prediction visualisation. That’s really good!
There is a metric of accuracy of prediction for the future and it’s about 87%.