Great article!
Very practical tips in “Lessons Learned” — each one can become an article in itself… :)
Regarding citing Human Error as the root cause. I would recommend to investigate further than that point. There is a great article on blameless postmortems at Etsy:
The fundamental difference is that we don’t stop at “human error” as the reason for why something broke. Humans don’t generally come to work to do a bad job.
…This is why we focus not on the action itself — which is most often the most prominent thing people point to as the cause — but on exploring the conditions and context that influenced decisions and actions.
So from reading your article, I would say that lack of simulation tests, peer review and migration plan are causing factors among others…