this post was submitted on 05 Aug 2025
99 points (97.1% liked)

Tech

1687 readers
142 users here now

A community for high quality news and discussion around technological advancements and changes

Things that fit:

Things that don't fit

Community Wiki

founded 1 year ago
MODERATORS
 

A software engineer has warned against trusting cloud data storage services in a painstakingly detailed blog post detailing their own “complete digital annihilation” at the hands of AWS admins. Developer Abdelkader Boudih, pen name Seuros, says they had been a fee-paying AWS subscriber for a decade, with the cloud service becoming a firm part of their workflow. Suffice to say, the developer’s long-standing relationship with AWS has now ended acrimoniously.

you are viewing a single comment's thread
view the rest of the comments
[–] tyler@programming.dev 2 points 20 hours ago

I meant to respond to this yesterday. We didn't lose the lambda code, we lost lambda versions, which are immutable versions of your Lambda. There is no way to restore these (hence immutable).

We had every lambda version's code tagged in github as a release and while we could have redeployed them it would have taken just as long if not longer, due to how long our deployments for the lambdas in question were (20minutes to 1.5h depending on the lambda).

There were a lot of suboptimal things that happened to make it a shitshow, but essentially:

  • we should have been using function Aliases from the beginning, our versions were referenced directly, so redeploying would have resulted in needing massive db changes downstream to result in referencing the right lambda versions.
  • AWS should have specified that what we were doing was not what they intended (they've updated the docs now, but at the time their docs literally just said if you want immutable functions you can use function versions!).
  • we should have saved off the function.zips before deploying (we didn't think that was necessary because we had all the code and the artifact was the least important part of our deploy)
  • we should have had our own AWS account rather than using the company's 'shared' account which was how everything was done at the time.

This all resulted in a dumb devops dude getting a ticket to clean up our dev account due to running out of lambda storage space. He cleaned up the dev account with a script that was built to only be run against dev. Then he decided even though the ticket said just clean up dev, he would take a look at prod and clean that one up too.

Thus managing to take down the entire company's sales infrastructure.

The shared aws account and the devops script to clean up lambdas was built before I started at that company, but the rest of the code/architecture was mine and one other person's design. It worked really really well for what it was built for (immutable rules for specific points in time), but there were a lot of shortcomings and things we missed that resulted in everything going badly that month.