this post was submitted on 28 Oct 2025

420 points (99.1% liked)

Technology

76581 readers

2646 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

420

‘There isn’t really another choice:’ Signal chief explains why the encrypted messenger relies on AWS (www.theverge.com)

submitted 1 week ago by schizoidman@lemmy.zip to c/technology@lemmy.world

108 comments fedilink hide all child comments

cross-posted from: https://lemmy.zip/post/51866711

Signal was just one of many services brought down by the AWS outage.

top 50 comments

sorted by: hot top controversial new old

[–] ICastFist@programming.dev 7 points 6 days ago

Tangent: Jami is p2p, so the only risk of going offline is if everyone in the groups go offline. It does lack several quality of life features, though.

[–] Tiger_Man_@szmer.info -4 points 6 days ago (2 children)

There always is another and better choice and it's called using your own fucking servers

[–] PrettyFlyForAFatGuy@feddit.uk 12 points 6 days ago* (last edited 6 days ago)

There was a big exodus to signal for a while a couple years ago when meta were fucking with their whatsapp privacy policy, similar to the exodus from reddit to lemmy.

Having your infrastructure on a cloud provider allows you to keep your costs in line with your current amount of users, if you have a big influx you can immediately scale up to accommodate them, and then when that spike in users dies off as they invariably do you can scale back down instead of being left with a load of hardware you've just bought for your new users (that have since fucked off) and now aren't using

[–] MangoCats@feddit.it 3 points 6 days ago

using your own fucking servers

And/or peer to peer mesh. Personally, I WANT a system that has peak performance AND multiple fallbacks to prevent blackout single point of failure situations.

[–] blakemiller@lemmy.world 220 points 1 week ago (8 children)

Her real comment was that there are only 3 major cloud providers they can consider: AWS, GCP, and Azure. They chose AWS and AWS only. So there are a few options for them going forward — 1) keep doing what they’re doing and hope a single cloud provider can improve reliability, 2) modify their architecture to a multi-cloud architecture given the odds of more than one major provider going down simultaneously is much rarer, or 3) build their own datacenters/use colos which have a learning curve yet are still viable alternatives. Those that are serious about software own their own hardware, after all.

Each choice has its strengths and drawbacks. The economics are tough with any choice. Comes down to priorities, ability to differentiate, and value in differentiation :)

[–] axx@slrpnk.net 54 points 1 week ago

I'm sorry, what, a balanced and informed answer? Surely you must be joking!

[–] blah3166@piefed.social 27 points 1 week ago

Meredith mentioned in a reply to her posts that they do leverage multi-cloud and were able to fall back onto GCP (Google Cloud Platform), which enabled Signal to recover quicker than just waiting on AWS. I'd link to source but on phone, it's somewhere in this thread: https://mastodon.world/@Mer__edith/115445701583902092

load more comments (6 replies)

[–] magguzu@midwest.social 99 points 1 week ago* (last edited 1 week ago) (4 children)

So much talking out of ass in these comments.

Federation/decentralization is great. It's why we're here on Lemmy.

It also means you expect everyone involved, people you've never met or vetted, to be competent and be able to shell out the cash and time to commit to a certain level of uptime. That's unacceptable for a high SLA product like Signal. Hell midwest.social, the Lemmy instance I'm on, is very often quite slow. I and others put up with it because we know it's run by one person on one server that he's presumably paying for himself. But that doesn't reflect Lemmy as a whole.

AWS isn't just a bunch of servers. They have dedicated services for database clusters, cache store, data warehouse, load balancing, container clusters, kubernetes clusters, CDN, web access firewall, to name just a few. Every region has multiple datacenters, the largest by far of which is North Virginia's. By default most people use one DC but multi region while being a huge expensive lift is something they already have tools to assist with. Also, and maybe most importantly, AWS, Azure and GCP run their own backbones between the datacenters rather than rely on the shared one that you, me, and most other smaller DCs are using.

I'm a DevOps Engineer but I'm no big tech fan. I run my own hobby server too. Amazon is an evil company. But the claim that "multi cloud is easy, smaller CSPs are just as good" is naive at best.

Ideally some legislation comes in and forces these companies to simplify the process for adopting multi cloud, because right now you have to build it all yourself and it becomes still very imperfect when you start to factor things like databases and DNS, and this is what they rely on hard for vendor lock-in.

[–] Dragonstaff@leminal.space 5 points 6 days ago (1 children)

AWS needs to be broken up way more than Ma Bell ever did. We need to have open protocols developed so that there can be actual competition.

[–] jfrnz@lemmy.world 4 points 6 days ago (2 children)

There is actual competition though, from Google and Microsoft at a minimum.

[–] Dragonstaff@leminal.space 6 points 6 days ago (1 children)

3-5 companies in a sector is an oligopoly, which acts nearly the same as a monopoly. This is not "actual competition".

All of these companies cornered their own markets, and now they own the backbone of the internet.

If we broke up all of them and required open standards and interoperability then other companies could innovate.

[–] jfrnz@lemmy.world 0 points 5 days ago (1 children)

I’m not saying it’s good, but it’s not Ma Bell.

[–] Dragonstaff@leminal.space 1 points 5 days ago

How much of the economy in the 60s was telecommunications vs how much of the economy today relies on the internet?

[–] AwesomeLowlander@sh.itjust.works 3 points 6 days ago

3 companies is not competition, 3 companies is collusion.

[–] shalafi@lemmy.world 19 points 1 week ago

Can't find a screenshot, but when you're logged in and click for the screen to show all AWS products, holy shit. AWS is far more than most people think.

[–] douglasg14b@lemmy.world 18 points 1 week ago

Not to mention the fact that the grand majority of federalized services have extremely unsustainable performance characteristics that make them effectively impossible to scale from hobby projects

load more comments (1 replies)

[–] qwerty@discuss.tchncs.de 29 points 1 week ago (6 children)

Session is a decentralized alternative to signal. It doesn't require a phone number and all traffic is routed through a tor like onion network. Relays are run by the community and relay operators are rewarded with some crypto token for their troubles. To prevent bad actors from attacking the network, in order to run a relay you have to stake some of those tokens first and if your node misbehaves thay will get slashed.

[–] tengkuizdihar@programming.dev 68 points 1 week ago (5 children)

shame their entire node system relies on cryptobros tech.

tor doesnt need currency to back it up. i2p doesnt need currency to back it up. why the hell lokinet does?

[–] qwerty@discuss.tchncs.de 20 points 1 week ago (18 children)

Tor relays only relay the traffic, they don't store anything (other than HSDirs, but that's miniscule). Session relays have to store all the messages, pictures, files until the user comes online and retrieves them. Obviously all that data would be too much to store on every single node, so instead it is spread across only 5-7 nodes at a time. If all of those nodes ware to go offline at the same time, messages would be lost, so there has to be some mechanism that discourages taking nodes offline without giving a notice period to the network. Without the staking mechanism, an attacker could spin up a bunch of nodes and then take them all down for relatively cheap, and leave users' messages undelivered. It also incentivizes honest operators to ensure their node's reliability and rewards them for it, which, even if you run your node purely for altruistic reasons, is always a nice bonus, so I don't really see any downside to it, especially since the end user doesn't need to interact with it at all.

load more comments (18 replies)

load more comments (4 replies)

[–] e8d79@discuss.tchncs.de 30 points 1 week ago

I would not recommend it. Session is a signal fork that deliberately removes forward secrecy from the protocol and uses weaker keys. The removal of forward security means that if your private key is ever exposed all your past messages could be decrypted.

[–] arcterus@piefed.blahaj.zone 22 points 1 week ago (2 children)

The main issue with Session is they removed PFS when they redesigned everything. Also, it's admittedly been years since I tried it, but I remember the app being noticeably buggy.

load more comments (2 replies)

load more comments (3 replies)

[–] axum@lemmy.blahaj.zone 20 points 1 week ago* (last edited 1 week ago)

SimpleX literally solves the messaging problem. You can bounce through their default relay nodes or run your own to use exclusively or add to the mix. It's all very transparent to end users.

At most, aws outage would have only affected chats relayed on those aws servers.

SimpleX also doesn't require a fukkin phone number.

[–] goatinspace@feddit.org 13 points 1 week ago (1 children)

load more comments (1 replies)

[–] net00@lemmy.today 13 points 1 week ago (8 children)

Didn't only 1 AWS region go down? maybe before even thinking about anything else they should focus on redundancy within AWS

[–] shalafi@lemmy.world 15 points 1 week ago* (last edited 1 week ago) (2 children)

us-east-1 went down. Problem is that IAM services all run through that DC. Any code relying on an IAM role would not be able to authenticate. Think of it as a username in a Windows domain. IAM encompasses all that you are allowed to view, change, launch, etc.

I didn't hardly touch AWS at my last job, but listening to my teammates and seeing their code led me to believe IAM is used everywhere.

load more comments (2 replies)

load more comments (7 replies)

[–] majster@lemmy.zip 12 points 1 week ago (1 children)

They are serving 1on1 chats and group chats. That practically partitions itself. There are many server lease options all over the world. My assumption is that they use some AWS service and now can't migrate off. But you need an oncall team anyway so you aren't buying that much convenience.

[–] boonhet@sopuli.xyz 18 points 1 week ago (5 children)

There are many server lease options all over the world

It increases complexity a lot to go with a bunch of separate server leases. There's a reason global companies use hyperscalers instead of getting VPSes in 30 or 40 different countries.

I hate the centralization as much as everyone else, but for some things it's just not feasible to go on-prem. I do know an exception. Used to work at a company with a pretty large and widely spread out customer base (big corps on multiple continents) that had its own k8s cluster in a super secure colocation space. But our backend was always slow to some degree (in multiple cases I optimized multi-second API endpoints into 10-200ms), we used asynchronous processing for the truly slow things instead of letting the user wait for a multi-minute API request, and it just wasn't the sort of application that you need to be super fast anyway, so the extra milliseconds of latency didn't matter that much, whether it was 50 or 500.

But with a chat app, users want it to be fast. They expect their messages to be sent as soon as they hit the send button. It might take longer to actually reach the other people in the conversation, but it needs to be fast enough that if the user hits send and then immediately closes the app, it's sent already. Otherwise it's bad UX.

load more comments (5 replies)

load more comments