Hacker News new | past | comments | ask | show | jobs | submit login
Reddit is experiencing a partial outage (redditstatus.com)
93 points by gundmc on Feb 21, 2023 | hide | past | favorite | 74 comments



For me it looks like the "old" reddit design is especially affected by the outage. "new.reddit.com" works fine for me.


I tried to use new.reddit because it is the only thing working, why is it that when I'm reading comments that 50% of my 1440p screen is just black background on the left & right? You can kind of fix the front page by putting it into "Compact" view, but the comment section has nothing akin to that.

This is such a usability disaster relative to old.reddit, and it hasn't been fixed for years. Who is this for? I still remember when Reddit had one of the best mobile websites, before they had an official app and then mysteriously the mobile site started to lose previously working functionality.


At a strategic level, surely the website is bad in order to drive people to the mobile apps. Simple enough.

But I wonder how this result was actually accomplished - are there Reddit developers deliberately adding JS bloat so scrolling is choppy? Are there KPIs that pages must not load in under N seconds or videos must not play successfully more than N% of the time? It's bad enough that it can't be an accident. I'd love to interview one of the devs of the new UI.


Ye ... I mean so called dark patterns are so common nowadays that it would not suprise me at all if they made the website shitty on purpuse.

It is like login in to Gmail on a new computer. It has some 2fa that begs for my phone number and the other options just don't work half the time.


I'm fairly sure this is some "design paradigm" or something. I've complained about it for years but I've never heard it described by anyone as an "issue" that's going to be fixed, as much as "a design some old people don't like".


I think the paradigm is called “use a ton of javascript to do something really simple”


Another "single page app" design that is slower than the thing it replaced.


I didn't even think of using new.reddit.com when old had trouble! I used the API and that worked fine. I'll have to keep the new.reddit.com trick handy in case it is needed sometime.


its for kids who think there phone is there only computer an jus trynna see memes doring class


Can't wait for this to be one of the reasons the hard deprecated old reddit :(


It will suck when they do that, but at least I’ll get whatever time I spend on Reddit back to funnel somewhere else.


As long as the API exists, so can your reddit consumption.


Is there a way to use this on desktop? I’m not sure I care enough about Reddit to worry too much.

Tbh I look forward to the day I can cut that time sink out. I just can’t peel myself away entirely.


Stellar and Comet are popular clients for MacOS. They're both pretty nice, from what I remember.

Legere for Reddit has been recommended by my colleagues that use Windows but I can't vouch for it personally.


If you want to read - sure. Here's an example.

https://api.reddit.com/r/programming


I just use some home cooked python scripts on desktop, but I think there is a gnus.el client somewhere too. On my phone, Red Reader is great. It's on F-droid.org.


I believe they are planning API changes though, with restrictions that make alternative sane apps like Apollo not or barely functional.


But probably not.


I’m giving up Reddit for lent, and would be somewhat pleasantly surprised if they did this while I’m gone. It would make just never, ever using it again much easier.


Maybe the cause and effect is reversed. We're seeing errors because the old reddit is deprecated.


Tested, old yields 503, new shows the normal content. I need a tin foil hat.


There's very understandable reasons this could be the case, like the "old.reddit" code base just being untouched as an important API changed.


Which is amusing because the "new" reddit has consistently been loading 10x slower than old over the past few weeks.


I didn't find this to be the case, took a few tries on both, but at least the old one provided an error message


Let's be honest here...Reddit is _always_ experiencing a partial outage...


Well, the outage was in a larger part than usual.


"just roll it back" /s

Anybody want to guess root cause?

Do we have a "root cause" bingo card?

DNS

Database

What else are super likely?


What an odd comment. As a software engineer, my professional guess is the computers aren’t working the way they should.


I used to work for a very high level director who was promoted many, many times (probably VP level+, $300k/yr easily total comp, probably 80 indirect reports overall in the org, probably 10-20 years experience) whose entire incident playbook handling philosophy was "how quickly can we roll it back/why hasn't it been rolled back yet/have you tried rolling it back yet"


It's weird for it to be their _entire_ playbook, but most outages that I've exacerbated were because I panicedly tried to fix things instead of just rolling it back and then taking stock.

I often have to work hard to convince people of all experience levels that it's the best way forward.

- "It's just a little bug I can just fix it [and definitely won't make it worse with code that I haven't tested as rigorously right?]"

- "My KPI/bonus/project plan relies on this going out today"

- "My code is fine it's the infrastructure [that I didn't warn] that can't handle it. They need to fix their side now."

I don't know about your VP but "how fast can we get back to before it was broken?" is reasonably the first thing you should be asking


Incident response should always be: (1) get people enacting the final disaster recovery plan and rollback whilst we (2) see if we can recover from where we stand.

Doing #1 puts some serious boundaries on how bad it can get


i find its usually the same persons or teams responsible for both. hard to do them in parallel


This probably really depends on the type of business you have. I work for a CDN, our outages are usually caused by one of our network peers/providers borking things. There is nothing to rollback.


For sure, and you're not going to be able to roll back a failed power supply. I'm just saying it's a totally reasonable first and maybe even second question


"It works on my career"


“And that’s how docker was born”


If you're a director with 10-20 years experience and you're making $300k/year, you're REALLY doing it wrong.


"Let me run the reverse reverse migration script again"


Woe be upon thine fools that change code and database schema simultaneously.


You missed the other two common ones: permissions change and a disk filled up somewhere.

Before finding out the dead simple failure mode and fix, engineers need to spend countless hours diving into the most technically complex scenarios that might be happening but are irrelevant. Then they can reset permissions or add disk space or add back a DNS entry.


Reminds me of the old sysadmin who always made a file 10% the size of the disk named .root-emergency or similar. Disk filled up? Delete the file, get some breathing time, fix the problem, recreate the file.


Isn't that what ext filesystems on Linux do? IIRC the reserved portion is 5% which can be dropped if you need some headroom.


Yeah, the root-reserved blocks are tunable.

Won't save you if someone's running-as-root reporting job goes rogue and fills up the disk, though, while the file might... I mean, obviously one ought not have done that in the first place, but the real world is a whole thing.


Yep. Often a system crash is caused by logging, which often logs ... as root.


Try SCE to AUX.


Investor left their bottle of Tequilla sitting on the delete key?


that sounds Twitter-Musk-esque, amirite?


It's from the series Silicon Valley


Seems to be limited to loading comments in-app. Posts are loading fine and Reddit.com is showing comment threads.

Busted API deployment?


You should add "BGP config problems" and "cryptolocker"


A Tesla fire in Newark, CA took out an internet backbone?


Or a North American Fiber-Seeking Backhoe.


Just for fun I've bet on hardware failure.


Cat walking on keyboard?


It’s always DNS.


Pre-IPO jitters?


I noticed after last week’s “don’t be cute” discussion the error page stopped saying “you broke Reddit” and switched to a generic 503-sounding message


We didn't change anything AFAIK, different parts of the stack have different error messages


What is reddit and why would anyone go there?


It's the thing you put in your Duck Duck Go search (but do not do "!reddit" or it won't work) to find content from the closest thing the Internet still has to a reliable and genuine collection of opinions and reviews by actual disinterested humans.


Reddit is a website where anyone can make their own hackernews.


With the added bonus of a CEO who will helpfully edit your comments if your attempt wasn’t funny enough.


Let me be clear, I think it was an absolutely moronic move by the reddit CEO.

However, you make it sound like it happens all the time. In reality, it happened literally just once (multiple comments, but all within one hour), the edit itself was made obvious on purpose, and it has never happened again (it's been almost 7 years since then).

For additional context to what happened: the CEO replaced his own username (u/spez) in a few comments on r/the_donald (that were just variations of "fuck u/spez") with usernames of r/the_donald mods.

Again, I have a very low opinion of Reddit leadership team and their direction in general, but "CEO editing user comments" is not a concern I have about the website at all. Mostly because it was an isolated incident that happened over half a decade ago.


> and it has never happened again

how do you know exactly?


> but "CEO editing user comments" is not a concern I have about the website at all.

What concern would there be anyway? Like, who cares if someone edits your comment? It is not like you are storing your important life's work there – or at least I can't see it being a good idea to store your important life work there, even without tampering. Providing long-term durable storage is not the business they are in.


It has forums for niche hobbies like tabletop board gaming, wood working, baking, etc. where people can get together and discuss why Patchwork is the greatest game ever, why that piece of wood is oak, showcase the very first cake they have every tried to bake and why it looks good enough to be in a glass case in the Louvre, etc.


It has a rich niche community for everything, you will find brutally honest experts in everything.


It's the hive mind.


I think it's like teddit, but worse


One way they appear to be managing load is randomly banning users.

I've never had a report against me, I never post anything remotely spicy, I never even argue or flame. I moderated multiple subs.

And yet my 9-year, paid Reddit Gold account was randomly nuked. No explanation. Appeals denied. Support emails never answered.

I guess if your servers can't keep up, one way to cope is to just ban the folks who contribute the most to your site?


You really believe they ban people as a way of mitigating high server loads? I'm sorry but like, come on...


I do not literally believe that, no.

However I do often wonder what in the world is happening at Reddit HQ. Mitigating server load by banning users isn't too much sillier than some of their actual decisions.


Only partial?


Now how will I look at porn while I’m supposed to be working?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: