This is a good question for any provider like AWS--what kinds of information do I leak with seemlingly mundane choices like bucket names.
The other attack vector is from insiders. Many organizations "shield" identifiable information behind UUIDs or some other scheme. In the event of a breach, the UUID might mean nothing to most (it's not foolproof, though), but opens more doors for an insider.
> Absolutely not my experience from those days, VPSs were rock-solid for me.
Likewise, my personal toy VPS actually dates back to 2009 and the only unplanned downtime it's ever had were when Linode was one of the targets of one of those big DDoS waves a few years back. Even then it was online and working fine internally, just limited in its ability to reach the outside world.
The VPS itself has been rock solid and in 15 years the only times it's not been at least trying to serve whatever I have on it have been when I've rebooted it for updates.
I found those specs really curious as I worked in webhosting (even rax which is in the thread) and we didn't have anything near that low for a vps, but reading further I guess they're using some random fly-by-night webhost that doesn't even have a cPanel or Plesk license or anything for the user to self-service, they base all of their pricing on linode and you have to call/ticket in to manage your site.
I don't know the background of satoshi, etc (of course) but that's a $10 vps and they're complaining about running out of memory, etc in the thread.. Why so cheap? Dedicated xeon servers back then were 80-250 and vps filled every other price range.
And if they just need to compile a python widget on linux with a desktop WM... we had the technology to do that locally in 2009..
It was an amateur open source project hosting a simple site for project collaboration, in an era before the FAANG bubble in engineer salaries. I think $10/month for a hobby thing is about right.
Also recognize that this was still in the "OMG how did computers get so fast" era. Our intuition about the time was still colored by the 486's on which we'd all installed Linux for the first time (or the Sparcstations we used at school, same deal). Even today a 100+ MHz device still "feels fast" to me, and recognize that I write audio firmware on 400-800 MHz DSP cores.
I mean, I just said I was in this business back then. I was probably 22 and making a whole $40k in my first "real" engineering job at Hostway, then Hostgator, then Rackspace, etc. And I owned multiple servers. A $40 VPS was not a big ask in 2009. I absolutely didn't make FAANG salary nor did 99.9% of our VPS customers. We used to stick 1-3k customers on one $350/month dell poweredge.
We had millions of customers with VPS of various pricing and traffic and businesses.
> Also recognize that this was still in the "OMG how did computers get so fast" era.
I'm so confused by this statement. I've been in datacenters since 2005, if we ever had a "omg fast cpus" moment anywhere in that time line it was when AMD Epyc came out around 2016 and we could push massive PCIE bandwidth for vfio/etc. Beyond that we've been sitting on various 2-4ghz xeons for 25 years. I'm confused on the 486 comparison. I had a 486 when I was 8, 31 years ago.
TWO THOUSAND AND NINE
2 0 0 9
"Before the SV balloon in salaries".. dude, we're talking under a $100/month... In 2009. Your comment makes it sound like this was the yesteryear of computers, like 1988 or something, this comment has me so confused.
The entire thread is them talking about putting a tiddlywiki+phpbb on a VPS. This, aside from LAMP/etc stacks were the most common hosting product sold and were usually $5-10/mo plans. I have a feeling they're actually using one of those scammy "free" webhosts you used to be able to get off of somethingawful, etc. that would disappear after 2 months and were probably CSAM vectors.
It's just a very strange level of frugality, which is mentioned in this HN thread elsewhere. It sounds like they were only using donations to move forward, nothing wrong with that, just interesting how low skill/researched their infrastructure stuff is for someone who seemingly can create a thing like bitcoin.
You lost me. You asked why someone would rent a bottom tier hosting solution in 2009, and I told gave you an answer as someone who was paying somewhere around that for hobby projects at right around the same time. I mean, I'm sure you're right in some sense that there were better choices, there always are.
But this wasn't a weird choice at all, unless you want to inform it with a BTC quote from 2018 or whatever.
I was just rhetorically reflecting on how strange it was that those specs is what they were using when they clearly aren't a 12 year old child or Russian warez slinger who can't get a VPS elsewhere and can afford something beyond a free or $10/mo VPS, WHILE they're arguing about how they're running out of memory on the VPS when compiling. And .. I assume that they're technically savvy since they created bitcoin, although I have absolutely no idea what that entailed.
Those specs for a VPS are absolutely pathetic even for 2009 era Xeon computing. I had quad xeons poweredges with 128gb+ ram under my desk as workstations when customers stopped paying their bills in 2009 when I worked at Hostway. VPS were leased with either dedicated cores or shared cores, shared probably around $20-40 then dedi core $60-250+. Datacenters either used Vmware g/esx or Virtuzzo.
I worked at all of the largest webhosts in the USA (not Hetzner etc in Europe) from I think 2005-2015 so I'm pretty familiar with that era of hosting. The guy who owns the webhost they're using can't even spell his competitor Linodes url/name. It's like they went to some random IRC channel and picked whatever free VPS host they could, which is a horrible idea. Whoever owned that VPS could absolutely figure out who these people were and interact with any file on their system he wanted to. I bet if someone tracked down whoever is the in that email talking about his webhost, they could track that email thread back and figure everyone out. But fraud at webhosts was absolutely rampant back then so there is the possibility the entire group faked their identities if they even HAD to identify at a fly-by-night webhost (narrator: they didn't).
When we took down Windows + Linux servers for CSAM, warez, fraud, etc we'd often log into them before wiping them and they'd have IRC, ICQ, etc open on the server and we'd be able to take down an entire group by going through their logs/etc that were still on the server.
$7.99 at HOSTGATOR who back then was the biggest (outside of godaddy) budget webhost barely got you a free shared cPanel account on a server with another 2000 customers to lag your site down. It was basically an sftp account with a cpanel login, that's it. The webhost in question here doesn't even give them a login, they have to communicate and have things done by their "customer support" (probably a call center India full of sysadmins, they were huge back then).
A $9.99 VPS in 2009? oof.
Even more to the point of this thread, it shows that the tech/operations/infra skills of that team are .. not very impressive. One guy didn't know linux at all and the "Linux guy" is the one picking a free VPS at a shady webhost that can't even compile his code. So that takes a lot of SRE/types off the "who made bitcoin" list. And this wasn't like, "Linux is super rare" world. It was 2009. I had been using Linux since the 90s when I wasn't even 10. We had hundreds of Linux sysengineers at each of these companies and we had no problem hiring them.
2009 hostgator SHARED (NOT! vps/dedi) pricing, $7.95, although they didn't have VPS back then so can't compare that. A dual xeon dedi is $219/mo, IIRC VPS started around $30-60 whenever that got added.
I think you are overestimating the skills of a couple of math nerds working on a hobby project in their spare time. Even working at a tech company with professionals who are paid to write computer software, a lot of people don't know much about operations. Heck, I had been in the industry for 10 years by that point, including one as a sysadmin, and I wouldn't know where to get webhosting because I was never employed to do that kind of work. Most people in the tech industry (and even more so for people in tech-adjacent academia) only really have deep knowledge in one or two areas of specialization.
That is absolutely ridiculous. This was 2009. We had just started the recession and everyone who lost their jobs was opening up wordpress blogs and everything else. That's why Hostgator sold for $300 million 5 years later and made Oxley a fortune. 2008-2010 was the absolute largest webhosting boom that we've **ever** had. SO many webhosts started post-2008 to pull in that market. The webhosting market is absoultely dead nowadays.
Your grandma probably set up a blog to get some adwords revenue. The entire market was based around WYSYWIG website theme guis and wordpress. Nothing techinical required until you need a vps/dedi.
Once again, I worked in the industry. I am very familiar with the types of mom and pop customers who were buying $9.99 sharing hosting accounts. I spent a LOT of time on the phone supporting people.
Regardless, if they AREN'T skilled in infra/operations, like I said that they aren't - then whats your point? You just corroborated that they're not skilled in infra. So thanks for repeating me? We have a group of programmers who are so unskilled with operations/infrastructure I *know* they aren't SRE/infra types. That was my point. Satoshi or Hatoshi or whoever clearly wasn't an operations person.
If you consider someone technically skilled in linux when they don't know ANY webhosts or are willing to use a free webhost as ... technically skilled then we are absolutely not going to agree. That is absolutely pathetic, insecure, and stupid to do. Do not EVER use a free webhost. You shouldn't need a PHD in Debian to understand the implications of some random person/company who doesn't even charge you remotely normal fees having access to your super secret bitcoin code.
You were a linux sysadmin and didn't know any webhosts/dcs? Did you not work on apache/nginx? If you did you got your .htacess configurations from webhosts. You probably got your ~/.ssh/config from webhosts tutorials. You probably learned postgres/mysql through webhost tutorials. You probably learned systemd/etcd/etc from webhost tutorials. You absolutely learned iptables through a webhost tutorial.
Almost *EVERY* single linux tutorial from the 1990s to the 2015s was some sort of "Set up a LAMP/WAMP stack for a Bookstore company."
This sounds like a blatant lie. or you should've been nowhere near systems. You didn't know Geocities? Angelfire? Tripod? Godaddy? Linode? DigitalOcean? Rackspace? Hostway? The Planet? Liquidweb? 1&1?? Hetzner?? Or any of the tens of thousands of local datacenters we had to rack servers in? You were a linux admin who literally had never had their own server hosted somewhere? Where exactly did you get PRODUCTION experience to become a sysadmin? I learned linux over IRC but I absolutely had tens or hundreds of servers throughout my growth. A HUGE amount of us #linux people had eggdrops/shells even when we were little kids.
If you don't know any webhosts then what exactly were you hosting in the datacenter where you were a linux admin? What, 80-90% of servers in a datacenter are linux servers running a webserver probably. And if you're doing that you're doing exactly what webhosts are doing, except they put a nice little cPanel/Plesk portal in front. I think you're using "linux admin" a bit loosely here or our skills are astronomically far apart.
This is some strange weird whitewashing of an entire era of computing that some seem to know NOTHING about, WHILE arguing with someone who was in the trenches at that time talking about EXACTLY what they did for a living. But no, please have a random hackernews person who wasn't involved in webhosting whatsoever describe to me the industry in 2009.
Anything to play devils advocate. You people are making it sound like this was the wild west and webhosting was so hard and complex back then in the yesteryear of ... 2009.
I don't really understand your hostility here. When I was a sysadmin, I was working on Windows NT and Digital UNIX systems, and we were taking the radical step of moving some of those systems to Debian. There was no hosted website, it was on-prem IT. In subsequent jobs I developed software that ran on various versions of UNIX/Linux, but they were on-prem installs too. By the time I worked in a company that had a SaaS offering, there was a whole team of people dedicated to setting up the operations side. I wrote back end application code, why would I ever need to set up a LAMP stack? At another company I did admittedly set up Apache and do a bit of PHP programming as glue to a Java back end, but once again... on-prem installs for enterprise clients.
I think you are seeing the world through your own lens of being an expert in web hosting. Sure, every software developer who used Linux before knows how to navigate a shell, and most software developers working in SaaS 15 years ago knew how to spin up a local web server. That doesn't mean they knew the names of every company offering web hosting services on the public internet, especially in America (assuming these Bitcoin devs were European), or that they knew about cPanel or Wordpress or whatever other PHP content management system. It's a completely different area of expertise.
At least in the US, I believe we've proven that anti-competition regulations don't work and that the government realized its easier to regulate the consumer than to deal with regulating massive corporations.
If you want to win reelection you're much better off taking massive piles of cash from big businesses and regulate consumers to help create the monopolies. Trying to protect consumers by breaking up monopolies and promoting healthy market competition will see you leaving office in a hurry.
In the US we've proven that it's easiest to just talk about anti-competition regulation to buy votes and then never actually get around to improving or enforcing it. After all, if you solve problems, you lose platforms to run the next election on.
There are always more problems, but for example, it's a lot easier to get voters heated over abortion rights than it is over whether or not a national ID should carry biometric data.
> You barely have any .. anti-competition regulations
this is not true. However, the secrecy implemented around enforcement (bad publicity) causes the casual observer to think so..
There are very large enforcement actions that take place regularly.. they are far from perfect, and the failures tend to be the ones that are amplified in media..
> this is not true. However, the secrecy implemented around enforcement (bad publicity) causes the casual observer to think so..
That isn't the reason it isn't true. The US nominally has quite strong antitrust laws. The statutes are extremely broad in what they prohibit. But the enforcement is lacking and the courts over time have read the laws more narrowly than they were intended to be.
> the failures tend to be the ones that are amplified in media..
The failures are prolific. In a functioning regulatory environment, whether because you don't have regulations that prop up incumbents and don't create regulatory barriers to entry or because you break them up and stop them from buying each other, you wouldn't have industries where any one company has more than 15% of the market. But that is common, not rare, and that is the measure of it working.
> Is that realistic? Intuition is telling me that's very idealistic but I'm prepared to be surprised
There are many markets where this is the case. Which trucking company has significantly more than 15% market share? Which law firm? Which car insurance company? Which university? Which construction company?
Nearly all of the consolidated industries got there through some combination of mergers, vertical integration and regulatory barriers to entry. Even some of the "natural monopolies" like last mile telecommunications are only so because of regulatory choices -- the natural monopoly is actually the roads, which the government owns, and if they provided easy and affordable access to roadside cable trenches there would be much more competition for data service.
> You have? How? You barely have any. And the ones you do have you rarely enforce.
Lack of enforcement was actually part of my point. More broadly, though, we don't need more regulation so much as we need less legal protections that allow companies to get away with it.
> That sounds like a flaw in your political/electoral system not in anti-competition regulations.
No disagreements at all that our political and electoral system is flawed. I'm not so sure if that's the direct cause here though or if its the other way around. Meaning, we could be here because runaway anticompetitive behavior led to political and regulatory capture rather than the flawed political system being the proximal cause.
I think you are missing the third leg of this, which is how those monopolies extend outside of the US and act as part of the states global soft power strategy in those other countries where they are also monopolies.
The US is no more likely to break them up as it is to cut its military budget in half.
> Current performance and test counts on a 40 core system are: $ time make -j $(nproc) check SUBDIRS=.
13s
$ # time make -j $(nproc) check RUN_EXPENSIVE_TESTS=yes
1m22.244s for 9 extra expensive tests
That's pretty respectable, given that coreutils include 98 programs (some are simple like yes(1) and true(1), but most of them are used millions of times a day to do real work: ls(1), kill(1), cat(1), wc(1).
In fact, I used wc(1) to count the number of separate programs inside coreutils.
I have to wonder if this was either just someone being like "I want to make `yes` as fast as possible" or if there was an actual need to make such an elaborate program for something that spits out "y\n" repeatedly.
It also, frankly, feels like the wrong layer for such an optimization. I would have hoped there was a c "write to stdout" method that does all the buffering and performance tricks this thing does.
> If you have a vague recollection of the internals of a Unix program, this does not absolutely mean you can’t write an imitation of it, but do try to organize the imitation internally along different lines, because this is likely to make the details of the Unix version irrelevant and dissimilar to your results.
> For example, Unix utilities were generally optimized to minimize memory use; if you go for speed instead, your program will be very different.
So I think a lot of coreutils etc were written for extreme speed so there would obviously be no crossover with existing UNIX source.
I love using it as an example of how implementing something so simple can lead to learning about so many seemingly unrelated things like context switches and memory alignment.
It's a dumb tradeoff, IMO. The job of `yes` is to produce output only when it's being read, and this implementation copies the "y\n" until it fills up a BUFSIZ-sized buffer (8KB on my system), then outputs that buffer until the write fails. This means you're paying the cost to fill the buffer even if you're only reading one line (which is a common use case for "yes": Responding with "y" to software that is asking you for confirmation, which generally only reads input once.) Which means "yes" will always occupy at least 8kb of RAM even thought it doesn't need to, and you're spending thousands of CPU cycles copying into a buffer even though you don't need to.
That inefficiency is a far bigger sin than the "slowness" of yes needing a write() call for every line it emits, given the intended purpose of the command (which is not, as the code suggests, to saturate a unix pipe as fast as you can.)
Except that it does not display a single yes line by default, so its primary purpose is effectively to print them forever. Also, in this particular corner of the netsphere we should applaud the maddening research for performance that people put into that seemingly obvious command
Edit: riquito is totally right here, I'm wrong. I'm leaving this for posterity, but this is all incorrect: stdout buffering by default will allow the "yes" process to write a lot of data even if you do a naive loop of printf calls. Writing to stdout will not block just because nothing's reading from it, the writes will be held in a decently-sized buffer by default, so even the naive approach will be wasting energy. I'll leave this for my own shame:
It prints them as long as the receiving process reads them. That’s how pipes work. When piping to a process that only reads from stdin once (you know, like the original purpose of the command, to respond “yes” to prompts from processes that are asking for confirmation), it will only write once (because stdout is line-buffered, so printing a string with a newline will block until something reads it.) Filling a buffer with 4,000 instances of “y\n” on the off chance that the receiving process will actually read all of those, is doing extra work that may not be needed.
“Maddening research for performance” in this case is coming at the expense of energy expenditure: you’re always allocating that memory and always filling a buffer with thousands of “y”s even when it’s just going to get thrown out. That should not be applauded. People should be just as concerned about energy usage as they are about wall clock time.
On the other hand, 8kb is just 2 virtual memory pages; overhead of running a process in the first place will be bigger than that. And the time of filling the 8kb buffer is likely not much more than the cost of a context switch to the reading process.
You're optimizing for the wrong thing. Sure, it's "inefficient" for a single prompt, but what about when you're doing a batch operation with millions of prompts that you actually need to be fast?
I expected to read a paper about some obscure Excel trick to manipulate stats output. Instead, this is just old-fashioned manipulation by hand or "imputation" as the paper describes it.
> In email correspondence seen by Retraction Watch and a follow-up Zoom call, Heshmati told the student he had used Excel’s autofill function to mend the data. He had marked anywhere from two to four observations before or after the missing values and dragged the selected cells down or up, depending on the case. The program then filled in the blanks. If the new numbers turned negative, Heshmati replaced them with the last positive value Excel had spit out.
The crazy thing about it is that the author doesn't seem to understand why it's bad. He doesn't appear to be hiding it. He just says "yeah that's what I did, whoops, I forgot to say it in the paper". He's either decided that acting like a complete moron is better than being thought of as an an intentional fraud, or else he really does think it was totally above board.
I had a similar thought. My interpretation is that he genuinely thinks what he did was ok, because Excel has computer magic.
This quote seems unintentionally telling:
> "If we do not use imputation, such data is almost useless,” Heshmati said. He added that the description of the data in the paper as “balanced” referred to “the final data” – that is, the mended dataset.
European providers benefit from lower cross-connect fees in datacenters and more internet exchanges for easy peering. It's not surprising they offer more bandwidth at the same cost.