Looks like it's a cut/paste error. If you do wget www.doioig.gov, this is the page you get. Notice the meta refresh that points to stackoverflow.com.
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="1;url=http://stackoverflow.com">
<script language="javascript">
window.location.href = "http://www.doi.gov/oig/index.cfm"
</script>
<title>Page Redirection</title>
</head>
<body>
If you are not redirected automatically, please click the link to continue to the <a href='http://www.doi.gov/oig/index.cfm'>U.S. Department of the Interior Office of Inspector General.</a>
</body>
</html>
Because it makes sense. These sorts of things normal are allowed to go through so that horrible unintended consequences can be wrought on the unsuspecting.
I bet there's some sort of logic on google's side that states, if a government homepage url redirects to another non .gov URL, that the .gov address should be the canonical page of the site thats being redirected too. Which would explain why stackoveflow's homepage isn't listed in the results. Just a guess.
I could imagine the ranking favours .gov domains, which should be reliable. But surely for anything Google sees as a redirect, the redirect target should be the canonical URL?
Why would this rank them so high though? Does the googlebot actually follow the http-equiv="refresh". If so, would changing http://stackoverflow.com to say http://dell.com make them rank #1 for searching for dell as well?
That sounds too easy. Also, if doioig.gov redirects to stackoverflow.com that just makes doioig.gov an alias for Stack Overflow. An alias shouldn't rank higher than the real thing.
Given that the Department of Interior has been forced to take the whole department (expect for vital services) offline multiple times, I would not be surprised if it were hacked. I am hoping this doesn't get in front of a judge anytime soon as it can have some consequence for people caught in the way.
One such consequence, at one point a judge (curse his or her soul) decided since the DOI needed to be off the internet then all "affiliates" needed to be off the internet. This includes the BIA (Bureau of Indian Affairs). Which included both the .gov and .edu domain. At the time many Tribally chartered Community Colleges[1] were told to disconnect from the internet mid-semester. Even those colleges who paid for their own internet connection and had a .edu domain of their own.
Imagine having two weeks with no internet (most of our students don't have home internet) with classes going on. Finally, someone got the order rescinded for the schools.
I am not very fond of how the DOI handles its internet[2][3].
1) accredited just like state or private colleges with transferable classes.
2) don't even get me started about sending mail from a subdomain with no DNS entry for the sending mail server or subdomain and expecting us to not reject it.
So, apparently in 2001 a class action lawsuit accused the Department of the Interior of mismanaging Indian trust accounts. As part of that lawsuit court appointed hackers broke into DOI computers. The judge ordered all DOI computers to be disconnected from the internet. Some parts of the DOI came back online quickly, but others remained disconnected for over 6 years. Employee desktops couldn't even access the internet, so they could only send emails within their intranet.
Google the Cobell v. Norton lawsuit. It basically is a case study in how not to do internet anything along with some interesting physical site security problems.
Google's algorithm has gotten so big and so complicated over the years, that there are so many cracks and special cases that can cause sites to disappear from or be poorly ranked in search results, unless you're lucky enough to be huge in the tech scene or post here and get your comment seen by a Googler (as I have on occasion).
<plea>Any Googlers reading this, I'm looking into rebuttals of false DMCA requests being ignored by Google for months...</plea>
Also, the internet has gotten bigger, so the possibilities for cracks or errors have widened greatly.
That being said, I do think this specific failure has no reasonable explanation. No matter how I think about it, or look at it, the only explanation I can come up with is essentially: "What percentage of people would need to see a wrong version of a search result before it is reported" or "How long would it take us to fix a knowingly wrong result".
Agreed. A hacker put some malicious code on one of my sites over a year ago and it was quickly buried in search results after ranking 1-3 on page one for years. I fixed it, changed hosts, notified Google and have tried everything else I can think of and it's still nowhere to be seen - even though it ranks near the top of both Yahoo and Bing.
(This hypothesis is supported by a Google cache of doioig.gov showing the message "Due to security concerns, our website will be unavailable until transition to the Department of the Interior web domain occurs. We apologize for any inconvenience this may cause, and are working to speed up the transition. The following contact information is provided to assist you.")
Google obviously puts high trust in this domain as it has many high trust links, typical for .gov domains. In this case this seems to have been enough for the algo to assume, that doioig.gov is the more important domain and should replace StackOverflow.
Everyone seems to be focusing on the .gov site but if you take a look at the Stack Overflow home page, the page that would normally get indexed highly, there is very little telling Google what it is.
There is no <meta name="description"> tag in the header. The H1 tag, important to Google, says "Top Questions". The content of the first page constantly changes.
Plus, I would bet that most of the links into Stack Overflow are to individual articles, not the home page. Any particular article probably doesn't outrank a popular .gov site.
This is just very poor SEO on Stack Overflow's part.
I disagree it's necessarily poor SEO. They should do everything they can to ensure that pages that answer questions have better pageranks than their largely useless homepage. That's what gives the name stackoverflow power, questions actually getting answered.
Certainly the individual questions and answers should rank highly. They do a somewhat better job at SEO on those pages, although there still is no meta description tag, leaving Google to guess what to put on the search results page.
They could do a lot better on those pages too. The real content is buried below lots of javascript and other code. The higher on the page your real content is, the better. They do have pretty good titles and H1 tags on those pages, and the urls are okay, although I would move the name of the article up at least a level. Here is one: /questions/15181744/twitter-number-of-tweets-not-updating-testing-on-local
But the original topic here was a search for "stack overflow" and for that search there could be no better page than the Home page (well, maybe the About page, although if they did their job better they would get one of those results with lots of sub-section links). There was no real attempt to optimize that page for search engines. Optimizing it for the keyword "stack overflow" isn't going to hurt the rank of those question/answer pages.
Too good, even. Google has a bad habit of ranking a question page lower (for matching keywords) than some random other (more popular) question page that links to the relevant question page.
Not sure I disagree with SEO on "Stack Overflow" who is googling for that term?
Practically any technical question I've searched for has resulted in a stack overflow (or super user) #1 result. I'd say they are doing a pretty good job, even if they don't do all the old school SEO stuff
According to the Google Adwords tool, "stack overflow" is googled 74,000 times each month. Not to mention all the permutations.
If I worked for Stack Overflow, I would also try to optimize for more general terms. The questions and answers are good at getting those long-tail keywords, but not so good at the more generic ones like "coding help" or "javascript help" or a hundred others. Stack Overflow isn't on the first page for either of those examples, or many others.
I've often wondered what all those extra bits of information mean and whether there's information encoded in there that I may not want to give someone.
Google results can dramatically change based on various parameters (location/time/browser/history/whatever-new-they-are-testing.) They presumably cram all this info into the search bar in case you want to repro the results.
Google almost universally provides the best results if you are actually searching for something.
Compare the results for queries like "that movie where a computer plays tic tac toe" A very reasonable search if you forget what the name of War Games is, but bing fails it utterly and completely; not a single mention of War Games until the second results page, and the wikipedia page for it doesn't appear until the third. Meanwhile on google the first results page barely has anything that isn't about War Games.
Now, if you just search "War Games" both will do fine. For that matter, so does Wikipedia's builtin search...
You can push it even further and get more vague, something like "that car that james may goes fast in" and while at that point google starts to degrade, it still easily beats out bing.
Suppose I am looking for "that movie with a button". Yup, google get's it, "The Box". Bing thinks I am thinking of Benjamin Button, which was google's second suggestion (I wasn't). Fair enough though, suppose I actually had been thinking about "that movie where the guy get's younger". Both google and bing don't do great, though google still definitively does better.
> Suppose I am looking for "that movie with a button". Yup, google get's it, "The Box". Bing thinks I am thinking of Benjamin Button, which was google's second suggestion
Hm. There is no one google. In the google that I see, for that search, it's the other way around: "Benjamin Button" is #1, and "The Box" is #2 (both on IMDB).
Life before search engines was different and it's easy to be blasé. IMHO, getting the one that you are vaguely thinking of anywhere in the top 5 is a great and amazing technical feat. And also good enough to jog your memory, so #1 or #2 makes no difference.
How do you allocate your searches? Does the better quality of results justify the inconvenience of using several engines or is there a mechanism I'm unaware of?
Well, most of the time I already know the site best suited for my queries (e.g., SO for programming, amazon for shopping, wikipedia for facts, Wolfram Alpha for calculations etc), so DuckDuckGo's bang syntax really comes in handy. Google is astonishingly good at parsing vague queries, but otherwise DDG or Bing are pretty much OK.
I've tried using Bing several times in the past just to give them a fair chance, but every time came back to Google. They are really not at par in the search business. Actually no other company except Google got hold of the nerve of internet searching.
Well, here's an anecdote: when searching for "stack overflow", the first result is more relevant on Bing than on google. So yeah, a few people will go with that one.
I interned at Microsoft so I've been using Bing since last May. Only really have issues with super obscure error messages that I encounter, since Google often indexes the source code that generates the error messages, and Bing just has nothing.
Gotta wonder though: should StackOverflow.com rank high for "stack overflow"? After all, a "stack overflow" isn't necessarily related to programming questions. Yes, you can ask questions on StackOverflow.com about stack overflows, but that's missing the point. So if I have a domain name that's a thing, but my site has very little content related to that thing, should I rank high for queries about that thing?
So are you saying people searching for "stack overflow" want the definition more often than the web site stackoverflow.com?
Because that doesn't make sense to me. Perhaps an upcoming engineer once or twice needs the definition in their life. And they will mostly go on to use the site as well.
Lazy people like me often type the approximate of a web site into Google rather than trying to guess the exact url / bookmarking it. And we do this continuously.
I can't see the definition being more popular than the site.
Would a person struggling with a stack overflow actually google "stack overflow", though? Surely only a very fresh programmer would need to google it, and only if he/she knew what the concept was. But here's what an infinite recursion in a C program prints if I run it:
Segmentation fault: 11
Well, that's C for you. If I try Ruby:
test.rb:2: stack level too deep (SystemStackError)
Ok, still not the same nomenclature. Python:
RuntimeError: maximum recursion depth exceeded
Nope. How about Go? ... Actually, an infinitely recursing function in Go never completes on my machine. I wonder why. Perhaps it's not using the stack the way I expect.
If you do the same thing in Java:
Exception in thread "main" java.lang.StackOverflowError
Ok, there it is.
But if do you get that, would you not google "StackOverflowError", as opposed to "stack overflow"?
(Then again, my Google searches are perhaps uncommonly precise. If a function "foobar()" in library "libfoo" overflowed when processing HTTPS URLs, I would probably google for "foobar stack overflow libfoo https url".)
The Wikipedia entry for stack overflow (the concept) is the fourth hit on a search for "stack overflow". Should be acceptable to a newbie.
> Actually, an infinitely recursing function in Go never completes on my machine.
Maybe your function is tail-recursive? Try to use the return value from the recursive call in a nontrivial way, so that stack storage is necessary to store some local variable.
And my point was that GCC was not as helpful in that case. :-) With the GNU C library you will have to install your own signal handlers and jump through a lot of hoops to get anything more sensible than a "segmentation fault" error. You would think our tools would be a bit more modern by now.
Google couldn't care less about some abstract notion of whether results are "correct". They want to provide the results that people are looking for, and that's it. I'd wager that most people searching for "stack overflow" are after the site, not the concept. By putting stackoverflow.com at the top, they are providing most people with the result they seek.
Your site should rank high for queries about that thing if people routinely search for that term while attempting to find your site.
There are more links on pages with the words "stack overflow" near the link, or with the anchor text itself "stack overflow", pointing to stackoverflow.com than to any other page on the web. Lacking additional context, the web teaches Google's algorithm that when someone is talking about "stack overflow", they are referring to stackoverflow.com more often than anything else.
If you made a website called "Nikon Cameras" today but the site had nothing about Nikon Cameras on it, it would not rank well for the search "Nikon Cameras". Other people writing about Nikon Cameras would not link to your site more often than something actually about Nikon Cameras.
The preferred .gov page has thousands of google plus likes. This seems like a fascinating example of google plus' terrible impact on google search and perhaps google corp.
Something strange happened here. Google displays `stackoverflow.com` content in `doioig.gov` description. For example I can see `careers 2.0` in the description of `doioig.gov` that really doesn't have.
Well actually, "stack overflow" should ideally refer to the programming error. The site being talked about is "StackOverflow", and if that term is queried in any search engine, it should, and will provide the correct result.
Don't be silly, that's just the way their logo is stylized. Their FAQ states it's indeed "Stack Overflow", and their mobile version has a logo with a space.
Because it describes what a stack overflow is, rather than being a site named after the thing?
That would be my thinking. Ok so maybe wikipedia doesn't have to be the top result, but I'd rather have results relating to the thing I'm searching for come above sites just named after it.
I am willing to bet the City of London against a house-brick that the vast majority of people searching for 'ford' are not looking for a definition of the river crossing, but rather the motor vehicle company named after the man named after several other generations of men named after such a river crossing.
Oh maybe, and maybe they search for that more often than they search for the thing it's named after, and maybe more people would find it more useful that way.
Their logo has no space, nor does their web address. People typing in "stack overflow" I bet are not looking for the web site. I'm sure google analytics for stackoverflow.com could tell us if it was true.
doioig.gov ranks highly for both "stack" and "overflow" separately. That site is about stacks and overflows afterall. Perhaps that's why combining the two gives it superpowers.
Do you have your preferences set to return more than the default 10 results? When set to the max of 100, I get a lot of suboptimal results. It's getting better than it used to be, but the worst case I found had slots 1-70 all occupied by the same site.
I think it there must not be much testing for the non-default settings. I much preferred the earlier interface where results from the same site were grouped, capped at a small number, and there was a "More results from this site" link.