Fairly ironic considering this entire article was basically rewritten in twenty minutes from two or three original blog posts, so that the NY Times could run ads against their own version. This entire article adds exactly nothing to the originals.
And what's more, whenever I check my RSS feed from the NYT I regularly see entries like this:
>The 6th Floor: Sentence of the Week
"Five diverse candidates this week."
It seems like the NYT is just trying to distract Google from the fact that 90% of their own content is not much better than the content farmed stuff they're criticizing.
I find your characterization inaccurate. What this author did was tie together a blog post from a week ago with Google's recent algorithm changes. The piece she quoted was an example of the sort of thing Google is trying to combat. What she adds is this connection. Readers of HN probably don't need these dots connected for us. But most readers of the NY Times probably do, since most of them do not have the experience and understanding of the web that most HN readers do. This is the sort of thing newspapers have done for as long as they've been around.
I find your claim that 90% of the content on the NY Times is "not much better than the content farmed stuff" absurd. Most of the writing on the NY Times has a clear voice and purpose, and much of it is original.
"I find your claim that 90% of the content on the NY Times is 'not much better than the content farmed stuff' absurd. Most of the writing on the NY Times has a clear voice and purpose, and much of it is original."
Even their so-called original writing often isn't original at all. Just look at their article on Choquequirao from this week, which took up half of the travel section:
Seem at all familiar? Probably because it's practically a word-for-word rewrite of this article from 2007, with a couple paragraphs from various other articles thrown in for good measure:
I think that one instance of the NY Times publishing a rewrite of their own travel article from four years ago is hardly damning. Compare this to, say, the front page, which links directly to 100+ original, recent articles and even 4 video features. In this context, I consider comparing the NY Times to a content farm absurd.
I don't think that's really fair. It's true that the NYT website is in the same fundamental business as content farms, that is using content to attract pageviews (forgetting the pay wall for a minute), but much more than 10% of the NYT is very high quality, compared to content farms. Sure, there's fluffy stuff printed every day, but there's also the kind of journalism and writing that you can't buy at .9c/word.
In this case, just look at the links to other Opinionator blogs on the right hand side of the linked article, for examples of stuff you won't find in any content farm.
* a five parts series on early computer scientists
* a post about a composer and musician growing up
* Stephen Strogatz's long (and excellent) series on math aimed at non-math people
* something about philosophy and rational choice that was dense enough that I couldn't skim quickly enough to summarize
Plus, I don't see ads on any of these pages. Are other people being served ads on the original article?
edited for formatting
edited again: in another browser, I see that this page does indeed have ads, so I was wrong about that.
in fairness, I regularly find information that's actually useful to me on about.com. Not the highest quality site, but I certainly wouldn't rank it with the demand medias of this world.
In fairness, you are right. I've found a few helpful pages on about.com. (I've found a lot more unhelpful ones, though.) And even Demand Media has shining moments. Thanks to eHow, I was able to fix a jammed disposal last month.
Wait, what?! Fuck, I hate that website. I was once searching for something like "Chernobyl cancer statistics" or something similar and, sure enough, about.com was there in the top 10 for that search. And that was after update Panda. At this point I've started to give up, I'm just adding "wiki" in almost all of my searches, hoping that Wikipedia has something smart/accurate to say about the subject I'm interested in.
A friend of mine got it into his head that there was money to be made from setting up a system to generate bullshit sites in the hopes of attracting users and creating "communities". In fact, he had a couple of these bullshit sites up and was generating a surprising amount of income from them. Of course, he insisted that they were not bullshit sites and that his intent was to turn the promising ones into "real" sites.
I declined to take part in the venture because I felt this was deeply wrong. I also predicted that Google would do their best to stomp out these sites despite selling ads on them (to stop selling ads on them first they have to identify them as bullshit) and that his business might evaporate overnight. And rightly so.
I hope he didn't invest too much of his own money in it. Or better yet, that he scrapped the idea altogether.
What's ironic with this nonsense is that it helps a lot to make progress in the NLP field.
I see it like the security field where attacks and defense push to move forward. The difference is that Google offers "security through obscurity" because we don't know how their bunch of algorithms works.
A large number of workers with Zero Marginal Product can be a sign of an economy in transition -- that is, a sign that people are employed in positions where they may have been productive in the past, but changes in the overall environment have rendered them obsolete.
This does not mean they're incapable of doing valuable work. It just means they haven't yet finished the transition from whatever no-value-added position they were in to a value-added position.
Though, I should say, that masks another contingency with possibly the same effect.
We have an economy which is predicated on cheap energy. As such it's frequently cheaper to produce goods by comparitively inefficient automated processes due to the reduction in manpower required, or to produce goods at distance from their end user as the reduction of labour costs from a remote supplier outweights the transportation costs.
It seems plausible that an energy scarce world could reverse the economics of both situations. It would have enough other consequences that I'm not at all suggesting it as some sort of worker's utopia, but greater employment may be an interesting side-effect.
Do you define the usefulness of a piece of work as "fulfills a percieved need for the individual paying for it" or as "brings added value to our society"?
In the former case - every employee is doing useful work. The latter definition is way more restrictive.
The latter. Useful means that it provides value that somehow improves the human condition. Everything from food to fuel to art to architecture is that. Random junk on the net is not.
A central fallacy of present-day economics is the assumption that all economic activity is of equal value. GDP is a good example: A million dollars spent on healthcare increases GDP by a million. A million dollars spent on war increases GDP by a million. A million dollars spent on rubber dog crap increases GDP by a million. But these transactions are not of equal value.
I personally think that economic central planners and GDP growth targets are a major underlying cause of all this useless make-work. I call it the gerbil wheel economy.
Run, gerbil, run! Gotta make the opaque meaningless numbers bigger!
You can see it very clearly in very centrally planned economies that worship GDP growth, like China. They have entire abandoned cities-- things built for nobody, just to make opaque numbers bigger.
> A million dollars spent on healthcare increases GDP by a million. A million dollars spent on war increases GDP by a million. A million dollars spent on rubber dog crap increases GDP by a million. But these transactions are not of equal value.
Actually, they are. The folks who spent a million dollars on rubber dog crap could have spent it on something else but they spent it on rubber dog crap. That million dollars is the value of that rubber dog crap.
Rubber dog crap is a bad example; how about orthodontics? There most of the value is in increasing your "prettiness rank", but there is a positional externality in that you necessarily decrease someone else's rank.
A lot of orthodontics is fixing things so someone can chew correctly, reducing pain etc. I'll assume that you didn't mean those folks....
Maybe you meant cosmetic dentistry, but even there, should we forever brand someone who didn't take care of their teeth earlier in life? What about folks who were given tetracycline when they were kids (which permanently stains their teeth)?
Maybe there's a good example in dentistry, but let's see one.
As to your "positional" argument, how are you separating out the "bad" positional change from "her teeth kept us from seeing his superior skills"? Or, is the latter also bad?
If there's value to people in looking at beautiful artwork, isn't there value in looking at (more) beautiful people? Improvement of one's appearance isn't entirely relative to others. There are absolute gains.
You don't think that raising the overall self-esteem of a population has a net positive effect on their society's condition?
I mean yes you could argue that it might contribute to a class divide, but I think orthodontic treatment is much more affordable for the middle class than it was 20 years ago.
Are you sure? The millionaire rubber dog crap inventor just died and gave everything he had to malaria vaccines.
My point isn't that all dollars spent are equal, but that the world is so complicated that it is not easy to make a snap judgement about where the next big difference will come from.
In my humble opinion the Global Situation roughly is:
-> 80 % Survival mode, Struggling to meet ends (The so-called 3rd World, mostly)
-> 15 % Semi-Ok, Hired hands in private and governmental institutions of various size (in-debted wage slaves, mostly)
-> 5 % Ok, successful business owners, top managers, political leaders, rich and super-rich (various business sizes, amounts, market caps, etc... imagine any metric indicating financial and social success)
If this picture is mostly right => Perhaps, the really useless work isn't something which is rightly measured as the global number of humans busy with it.
I believe your estimates are overly optimistic. Then again, if you look at just the US, there is a huge gap between how people think wealth is distributed and reality.
The distribution is only going to get worse. I don't see the 0.1% at the top suddenly growing altruistic -- nor do I think any sort of political disruption can change this tendency in the long term. Such disruptions now seem to, more often than not, result in more assets being secured by those of means.
Whatever illusions of a democratic and fair society the 1900s instilled in us are just that: illusions. I predict that within my lifetime western society will indeed continue a trend towards the sci-fi predicions of the privately owned corporate state. To a large degree, this is already reality.
If he was being paid $28,000, he had to be producing at least $28,000 worth of value to AOL. In trying to be neutral, the articles the AOL farmer wrote had to of be of some value to produce that return (the actual required return is probably some percentage higher than the $28k)-- eyes were reading the content and there was some demand for that content.
The ultimate issue should lie with the mechanisms used to drive that $28k+%return in value. If that mechanism is the Google organic funnel -> AdSense revenues, and if your demand is for quality content, isn't Google more of the problem than AOL? AOL is trying to survive and compete in a landscape where bad articles written on the cheap produce a higher aggregate value than quality articles written by authors who command a higher wage.
YOu're characterising value differently. I think the point being made is that the practical value of the articles is usually negligible - few who read them will get anything out of them. They're successful only because they game the existing system - essentially parasitic articles.
I'd agree, but isn't it somewhat unrealistic to expect companies to operate altruistically in a market where you have to dumb down writing in order to maximize profit on a large scale?
I wouldn't want to operate a business the way AOL does or the way Demand Media does, but if there is such a huge area of Google and search engines available for gaming, it's going to be gamed. There is too much incentive for some entity not to step in and profit.
Right, but nobody (well, probably a few people...) is asking human nature to change, much as we might find it distressing. I'd instead advocating changing the rules of the system to encourage actual value as opposed to the parasitic value generated by these articles.
(getting on a soapbox here) I think it's easy to forget that we don't use the capitalist system because it is somehow morally right and pure - it's because so far as we know, it's what works best to motivate people to generate real value for humanity as a whole. If the system stops working, then you tweak it - which is why we have regulation on things like monopolies, for example. In this case, google has changed the system for us, and for that I'm very glad - incentives are now more aligned with producing quality content that's actually useful to someone.
Institutions are not strictly rational. It's not hard to find cases where people have been payed 6+ months without actually doing anything other than showing up and reading a book.
'The insultingly vacuous and frankly bizarre prose of the content farms — it seems ripped from Wikipedia and translated from the Romanian — cheapens all online information.'
I doubt any decent translation would turn a random Wikipedia article into vacuous prose. The style may generally be cold, maybe 'soulless' (encyclopedic?), but more often than not there's a good amount of information and, more importantly, relevant sources.
more often than not there's a good amount of information and, more importantly, relevant sources
Relevant sources "more often than not" on Wikipedia? As a Wikipedian, I have to disagree, based on my personal knowledge of what the stream of newly submitted articles looks like, what the articles identified for Guild of Copy Editors attention look like, and what long-standing articles on my watch list look like. I confess that I don't have a published source to back up my statement (or refute yours). Presumably, there is some legitimate ground for debate on what is a "relevant" source for a particular article. I do note that the Wikimedia Foundation itself thinks more work still needs to be done on improving article quality.
into the related articles. Most editor discussions I see on a variety of article talk pages revolve around "I don't believe this," or "You miswrote this" much more than "What's a relevant source about this?" Perhaps you read mostly articles about different subjects from the articles I read on Wikipedia, and I'd love to hear about examples of articles with good sourcing in general.
Thanks for your reply. I guess I was thinking of low-controversy articles such as http://en.wikipedia.org/wiki/Domain_Name_System , but given your background I'm sure your perception is closer to reality than mine.
More work always needs to be done, but you might be getting a skewed view of things by looking at problem articles. What happens if you pick a random sample of articles weighted by how many page views they get? That's closer to what the average reader sees.
I think it was referring to copying content from wikipedia, translating it into a different language and then back into English. Probably one of the reasons for google to reduce the manipulation by taking down the translate API.
Last week I was reading an article about producing random text that was disturbingly close to human-produced by analyzing word frequency of massive amounts of data.
It seems that it would be cheaper to produce articles this way instead of hiring low-wage workers to do it.
Hmmmm...I guess there's some money in there for someone who'd stoop to that.
I've got a buddy who is a paid writer for AOL/HuffPost, and he writes great, originally researched articles. I was really sad to see recently through his author feed that he's now also churning out their signature content-less keyword articles and brief summaries that link to full blog posts on other sites. At least they're uncredited, but sad to see a good writer sell out like that.
Aftering reading, Googled "Rick Fox's mustache' and got the article in questions as the #1 result, but isn't the gist of the article that project panda fixed this?
I stopped when she "grew confused" by the weird articles. Perhaps she shouldn't be viewing webpages at all. Most people would recognize this instantly as nonsense, click the back button, and select another search result.
Whoa. I knew some people sadly use Google even when they know the url, but this is just... wtf?!
No wonder why there's so many phishing spams, people so uneducated for web usage must get owned all the time... Maybe it's time for primary school to teach children basics about how to use a computer a bit more safely. The problem with this is that every time I saw something like it, it was actually advertisement for Microsoft products dispensed by teachers who didn't knew what a url is (and they're not to blame for that).
I suppose I could've explained it... It's ideal for keyboard-heavy users.
Instead of typing a search term into the search box and Ctrl+Arrow-ing to the search engine I want to use, with YubNub (in Firefox and Chrome, maybe IE?) you type in a command for what you want to do.
"g FDR" does a Google search of "FDR".
b Bing, y Yahoo, yt YouTube, CNN, ESPN, IMDB, and on and on.
There's even some slightly stronger commands:
"tr Chi Eng 晚安" uses Google Translate from Chinese to English, telling me "晚安" means "good night".
Yeah, I mostly use the "g" and "yt" commands, but I love the flexibility without having to arrange search engines. I just remember a few letters/commands. For a long time I'd resorted to typing "site:imdb.com" as my first search term into a Google search box. No more.
Also, you can define your own shortcuts! On a whim, I pointed XBLA to the Xbox Live marketplace at Xbox.com, and it works wonderfully. So now I can just type "xbla Trenched" and get to the page so I can buy the game with a click.
As sesqu said, this features is built into firefox. I only have a window-wide url bar on my firefox and extensively use these "keyworded" bookmarks.
So in my url bar I can "g foo" to search "foo" using google, "wp foo" for wikipedia fr, "wpe foo" for wikipedia english, "imdb foo" for imdb, "yt foo" for youtube, "dm foo" for dailymotion, "gdv foo" to load the document which url is "foo" into google docs viewer, "tw" take me to twitter, "fb" to facebook, "hn" to Hacker News, "mail" to gmail, "reader" to google reader, "in" to linked in... I don't have to all list in my head, it's mostly a finger habits now.
What is really cool since firefox has the awesomebar is that it is aware of this and display the actual url so when i type "imdb foo bar" i see "http://www.imdb.com/find?s=all&q=foo+bar at the top of the proposed url.
This is really handy for people like me who mostly use their keyboard: this powerful firefox feature is just a ctrl+L away when I'm already in firefox :-).
This functionality is actually built into Firefox and Chrome, and I'd be shocked if at least Opera didn't have it too. It's the "keyword" field in search engine settings.
I believe multiple parameters are also allowed, but not having used them, can't vouch.
To use an analogy, humanity is still an equivalent of a bunch of 3 year olds alone in a middle of a busy city square as far as web maturity goes. Some might know not to wonder into the car traffic but would still fall for a stranger with a candy in his van.
The unification of search and URL bar in Firefox, Chrome, and mobile browsers makes this even more common, at least for me. I don't always know if the browser is going to find a bookmark, something in my history, search Google, or try to open an URL.
For what it's worth, they were definitely pushing it when I was in college, and as far as I know, it's expanding into the high school level and possibly further. And actually spreading it was an explicit goal of many librarians and librarians-to-be I knew.
It could be better, certainly, but it's also being worked on.
Design related spam seems to be an area where a stack exchange like site could step in and really take off.. lists of lists of lists of blogs of my favorite 50 photoshop brushes..
I agree, there's a lot of tiring unrelated stuff on the internet, the only way to avoid it, sadly, is to wade through it, till you find what you're looking for.
And what's more, whenever I check my RSS feed from the NYT I regularly see entries like this:
>The 6th Floor: Sentence of the Week
"Five diverse candidates this week."
It seems like the NYT is just trying to distract Google from the fact that 90% of their own content is not much better than the content farmed stuff they're criticizing.