
Google Directly Embedding Stack Overflow Responses in SERPs - Mojah
http://ma.ttias.be/google-directly-embedding-stack-overflow-responses/
======
SwellJoe
Stack Overflow utterly destroyed the other answer sites for just this reason:
they make choices based on what is best for the user, not Stack Overflow.
Check the traffic of Expert Sexchange, or Quora, vs. the entire SO network. If
all you care about is short term gains, you lose to the company that wants to
build a trusted brand.

At least that's the way I think it should work out, and in this case I think
it has. There is such a clear delineation between good and evil in this
market, and such a clear leadership position held by the "good", that it's
interesting that folks are wringing their hands worrying about whether SO gets
enough impressions out of this. They don't need/want merely impressions. They
want your trust and your participation. I know where I go when I have a
question...how about you?

~~~
eru
I go to Google (or Bing), and the search engine sends me to stack overflow.

To agree with you: occasionally, quora pops up. But I actively avoid them.

~~~
avinassh
Also Quora forces you to signup even to view all answers/comment. (or you have
to do silly url hack, appending _?share=1_ ). Secondly the quality of answers
on Stackoverflow network is much better than Quora. Lastly, I have seen some
answers deleted by Quora mods for their personal reasons.* There is no way
such thing can happen on Stack Overflow.

* this [0] is the question I remember. Directi is an Indian company. And mods who work for Directi deleted the answers which showed Directi in badlight. It doesn't matter whether answers were correct or not, let the downvotes decide them, but this kinda of censorship is not cool.

[0] - [http://www.quora.com/Is-Directi-banned-from-IIT-
placements](http://www.quora.com/Is-Directi-banned-from-IIT-placements)

~~~
danielweber
_Lastly, I have seen some answers deleted by Quora mods for their personal
reasons. There is no way such thing can happen on Stack Overflow._

First, there is no way that there is "no way such a thing can happen" on SO.
SO has moderators who are human. They have guarded themselves against becoming
paywalled by putting a CC license on everything they make, but there's __no
__technical solution that can solve moderators being human.

Second, there are people who complained specifically about the pettiness of
the mods. Google for complains about Stack Overflow and you'll find them,
including on HN.

 _EDIT_ : it now occurs to me your specific complaint may have been "mods
deleting answers." But mods on SO can delete questions, answers, and comments.
You can find people talking about this in meta.stackoverflow.com, among other
places.

~~~
lambda
Note that high-reputation users can see deleted questions and answers on SO,
and petition for them to be undeleted. So there is at least a check on the
ability of the moderators to delete things for personal reasons.

Most of the complaints about deletion or closing on SO are not about it for
petty personal reasons, but instead a general community mindset that is a bit
too quick to jump to closing imperfect questions, questions that are close to
duplicates but not quite, questions that are somewhat subjective but still
have a lot of users interested in them, and so on.

~~~
slantyyz
>> Note that high-reputation users can see deleted questions and answers on
SO, and petition for them to be undeleted.

While overall, SO's a great place to get questions answered, the problem, in
my mind, is that there might be a disconnect between the high-reputation users
and newbies who asked and the newbies come to SO to find answers to the
questions that do flagged down or deleted.

While I don't think that any of the high rep users are newbie haters, I do
think it can be challenging for them to see things from a newbie's point of
view.

~~~
sosborn
In my experience, the standard newbie questions have all been answered. A
quick search is all it takes to find the answer. The problem, of course, is
that newbies rarely know the right questions to ask (read: search), which is
to be expected because they don't posses the right amount of domain knowledge.

~~~
slantyyz
>> The problem, of course, is that newbies rarely know the right questions to
ask (read: search)

When I'm "learning a new tech" \-- let's use switching from Rails to Node as
an example, I usually go to Google first.

Ask a question like "why would I choose Framework X over Y" in Google's
searchbox, and your top results will have the same question on SO, except that
it has been closed because the answer is opinion-based. The few answers to the
closed question are usually fairly objective and insightful and leave me
wishing that more people were able to add their 2 cents.

Sometimes newbies ask a broad question to get a big picture perspective (from
multiple people answering) that can't be provided by a couple of blog entries
found in Google results. And yes, those questions might elicit opinionated
responses, but I would argue that the responses add more value to SO than they
take away.

~~~
crdoconnor
>Ask a question like "why would I choose Framework X over Y" in Google's
searchbox, and your top results will have the same question on SO, except that
it has been closed because the answer is opinion-based. The few answers to the
closed question are usually fairly objective and insightful and leave me
wishing that more people were able to add their 2 cents.

They are often _very_ outdated too - some were closed 3, 4, 5 years ago and
never updated.

~~~
slantyyz
>> They are often very outdated too - some were closed 3, 4, 5 years ago and
never updated.

This is true, but closing a question pretty much guarantees that it won't ever
be updated.

------
kaoD
> So it appears this is happening with Stack Overflow knowing about it and
> approving it, after all — they implemented schema.org.

Not really. As far as I know, implementing schema.org means your data is
structured, not that anyone can do whatever they want with it. The real reason
Google can do it is because Stack Overflow user submitted content is licensed
under CC BY-SA 3.0.

Unfortunately Google is not complying with the license since it requires
providing a link to it and indicating which changes were made[0].

There is a mechanism in schema.org to specify licenses[1]. I couldn't find it
in SO answers' attributes, but Google shouldn't assume that means they can do
whatever they want with the content! In fact, wouldn't that mean the content
is under copyright (unless specified otherwise) and therefore not
remixable/shareable? As far as I can tell, even if those are excerpts,
Google's use does not fall under fair use.

Anyways, props to SO for choosing CC BY-SA for their user submitted content. I
think it's fair (after all SO feeds from their users) and, even if detrimental
to their interests in the short term, in the long term builds trust among
their users.

[0] [https://creativecommons.org/licenses/by-
sa/3.0/](https://creativecommons.org/licenses/by-sa/3.0/)

[1] [https://schema.org/license](https://schema.org/license)

~~~
bryanlarsen
CC BY-SA is not the license that Google is using.

Stack Overflow added metadata to their HTML to enable Google to use the
answers in their response box. This is the primary (and probably sole) reason
for this metadata, therefore it constitutes an implicit license grant to
Google.

IANAL.

~~~
kaoD
> CC BY-SA is not the license that Google is using.

Exactly, so their license is not compatible with SO's (which requires Share-
Alike) and therefore they're violating SO's license until they explicitly
comply with CC BY-SA in those excerpts.

What matters here is which license the source content is distributed under.

> Stack Overflow added metadata to their HTML to enable Google to use the
> answers in their response box. This is the primary (and probably sole)
> reason for this metadata

SO added metadata to enable _anyone_ to use their data in _any_ way (as long
as they comply with the license).

I'm very grateful for having structured data. Even if schema.org is a Google
(+ Microsoft + Yahoo + Yandex) initiative, the idea is to structure the
content, not to give it for free.

> therefore it constitutes an implicit license grant to Google.

I don't think it works that way. I'm not sure if "implicit licensing" will
hold in court (after all, there is a _explicit_ license contradicting it), but
even if it would, the only thing schema.org implies is that the data is
structured.

Here are schema.org terms and conditions:
[https://schema.org/docs/terms.html](https://schema.org/docs/terms.html) It
never says you give them your content for free.

> IANAL.

IANAL either :)

~~~
kbenson
As the owner of the content doesn't SO reserve the right to also release in
whatever other license they deem useful? Is there some source stating they
didn't give Google express use to use the content in the exact way it's
currently being used?

Edit: My thanks to the people who spelled out in the terms. It appears it is
CC licensed.

~~~
icebraining
_As the owner of the content doesn 't SO reserve the right_

SO isn't the copyright holder of the content; the users are.

[http://stackexchange.com/legal/terms-of-
service#3SubscriberC...](http://stackexchange.com/legal/terms-of-
service#3SubscriberContent)

~~~
DanBC
[http://stackexchange.com/legal/terms-of-
service](http://stackexchange.com/legal/terms-of-service)

> 3\. Subscriber Content

> You agree that all Subscriber Content that You contribute to the Network is
> perpetually and irrevocably licensed to Stack Exchange under the Creative
> Commons Attribution Share Alike license.

~~~
kaoD
> under the Creative Commons Attribution Share Alike license.

~~~
DanBC
Did you read the words before those?

> irrevocably licensed to Stack Exchange

> You agree that all Subscriber Content that You contribute to the Network is
> perpetually and irrevocably licensed to Stack Exchange under the Creative
> Commons Attribution Share Alike license.

This bit seems like SE owns the content:

> You grant Stack Exchange the perpetual and irrevocable right and license to
> use, copy, cache, publish, display, distribute, modify, create derivative
> works and store such Subscriber Content and to allow others to do so in any
> medium now known or hereinafter developed (“Content License”) in order to
> provide the Services, even if such Subscriber Content has been contributed
> and subsequently removed by You.

~~~
icebraining
Yes, it's irrevocably licensed to SE under the CC-BY-SA license. Seems clear
to me.

 _This bit seems like SE owns the content:_

Yes, you grant those rights. That's exactly what granting them the content
under the CC-BY-SA license does. You don't transfer the copyright, so they
don't "own" the content.

~~~
DanBC
But the user no-longer owns the content. The user can't control what SE does
with the content.

~~~
dragonwriter
> But the user no-longer owns the content.

Yes, they do.

> The user can't control what SE does with the content.

To the extent that is true (because they have granted a generous license) that
doesn't change the fact that the user still _owns_ the content, can use it
themselves without permission from SE, and further can license it to others
without permission from SE.

The original owner having granted a generous but non-exclusive license to one
party does not make the party benefitting the license the owner of the content
or stop the original owner from being the owner of the content with all the
privileges and rights that go with that.

------
TravisJamison
Just going to leave this here:
[https://s3.amazonaws.com/images.seroundtable.com/report-
scra...](https://s3.amazonaws.com/images.seroundtable.com/report-scraper-
example-1393540705.png)

------
blazespin
I suspect this actually increases exposure and traffic to SO. If you look at
the quoted data it is definitely not enough to decide if that's the right
answer. It's going to get you to click through as you really need context,
like vote ups, compare/contrast with other answers, look at context of
question, etc. I think it's a bright move on SO's part and they really do
control what they share with Google.

It's basically a free ad for SO. Smart.

Also, it could be that Google is paying SO to do this. You never know.

~~~
df07
They are not paying us :) They did work with us to develop a new schema.org
type for Questions ([http://schema.org/Question](http://schema.org/Question))
and Answers ([http://schema.org/Answer](http://schema.org/Answer)) which we
implemented.

~~~
comboy
So do you like what they are doing here or would you prefer they didn't?

------
jasonlfunk
> So it appears this is happening with Stack Overflow knowing about it and
> approving it, after all -- they implemented schema.org. But at the cost of
> pageviews?

Having a useful answer appear at the top of a Google search with an obviously
link to your site can only help you. It builds your brand as an authority and
provides the first link for people to click for problems. I think this is a
clear win for StackOverflow.

------
stalcottsmith
Duckduckgo has had this feature for a while... is Google playing catch up?

~~~
nolok
Given the difference in size and userbase, if duckduckgo were to be the one
playing catch up on new small features like this, they would really be doing
it wrong ...

~~~
nolok
I'm not quite sure why I'm being downvoted here ? I'm not criticizing but
pointing out the obvious.

Or do you think the smaller/newer players aren't supposed to be the ones
bringing new ideas and executions but are meant to merely follow the giants
from a distance and copy them ? In search or in other fields ... Because
that's what downvoting what I said implies.

------
mark_l_watson
While it is a good point about __maybe __affecting page views, the use of
schema.org markup is a tide that lifts all boats: great technology. While
other embedded semantic markup schemes are also very useful, it makes sense
(to me) to have schema.org be the standard that gets used.

------
ww520
Is there a search engine tuned to programming or CS search? I.e. rank
programming related results higher?

~~~
aabajian
I've been working on such a thing here:

[http://gotoanswer.stanford.edu](http://gotoanswer.stanford.edu)

The hypothesis behind the search engine is that correct answers will share
some of the same tokens in common. For example a search for "mysql error 1045"
brings up posts from stackoverflow, linuxquestions, ubuntuforums, and
serverfault that all mention an incorrect password. The "best" answer is the
one that instructs you on how to reset your password.

Other queries you can try are:

"linux check hard drive space" "openssl aes" "reset safari 8 on yosemite" "how
to remove spilled wine from macbook?"

I had concerns as well about whether this type of search engine "steals"
content from other sites. I consulted with a few lawyers and their legal
consensus was that this fell under fair use since I was a) Only showing a
portion (specific posts) from each site and b) Transforming the posts in a
novel way (i.e. re-ranking them). The transformative requirement is the #1
factor for fair use:

[http://fairuse.stanford.edu/overview/fair-use/four-
factors/#...](http://fairuse.stanford.edu/overview/fair-use/four-
factors/#the_transformative_factor_the_purpose_and_character_of_your_use)

------
richmarr
Seems like a short-sighted strategy, unless they have a plan to pay for the
content they're using. Impressive though.

~~~
chriswarbo
Surely relying on ad revenue from people redirected from search engine results
is the short-sighted strategy? It makes your revenue stream rely on the whims
of uninterested third-parties.

The comments on the article even imply that _Wikipedia_ "suffers" from having
their content embedded in Google search results in this way; as if Wikipedia
relies on ads, or has free bandwidth to waste, or cares more about getting
visitors than spreading knowledge.

Unlike Wikipedia, StackOverflow is a commercial venture with ads, but are we
really so jaded that we expect them to sabotage those who want to use their
content? Have CC licenses turned into nothing more than trendy logos, to which
we only pay lip-service?

SO's informal opinion-of/relationship-with Google is well-documented, for
example:
[http://www.joelonsoftware.com/items/2008/04/16.html](http://www.joelonsoftware.com/items/2008/04/16.html)
[http://blog.codinghorror.com/trouble-in-the-house-of-
google/](http://blog.codinghorror.com/trouble-in-the-house-of-google/)
[http://blog.codinghorror.com/the-importance-of-
sitemaps/](http://blog.codinghorror.com/the-importance-of-sitemaps/)

~~~
jfoster
> Surely relying on ad revenue from people redirected from search engine
> results is the short-sighted strategy?

What's the long-sighted alternative?

> It makes your revenue stream rely on the whims of uninterested third-
> parties.

Involvement with search engines makes your traffic rely on the whims of
uninterested third-parties. Ad networks will predictably go wherever the users
are. Search engines are rather unpredictable.

~~~
bryanlarsen
"What's the long-sighted alternative?"

StackOverflow's primary income source is Stack Overflow Careers.

~~~
mooreds
Yes. That's why they took more investment this year:
[http://joelonsoftware.com/items/2015/01/20.html](http://joelonsoftware.com/items/2015/01/20.html)

------
dwd
If you google "html anchor element" you get an embedded result from
w3schools.com

Needs more testing for quality control.

