Hacker News new | past | comments | ask | show | jobs | submit login

Well put.

It is an interesting salvo in what I've started thinking of as the "data war." All three companies have a huge asset in data collection capability, and preventing the others from exploiting it is only the first skirmish among them.

It will be interesting to see if Google offers to pay additional monies to Apple in order to "restore" this pipeline, and whether or not Apple will agree. In one sense, Apple already gives up a data feed by sending search queries to Google.




Data wars is an understatement. Facebook is AMAZINGLY litigious on its data.

Do this. Create a fake company and say you wrote a spider to index Facebook public profile data and that you have like say 100GB ....

Watch how fast you get sued by Facebook.

Mind you this is public data that EVERYONE can see...


> public data that EVERYONE can see

The entire concept of law is based on the premise that not everything that is physically possible should be permitted.


The web makes things interesting in that OP's hypothetical company only has that data because Facebook willingly gives it to everyone who asks. It would be obviously wrong if they were using some exploit to trick Facebook's servers into divulging secrets.


> gives it to everyone who asks

Yeah nah, that's where the concept of agreements comes in. You walk up to Fes Boock and say:

― I want to have business with Fes Boock.

― Fes Boock will have business with you if you promise to not stab Fes Boock in the back.

― I give my word to not stab Fes Boock in the back.

Turns out, this thing is so valuable, it's supported by law everywhere that I know of, in multiple forms, including rather implicit ones such as “ToS.” Which is what allows Fes to sue the stabbing bastard.


To my knowledge making an HTTP GET request and then receiving a document does not involve agreeing to any TOS, implicitly or otherwise. If the server didn’t want to send the data over an authenticated channel, then why does it send the data?


My mailbox opens and closes for my mailman to collect outgoing mail and deposit incoming mail. But anyone can open it. That doesn't mean I want them to, or that they are allowed to. But if my mailbox doesn't want to allow access to private information, then why does it open for unauthorized individuals? Because physically securing it would be a pain in the ass, most people are honest, and if I can keep my mail safe through force of law and social contract, that's easier for everyone, including legitimate users of my mailbox (myself and my mailman).


Your argument holds for mailboxes because it is not a common use case of mailboxes that their owners want complete strangers to check as often as possible because they've left something they want taken.

A better real world analogy is a bulletin board on campus or a wooden power pole.

Lets suppose that it is super common that people staple flyers to power poles, with the expectation that people will read them as they pass by. Your analogy would claim that if I staple a letter to the power pole, expecting that only my friend that I told about the letter should read it, that passers-by are doing something unseemly by reading it, while being surrounded by want ads and for sale flyers that people do want read.

Websites are nothing like mailboxes. The vast majority of websites would prefer that as many people as possible read their contents as much as possible. Email would be a better analogy.


A request is communication with certain semantic content, which pulling on a mailbox handle lacks. There is no general understanding among people nor specific agreement between you and some other party that pulling on your mailbox handle is how to ask you for access to your correspondence.

This is not the case for HTTP. A network protocol is an agreement about the meaning of certain clusters of bytes sent over a network. When someone operates an HTTP server, a reasonable person could conclude that they take HTTP messages to mean what HTTP says they mean. A lot of cases get more interesting because there is also something generally understood to mean, "Please don't access the following resources by automated scraping, independently of whether my server decides to grant those requests."


I'm pretty sure that a server, being a stupid piece of inanimate junk, is unable to enter any agreements or disagreements. In contrast, people, being endowed with free will supported by the ability to reason, need to apply said will and reason when directing actions of pieces of junk, so as to follow the same procedures of inter-party conduct as in direct interaction.

Since a web server, by its primary mode of operation, does indeed more or less indiscriminately send replies to whomever makes a request, it follows that the duty of choice lies with the client. The person operating the client has to apply their reason and follow the inter-party conduct.


> Since a web server, by its primary mode of operation, does indeed more or less indiscriminately send replies to whomever makes a request, it follows that the duty of choice lies with the client.

Sorry, why isn't it the duty of choice the server owner, who chooses to put the server online in the first place? What exactly are these rules you think exist? This is the first time I've ever heard of them.

> Since a web server, by its primary mode of operation, does indeed more or less indiscriminately send replies to whomever makes a request,

This is completely false. The server owner can authenticate GET requests and return an unauthorized response if the client is not permitted to access the document. We are not talking about a situation where a hacker attempts to brute force a password or gain unauthorized access to a server. If the server is on the internet serving anonymous GET requests with no authentication the reasonable assumption is that anyone is permitted to access the data.


Well, if you think that it would be more reasonable and expedient to require users to read a contract beforehand and then authenticate themselves to the service before accessing any content―please, knock yourself out on your site.

It appears that the rest of the web gets by pretty well using the legal framework I've described. Because, you know, they tend to choose things to be pragmatical instead of those that “can be done.”


Sure, but web scraping is a thing, and one that shouldn't be illegal. Therefore if data is public, it should be assumed to be... well, publicly accessible.


> Do this ... get sued by Facebook.

"It's a bold strategy, Cotton, let's see if it pays off for him"


Apple actively avoids no anything - the primary difficulty in collecting nothing, and having no access to it, is users expect not to lose data. Even in that case where they’ve lost all of their devices (which could obviously be just a single one). Making that possible was the topic of Ivan Krstic’s talk at black hat a few years ago.


>> In one sense, Apple already gives up a data feed by sending search queries to Google.

Apple does this under protest. Their top search queries are served through siri lately, and the hope is siri will replace all search so they won't need to utilize google anymore.


Siri isn't a search engine, it is a front end to a search engine[1]. That used to be Bing but now it is Google (see: http://fortune.com/2017/09/25/google-bing-default-iphone/)

There was a time when the Siri folks approached Blekko (which was an actual search engine with its own index, crawler, and ranking Etc. to discuss partnering with Apple (personally I think they should have bought us :-)) But, according to people who should know, there was a cultural mental block at Apple about providing web services at the time. The biggest thing like that they had done was Apple Maps and it was a 'mixed' success. Apple didn't see itself as being a search company.

I used to point out that Microsoft had a phone (Nokia), an operating system (Windows Phone), and a search engine. Google had a phone (Nexus), an operating system (Android), and a search engine. Apple had a phone (iPhone) and an operating system (iOS).

Since that time Microsoft dropped the OS and phone, and Apple never did build a real search engine.

[1] More precisely it is a front end to a simple knowledge base, a local index of things on your device, and when those things are exhausted an internet search engine.


In safari, when you enter terms into the search bar, the "google suggestions" is separate from "siri knowledge" or "siri suggested website" which they surface at the top. It looks like Apple generates that independent of google


Siri suggested website is based on what your device knows about you (that is never uploaded from your device).


and if you select a link through siri, google doesn't get your query (data).


I really can't see how Apple can continue to competitive with Siri without entering the search business (or partnering with another)

I've noticed several times now where Google assistant has been able to answer questions about things in almost real time all thanks to Google's crawlers.

My friend asked it earlier whether USPS delivers mail during a polar vortex and Google assistant told them they didn't yesterday, at least in Chicago.


Here's what I wanna know: to what extent does Google actually "have a phone"?

I mean, when I think about Apple, I think of a company that designs the look, the internals, the case, the glass, the board layout, and even some of the chips. (Sure, they contract the manufacture out, but Apple is deeply involved with designing components on a low level -- not merely farming it all out to some device maker in Taiwan or China.)

But for Nexus/Pixel: how much is Google and how much is LG or Samsung or HTC (yes, I know they bought HTC). I mean, how deep do Google personnel in Mountain View really go? How much do they just hand off to outsiders? Is it comparable to what Apple does? Maybe so. I just can't quite see into it.


Its a fair question, when I was there Google was all over the design of the handsets (the original 'Dream' phone), they did the Nexus One with HTC, after I had left, they bought Motorola Mobility which did the Moto phones and that group mixed in with the Android handsets folks. Then Lenovo bought it from them.

Google's biggest challenge was customer support, they just didn't do the whole "someone to pick up the phone and talk to you" thing.

So I'd say, they have a core capability to do handset design (perhaps some of it residual) and they likely strongly influence the hardware they sell. Is their bench as deep as Apples? No.


Not under protest, but for profit. The current figures are not public, but Google pays billions annually to Apple to remain the default search engine on iOS.


It's easy to change, and one of the default options is DuckDuckGo.


Do you think aapl needs google's "billions"? No, they're making way more money selling "privacy" and building a solid search engine to replace google is high priority for them.


>and building a solid search engine to replace google is high priority for them.

What are you basing that on, exactly? Apple doesn't exist in a market simply to "be" in that market. That's why they jettisoned things like their Airport routers


Why does apple exist in a market? Say phones or laptops? Not calling you out, genuinely curious to your opinion.


So they’re going to build their own search engine? Siri doesn’t do anything outside of your own data.


What if Apple bought Duckduckgo?


Then they still need to build a search engine, DuckDuckGo is just a proxy for google.


No it's not. If anything it's a proxy for Bing, but that's one of many data sources[0] (another of which being DDG's own crawler[1]). I'm not aware of Google actually being one of those data sources; you might be thinking of StartPage, which is a proxy for Google.

[0]: https://duck.co/help/results/sources

[1]: https://duckduckgo.com/duckduckbot


They also have their own crawler https://duckduckgo.com/duckduckbot and additionally use several hundred "sources" for their results https://duck.co/help/results/sources with the main source seeming to be bing. I don't think google is one of these sources.


> DuckDuckGo is just a proxy for google

Bing, I thought?


I know that Startpage uses Google. And Searx uses Bing.

But I think that DuckDuckGo uses multiple sources. Although it's easy to restrict that to Google.


It's a lot more than "just" a proxy.


It's called "siri suggested website" or "siri knowledge" when I search in Safari.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: