
A Former Google Executive Takes Aim at His Old Company with a Search Startup - tim_sw
https://www.nytimes.com/2020/06/19/technology/google-neeva-executive.html
======
jka
Common Crawl's public dataset for May/June 2020[1] weighs in at 53TB
(compressed).

As of 2019-05-15, SanDisk's 1TB SD Card was listed[2] for sale at $449, and
the SDUC standard[3] today provides for storage volumes up to 128TB in
capacity.

Elasticsearch[4] is a mature, proven and freely available open source search
engine, and new entrant search engines continue to become available.

We might still be a few years out from this in practice, but I'd argue that
the search engine that could most verifiably preserve your privacy would be
one that runs on your own personal device(s).

[1] - [https://www.backblaze.com/blog/hard-drive-cost-per-
gigabyte/](https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/)

[2] - [https://commoncrawl.org/2020/06/may-june-2020-crawl-
archive-...](https://commoncrawl.org/2020/06/may-june-2020-crawl-archive-now-
available/)

[3] -
[https://en.wikipedia.org/wiki/SD_card#SDUC](https://en.wikipedia.org/wiki/SD_card#SDUC)

[4] - [https://elastic.co](https://elastic.co)

