
Amazon Kendra: Enterprise Search Service - garysieling
https://aws.amazon.com/kendra/
======
crawdog
Interesting to see more players joining the market. You can't walk into a
large Enterprise and start your search conversation with "Your developers just
build ____". Otherwise customer will want to build it themselves.

The killer feature I haven't seen with many of these solutions is easy, out of
the box integration with internal systems (Atlassian Confluence, JIRA, Remedy,
SharePoint, FileSystem, Intranet). When you have a SaaS search engine it's
difficult to export that data... Even worse to secure it. Ironically, Plumtree
Software (bought by BEA -> Oracle) had all of this in their product in 2001.
What's old is new again... Those features are prime for a comeback.

I think this is a space where Elastic can do well with an on-prem or managed
cloud offering that is "behind the firewall", integrated with customer's
environment. Add in term vector search support, ML for document/query
understanding, and integration with customer's security model (Active
Directory) and it would be compelling.

~~~
fiddlewin
You will need very powerful hardware to deploy the deep learning models on-
prem for incremental learning.

And most of the time, while not indexing, the hardware would be sitting there
sleeping. Probably not very cost-effective for enterprises.

~~~
hueving
> the hardware would be sitting there sleeping. Probably not very cost-
> effective for enterprises.

Not to be condescending, but idle hardware isn't even on the radar as far as
waste goes in enterprises. An on-prem solution that is idle for 364 days of
the year is completely fine for most of these companies.

For the ones that do care, that's what they make virtual machines and over-
subscription for if they even care the slightest about that.

~~~
crawdog
Also - see the rise in popularity of OpenShift/PCS/PKS - flexible
infrastructure is also catching on.

------
whitezebra
Hey HN, we're building a similar product at
[https://evertrove.co](https://evertrove.co) \-- we don't have the limits
Kendra currently has, and integrate with a lot more services. We're still
early and figuring out what the pricing structure should be, but we're making
it a lot more competitive than Kendra is.

We'd love to talk to you if you're interested in using Kendra. We're also
wondering if there's more value on the Question Answering side of things, or
the document retrieval side of things? Would love your thoughts!

~~~
jamra
Is this HIPAA compliant? And how do we contact you? It's a little difficult
not having an email address.

~~~
evertrove
Sorry about that, our mistake. Please email us at founders@evertrove.co

------
MediumD
Shameless Plug Alert

Building a similar enterprise search product at
[http://landria.io/](http://landria.io/) that has a lot of additional features
& enhancements over a unified keyword index + ML.

We also have a terraform config if you would like to boot it up within your
own private cloud!

Any feedback would be great appreciated

------
cj
This is cool, but much of the functionality they're demoing isn't available in
the preview. See the disclaimer at the bottom of the page:

> Kendra’s preview will not include incremental learning, query auto-
> completion, custom synonyms, or analytics. The preview will only offer
> connectors for SharePoint online, JDBC, and Amazon S3. It will be limited to
> a maximum of 40k queries per day, 100k documents indexed, and one index per
> account.

~~~
Aeolun
i.e. the preview is mostly useless for any enterprise...

------
msoad
I've worked with setting up Google Cloud Search. GCS is good for our use case
because all of our employees use Google G Suite for email calendar and one-off
sites. However it took 2 years for it to be somewhat mature enough for us to
actually deploy it. We're still missing connectors for some major data sources
like Slack.

Hopefully Amazon moves faster and offers more out of the box data sources.
They are missing G Suite content that a lot of orgs are relying on these days.
Would be interesting to see what's their strategy there.

~~~
tcbasche
GCS - not to be confused with Google Cloud Storage ;)

------
denzil_correa
Probably this dictionary definition of Kendra (kendră) might make sense in
this context

kendra (IndE)

noun C

a centre for some activity (research, study, business, art, etc.)

------
citilife
Curious how it compares to my offering:

[https://insideropinion.com/](https://insideropinion.com/)

The main issue is giving access to documents, which most Enterprise customers
do not want to do... Further, most info is in employees heads, not in
documentation.

~~~
garysieling
From the demo it looked like an alternate way to search things like corporate
portals. I.e. they're trying to improve the search that products like
SharePoint provide with some ML integration.

~~~
citilife
Very similar, it also learns from context of a conversation what is likely in
a document, a page, video, audio, etc. This alleviates the need to parse
media.

------
aerovistae
Thank God. Even today it's almost always faster to use Google than a site's
search bar. Maybe this will start to change that.

~~~
papito
Even Google's Search Appliance wasn't a game changer. Reality check - it's
DEAD.

Google disrupted the market by factoring in links into its algorithm,
something that is rather meaningless in proprietary context.

~~~
occamsrazorwit
I've heard from the industry that the issue with GSA was that it just wasn't
that good. It tried to adapt Google's core search into a box, but Google's
algorithm is only great because of scale. The box doesn't have as much of an
opportunity to learn from user interactions and thus fails to meet
expectations.

------
CodeSheikh
So the idea is I feed all of the content for my website to Kendra (hosted in
cloud) and whenever a user performs a search on my website, Kendra will return
results to me via a REST(?) call and I can display sorted results back to the
user, right? Is the index going to live locally within my ecosystem for faster
retrieval of results and Kendra can do updates to the index via some push
mechanism? To be honest instead of bootstrapping a solution with Lucene/SOLR-
esque, this might be not be bad idea to ride your search on the shoulders of
Amazon AI search giant.

------
davchana
I do not know if it is inspired or not,and my 2G internet is not that fast to
open this page, but name Kendra means Center in Hindi, with exact spelling

------
joeAtBiome
Hello everyone at HN! The team @ Biome
([https://www.trybiome.com](https://www.trybiome.com)) is building a unified
search platform for finding and organizing internal information. Biome
integrates with your existing SaaS applications (Github, Slack, etc.) to
surface content no matter where it’s stored.

If you are interested in a search solution like Biome, please feel free to
reach out so we can talk more and learn the best way we can empower your team
to be more productive.

------
collsni
Sure are alot of product plugs going on in the comments.

------
stepstep1
Is Kendra just a wrapped Elasticsearch? Initial offering doesn't look like
much, only thing "new" is FAQ.

------
hooloovoo_zoo
It would be cool if they added this to their Kindle e-readers so you could
could perform better searches of your library.

------
stepstep1
Is Kendra just a wrapped ES ? Only thing novel is FAQ creation and the data
sources.

------
lovelearning
Coming from a Solr/Lucene/Algolia background, my opinions on this:

 _What 's good:_

==========

\- Focused search for question and answer databases (such as customer FAQs)

\- ML-based semantic search without requiring any explicit configuration

\- Connectors for S3, AWS-hosted MySQL/PG, Sharepoint. Searching data already
in the AWS ecosystem (S3, Aurora) is now easier, and likely faster and cheaper
too in some aspects like saving incoming/outgoing bandwidth

\- Document-level access control at all pricing plans

\- Managed search (similar to Algolia)

 _What 's similar to existing search systems (Solr / ES / Algolia):_

==========

\- Indexing: All data has to be processed into "field:value" structure prior
to indexing

\- Indexing file formats: Plain text, HTML, PDF, MS DOCX, MS PPT

\- Searching: Usual boolean filters and faceting but only at field level.

\- Searching: Field and value boosts for relevance, but only at index-time

\- Results: Highlighting support

 _What 's missing:_

===========

\- No multi-lingual support. Only English. Given that it's AWS, I'm very
surprised by this actually (or I've missed out something in their docs)

\- Can't configure text analysis for English. I feel this'll return relevant
results for formal-style content, but probably not for informal-style content
like emails.

\- No connectors for common internal systems: Outlook, JIRA, Confluence

\- No built-in support for CSV, XLS, JSON (that one's odd!). They'll all
require preprocessing which means additional infra costs.

\- Doesn't seem to support range- / query- facets. I feel lack of range facets
is a big problem, especially for numerical data.

\- No query-time relevance tuning

\- No field-level access control

\- Scores are not returned in results

\- Common post-searching functionality is missing: rescoring, grouping,
clustering

 _What 's unknown:_

============

\- I don't see any information about phrase or proximity searches. Of course,
they are usually relevance hacks in keyword-based systems, but sometimes users
really need exact phrase matches. Does their ML backend handle this somehow?

\- All search systems fall short while handling proper nouns - names, places,
things, scientific names. It's possible to alleviate it to some extent using
part-of-speech aware indexing. Not sure if Kendra does it in its ML backend.

------
xfalcox
Damn this is a really expensive alternative to Algolia.

~~~
jpadkins
it's internal enterprise search, not site search. harder problem.

~~~
hueving
Is it? The main difference is additional connectors and access control
filters. Both are not really hard from a technology standpoint.

~~~
softwaredoug
It's a nightmare when it comes to the relevance of all those things, with an
extremely long tail of queries, ranging from the lunch menu to narrower,
domain specific knowledgeable questions about the company's domain. There's
usually not enough traffic to figure out what a good result is for a given
query. And users would usually rather complain than be part of the solution.
Throw on top of that doing relevance on top of federated results is a big
headache...

Very different than a consumer-facing e-commerce product search, or searching
your blog, etc

------
mlboss
What kind of technology they might be using for this ?

------
vkaku
Where's the AWS Kitchen Sink Service? :)

------
genS3
do they use the elastic fork they did a while ago?

~~~
arnocaj
They explicitely mention Question Answering. Could it be that they use
something like BERT trained with Squad dataset, and fine tuned on additional
content? If so, Bert is very intense in terms of required GPU hardware...

~~~
genS3
pretty sure they use some of the BERT + rules + a classic search engine + a
LOT of marketing kool-aid

~~~
mfrye0
Can you elaborate on "BERT + rules + a classic search engine"?

We're due to upgrade our basic Postgres based text search to a formal search
solution and investigating different options.

