
Open Sourcing Active Question Reformulation with Reinforcement Learning - jamesjue
https://ai.googleblog.com/2018/10/open-sourcing-active-question.html
======
breatheoften
I haven’t read the paper yet — but from the summary the agent rewrites
questions and learns to find “more rewarded” reformulations — but how does the
environment derive a reward signal useful for observing the quality of a
reformulation ...? That seems as hard as the original problem to me ...

~~~
The_rationalist
I looked at the paper (too Fast I could say bullshit), the techniques consist
of retrieving from a question a list of differently phrased same question
thus, if the existing QA answer system fail with one, it can retry with other
phrasing of the same request. Also they may combine answers from subquestions
to the global visible answer, but this must have many issues.

Nothing directly reward reformulations. But the global answer can be rewarded
by user feedback. Yes this indirection still seems like an issue.

What I would do instead of this strategy would be to cluster extremely
similar/other formulations of the same question by different users and then
store for each frequent common question a list of reformulations (user
generated). Of course the list would be based on the profile of the user.

I do not answer How similarity/identicality of user formulations would be
determined by I have a couple of heuristics in mind.

