
Voyages in sentence space - bkudria
https://www.robinsloan.com/voyages-in-sentence-space/
======
TuringTest
The voyage looks somewhat random, yet inspiring. I can imagine using it to
grab ideas for the overarching story arc in some storytelling.

> I am waiting at the entrance of the fantasy kingdom.

...

> I have a revelation about reality.

Main problem is, it doesn't look at all robust. A small change in a single
word at the start or end sentence produces a whole different result. That
doesn't look like a "space" at all, as insights about closeness do not work in
it.

~~~
OscarCunningham
I guess that can happen sometimes in actual spaces too. The shortest path from
a point near the north pole to the south pole can change dramatically if you
slightly alter its endpoints. Given a high-dimensional space it could be that
most possible paths behave in this weird way.

~~~
jimmytidey
How useful is the spatial analogy if the space in question defies all our
spatial intuitions?

I can see it has mathematical meaning, but isn't the whole post about having
intuitive understanding of sentence space?

~~~
gugagore
I think it's less about intuitive understanding, and more about having a flat,
numerical representation of a sentence that (qualitatively) satisfies some
properties.

------
xtiansimon
_"You can also increase or decrease the distance you peer into sentence space
from your initial location; as you increase it, the results get more diverse.
[...] What you’re seeing is the transition from the richness of arbitrary text
to the regularity of this particular sentence space."_

There's a huge gap in here. Mr Sloan has worked on the idea of a gradient in
language. He's worked to realate the craft of colors to sentence creativity.

But color craft on a computer is just an abstraction for picking colors out of
the gammut of screen displays. And this gammut is a subset of the visible
spectrum, which is a subset of the electromatnetic spectrum.

He didn't work at all on how this project relates to language itself. I guess
I have to read the papers.

What exactly is 'sentence space'?

~~~
YeGoblynQueenne
>> What exactly is 'sentence space'?

It's a multi-dimensional cartesian space where each sentence occupies a single
point (i.e. a set of coordinates).

The point is that once you have such a sentence space, you can then do
geometry over it. Like calculate distance between sentences etc. The hope is
of course that the sentence space is constrtucted so as to encode the
_semantic_ relation between sentences, so for instance sentences that are
close to each other encode similar meaning etc.

~~~
xtiansimon
Sorry if I wasn't clear--What's in there? You've mentioned 'each sentence' and
a 'multi-dimensional cartesian space', what _constitutes_ the 'space'? Each
sentence from where? What determines a set of words is a sentence?

Is it a WordNet space? Is it a sampling from a corpus? A neural network
trained on what? the internet?

One tree? What's the forest? Old growth? Jungle? Alpine above the tree line?!

~~~
YeGoblynQueenne
Ah, OK, you were joking. Sorry for missing it, I had my humour removed at
birth :)

~~~
xtiansimon
The glass is half-empty, or half-full is not a joke, but a an ambiguity of
language. “Sentence space” could refer to both the sentence itself,
numerically quantified, and the context which makes a sentence grammatically
correct, composed of senseable words which go together. Colors don’t walk.
Right?

My response is just a comment, but even if you disagree with my example, the
question remains—what constitutes the space the alternatives are drawn from?

------
atrilumen
Can this be used for style transfer?

I'm looking for a solution to create a consistent personality for a chatbot,
where the content all is created by different team members.

------
gojomo
For a similar tool in the somewhat simpler-to-understand ‘word space’, see
Travis Hoppe’s ‘Transorthogonal Linquistics’:

[https://github.com/thoppe/transorthogonal-
linguistics?files=...](https://github.com/thoppe/transorthogonal-
linguistics?files=1)

Shayne Miel has observed that its results for the space from ‘sociology’ to
‘mathematics’ closely match an earlier XKCD comic:

[https://groups.google.com/d/msg/gensim/a8PF8CInRKk/6XWcp87WB](https://groups.google.com/d/msg/gensim/a8PF8CInRKk/6XWcp87WB)

~~~
xtiansimon
Warmer. _"English dump of Wikipedia that was sentence and word tokenized by
NLTK."_ Wikipedia would certainly cover the morphology of a 'space'. I'm
curious how the linguistic relations are derived from 'context', which I
gather is what word2vec does. Although NLTK does have an interface to WordNet.

