
Show HN: Five Thousand Novels, Ranked by Vividness - benjismith
http://www.prosecraft.io/analysis/vividness/percentile/#50
======
christudor
This will come across as very mean-spirited but I find the idea of measuring a
novel’s “vividness” based on the kind of vocabulary it uses (or the voice of
the verb, etc.) to be completely bogus.

~~~
benjismith
The purpose of this metric is to give authors a practical tool they can use to
measure the degree of direct sensory language they use in their writing, and
compare it with authors they admire. It's not a value judgement.

There are other admirable qualities of writing (emotional, conceptual, etc)
that are orthogonal to "vividness", and in the long-run, I plan on developing
metrics for those qualities as well.

For example, take a look at the works of Jane Austen. Here's the analysis of
"Pride and Prejudice":

[http://prosecraft.io/library/jane-austen/pride-and-
prejudice...](http://prosecraft.io/library/jane-austen/pride-and-prejudice/)

She's a brilliant writer, and her prose is highly emotional, but it isn't
especially vivid, according to my definition of "vividness", which is: prose
that evokes a sensory experience (with colors, textures, flavors, aromas,
sounds, and bodily sensations).

There are plenty of ways to write a brilliant novel, and some of them involve
vivid sensory writing. But Jane Austen's brilliance comes from her handling of
emotional relationships.

For a modern example of the same phenomenon, take a look at one of my favorite
authors, Nick Hornby:

[http://prosecraft.io/library/nick-hornby/about-a-
boy/](http://prosecraft.io/library/nick-hornby/about-a-boy/)

[http://prosecraft.io/library/nick-hornby/juliet-
naked/](http://prosecraft.io/library/nick-hornby/juliet-naked/)

I've read every one of his novels, and they're all about human relationships,
but the prose itself isn't very vivid. Nothing wrong with that, though. It's
just a measurement.

As an author, it's helpful to be mindful of these kinds of measurements. The
same thing is true of "passive voice". Using a lot of passive voice is still a
legitimate way of writing, but it's helpful for an author to be aware of the
literary voice they're crafting:

[https://blog.shaxpir.com/thoughts-on-passive-
voice-705fa4dbd...](https://blog.shaxpir.com/thoughts-on-passive-
voice-705fa4dbd291)

~~~
ajmarcic
My impression is that your "vividness" metric is closed source. [0]

Your metric is wholly subjective without a derivation and formula. We have _no
clue_ what's being measured. Your results are susceptible to "Yeah, well,
that's just like your opinion man".

[0] [https://blog.shaxpir.com/writing-vivid-
prose-33283e861358](https://blog.shaxpir.com/writing-vivid-prose-33283e861358)

~~~
glenstein
In these criticisms, I find a laudable impulse to protect language from being
captured by formal analysis, with an ethical impulse something like not
wanting to see an elephant caged at a zoo.

But in outcome, I always seem to see so much more detail in the positive
efforts to analyze than I do in the defenses of language as being beyond
analysis. Effort at making analysis comes with deep engagement on the
different dimensions upon which language can be expressive, and the gist of
challenges to these analyses are "it's subjective. You just can't do it!"

And without even commenting on the substance of these respective arguments,
the types of output they tend to produce makes me more sympathetic to those
making the effort to analyze.

~~~
christudor
I don't think this is about "protect[ing] language from being captured by
formal analysis", it's about calling out bogus analysis when you see it.

This "vividness" analysis is bogus for two reasons: (1) The idea that
individual words have objectively different levels of "vividness" completely
divorced from context (when and where the novel was written, which character
is speaking, etc.) is extremely debatable; (2) The idea that the "vividness"
of individual words makes the novel as a whole "vivid" is a logical fallacy
(compositional fallacy).

I'm 100% behind the use of "formal analysis" to extract new insights into
literature and language – there are examples where it has been done very well
[1] – but analysis has to be robust, which I don't think it is here.

[1] [http://jonreeve.com/2016/07/paradise-lost-
macroetymology/](http://jonreeve.com/2016/07/paradise-lost-macroetymology/)

------
benjismith
Here's an article I wrote, describing the idea behind the project, defining
the idea of "vividness", and explaining how the linguistic analysis works:

[https://blog.shaxpir.com/writing-vivid-
prose-33283e861358](https://blog.shaxpir.com/writing-vivid-prose-33283e861358)

You can click around anywhere on the histogram chart, to see the different
percentile buckets. And you can click on any of the books, to see detailed
linguistics, including a snippet of the most vivid page in the book.

~~~
lebca
Thanks! I'm enjoying comparing novels I've read and finding surprises at
misremembering the prose of some of them. Was it a deliberate decision to not
include the ability to search for specific titles? I've been clicking through
the percentiles and realized could save some time finding the titles if they
were all on one page (Ctrl-F) or via a search box.

edit: manually editing the URL helps with this :)

edit2: and apparently the home page

~~~
benjismith
You can click on the logo in the upper-left corner, or you can go directly to
the homepage, [http://prosecraft.io](http://prosecraft.io), to search for
specific titles :)

~~~
Severian
I'm disappointed there is no Gene Wolfe. I'd be extremely interested at his
metrics.

------
dkuebric
How'd you assemble the corpus? Only some of these books are public domain, did
you have to buy/license the rest?

------
jmenn
Apologies if this is mentioned and I missed it, but does this account for
changes in word meaning or context over time? Earlier literature, such as
Austen, could be considered “not-vivid” unless you’re clued in for particular
hints/phrases. I’m thinking of, perhaps, the use of “Et cetera” for pudenda.

------
voidmain
Here is the "most vivid" page of the "most vivid" book:

"giants, and they were impaled by spear, lance, and crystal shard. A series of
explosive reports echoed across the battlefield as the giants stumbled upon
the Mistcloak’s tripwires, sending lethal blossoms of sharpened steel twisting
through the air. Fell and his minions moved through the giants like an
avalanche. The Under-King shifted his form to a flowing slab of stone and
crashed down upon giant flesh, pulverizing it to blue powder and red ash. Even
the animals, though weak and weary, tore into the giants with the primal fury
of the wild. Claw and fang stood with horn and hoof, wounding with equal
enmity. Beak and talon darted and gouged. The entire island of Mistgard stood
united against the foul armies of frost and fire. Devastation was rampant on
the mountain, but it was nothing compared to the wrath of the Storm Speaker.
Even the stoic Under-King was surprised at the power of the Oldest of Cubs. At
the back of the Pandyr’s armies, high atop the tallest of Fell’s battlements,
stood the lone figure of the Storm Speaker. He called forth and charmed the
very storms from the clouds beneath him and sent electric green-and-blue arcs
of lightning into the giants’ lines, blasting hundreds of their bodies off of
the battlefield and into the mist below. The world above burned. The radiant
morning light was blackened by acrid smoke, making the golden skull radiate a
brown and bloody glow. The Aesirmyr lay strewn with broken bodies: blue and
red"

------
ggchappell
This is an interesting analysis, but I have a serious problem with the strong
implication that high vividness = good.

At the rock bottom of the vividness scale, we find Jane Austen, Isaac Asimov,
Agatha Christie, C.J. Cherryh, and Danielle Steel -- all extremely popular
authors. And at the very top, we find George R.R. Martin, Roald Dahl, Poul
Anderson, Edgar Rice Burroughs, Ray Bradbury, and Kim Stanley Robinson -- also
popular, but generally not quite of the same stature as those on the first
list.

Possibly the reading public is slightly biased toward low vividness.
Meanwhile, I have at least two favorite authors on both lists.

------
sb8244
I was naturally curious about outliers so went to the most vivid book which is
"Pygmy". I haven't read this book, but style of it is listed as incorrect
grammar "English" written in a detached scientific tone. I wonder how much
this threw off the algorithm to cause it to have such a high score (over 100%
and nearly 25% higher than second most vivid).

------
inputcoffee
This is great.

I wish we could do our own arbitrary style analysis on the data set sort of
the way one can do a factor analysis on a portfolio.

I would look at words that are common between Marukami and McCarthy compared
to the rest of the corpus for instance.

------
psalminen
Interesting project. Not surprised to see Chuck Palahniuk at the top of the
list, but was a little shocked how much he dominated it.

------
kermittd
Incredible concept! On mobile (ios 6s) the website needs some work.

------
drenvuk
This is very cool. I want to search, can we search?

~~~
benjismith
You can search from the homepage: [http://prosecraft.io](http://prosecraft.io)

