
Proselint - g1n016399
http://proselint.com/
======
IanCal
This sounds interesting. As a bit of constructive criticism, please put some
examples high up.

You tell me it does cool things. Great, show me. I've looked about on the
various pages and can see only one example and I don't understand it:

    
    
        text.md:0:10: wallace.uncomparables Comparison of an uncomparable: 'unique' can not be compared.
    

What's the context of this, what's the error it would have caught in my
writing?

The tool is in a perfect place to show this off as it's text.

~~~
shawabawa3
Some people feel you should _never ever_ say things like "more unique", "most
unique" etc

Which I think is equally as misguided as trying to force "data" to be plural,
and that "less than 3" is wrong

~~~
amelius
In a mathematical context, something is either "unique" or it is not. There is
no in-between state.

But you can easily define it to mean something else. And you can even make
"uniqueness" comparable.

~~~
stared
In a mathematical world you operate with abstract objects. In the real word -
you need to abstract things; before that _everything_ is unique; after that -
well, depends on your abstraction. So unless you talk about mathematics,
things can be more or less unique.

~~~
MawNicker
This is a excellent and subtle comment. You seem like someone with a tolerance
for philosophical nit-picking. Please forgive me if I'm mistaken.

Instead of saying every _thing_ is unique we could simply say that there is no
_thing_. A _thing_ is itself an abstraction. The concrete world is without
inherently distinct _thing_ s. We must abstract _thing_ s for "unique" to
describe some _thing_ at all. As you implied, this process is arbitrary. Every
way in which you could abstract _thing_ s implies a distinct notion of
"uniqueness". To simply select one "uniqueness" (like mathematics) is
arbitrary. But to consider every possible "uniqueness" equally is also
arbitrary. Without prioritizing forms of "uniqueness" we can only construct a
partially ordered set. So when you void a fixation on mathematics, things can
be more, less or "incomparably" unique.

I suspect most pairs of things are incomparably unique. Further, I suspect
most binary qualities are predominantly incomparable. I don't know that you
should _never say_ things like "more unique" but it might be fair to issue a
warning in a prose linter. Any binary quality used as a continuum requires an
arbitrary combination of it's distinct forms. If this isn't specified then it
only has meaning for those who already know what it is.

~~~
mbrock
Some philosophers, thinking especially of Graham Harman, have started reacting
against the now sort of commonplace idea that "there are no things (or
objects) in reality."

From a common sense perspective, it's obvious that there are things. Sure, you
can point out the flux and decay of all entities, but still, this table here
is a coherent thing even if it's made from parts in a temporary arrangement.

In some sense, philosophy itself is destroyed when you go down the path of
denying objects, since philosophy crucially deals with concepts, and concepts
are "thought objects."

Harman describes two modes of denying objects: undermining and overmining.
Undermining is the tendency to say "really, this object is just a composition
of these other particles," while overmining is the tendency to say "this
object is just a modulation in a grand monistic entity."

Instead of that, he recommends an ontology of objects that's pretty
interesting and fun to read about. He would, I think, agree that objects are
unique in that they are (in programmer jargon) "pointer equal" to only
themselves... and each real object, for that reason, has an infinity of
potential that's never exhausted by any "arbitrary" perception of it... yet
still, we perceive other objects not directly, but through aesthetic
caricatures, and on that level you might have different degrees of uniqueness.

~~~
MawNicker
Thank you very much for this comment. I'm an armchair philosopher and I hadn't
heard of Graham Harman. His notion of objects is beautiful. In one motion
nihilism both compels me to accept my sins and deprives me of any path to
salvation. Harman's objects capture the essential impetus of nihilism without
ultimately voiding conception. In fact, they even capture the paradox of
nihilism. The denial of objects _necessarily_ implies an objective system:
dualism. First there is an object contriving infinitely varied "caricature
objects". Then there must be another object that is (infinitely) not any of
those. This expression of our relationship to The Great Unknowable Reality is
much saner. It doesn't overmine. It doesn't undermine. It doesn't leave me
oscillating between affirmation and denial. Also, most importantly, I'm given
a clue to further knowledge. I _am_ that contriving object. This is just a
caricature of reality. My participation in it's consideration is entirely
arbitrary. I'm haunted by the concern that knowledge exists which cannot be
captured by this freedom. But for now these objects certainly get us further
than _nothing_. ;)

------
jonstokes
I'm a writer and editor, and I dislike the idea of this tool quite a bit.

1\. Writing isn't coding. In coding, you can do various types of "cargo cult
programming" and "copypasta" and what-have-you -- in other words, as long as
the code runs you don't necessarily have to know why or how a programming
idiom or convention works, or how/why expressing it one way in code is better
than expressing it another way in code. This definitionally untrue with
writing. If you don't know the why/how of something, then it's better for you
to botch it and let the reader attempt to parse it so at least they know what
they're dealing with and how to interpret it ("oh, this guy's a non-native
speaker, so I'll adjust my reception accordingly" or "ah, this person is kind
of clueless about the whole sexist language thing, which is good info for
me.").

2\. 90% of writing style advice falls into one of two categories: a) hotly
debated, and b) totally wrong. Most of it is in the latter category, and this
includes Strunk & White (just use google for numerous takedowns of that text).
I looked through the PR queue and saw that it consists of eager coders finding
style advice from various sources and trying to work that into the tool. That
is terrible, terrible, terrible... This will guarantee that the tool will
represent a collection of awful writing advice gleaned from dubious sources
and wielded with unforgiving ignorance.

This tool may be a terrible idea, but the idea of automated prose linting is
not terrible. Most beginner to intermediate writers have tics, and as an
editor I often have a couple of writer-specific find/replace things I do when
I get a new piece from a particular writer (e.g. "this person uses 'however'
when she means 'but', and this person overuses these four business jargon
terms, etc.). If editors were able to easily compose and execute writer-
specific linters from within something like Wordpress, that would probably be
pretty great.

But this particular command line tool is destined to be either totally unused
or massively abused.

I'm sorry, I hate to be mean... or, actually, there is a small part of me that
enjoys playing Mr. Party Pooper when I see a mob of enthusiastic programmers
trying to tie down some great cultural Gulliver with a thousand tiny little
automated, black-and-white rules.

~~~
suchow
Thanks for the feedback. These are issues we've thought about, and we came to
different conclusions:

re 2, you'll see at
[http://proselint.com/approach/](http://proselint.com/approach/) that one of
the guiding principles of Proselint is that we defer to experts. In practice,
that's meant almost all the advice comes from Bryan Garner's usage guide,
Garner's Modern American Usage. He is a careful compiler of advice and you'll
find that he is almost never "totally wrong", and when his advice is debated,
he knows it, notes it, and provides a thoughtful discussion.

re 1, we think of Proselint as eventually being useful as a training tool, a
way to learn the conventions. Note that natural languages are large, with so
many low-frequency terms that nobody can learn the whole language. Why err if
an automated tool can help? Consider for example demonyms, what you call
people from a certain place. How many people know, for example, that people
from Manchester are Mancunians, not Manchesterians? Rather than call someone
by the wrong name, with Proselint the voice of an expert gently corrects you,
and you learn a cool new word.

We aren't a mob of programmers, we are three people who love language, respect
it, and think we're 2% of the way to making a great tool, one that The New
Yorker could run over its stories to flag issues that its own editors would
flag anyways. (In fact, we've done this, running Proselint over a corpus of
highly vetted text, and have found numerous issues.)

~~~
jonstokes
Calling someone from Manchester a "Manchesterian" instead of "Mancunian" is
not wrong, or even necessarily bad. Rather, it communicates something to the
reader. Depending on the context, it could mean this person doesn't know that
the correct term is "Mancunian", and did not look it up or even know that it
should be looked up, all of which gives me useful info and context about the
writer and their education level and the amount of effort they put into the
piece and the amount of editing it underwent and so on. At the very least I
can surmise that the writer is not a Mancunian. Or, it could mean that the
writer is attempting to be clever.

Widespread use of proselint to correct this type of thing wouldn't improve
writing. Rather, it would just add another interpretive option to the above
range of scenarios, i.e. "ah, I can tell that this writer did or did not run
that proselint tool before submission, because their text is or is not
littered with boilerplate proselintisms."

The way to improve genuinely bad writing is not with rules and tools -- it's
with lots of reading, a little mentorship, and lots and lots and lots of
practice.

~~~
amperser
> Calling someone from Manchester a "Manchesterian" instead of "Mancunian" is
> not wrong, or even necessarily bad. Rather, it communicates something to the
> reader. Depending on the context, it could mean this person doesn't know
> that the correct term is "Mancunian", and did not look it up or even know
> that it should be looked up, all of which gives me useful info and context
> about the writer and their education level and the amount of effort they put
> into the piece and the amount of editing it underwent and so on. At the very
> least I can surmise that the writer is not a Mancunian. Or, it could mean
> that the writer is attempting to be clever.

If the only goal of writing were to allow accurate assessment of the writer,
then I would agree. But there are other reasons for writing — informing,
persuading, clarifying, &c. — where writing clear, consistent, and idiomatic
prose can help. Yours is a condemnation at all attempts to improve writing
beyond the first-draft capabilities of the author.

> The way to improve genuinely bad writing is not with rules and tools -- it's
> with lots of reading, a little mentorship, and lots and lots and lots of
> practice.

Agreed, Proselint is not the right tool to improve genuinely bad writing.
Reading great authors and sweating through drafts is what we'd recommend to
get better at the craft, too.

------
rosser
I can see a lot of value for this sort of tool, and might even play with it
myself, for sake of evaluating whether or not to incorporate its _suggestions_
into my writing. At the same time, however, I have some wariness that its
widespread use could actually have a shaping, and, specifically _homogenizing_
, effect on language. For me, a large part of the beauty of language is how
facile it is, how judiciously breaking its rules can create a more artful and
compelling means of expression than linted — if you will, "prosaic" — prose
seems likely to offer.

~~~
lauritz
I agree!

But still, it corrects incorrect things that my spell checker doesn't see,
like inconsistent spacing and 'goofy approximations' like (R) for ®. (Depends
on your definition of incorrect, but I personally would not mind at all if
these things were homogenized for everyone, it would not take any richness out
of the English language).

What I'd like (--help doesn't list such an option) would be to be able to
enable some checks with a flag while disabling other parts (the ones that
contain suggestions you can elect to break).

~~~
nether
That's cool but it sounds like this tool is way oversold. It namedrops DFW and
other great authors then shows examples of it correcting spacing and "brb."
This isn't stylistic revising that takes you closer to those writers, it's
just simple corrections.

------
dcw303
This sounds promising, but I think a lot of potential users would be deterred
by the lack of examples.

This positively screams for a online interface to test drive.

~~~
train_robber
[http://proselint.com/write/](http://proselint.com/write/)

~~~
biturd
Are you claiming you can paste in your own copy, and it will run against it? I
see no text area in Chrome or Safari, what am I missing?

~~~
Piskvorrr
You're missing a stupid CSS trick; apparently everything should be flat flat
flat nowadays, even if that means throwing UX out of the window. The sample
text is editable, even though it looks as if it's not.

~~~
biturd
Thanks, I feel dumb and I feel it is dumb design. I should've known, and did
figure it out eventually, but that won't pass the grandma test, and yes, it is
a CLI tool, but so what. If a wanna be developer can't figure it out, the cli
tool may have just as many different conventions from other cli tools as well.

------
pron
Probably a stupid nitpick, but this bothers me:

> detecting grammatical errors is _AI-complete_ , requiring human-level
> intelligence to get things right.

(emphasis mine)

First, there's a problem of usage. When in CS we say that a problem is _class_
-complete (like NP-complete), we mean that the problem belongs to the class
(which in this case is true, because human-level intelligence can check
grammar), but also that it is _class_ -hard, which informally means "at least
as hard as the hardest problems in _class_ ", and more formally means that any
other problem in _class_ can be cheaply reduced to the problem, and so finding
a suitable solution to the problem is identical to finding a suitable solution
to all other problems in _class_. Not only checking grammar not known to be
"AI-complete" then, we don't even know that human-level intelligence is
necessary to solve it.

But the reason this bothers me even though I fully understand the statement
was made informally, is a little deeper than that: we don't even know what
"human-level intelligence" (or intelligence in general) is, let alone what AI
means. That people refer to AI as if it's a thing rather than a very vague
notion, clouds how people think of AI research as well as intelligence. I
would have simply said "we don't know of good algorithms to dependably check
grammar, and this appears to be a very hard problem that may require
intelligence".

------
MichaelBurge
If you're on Ubuntu, you want to run 'pip3 install proselint' rather than 'pip
install proselint'.

I ran it on a couple 800 word emails and it didn't catch anything except me
using 2 spaces instead of 1 in one place. I also ran it on my city's sidewalk
maintenance ordinance, and it didn't report anything.

~~~
mdpacer
Part of the goals of proselint is to minimize the number of false positives
that traditionally clutter the results of style checkers, resulting in users
ignoring the changes when they see them. We want to be reasonably certain
before raising an alarm. You can read more about the precise metric[^fn1] we
use here: [http://proselint.com/lintscore/](http://proselint.com/lintscore/).

And yes, `python3` for the win. :)

[^fn1]: If you wanted to be truly precise, it's a parametric family of
metrics.

------
czechdeveloper
Does anyone know about similar tool for scientific papers? Specifically to
help non native English speakers to write high quality scientific papers?

~~~
rodion
Something along these lines:

[http://matt.might.net/articles/shell-scripts-for-passive-
voi...](http://matt.might.net/articles/shell-scripts-for-passive-voice-weasel-
words-duplicates/)

[https://github.com/bnbeckwith/writegood-
mode](https://github.com/bnbeckwith/writegood-mode)

------
MatthewWilkes
While the idea is interesting, I do worry about the proliferation of linting
to prose. Especially the hint about authoritative near the end of the article.
Linters turn guidelines into steadfast rules in programming, removing all
ability to use judgement if you want your PR merged. I personally want less of
that, not more.

~~~
pablasso
How is standardization a bad thing in programming? in prose I can see the
argument, but in programming you should always aim for standardization for
code maintenance.

~~~
MatthewWilkes
For example, the Python best practices document recommends 1 blank line after
functions and 2 after classes. Linters enforce this. However, this can be a
detriment to readability in some cases, such as closures or classes that have
no body, only superclasses.

Some might say you can mark lines as not being linted, but that then makes the
change vulnerable to bikeshedding. For some people, being able to force the
conversation to not happen because the linter is authoritative might be good,
personally I prefer to follow the guidelines but be aware of the fact that
they are there to aid in understanding for future coders not to adhere to a
standard.

------
kbenson
Ah, another part of my brain I can offload to an external source. It will be
interesting when we get to "social-lint", so those of us that are no good at
social interactions (through lack of ability or lack of willingness to spend
the effort to combat that with ) or that feel they spend far too much
brainpower on social interactions to make up for lack of natural ability can
benefit.

------
yitchelle
Can someone explain in layman's terms how this is any better from an app like
the Hemmingway Editor [0]? Both analyses the text and makes suggestions to
make it better.

[0]- [http://www.hemingwayapp.com/](http://www.hemingwayapp.com/)

~~~
hk__2
Hemingway is an _editor_ while Proselint is a _tool_. The latter can be
integrated in any editor. That’s the main reason I ditched Hemingway (the
editor) because I couldn’t just copy/paste text in it to get some suggestions.

~~~
banach
In what way were you not able to copy/paste into Hemingway to get suggestions?

~~~
hk__2
I was; it was just tedious.

------
squimmy
I question how useful a tool like this is for a skilled writer.

Prose isn't code.

Many key elements of good writing are based around the idea of knowing the
rules, and then _carefully breaking them_.

~~~
routerl
A linter doesn't _prevent_ breaking its rules, it just _notifies_ the writer
of which rules are being broken.

I was writing some C earlier and my linter warned me about "incrementing a
void pointer". However, I understood the context better than my linter, knew
that I'd be compiling with gcc (which allows void pointer arithmetic), so I
ignored the warning and carried on. My code compiled and ran nicely.

When it comes to static analysis, I think (creative) writers, like
programmers, wouldn't care about warnings. This is already true of spell-
checkers (e.g. my letter-writing character is English, but my text-editor's
yelling about "colour").

~~~
chei0aiV
Sounds like your system is in US English rather than the variant of English
you are used to?

[http://grammarist.com/spelling/color-
colour/](http://grammarist.com/spelling/color-colour/)

~~~
routerl
Sorry, I guess that example was too terse.

I was referring to a hypothetical American creative writer, writing a scene in
which a British character writes a letter. In this hypothetical work, written
in US English, there would then be a section of text that used UK English
spellings. The naive spell-checker would not understand the context, and would
flag these as misspellings.

This was meant to be analogous to my "incrementing a void pointer" example;
the static analysis tool produces warnings which the author knows to ignore.
In the C programming case, my function was passed the size of the objects
comprising the array pointed to by the void pointer, so the linter was wrong
to tell me I was making a mistake. Similarly, the spell-checker was wrong to
say "change this instance of 'colour' to 'color'".

Similar considerations apply to prose linters.

Polonius would be a lesser character if shed of cliches, and a good writer
would know to ignore the linter's opinions on the matter.

------
vpontis
Can someone who has tried this share their experience?

It sounds really awesome but it's very hard to tell if it's going to be more
annoying or more useful. Maybe it would be useful to have some example linting
errors on the homepage.

Either way, I really love the idea!

~~~
vpontis
Hmm, I tried it out. Doesn't seem too useful yet and there is some polishing
to be done so hopefully this continues to go through further development!

One needed improvement: display the offending line on errors. Then you don't
have to toggle between file and console to contextualize the errors.

------
stared
Is it already in Atom or Sublime Text?

EDIT: I must be blind - they say about ST plugin (although they don't link to
it). [https://packagecontrol.io/packages/SublimeLinter-contrib-
pro...](https://packagecontrol.io/packages/SublimeLinter-contrib-proselint)

~~~
vikeri
"There’s a plugin for Sublime Text." Didn't see anything about Atom though.

------
synthmeat
Here's a suggestion...

Have copy on web site be intentionally incorrect, red-underlined with (small
modals? tooltips?) that show what's been corrected/suggested by the tool.

~~~
aroberge
Like [http://proselint.com/write/](http://proselint.com/write/) ? ... which is
also editable

------
gepoch
See also write-good: [https://github.com/btford/write-
good](https://github.com/btford/write-good)

~~~
ayushgta
Gitbook has open sourced their proofreader at
[https://github.com/GitbookIO/rousseau](https://github.com/GitbookIO/rousseau)

------
nmstoker
Looks really interesting. I'd done some preliminary investigation into whether
this kind of concept might work for the style guide at my company, but I never
got time to take it further.

Is there any word on business model / the intentions of the developers? Is it
something that's being open sourced and then integration assistance would be
commercialised?

------
kmfrk
This is very cool and needed, thank you.

Could you include a sample .proselintrc? rc files tend to have very different
opinions on how to be formatted: dictionaries, JSON, bash-argument syntax, and
so on. (EDIT: Ah, found one:
[https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1...](https://github.com/amperser/proselint/blob/cd428bb0ecc5530c1e2b269e993f1a57d1e8ff21/.proselintrc).
Can’t quite get it to ignore butterick, though.)

I find it a little curious that you use a Markdown example and lint for curly
quotes and unicode ellipses by default (butterick), since Markdown discourages
such pre-formatting in its syntax, but that’s just hairsplitting, of which I
can tell by your swelling Issues count that you have plenty of as it is. :)

Looking forward to some formatting/syntax highlighting in the CLI output, but
I know you have your hands full as it is.

------
joncp
Tried it with "I'm better then you" and it didn't complain.

Nice idea, but you need to catch homophone errors.

------
raphman_
Are there any plans to support rules for texts written in other languages
(e.g., German)? Would a set of such rules fit within the scope of this project
or is proselint purposely or inherently limited to English prose? (@suchow)

~~~
suchow
It's out of scope for now, but only because we don't have any native speakers
of other languages helping us out with the project, and this stuff is hard
enough to get write in your native tongue; otherwise it's on the table.
Interested?

~~~
Singletoned
> this stuff is hard enough to get _write_ in your native tongue

Was that deliberate?

~~~
Tepix
"get write" is an error proselint could easily catch.

------
segphault
The main problem with a tool like this it that it needs to understand sentence
structure in order to find a lot of common anti-patterns. Without some natural
language processing, it's just going to be able to scan for word usage and
simple things that you can catch with a regex. You could probably build
something a lot more sophisticated on top of something like Apple's
NSLinguistic​Tagger and related APIs.

After testing this against a dozen of my blog posts, I'm not terribly
impressed with the output. I get more immediate value out of MarkedApp's
keyword drawer and word repetition visualization.

~~~
suchow
You're right, but the problem is much worse than that. Examining 200 entries
from Garner's Modern American Usage at random reveals that half of them are
easy to implement, the kind of thing that could be assigned as a homework
problem (e.g., recognizing that “$10 USD” is redundant, that “very unique” is
comparing an uncomparable adjective, or that people from Michigan are called
“Michiganders”, not “Michiganites”). Thirty percent are moderately
challenging, requiring a week’s effort. Fifteen percent are hard — they are
entire projects, requiring advances in AI. And the remaining advice (around
five percent), the best kind, is AI-complete. Consider, e.g., "John hit Peter
only in the nose". Does this mean that, of all Peter's body parts that could
have been hit, John hit only Peter's nose? Or is it a grammatical error that
was suppose to convey that, of all the people John could have hit, it was only
Peter who he did hit.

We're interested in incorporating deeper NLP. In particular, we've been eyeing
[https://github.com/spacy-io/spaCy](https://github.com/spacy-io/spaCy).

~~~
techdragon
Furthering the complexity of this topic...

While "$10 USD" may be redundant in a newspaper published in the USA, it's
immensely useful and arguably preferable when writing blog posts, emails and
other text destined for the "World Wide" Web. While USD is commonly used as
and many are comfortable with its use as a "common denominator" when pricing
something on the Internet, it's still very important to be clear "what dollars
do you mean" in this context.

~~~
ghshephard
If you are going to specify a currency, write USD 10 (though spoken, it's 10
USD).

If the context is explicitly local (such as a local newspaper, menu), then $10
is sufficient in the United States.

~~~
techdragon
I used to do "10 USD" or "USD 10" until I got sick of hearing responses like

"USD 10 looks weird, why did you do that', or 'that on the pricing page looks
funny, can you fix it up a bit'

It seems $ (or the equivalent currency symbol for other currencies) has a
place in many peoples minds implying that the number it is next to is
currency, and they seem to find it weird when things involving currency are
'written correctly' without the symbol that the numbers mean currency.

------
gansai
Will this be used by automated content creators? For example, lots of articles
on some of news websites (including wikipedia) are written by bots. So the bot
would write an article, invoke proselint and correct, if required?

------
kaeluka
Related: artbollocks-mode [https://github.com/sachac/artbollocks-
mode](https://github.com/sachac/artbollocks-mode)

------
vortico
I was skeptical that it would only detect obvious issues, but the sheer number
of built-in checks is surprising. I'll try this on the next large text I
write.

------
jake-low
I've been interested in linters and style checkers for English prose for a
while, and I'm excited to try this out!

To the author(s): Your website, as far as I could tell, doesn't tell me how to
install it; I had to go to GitHub to realize it was pip-installable. You
should consider adding that to the main page.

~~~
chei0aiV
The authors probably aren't reading HN, best submit a PR.

~~~
suchow
We are. Even so, opening issues on Github and submitting PRs is appreciated.

------
kylemathews
Nice idea.

Bug report — it told me I had too many exclamation marks in a Markdown file
with a number of images in it.

~~~
Tepix
Sounds like a feature request ("recognize and support markdown"). Open an
issue at
[https://github.com/amperser/proselint/issues/new](https://github.com/amperser/proselint/issues/new)

------
timlyo
Going through the example, it comes up with:

> Get that off of me before I catch on fire! > Needless variant. 'catch fire'
> is the preferred form

I don't think I've ever heard anyone say "catch fire" rather than "catch on
fire".

From the UK if that changes anything.

~~~
reikonomusha
"To catch fire" is a relatively common term, at least in the USA. "To catch on
fire" probably equally so.

------
vram22
Ha ha, slightly related fun snippet I wrote:

[http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-
essent...](http://jugad2.blogspot.in/2015/07/cut-crap-absolutely-essential-
tool-for.html)

------
edwinyzh
Very interesting, and I'm looking into integrating it to
[http://WritingOutliner.com](http://WritingOutliner.com) (or as a separate
Word addin) :)

------
Dowwie
Thank you for working on this project and sharing it.

One of the more challenging sections in the GMAT entails sentence correction.
A proselint-enabled GMAT prep for sentence correction would be very valuable.

------
amelius
What kinds of NLP technique does this system use?

Is it possible to specify new rules in a high-level way?

Can it learn from examples?

Does it work on a sentence-by-sentence basis only, or does it "grasp" complete
paragraphs?

~~~
raphman_
Rules are defined in Python scripts which can have arbitrary complexity.
However, it seems like most rules are just string or regex matching:

[https://github.com/amperser/proselint/blob/master/proselint/...](https://github.com/amperser/proselint/blob/master/proselint/checks/lilienfeld/terms_to_avoid.py)

[https://github.com/amperser/proselint/blob/master/proselint/...](https://github.com/amperser/proselint/blob/master/proselint/checks/misc/credit_card.py)

------
jcoffland
It would be interesting to run this against campaign speeches as a unbiased
way of judging the quality of prose. Surely content is more important but
still it would be fun.

------
brudgers
Github:
[https://github.com/amperser/proselint/](https://github.com/amperser/proselint/)

------
willvarfar
Its a python module? I'm looking forward to making a Pelican plugin so my mate
can start checking his blog for glaring errors before he posts! :)

------
true_religion
I'm curious is this just a grammar checker? Or does it do spell checking too
like aspell?

~~~
oneeyedpigeon
No. Yes, and no.
[http://proselint.com/approach/](http://proselint.com/approach/)

------
zimpenfish
Most important question - How many linguists are on the team developing this?

------
erubin
Can I use this with latex?

~~~
jake-low
Just tried it; you can. Seems like it strips markup characters so it should
work well with most markup languages.

------
biturd
FYI, seems to work perfectly find in Safari on Mac OS X Desktop.

------
stared
What is wrong with "very smart"? (line 86)

~~~
busyant
I worked w/ a guy who was good at editing my manuscripts. His opinion (which I
agree with) was that the word "very" was almost always superfluous. You can
delete it without affecting your message.

~~~
conceit
Took me a while to see that very comes from veritas and doesn't mean _much_.
At first I wrongly thought, I knew what the word means. Now I do know verily.

------
blt
Microsoft Word had something like this round about 1999

~~~
Piskvorrr
Yeah, there's the squiggly line; same thing, right?

Similarly, where Tesla Model S is concerned: Ford Motor Company had something
like this round about 1908. (Where "something like this" is "has four wheels
and no horses")

