

Tell HN: College students creating machine to summarize from abstraction - marcomassaro

HN,<p>I would like to share some insight on a stealth mode project that a company founded by college kids have started named Memsparx and the revolutionary implications that are entailed.<p>Memsparx is creating a machine named WINSTON that can understand human language, reason, and create summaries from abstraction. They are applying this technology to the Media/News industry. This team is doing the impossible - many features from IBM's WATSON machine are in this machine including natural language processing, reasoning, info retrieval and so forth. And they are doing it on top of creating a news summarizing company. The description goes as follows.<p>The problem: Text summarization is a super hard problem. IBM and only a hand full of companies provide this software (at a high price to financial/medical/etc firms to shorten reports and index them). This current technology takes words, retrieves the definitions of each word, and then can create summaries through statistics and other methods. This software gets the job done but its not good enough. The summaries are not quite proprietary and the machines will typically form awkward sentences out of context. The problem is making a computer (software) understand the CONTEXT that the word is used in (i.e. fly can mean a ton of different things such as | fly through the air | fly through a book | the insect "Fly" etc).<p>Solution: Memsparx is creating WINSTON. Winston is a super smart software that can not only read (parse) an article but it can truly understand what it is/has read (as well as a comp can!). It can understand what the human language in all of its complexities and nuances are trying to communicate. This is rooted in a number of text mining applications but specifically NATURAL LANGUAGE PROCESSING, INFORMATION RETRIEVAL, REASONING ALGORITHMS, and so forth. Kind of like the popular movie Terminator's Skynet...
======
sorbus
This post reads like a press release, which left a bad taste in my mouth.
Looking at the website, though, if the summaries on the front-page are indeed
produced by Winston, then I'm quite impressed. As _logjam_ says, though, being
able to have arbitrary text summarized would be a true demonstration of your
technology - or at least let users submit articles to have summarized (make an
API and charge for it. If the summaries are cheaper than having a human
summarize an article and work without human intervention, then there's
probably a market for it).

Also, the registration system doesn't support the foo+bar@foobar.com format of
emails. An extremely minor nitpick, of course.

~~~
walkwalk12
I'll give Sorbus a demo in 45 days (not perfect yet...). Until then be
skeptical!

------
dstein
Unless I can interact with it then there's not much for me to get excited
about. Why not keep it stealth until you have a demo that people can use?

~~~
marcomassaro
Thanks for your comment. Our website basically shows the proof of concept. We
are still building, testing and creating Winston which will be available for
commercial applications, not public usage.

------
marcomassaro
clickable: <http://memsparx.com>

------
logjam
Pardon my skepticism about this, but your website doesn't show much at all
besides a bunch of apparently hastily written buzzwords.

Where do you provide (as a demonstration) the ability for a user to upload
text, have it summarized, and show that "Winston" reasons about it?

~~~
walkwalk12
We own this (Phil W. here from mems)... not supposed to be out on the web
yet.. leak?

1\. "Web doesn't show much" - logjam -> Thank you for the compliment. We aim
to make make the news experience as de-cluttered and clean as possible. Right
now we find you the top news across the board and summarize it for you. We
will expand as the tech is perfected.

2\. "Hastily Written words" - logjam -> Appreciate the honesty. Journalists
aim to create these "Hastily written buzz word" and Winston has proven again +
again to model the human method. But not perfect yet!

3\. "Where do you provide ...a user to upload text" -logjam ->As many hackers
are all about creating tech like this and giving it to everyone for
exploration etc. We are forming a company around this with many applications
to it (one is news/media but that's just the niche we found scalable and
appealing). Call us selfish but we would like to make a few pennies off of our
40+ hour work sprints :)

Thanks for the question and I tried to follow the guidelines.

