Are there actual results of the system for evaluation, or even any system at all? All "demos" on the site seem prewritten, and done by hand for all we know, and in all probability are.
This is an interesting interface for a sophisticated machine learning model. The user input is a moderately structured prompt (a line-delimited list of facts) and the output is text crafted for a particular medium (email).
This could be extended to support output for other mediums, such as Blog posts or social media updates.
It does drive me slightly crazy that this system JUST INJECTS NOISE. You provide it with concise, readable information. It dilutes that information with additional words and scaffolds it over complex sentence structure. The resulting email has a flourish, but takes more effort to read. It's an inverse document summarizer!
Perhaps there are benefits to this. Sometimes you want to make your reader invest a bit of effort in your message. I believe that can improve information retention. The email is also a generally-accepted human API. People might be more receptive to information presented as a letter than as a list.
The biggest benefit is probably on the author's side. This forces you to actually think about what you want to say. The input is basically an outline!
Paradoxically, I think Flowrite could improve communication by making authors and readers do MORE cognitive work rather than less.
> Paradoxically, I think Flowrite could improve communication by making authors and readers do MORE cognitive work rather than less.
I would love it if I could get recipients to actually devote cognitive work to my communications. Unfortunately, increasing cognitive load is a great way to ensure fewer people read what you write.
Every recipient of a message thinks differently than the writer. For communication to be effective, it must be designed for the recipients, to trigger the right sequence of ideas that will get the message across. I’ve written up the exact same content in 3 or more ways to target specific teams - bullet lists, email, slide deck, etc. It’s always worth my time. Before I did this, I would get several elaborate responses of “why didn’t you write this in X format”, arguing that X was a superior format. But everyone provided different versions of X. You may think conciseness is paramount; not only is it not, the definition of what is concise varies from person to person.
> It does drive me slightly crazy that this system JUST INJECTS NOISE.
is actually an important service. What is noise to the sender is generally not noise to the reader. It would be awesome if at some point in the next few decades we only have to provide information that we need to think about and the computer could find a way to compile it into what the reader is looking for. In other words, if the computer could do what most every secretary was doing in the 1960s.
If the example is correct, it's injecting more than fluff and flourish. It injected "After weeks of hard work", which doesn't seem supported by anything in the input, and probably sets the wrong tone. Certainly it was at least many months of work prior to launch.
The additional examples seem to integrate external knowledge. How does that work?
It expands "follow up" to "in person or via Skype / phone call" -- where did those platform preferences come from?
when asked "How about a Zoom call on the following Tuesday at 2PM?" it replies "I'll be in a Zoom meeting on Tuesday at 2PM, so let me know if that works for you." Does this mean Flowrite looked at a calendar and detected a conflicting zoom meeting? Or is it trying to accept the proposed meeting time & venue?
I do really like the Blog Outline example, where it expands a general question ("How to align your Marketing, Sales and Customer Success teams to maximize revenue?") to a list of six other questions to dig into the problem more. That might be a powerful way to push people to think more deeply and question assumptions.
Isn’t this just GPT-3 under the hood? Other similar things do exist (eg: copy.ai).
Not sure whether there’s much of a difference between all these GPT-3 powered services when all that distinguishes you from competition is some (slick) UI and the 500-1000 extra words of “training” you give to GPT-3.
Gwern [1], who has spent quite some time with GPT-3 and previous models, seems to think that coming up with the right 500-1000 words can be a subtle business.
Now, what's the prompt that will get GPT-3 to generate good prompts for us?
https://arxiv.org/abs/2102.07350 already calls it "metaprompt" :) I gave it a quick stab a while ago but I think prompt programming is too new, and you can't easily cram demonstrations into an existing prompt, for it to really work well. It's more promising to train models on examples of tasks from instructions, or work on directly optimizing prompts for a goal (https://arxiv.org/abs/2101.00190) - it's a differentiable model and a whitebox, so use that power!
There was a pretty good sci-fi story in (IIRC) Omni, wherein someone figures out how to make a computer that can scan in a document and produce a clear, simple summary.
The main character is a friend of the inventor, who can't seem to make any sales. Later, they run into each other again, and the inventor is HUGELY rich.
Turns out he realized that the original idea was a flop, reversed the process, and sold it to law firms.
I think that'd be a much harder thing to do well. You'll have a lot of false negatives. Missing nuances in complex sentences can change the meaning significantly whereas when adding extra words to a list of facts as long the words are pretty neutral meaning-wise you'll be fine.
It looks like Axios is launching something like this. I asked and they want 6 digits a month - way to much for us so I haven't demo-ed so maybe it's not AI but more simple suggestions
Yes, this is incredibly useful for customer service chats and emails. Many times people feel shortchanged if you send a concise email with the relevant facts, I guess not everyone is an engineer. Adding warmth and extra reassuring words will go a long way to keep customers happy.
I agree with this and will add that a similar thing can be said for non-technical colleagues. The humanness is critical for maintaining good relationships.
The fact that AI performs so well at "fluff" creation says far more about the lack of creativity shown by humans who usually create fluff than it does about the creativity of the AI.
Alternative view: "Soft" skills like writing are actually critical to the success of companies, but companies hiring practices and retention practices are so terribly broken, particularly in tech, that most can't seem to identify or retain people who have these skills.
So now they're resorting to an AI to try to replace non-technical skills because they've utterly failed at identifying value in human beings.
The example seems to indicate it could create issues.
It's conjuring, for example "after weeks of hard work" from thin air. There's nothing in the input to suggest that phrase makes contextual sense. Though perhaps the example isn't a real world one.
Yeah, but what if I'm sending an email to say that I finished a big project the recipient asked me to work on a few days ago? I don't want my email to say I've been working for weeks when the reader knows that's not true.
Maybe I'm missing the intended use case for this tool?
I don't know, I get both recruiter emails as a hiring manager and as a potential hire. The majority them try to be familiar, each seems to fail with a different detail. Some guess my location incorrectly, or talk about a product launch from a competitor (attributing it to us) or get our industry wrong, or present skill-sets we don't need in a way that it describes them as being a core part of our architecture, or congratulates us on vague recent successes that could potentially apply to anyone. I had one last week that congratulated me on having a recent child (not the case) from I presume some badly matched social media profile with someone sharing a similar name. I get 2-10 of these a day and I really don't see many that take the time to actual do any research. It is clear most of form-fills, based on some kind of automated data collection or just tossing a wide net.
Yikes! It seems to add "fluff" but it actually adds context. Where does it get the context? How is the context validated against ground truth?
How does it know it took "weeks" of work to do something?
How does it know to schedule a video call instead of, I dunno, an in-person meeting or a lunch date?
Does it learn if the human user corrects what it comes up with?
Somebody needs to rig this up like the old emacs "psychoanalyze-pinhead" hack and see whether two instances of this stuff can talk each other into some rhetorically wierd corner.
I always get nervous when an AI/ML demo doesn’t let you try with your own input. It is fairly easy to curate a set of examples in any ML model that work really well. However, a lot of them, quickly fall apart when faced with diverse, real world data.
This does feel like a big part of the future. An AI coach, which expands on your text and tailors it to you're style and to the p̵r̵e̵f̵e̵r̵e̵n̵c̵e̵s̵ optimized for persuation of the recipient.
Does feedback fine tune or customise the model behind this?
While not in the sense of "this sentence is lifted from someone else's writing" also very much not "this writing is my own work": this seems like plagiarism without the copyright issues.
Do they have a solution to the (seemingly intractable) problem of people who only read up to the first question in an email, respond to that, and ignore the rest?
I find formatting can help. List each of your questions at the bottom of your emails, so they can go through them and answer them one by one. It's also harder for them to justify answering one question and ignoring the rest when you lay it out for them in one place.
Depending on the questions and how much the answers matter, it may be better to ask each one after they reply to the previous one, so they don't feel overwhelmed being asked all of it at once and put more thought into answering each one.
A bit premature for HN?