Elicit is an AI research assistant that uses language models to help researchers figure out what’s true and make better decisions. We've scaled to >$2MM annual revenue and 400k MAU with our small team. We're hiring for multiple roles and are primarily interested in people with early-stage company experience, who are comfortable in high-agency, fast-paced teams.
Elicit (https://elicit.com/careers) | Oakland, CA and Hybrid | Frontend, AI, and Full-stack software engineering roles.
Elicit is automating high-quality reasoning so that we can help the world make more breakthroughs in every domain: from climate change to the gut microbiome to longevity and economic policy.
We’ve scaled to over 200,000 monthly users purely by word of mouth and recently crossed $1.5MM in annual revenue, 7 months after launching subscriptions.
We’re now building out our software engineering team, and hiring across several technical roles.
Our main focus is a little different to SciSummary actually. We're focussed on understanding researchers broader workflows, and providing a research assistant (i.e. rather than a particular narrow tool for summarisation or search).
The workflows we're most excited about at the moment are literature and systematic reviews: we think we can make these orders of magnitude faster and higher quality.
(Elicit the company and app was spun out from Ought the research lab).
We do joke internally about the homophone (in fact, IIRC we did a little joke on our CEO by rebranding for his birthday in 2022) but I'm sorry to report that we're all careful, ethical, and well-behaved people :(
This is a good point! (Hopefully) obviously, if we knew a particular claim was fishy, we wouldn't make it in the app in the first place.
However, we do do a couple of things which go towards addressing your concern:
1. We can be more or less confident in the answers we're giving in the app, and if that confidence dips below a threshold we mark that particular cell in the results table with a red warning icon which encourages caution and user verification. This confidence level isn't perfectly calibrated, of course, but we are trying to engender a healthy, active, wariness in our users so that they don't take Elicit results as gospel.
2. We provide sources for all of the claims made in the app. You can see these by clicking on any cell in the results table. We encourage users to check—or at least spot-check—the results which they are updating on. This verification is generally much faster than doing the generation of the answer in the first place.
I'm not sure I agree that those rule-of-thumb statistics are "arbitrary" or "fictional"… I guess it depends on what you mean by that. I can say that on our part they're a good faith attempt to help users calibrate how best to use the tool, using evaluations of Elicit based on real usage.
Definitely accept that the tool can work better or worse depending on your domain or workflow though!
One way we do try to distinguish ourselves from vanilla LLMs is that we provide sources for all of the claims made. I mention this because we hope our users can approach the falsification process you mention for Google. We want to show people where particular claims come from such that we earn their trust.
Walking citation trails and verifying transitive claims is something we've talked about but need more people to implement! (https://elicit.com/careers)
Accuracy and supportedness of the claims made in Elicit are two of the most central things we focus on—it's a shame it didn't work as well as we'd like in this case.
I'd appreciate knowing more about the specifics so we can understand and improve
1. Finding papers / claims / data across an academic literature which is ballooning in size.
2. Using these raw materials to to answer questions in a reliable manner.
#2 is where the bulk of the tricky ML work is, and where vanilla language models often fall short because of limited context windows and hallucination.
We're also working to expand Elicit to help academics with other parts of their research, like surfacing critiques, suggesting related prior work, brainstorming related research questions, identifying risks of bias, …
Elicit is an AI research assistant that uses language models to help researchers figure out what’s true and make better decisions. We've scaled to >$2MM annual revenue and 400k MAU with our small team. We're hiring for multiple roles and are primarily interested in people with early-stage company experience, who are comfortable in high-agency, fast-paced teams.
Front-end engineer: https://elicit.com/careers?ashby_jid=b5e218b8-8730-4254-b026...
Machine learning engineer: https://elicit.com/careers?ashby_jid=913a03d5-bd26-4c64-8346...
Data engineer: https://elicit.com/careers?ashby_jid=4617f630-f971-4716-b753...