Hacker News new | past | comments | ask | show | jobs | submit | matt1's comments login

Hey all, Matt here, Emergent Mind's founder.

Yes, Emergent Mind is 100% focused on AI/ML papers from arXiv. I think it makes more sense to focus on a niche because you can tailor everything to that niche, vs creating a general research paper site which won't wind up speaking to any audience well.

For anyone curious about Emergent Mind: it surfaces trending AI/ML papers by monitoring social media (HackerNews, Reddit, X, YouTube, and GitHub) for discussions about papers, then ranks them based on the amount of engagement they're getting (similar to how HackerNews uses upvotes). Then, for all trending papers, it automatically summarizes them using GPT-4o and links to relevant discussions so you can learn more.

We're working on a bunch of new capabilities that we'll announce soon too.

Feedback welcome: matt@emergentmind.com


If you're a fan of tldr-ai, you might also like my site, EmergentMind.com, which does something similar: it surfaces trending AI papers based on social media engagement (including HackerNews upvotes!), then summarizes those papers using GPT-4 (a bullet point summary + detailed writeup based on the actual content of the paper), and highlights discussions on HN, Reddit, YouTube, GitHub, and X about that paper.

I don't want to highjack this launch post (we definitely need more tools in this space!), just wanted to share my tool for anyone interested since it's related. Feedback welcome: matt@emergentmind.com.


It does feed each paper through GPT-4 to generate an overview of it, including the main conclusions. Here's a demo I shared on X which walks through it: https://twitter.com/mhmazur/status/1747990900771287097


Hey all - I'd like to invite you all to check out Emergent Mind, a website I built to make it easier to stay informed about important new AI/ML research.

It works by monitoring HN, X, Reddit, YouTube, and GitHub for mentions of new arXiv AI/ML papers, and then surfacing trending papers for you. It also runs each paper through GPT-4 to generate an overview (including defining all technical terms) and links to those social media discussions and other resources (GitHub, YouTube videos, references, and related papers) so you can learn more. A lot of the UI and features are inspired by my experience browsing HN for many years.

I shared a 4-minute demo video on X for anyone interested in learning more: https://twitter.com/mhmazur/status/1747990900771287097

Questions/feedback welcome!


The site gets hung when browser script blockers are on. Works fine without any blockers on though.

Cool concept and the summary has good detail. It gives a level deeper than the Abstract


Which ad blockers are you using? I'll take a look to figure out what's up.


Great works matt. This is very interesting. I'm very curious about how do you get related tweets? Does X have free api for that?


X has an API that costs $100/month: https://help.twitter.com/en/rules-and-policies/x-api

You can use it to fetch tweets with links to arxiv papers, but be watchful of their 10k tweet-read requests/month limit.


OP here with a shameless plug: for anyone interested, I'm working on a site called Emergent Mind that surfaces trending AI/ML papers. This TinyLlama paper/repo is trending #1 right now and likely will be for a while due to how much attention it's getting across social media: https://www.emergentmind.com/papers/2401.02385. Emergent Mind also looks for and links to relevant discussions/resources on Reddit, X, HackerNews, GitHub, and YouTube for every new arXiv AI/ML paper. Feedback welcome!


I visit your site every day. Thank you for creating it and evolving it past simple summaries to show paper details!

I recall you were looking to sell it at some point. Was wondering what that process looked like, and why you ended up holding on to the site.


Hey, thanks for the kind words.

To answer your question: an earlier version of the site focused on surfacing AI news, but that space is super competitive and I don't think Emergent Mind did a better job than the other resources out there. I tried selling it instead of just shutting it down, but ultimately decided to keep it. I recently decided to pivot to covering arXiv papers, which is a much better fit than AI news. I think there's an opportunity with it to not only help surface trending papers, but help educate people about them too using AI (the GPT-4 summaries are just a start). A lot of the future work will be focused in that direction, but I'd also love any feedback folks have on what I could add to make it more useful.


Thank you for the detailed response!

Pivoting into arXiv is a good idea. It helps you have focused prompts and templates.

A natural progression is aggregation, categorization, and related paper suggestions. Since arXiv has HTML versions of papers now, you can also consider allowing deeplinked citations directly from the LLM summaries.

A GPT-curated comments section for papers would also be nice, automatically filtering out any spam that gets past the regular Disqus filters, then scoring/hiding comments based on usefulness or insight.


I am new to this space. Is it hard to fine tune this model?


For anyone interested in staying informed about important new AI/ML papers on arXiv, check out https://www.emergentmind.com, a site I'm building that should help.

Emergent Mind works by checking social media for arXiv paper mentions (HackerNews, Reddit, X, YouTube, and GitHub), then ranks the papers based on how much social media activity there has been and how long since the paper was published (similar to how HN and Reddit work, except using social media activity, not upvotes, for the ranking). Then, for each paper, it summarizes it using GPT-4, links to the social media discussions, paper references, and related papers.

It's a fairly new site and I haven't shared it much yet. Would love any feedback or requests you all have for improving it.


This is exactly what I was using HN for. But, yeah, in kinda sucked compared to yours. Another thing I was trying to create was some sort of NN model that could use the semanticscholar h-index of authors along with the abstract text and T5 to estimate the one-year out citations. Just for personal use, though. That whole thing fell apart because semanticscholar is kinda crap for associating author links to the same author. I frequently ended up with the wrong professors, which I'd think would be easily fixable for them.


Just a note to say that factoring authors into the ranking system is high on my todo list. v1 won't be too fancy - just a hardcoded list of prominent authors whose papers warrant extra visibility. A future version will likely automate it to avoid the hardcoded list.

Also, soon-ish I'm going to add the ability for users to follow specific authors, so you can get notified when they publish new papers.


> Also, soon-ish I'm going to add the ability for users to follow specific authors, so you can get notified when they publish new papers.

If you could do it, this would be a dream. My original intent was to be able to look through only papers citing a popular one and filtering the results for ones having at least one author with a set minimum h-index. Using Google Scholar data required using SerpAPI, which has some annoying limitations.

The core goal is obviously just not to miss out on a paper that will very likely be influential while not having to comb through the mountain of irrelevant papers.

What's funny is that Microsoft Academic was the best suited, but was retired in 2021.


I did that (used other features). This is how new papers are ranked here:

https://trendingpapers.com


Great site, thanks for sharing. Can you explain how you're determining how many times a paper is cited? Obviously papers include a list of references, but extracting them accurately from the PDF is difficult in my experience (two column formats, ugh) - though the new HTML versions help. And even if you have a list, many authors just mention arXiv paper titles, not their ids, making identifying specific references tricky.


Difficult, yes… but not impossible :)

I just extract the titles and look for their respective ids.

The real challenge was how to do that at scale. Only in CS there are well over half a million papers


FYI I started embedding the HTML pages in an iframe on Emergent Mind when the HTML version is available: https://www.emergentmind.com/papers/2312.11444 // should make it even easier to stay informed about trending papers


I've got a somewhat related question:

is there a site that lists and rates the various LLM models of hugginface.co alongside their various applications?


That looks great. No real feedback yet, but it's the kind of thing I've always been looking for as a better alternative to Twitter.


Thanks! I've got a lot more planned for it too. If anyone has any feedback that doesn't make sense to share here, or if you're a researcher who is open to some questions about how you currently follow arXiv papers, drop me a note at matt@emergentmind.com.


Love the clean design of the website! Looks amazing on mobile.


Thanks! If you ever run into any issues or have any suggestions for improving the site, drop me a note: matt@emergentmind.com.


Would love to see a comments feature at the bottom there. Reddit / HN style

Love the concept though. Added it to my Home Screen on iOS


Thanks for the kind words, it's appreciated.

I might add comments down the road if there's enough interest and if there's enough traffic to warrant it. Don't want to add them just yet and have zero comments on everything and it look like a ghost town.

Keep the suggestions coming though as you use it more: matt@emergentmind.com.


Great site. Bookmarked it.

Would be nice if I could change timeframe. Top this week, month, year, all time.


I'm slowly adding older papers as I work out the kinks in the site. Down the road when the database is more comprehensive, this should definitely be possible.


Works in Chrome, but does not seem to work in Firefox.


Can you (or anyone experiencing similar issues) share any details about what's not working in Firefox? I tested it and all is well for me, though it's definitely possible there's an issue with some other version of it.


Love to see Energent Mind continuing to innovate!


Hey all!

Emergent Mind is an AI news aggregator that I'm working on to help me and others stay more informed about the latest AI news. Today I launched this new feature that lets you adjust how the news is explained to you with the click of a button. It currently supports 5 explanation styles, though I'll likely add more in the future:

- Explain it Normally

- Like I'm 5 (ELI5)

- In 10 Words or Fewer

- Like an AI Influencer

- As a poem

My hope with Emergent Mind and this feature in general is to make it easier (and potentially more entertaining depending on the style) to not just follow, but get educated about the what's happening in the world of AI.

Feedback/suggestions/questions welcome!


There's some risk here, but there's also risk utilizing any third party services for key parts of your business. If OpenAI increased pricing significantly or eliminated access (both of which seem unrealistic), there will likely be plenty of alternatives soon to fall back on. I wouldn't avoid creating an AI product because of this remote possibility.


> there will likely be plenty of alternatives soon to fall back on

Are there? OpenAI is currently the only provider. Big tech companies (Google, MS, Meta, maybe Apple) may also soon have similar models but they don't have much incentive to sell access to them. Smaller orgs can't produce GPT-level AI. It feels like all these products are really at the mercy of OpenAI.

There has been an explosion of effort around LLama, but Meta did the hard part (the base model), and who knows if they will do that again.


Claude from Anthropic already has an API and its likely that Bard will have an API as well. It's just the most basic way to monetize your model.


Good question! For now, it's not automatic. If I notice multiple links about the same news story, I manually hide the duplicates. There are ways to automate this process, it just hasn't been a priority yet.


Which automatic approaches do you think are promising? Off the top of my head I would consider reducing an article to a summary, then comparing semantic similarity, both done with LLM's.


This also seems like it was generated by an LLM, hah.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: