Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Linen – Make your Slack community Google-searchable
177 points by cheeseblubber on April 26, 2022 | hide | past | favorite | 57 comments
Hi HN! Kam here. I’m the founder of Linen.dev https://linen.dev, a website that makes your public Slack community Google searchable. Linen will sync your Slack threads and make it SEO friendly so your community can find Slack content that was previously hidden.

Previously I worked on a popular open source project which had a sizable Slack community. Slack was great for engaging with community members and with early sales. However as community scales Slack becomes this black hole where context becomes lost. Most public communities can’t afford to pay for several hundred/thousand members so they are limited to 10,000 free messages. You run into the problem of people asking repeat questions and not searching in Slack. It also doesn’t help that the Slack UX encourages posting and not searching. We experimented with Github discussions and Discourse but didn’t want another channel to maintain and split the community on.

With Linen I wanted to build a tool that is very low maintenance without changing my current workflow. By making it search engine friendly and putting it on a website the community members can find answers to repeat questions before ever getting into your Slack channel. Linen is the first result that comes up on Google if you search for “seeing a weird issue with flyte” https://www.google.com/search?q=seeing+a+weird+issue+with+fl... or “replace beast http with proxygen” https://www.google.com/search?q=replace+beast+http+with+prox....

As a side effect of syncing conversation to a website you end up with a very long tail of unique and relevant content for your community. Linen is free to use and get setup but I offer a paid version (I am still figuring out the pricing model for it) where you can get the content redirected to your own subdomain where your domain gets all the SEO benefits.

Linen is built with Nextjs, Node, Typescript, React, Prisma for the ORM and using AWS aurora for the Postgres db. I chose Nextjs for the server side rendering capabilities and wanted to share types between client side with Typescript. I’ve also enjoyed working with Prisma as the ORM since you don’t have to write a lot of boilerplate with other ORMs. I've also been pretty happy with Vercel and Nextjs especially with the server side rendering and client side caching it provides.

Here are a few communities on Linen right now:

https://osquery.fleetdm.com/

https://discuss.flyte.io/

https://calendso.linen.dev/

https://community-chat.infracost.io/

https://community-chat.signoz.io/

https://airbytehq.linen.dev/

The product is very simple right now but I want to add features like related questions detection with semantic similarity, integrating with Github to notify the thread when it is finished, auto thread detection for conversations that aren’t in thread form.

You can sign up for free today at https://www.linen.dev. I am doing manual onboarding at the moment to get better feedback and to manually walkthrough some of the less polished parts of the boarding flow.

p.s. I’m actively working on supporting Discord on Linen so would love to hear from anyone that is interested




I love stuff like this. With so many communities congregating on Discord, Slack etc. we lose a lot of valuable information that would previously have been on something publicly searchable like StackOverflow. A lot of OSS has fallen into this pattern and it can be truly infuriating at times.


It's a bit of a negative comment but I sure wish these communities would rediscover forums. They're not all perfect but I am convinced that Discord and Slack are where valuable information goes to get mixed with gifs and memes then die.

I guess that won't happen until someone starts hosting discourse for free.

Awesome that Linen is trying to build a different future though.


I agree. I really dislike Slack and Discord compared to forums. Being able to keep a topic in a confined thread and then bump it to the top when there is activity is so useful.

Slack does have threads but you never know about updates unless you're already chatting in the thread.

Slack just feels like email before threads got grouped together.

Don't get me wrong, Slack is a great chat tool for small groups and it's a bonus that others can search it. But forums are far superior for large groups and long lived conversations.


You can subscribe to Slack threads like you would a forum's. Discord threads are a bit weird but they have the advantage that they appear on the sidebar so they don't get lost in a sea of messages.


I was curious if anyone had used a typical web forum for work instead of slack or others.


Whoever can solve applying topic change-point detection (e.g. https://aclanthology.org/C18-1212/) to wide-ranging Discord and Slack channels, and put together a novel UI for discovering probable thread-intentions there (for both closed and open communities) would unlock an incredible repository of human knowledge. There's a massive opportunity here.


Gitter is searchable. I wonder why it isn't popular.


Well they did join Element[0]... So it's not like they weren't popular.

That said, I'd worry about a chat solution being too focused on a small niche like developers (who often aren't that interested in paying for tools, even if they're high quality).

[0]: https://blog.gitter.im/2020/09/30/gitter-element-acquisition...


They are real time forums and way more. Just without search, like the other poster mentioned.


Discourse is an absolute nightmare for searching though, it's painful and slow.


Well I'd argue that it's at the very least a lot more accessible to search engines without creating new software to support it (again, awesome that Linen is doing it!).


Yes! This is the exact problem I was hoping to solve with Linen. I felt frustrated that there was so much good information being hidden in Slack and Discord communities. I also was annoyed at slack links in Github issues that leads no where because it was archived historically.


I don't know if this makes any sense but have you thought about a way to migrate from Slack to a forum?


And not only is Slack harder to search, free Slack instances limit history to 10,000 messages, so older messages effectively disappear.


They also badger community members who have no say in the matter to upgrade.


How does this not breach Slack’s TOS? I thought they specifically prohibited tools like this?

We used slackarchive before but it got disabled because of the ToS. The 10k message limit is specifically imposed by slack to get people to upgrade so they very much don’t want it subverted.


How does this handle privacy? Assuming these Slack conversations were originally in private communities, I'm scratching my head a bit that they are now appearing in Google results.

I'm guessing that essentially it is up to the Slack admin to ensure all participants have agreed to their conversations being made available to the world?


All of the communities on Linen right now are open to the public to join through a public url. Linen isn't designed for private communities right now unless it evolves to be more like a productivity too that helps with related questions, summarization and search.

At the moment most people who come in to these Slack channels kind of treat it like Stackoverflow or customer support for technical tools.


Thanks for the reply. I understand that the join link is public, but it's essentially a gate, the community behind which is private.

At the moment most people who come in to these Slack channels kind of treat it like Stackoverflow or customer support for technical tools

For the most part they will assume a conversation limited to members of the community they have joined.


I see what you mean. Right now it is up to the communities/admin to explain that these conversations are public. I plan on building some tooling like a Slack bot that sends an alert to users who join a community letting them know the conversations are public.

Other solutions I have considered is only syncing specific channels and making it explicit that Linen is making these conversation public. Definitely a gray area right now and would love any suggestions around this.


would a self-host option make sense for some communities that wanted to remain "closed" but still take advantage of the search/archive neatness?


Can you see yourself offering a migration path off of Slack in the future?

Your interface is much more accessible not just to Google, but also to this human. I would love to have an option to migrate off Slack with continuity.


> I would love to have an option to migrate off Slack with continuity.

This sounds like something that may interest you: https://zulip.com/help/web-public-streams


Thanks! Not sure of the migration path off of Slack at least in the near term since there is a lot of pull in to Slack since most people in professional setting already is familiar with the UX. I do think there is something there for a layer on top of Slack or Discord where information gets moved.

I want to explore making this tool available for internal teams once I have more features around productivity like related questions, summarization and confluence/docs integration. These seem useful for internal teams but need to assess whether people actually want this.


I'd be interested in chatting about this at some point. What you are considering re. meaningful aggregation and summarization is connected to my thesis here: https://medium.com/@aviv/when-we-change-the-efficiency-of-kn... (contact info from my profile/website)

Also, this is very useful! Thanks for building it. (Ideally it's bundled with clear info that everything people are writing will be publicly searchable. Perhaps an invisible bot message every month or so.)


Speaking as somebody who somehow gets roped into doing internal knowledge management work, something like this would be great, particularly with the features you're suggesting. I would actively campaign for us to use Linen depending on the price. It's not what I would use if I were building things from the ground up, but going into an org with years of Slack messages + an ongoing Slack culture, it would be fantastic.


If you can share, how do you retrieve data from the Slack group? Scraping as an invited user? A public slack app? For some communities, that could be a massive amount of webhook traffic!

Neat idea :)


Sure! Right now a community admin that adds Linen to your Slack workspace. I have webhooks listening for any message events which gets mirror and saved to the Aurora postgres DB. On install I also have a cronjob that hits the Slack api and pulls the old Slack conversations.

For scaling wise the I host the webapp on Vercel where it is backed by lambda functions. I also am using AWS aurora for postgres which in theory should help a ton with scaling the DB. I've done Heroku db hosting before but found that AWS tends to be much cheaper.


Since your data is append-only have you considered something like just sticking all of the data in s3?

Might be awkward to deal with things like updating threads, but just editing the s3 file containing the json response for a given thread might work, and then you can skip having a database entirely.


Oh interesting. I think like you said updates might be a little bit tricky as well as search. I'm using full text search from Postgres so that would be tough on S3.

I'm doing some aggressive caching with Vercel and Nextjs which essentially turns in to putting the data in S3.


You can do text search and other SQL queries in S3 data by plugging AWS Athena to it.

It isn't very scalable, though. Last time I checked it would handle few dozens of concurrent queries, at most.

But for fulltext search, what really makes sense is something like OpenSearch.


Ah didn't think about search, makes sense. Yeah agree the benefits caching are roughly equivalent to storing in s3.


Is there a how to or any documentation you followed to do this? I wanted to do something where I consumed all the messages but didn't really see how in my brief documentation reading.



Like @charcircuit said those were the meat of it. Main part is you have to iterate through conversation.history api. The syncing portion definitely is the most complicated part of the code base.


Has anyone considered building something like this for Matrix?

There are lots of open-source communities there, I imagine it would be helpful for debugging (and discovery of Matrix itself).


Ugh, this is a guaranteed way to drive me out of any service that uses this. If it’s your intent to create an unsafe environment, you’re succeeding admirably. Google indexing is the biggest contributor to making the web unsafe for communities, and I’ve been dreading someone coming up with this for years.


It would be good if you explained your stance.


Congratulations on the launch! It looks helpful and easy to use.

Two bits of feedback:

1. The threading seems off in places. I looked at the linked osquery site, and there were some top-level items that seemed like replies. Maybe that's just a result of people replying in a channel instead of a thread, though.

2. I see character entities in code sections, instead of their corresponding characters. For example, an XML snippet has tag names surrounded by &lt; and &gt; instead of < and > (example: https://osquery.fleetdm.com/t/5378/Anyone-can-point-me-to-so...).


1. Yes it's because someone posted the reply in the channel instead of a thread. I only sync threads at the moment. I have some plans on tweaking the UX to fix that issue and potentially some heuristics to figure out if it is talking about the same subject.

2. Oh good catch. We just rewrote our message rendering so there is some bugs we have to fix.


Thanks for letting us know! I'll fix it today.


Supporting Discord would be really neat.


Is there anything similar for Discord?


Yes I am working on it actively I will have the version ready sometime this Friday. Discord has been taking longer than Slack since their API's isn't as intuitive and their threading model is a bit tricker since not everyone uses Discord threads. If you sign up on Linen.dev happy to reach out once Discord is ready to get some feedback.


Congrats on the launch, Kam! Linen is awesome. We'd love to get this setup for Fig's Discord community :)


Really like the idea, do you have any worry of Slack blocking your API access if they see you as a competitor?


I truly hope you succeed in getting as many communities as possible. This adds a lot of value to everyone.


If you could add GitHub-like Notifications, where the user has to explicitly mark a message as read, that would be golden.

I can't count how many messages I have missed on Slack because Slack assumed that I read them when in fact I did not.


This is a nice looking product and looks valuable. It pains me to watch all these communities exist in such ephemeral places like slack or discord though. Forums still work and are awesome!


This is fantastic! I'd absolutely love it if Discord were also supported, so many communities there that are non-googleable


This is absolutely incredible. Would definitely love to see Mattermost support too.


I really hope you succeed, but I think this breaches the Slack ToS


I have use case you may be interested in. Can we connect?


Sure whats the best way to reach you? If you sign up on Linen I can also reach out via email


ngl this seems really cool. Does this mean we can access chats a year old?


Is this GDPR compliant? Because both porting user data around to what I assume is American servers as well as maintaining the ability of users to delete personal data seems to not be possible if slack channels are being scraped by this. Honestly it also looks like this just breaks Slack's TOS.


This is genius.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: