Why? Because opensource communities are on the free plan, which limits search once you have 10k messages. I've had experiences where I wanted to revisit a question I had asked in a Slack channel the previous week, and been unable to find it.
As a result, everyone burns out faster b/c the same questions get asked and answered over, and over.
Couple this with the fact that channels are not indexed by Google and you get a black box where valuable Q&A content and discussion goes to die.
Just use IRC. It's practically impossible to avoid Slack at any startup now, but I'd love to be able to avoid it in FOSS.
I can finally have a single platform for communication. Voice chat, text chat, group chats, friends list, async communication, unlimited logs (no 10k max msg nonsense), webhooks/integrations that let me do far more than IRC bots ever did. All of it under one account. Oh and the client doesn't suck, unlike Slack's. It's fast. The voice quality is superb.
As far as productivity goes, I get far more done with it than I ever did with IRC. The addition of being able to hop on voice very quickly is insanely good. Screensharing and video chat coming this year as well, I'm pretty excited.
It's to the point that I bought Discord Nitro (their premium offering) the day it was released, for no other reason than to give them money.
I hope the question of protocol openness gets resolved; until then, IRC just doesn't cut it for me anymore. IRCCloud.com helps, but their interface is super slow with lots of channels and IRC itself simply has no support for the thousands of improvements that have been made in communications the past 30-something years.
Short of video calls though, Discord is essentially a drop-in replacement to Slack. We've been using it at my company, it works so damn well. I moved to it for our open source community as well. I use Matterbridge for a three-way mirror between IRC and Gitter as well: https://github.com/42wim/matterbridge/
You can use Discord any want you want. If you join some popular Discord server, odds are it'll be full of spam and Internet humor, but obviously you can do whatever like on your own server(s).
> I'd be forced to use their awful client
Who forces you ? You can use slack on the web can't you ? You don't need to have yet another browser engine running on you computer.
> Just use IRC.
Please don't … IRC is the opposite of user friendly: it has no good web interface so the casual user won't come in because he doesn't want to install and learn a new software (IRC client). But slack isn't the only option here, it's not even the best open by far, Gitter, Mattermost and Discord are alternatives to IRC which aren't Slack.
: https://about.mattermost.com/, they don't provide chat hosting but several organisation do host mattermost servers (Framasoft for instance https://framateam.org/)
: https://discordapp.com/ targeted at gamers, which is a good sign of quality, but fine for general purpose use.
Slack's web client is still slow. Their "native" client is just a Chrome wrapper over their web app, with some glue code to hook up notifications.
> it has no good web interface so the casual user won't come in
https://webchat.freenode.net works fine. If you need or want help with a project and can't be asked to spend a few moments opening up an IRC client (native or web), then I don't know what to tell you.
> Gitter, Mattermost, and Discord
Gitter and Discord are both closed source, proprietary systems made by companies who want to make money. Open source projects shouldn't rely on them if possible.
Mattermost is OK, but requires running your own server. IRC is free, and there are public servers specifically made for open source projects.
Curious if this statement is about the most recent experience post improvements mentioned in tfa?
But we still direct people to Stack Overflow since the Q&A is more discoverable there.
Which maybe should be a challenge to anybody looking to build the next generation of knowledge repositories.
I've repeatedly considered writing a bot that would—on a regular schedule—poke this page with a headless browser to generate a dump, download said dump, and ingest it into ElasticSearch (which I'd then expose through a web search, or maybe just spit out batched archive pages into a static-site S3 bucket and let Google index them.) Such a bot would be a good companion to https://github.com/rauchg/slackin for FOSS teams.
But I haven't done any of that yet, because I get the feeling that putting enough attention on this little "feature" would get it quickly locked down.
GitHub and Slack provide a huge amount of utility. But they also feel hollow to me. It feels harder and harder to opt-out of using closed tools.
Someone in your project can manage to setup a logbot that dumps logs onto a webserver, which will be indexed by google. I suspect there are services that will do it for you, so you might not even have to setup the bot yourself. If there isn't one I'd have half a mind to build one, if it gets more projects using IRC.
Usually when I'm searching, I'm looking for a particular message, possibly even one I read earlier that day, and I may know a few things about it, like who sent it and that it had an important link, but I still can't necessarily find it! The results are also presented in a giant cartoony way that makes me page through many pages. Tokenizing my search into "keywords" means that even if I know a substring of what I'm looking for, it doesn't come up as relevant, or the tokenizer tokenized the text differently. This is also why GitHub search can't find a lot of things.
What I would want in a search experience is the equivalent of Control-F over the list of messages I've actually seen.
As for the Control-F thing... stay tuned on that too :)
It'd be cool to highlight a piece of information and insert it into a wiki-style site.
Feel free to email me with any uestions too - email@example.com
Which was very helpful to me.
Any book where you learn at least one thing new is always worth it, so I do not regret having this book in my library.
For those that want to read more: http://opensourceconnections.com/about-us/doug-turnbull/ or really anything at http://opensourceconnections.com/blog/
I tried to see your talk at last year's Elaticon, but it was packed due to the small room. Not surprising since Elasticsearch tends to minimalize search in favor of analytics/logging. So few talks regarding pure search.
Lucene/Solr/Elasticsearch are nice, but they need competition, especially outside the Java world.
I get the competition part, but none of the above are exactly stagnant, so I'm wondering what you'd like to see more competition achieve.
Not trying to be difficult, just curious in case I missed something from your comment :)
There's a lot that competition can help improve, for example in the areas of performance, robustness, and also in functionality, e.g. better NLP for better understanding of queries and translating them into results, image/audio search, etc. And competition can also come up with surprising new features that we can't even think of right now.
This, plus it is (imho) quite weird that we have only one source of code for one of the basic branches of Computer Science.
In the end, ES really proved to be the least bad search server there's out there. The real crux isn't search, it's language. And as Lucene is made by technical linguists (a really rather special bunch) and Java is still universities' darling, it's unlikely their effort can be redone in a non-JVM language anytime soon.
A suggestion: When I search, what I want 99% of the time to happen is that the current window I'm looking at quickly gets filtered to my search query. Ranking doesn't matter, just show exact matches ranked by time.
1% of the time, I want something else.
This is like saying Ford sells cars they don't allow you to fly.
If you want something that flies, buy a plane.
I hadn't set it up to sync to github as it was just internal development, but i've started the sync and it will show up at https://github.com/wikimedia/search-ltr soon.
It's got a bit more of the integration with elasticsearch put together, including storing models in cluster state and a rest interface for managing them. It's a bit more of a direct port of the solr plugin rather than a rewrite from the ground up so there are also some oddities that don't yet make sense. Refactors will certainly be done. It's also tied a little less directly to RankLib, such that i can convert and load in MART models trained by lightgbm or xgboost which have done pretty well in my offline tests and are able to utilize resources on my training machine much more efficiently than ranklib's LambdaMART (although in terms of results, the ranklib implementation is pretty good).
We store models as a custom scripting language which takes care of distributing the model around the cluster, caching and basic CRUD operations. This was the hard thing to figure out, at first we looked at a REST plugin but it seemed cumbersome and hard to integrate with the query DSL. But I'm curious how you guys got around those pain points:)
Most of the time I search for information I need is because I don't know anything about that part of the software. I never found this kind of information in Slack.
Parse the company docs, or our rep, and now we're talking.