Hi srameshc, the core of BentoML is still considered MLOps. A lot of our customers are pretty much MLOps users. However, LLMOps seem like a natural progression of the product, given that a lot of our users now want to experiment/build with LLM-based services.
You can pretty much make the same argument about Docker. Docker abstracts away runc, runc abstract away cgroup.
I don't think calling it abstraction is correct. OneDiffusion is designed to be opinionated and help users to run diffusion models easily. It includes best practice, default options, optimisation baked in such that it helps developers to move faster.
You can also make that argument about debugging any AI application in production. If you were just writing a simple web server to run this model behind a APIs, there are a plethora to think about, scaling, resource utilisation, load balancing, batching, etc. Hence the argument for creating an standarized interface for these type of serving problems.
We haven't even touched k8s, and throughput, latency, serving optimisation, but you get the idea of building this from scratch versus using library such as OneDiffusion.
Of course we are also working to improve the library and include more features, so looking forward to hearing more constructive feedback!
I also think Docker is an excellent comparison, but the way I see it, it actually reinforces my point.
Docker does not change the API. I can take any simple program and run it with Docker, no modifications needed. That also implies that Docker is fully optional. If I ever need to, I can just run the program outside of Docker. Or I could run gdb inside the Docker container, so it adds no complexity to debugging.
OneDiffusion is exactly the opposite. To use it, I need to rewrite my app for it's API. Once done, I'm 100% locked in and it won't work without the framework anymore. And if there are any issues, I always have to check all OneDiffusion source code, too, because it is impossible to know a priory if the issue was caused by my code or by the framework.
Just imagine if Docker would require you to recompile your OS from scratch for every update: only hardcore Gentoo fans would use it. But that's the level of commitment that OneDiffusion asks of me.
Hi all, I'm the main maintainer from the OpenLLM team here. I'm actively developing the fine-tuning feature and will release a PR soon enough. Stay tuned. In the meanwhile, the best way to track the development workflow is at our discord, so feel free to join!!
Thanks for the great project! Any chance, your team might consider more open platform than Discord for posting updates? I personally find Discord hard to use, and there’s no way to have sensible subscription (like RSS). Discord is usually muted.
Discord is a black hole where information goes to die. Its search and scrollback is awful. It's awful at being an archive, as finding anything that was asked more than a day or two ago is impractical.
To use Discord in good faith and with open eyes, you have to prioritize communication in the present, and give up hope of archiving anything that was said for people who might need the information in the future.
Discord is just a rich IRC replacement. You can log and search in IRC too but nobody seriously tries to archive information for research later. And big difference is it's all closed and operated by one entity that can change conditions at will. Don't even try to use it for anything else than real time chat.
That's only half true. Yes, Discord does allow a "rich" chat experience, with channels and servers, but there the similarities end.
IRC is based on an open protocol, with many open source clients available for it, and a decentralized server infrastructure.
Discord is closed and centralized, with only a single client available for it.
You can easily log IRC channels, but there is no easy way to do that on Discord, if it can be done at all.
I've logged every channel I've ever visited on IRC, and I can use powerful text tools to regex search through all of my conversations on IRC and have the results appear instantly. Nothing remotely like that is possible with Discord.
Paging through IRC logs is virtually instant on a modern terminal, while Discord makes you wait a long time between every other page load, so if you need to look through more than a handful of pages it's incredibly slow and painful.
Some IRC channels have their logs published on the web, making them fully searchable through web search engines, but to my knowledge no Discord channels do that.
For gaming communities (where you'd use voice chat), Discord was great. Easy to set up, free as in beer, runs in cloud. The alternatives back in the days did not have these features. They were either expensive (Ventrilo) or bad quality (Ventrilo and Skype latency/quality) or proprietary (only Mumble wasn't, TeamSpeak, Ventrilo etc were) or lacked community features (Ventrilo) or these were very archaic (TeamSpeak, Mumble) or you'd have to self-host (all but Ventrilo). It was also before GDPR existed. So Discord happily used and abused that unique position.
Its a shame its being used for general communities who don't use or need the voice chat feature. Especially when its an official community for a place, given their stance on third party clients and privacy issues.
If you don't need voice chat, Zulip, Mattermost, Revolt, Discourse, and many other would suffice (Linen recently got featered on HN). If you do, I think even Signal would be suffice these days.
For Discord search, recently Answer Overflow was recently featured on HN [1].
They stem words aggressively, so searching for "repeater", which is a less common, specific term, gives you results including "repeat", a commonly used word.
And there's no way to do an exact word search.
FYI, specifying the nfpr=1 query string parameter will disable Google's idiot attempt to try be helpful by searching for something other than that which you want to link to.
when I click this google gives me results for “how to use openlm” a commercial product, they literally change your search term if there’s a product that fits
Related: As an operator/mod/admin it's fairly straight-forward to bridge a Discord channel to Matrix (and, if one so desires, from there to IRC), allowing users not on Discord to participate. Conservative mods concerned about spam can start with an allowlist for which servers can join.
What an insulting interview question, I hope it was just in jest or at the end looking to pad the time
However, it did make me realize hidden therein is an actual interesting interview question, similar to the "describe what happens when you type an address into the browser's URL bar and hit enter": describe what happens after you type `:s/foo/bar` and hit enter. Followup version: what about `:%s/foo/bar`? The kind of thing that can be interesting to watch them reason through even if they don't know the answer, or even know what those syntaxes do.
I used to play EVE Online a fair bit, and always thought it interesting how some of the groups used Discord but only for text communications. Voice was still done over Teamspeak or Mumble.
When I played EVE, Mumble was the de facto voice comms since it supported 100s of pilots which happened many times during joint ops and xmpp for text chat and pings.
My understanding from asking several people, since I hate discord and want to know why people insist on using it, is that it’s a free alternative to Slack. Simple as that.
But it’s crazy, people are aggressive about Discord for some reason. I maintain an OpenAI SDK package for .Net, and I had some random person decide they wanted it to be a Discord community, so they created a Discord claiming it was the official community discord for my library, and submitted a PR updating my readme to say that it’s my project’s official Discord. They also replied to several issues and pull requests telling people to discuss it on that discord. If Discord isn’t paying this person in some guerrilla marketing tactic, they should be...
People didn't put documentation in IRC channels because they didn't want to answer the same questions over and over. Info went into a wiki, and you would get flamed for asking a question on IRC that was answered on the wiki. Discord is not a good place to stash documentation.
In my experience, I don’t think I’ve ever seen an IRC log in a search result.
#haskell on Libra is publicly logged, but I couldn’t get Google to return a quoted phrase from a message a few weeks ago.
Many people on IRC don’t enjoy being in logged channels. I’ve also heard that there are GDPR implications to publicly logging people’s messages without their consent.
Discussion of the difficulty and downsides of IRC logging, from a coulple years ago:
No, it's not. If you work on an opensource / open development project, it totally makes sense to avoid walled gardens for the community chat/forum (a few years ago it was public Slack instance, nowadays it's Discord servers).
Can you show me how to access the archives of the ask-for-help channel on the openllm Discord server? Right now they're discussing "loading models on CPU vs GPU". No matter how explicit I got, google did not find the discussion.
Isn't there something really nice about it though?
It seems to me that most every community gradually evolves into one where every new message from a new-ish member is answered by something like "Duplicate, please search first!". And this in turn makes those newcomers either go away, become passive lurkers, or become part of the "hive-mind" (as only likeminded questions get answered).
On the other hand, if people have to actually converse to get an answer to their questions (like back in the real world), newcomers can more rapidly become part of the community, and help make it more diverse.
LLMs could indeed address the first part, but not the second, of bringing the newcomers in via actual conversation with the older members.
The only good solution I encountered to this is of having some (preferably not too experienced) member(s) actively take upon themselves the role of welcoming newcomers and answering their questions, whether that's in an official or unofficial capacity.
This to me is the real way through this "Eternal September", where in every "cohort" of newcomers, one or more choose to stay close to the doorway to welcome and guide the next cohort.
The best of both worlds - a friendly community that welcomes newbies, with a searchable archive - is possible. Limiting to only chat-based support means that support is bottle-necked by the folks who are available and engaged at the time of the question, and that knowledge will "drop out" of the community as people forget it.
Apologies for my skepticism, but is it just "possible", or do you actually have an example of a long-lived community that remained fully welcoming to newbies while utilizing a searchable archive?
In any case, I'm not arguing that it's impossible, but rather that the more comprehensive the archive, the less welcoming the community would tend to be, all other things being equal. To take it to the extreme, I'll posit the following law: "A well-curated archive is the grave of a community"
Hard disagree. If anything you'll find that the most knowledgeable members get burned out answering the same questions over and over again, so they begin to simplify their answers until they just become copy pasta.
You can still have channels open to welcoming new people while at the same time having a large archive of answered questions so that over time a reservoir gets built.
Saying that the same questions getting asked over and over again by new people is somehow a more welcoming community, is like saying that there's any meaningful interaction happening when two people say "What's up?" followed by the response "not much". It's a handshake protocol equivalent without actual depth.
A very reasonable question, and I'll admit that I'm not deeply-entrenched in enough technical communities to give you an actual example. But yeah, intuitively I do agree with the sibling commenter - a well-curated archive is a tool of technology which allows skilled respondents to preserve their time and energy for new and interesting questions. A pointer to search is not necessarily dismissive - there is a world of difference between the following _technically_ equivalent responses:
* FFS, read the fuckin' archive noob and stop wasting our time
* Hey there, thanks for asking! This is actually a pretty common question, and we have guides written up for just this case. Try entering some of your search terms here [link], and come back with a follow-up question if that doesn't help you!
But yes, in fairness, I'll certainly agree that a community which _chooses_ to respond as the former will stagnate and die.
short answer: because it's one of the options with least friction to get running
a lot of people who are into tech stuff already have a discord account making joining the community a one click process, the instant nature of it seems to appeal to younger users more than async forums, it's a fairly mature platform so it has a bunch of moderation/customization/integration features you might want, etc.
> are discord conversations persisted and indexed on search engines ?
I've used IRC for a long time and still do, but I do think Discord has a nicer UX for most use cases. In particular, building communities around clusters of channels ("servers") and support for rich media (yes, some old people might call that a downside) increase the appeal for most people. It's also a lot more work to have a persistent connection on IRC (bouncers).
My main problem with Discord is that it's someone else's centralized, for-profit company and has no apparent barriers to enshittification[0]. As Reddit recently demonstrated, it's probably a mistake to build communities on top of something like that.
Matrix is a good candidate for a modern successor to IRC. It's not quite as slick a UX as Discord, but it addresses the main advantages Discord has over IRC.
Practically all of my friends grew up with IRC, we are in our late 30s, 40s, early 50s.
We might reminisce about irc but we all prefer discord.
Even the searchability of indexed irc has been surpassed by other knowledge sites. It would have to be something extremely niche these days where the only source of info is in an irc chat log
IRC doesn't even have history, one of the most basic requirements for a modern rudimentary chat app. It's ridiculous to suggest using it in 2023 when it doesn't have features a freshman homework assignment chat app has.
Very cool, btw it's not mentioned in the readme so I assume it's only for running full precision models or do quantized GGML/GPTQ/etc. also work with it?
GPTQ support would be amazing (AutoGPTQ is an easy way to integrate GPTQ support - it's basically just importing autogptq and switching out 1 line in the model loading code).
BentoML is an amazing tool that allows you to quickly and easily deploy your machine learning models as APIs. It is extremely user-friendly and easy to use, and the results are amazing. I highly recommend BentoML for anyone who wants to deploy their machine learning models quickly and easily.