Hacker News new | past | comments | ask | show | jobs | submit | McAtNite's comments login

One of my favorite people I ever worked with was like this. We could disagree on an approach, I’d call him a moron, he’d tell me to go fuck myself, then we’d laugh and go back to our work.

It’s a very different style of work where politeness, ego, and professionalism weren’t factors. The only focus was the tech and ensuring it was moving as efficiently as possible. You could really push things in whatever direction you needed as long as you had the metrics to show it was better.


That sounds like a toxic relationship and work environment.


Different strokes I guess. I personally appreciate the straight forward approach to this style versus the normal corporate environment.


You can be straight forward without being crass, rude, or aggressive. That’s a desirable professional skill.


Or you can judge how to approach each interaction individually instead of applying a global veneer of fake pleasantness that the US work culture labels as "being professional".


>US work culture labels as "being professional".

You guys realize this person wrote a blog post about not being able to find a job, right?

"This toxic personality is so refreshing? But why can't he find a job?"

What a mystery.


I think much like the author’s writing it’s a very polarizing thing, and it either clicks with people or doesn’t. For those that it doesn’t I can understand the viewpoint that it’s extremely negative.

I disagree that it’s inherently rude or aggressive. You can absolutely tell someone they’re a moron in a very polite and joking way, but it clearly won’t translate into writing very well.

As for crass, yeah definitely ¯\ _(ツ)_/¯


The general gist I’m getting from this comment section is a general lack of awareness over DLP and security. The thing half these comments are complaining about exist for very good reasons.


I’m struggling to understand the point of this. It appears to be a more simplified way of getting a local LLM running on your machine, but I expect less technically inclined users would default to using the AI built into Windows while the more technical users will leverage llama.cpp to run whatever models they are interested in.

Who is the target audience for this solution?


This is a tech demo for TensorRT, which is ment to greatly improve inference time for compatible models.


> the more technical users will leverage llama.cpp to run whatever models they are interested in.

Llama.cpp is much slower, and does not have built-in RAG.

TRT-LLM is a finicky deployment grade framework, and TBH having it packaged into a one click install with llama index is very cool. The RAG in particular is beyond what most local LLM UIs do out-of-the-box.


>It appears to be a more simplified way of getting a local LLM running on your machine

No, it answers questions from the documents you provide. Off the shelf local LLMs don't do this by default. You need a RAG stack on top of it or fine tune with your own content.


From "Artificial intelligence is ineffective and potentially harmful for fact checking" (2023) https://news.ycombinator.com/item?id=37226233 : pdfgpt, knowledge_gpt, elasticsearch :

> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?

> How does search compare to generating things with citations?

pdfGPT: https://github.com/bhaskatripathi/pdfGPT :

> PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities.

GH "pdfgpt" topic: https://github.com/topics/pdfgpt

knowledge_gpt: https://github.com/mmz-001/knowledge_gpt

From https://news.ycombinator.com/item?id=39112014 : paperai

neuml/paperai: https://github.com/neuml/paperai :

> Semantic search and workflows for medical/scientific papers

RAG: https://news.ycombinator.com/item?id=38370452

Google Desktop (2004-2011): https://en.wikipedia.org/wiki/Google_Desktop :

> Google Desktop was a computer program with desktop search capabilities, created by Google for Linux, Apple Mac OS X, and Microsoft Windows systems. It allowed text searches of a user's email messages, computer files, music, photos, chats, Web pages viewed, and the ability to display "Google Gadgets" on the user's desktop in a Sidebar

GNOME/tracker-miners: https://gitlab.gnome.org/GNOME/tracker-miners

src/miners/fs: https://gitlab.gnome.org/GNOME/tracker-miners/-/tree/master/...

SPARQL + SQLite: https://gitlab.gnome.org/GNOME/tracker-miners/-/blob/master/...

https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; promptfoo, chainforge, mixtral


> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?

Absolutely worse, LLM are not made for it at all.


It seems really clear to me! I downloaded it, pointed it to my documents folder, and started running it. It's nothing like the "AI built into Windows" and it's much easier than dealing with rolling my own.


This lets you run Mistral or Llama 2, so whomever has an RTX card and wants to run either of those models?

And perhaps they will add more models in the future?


I don't think your comment answers the question? Basically, those who bother to know underlying model's name can already run their model without this tool from nvidia?


It will run a lot faster by using the tensor (Ray Tracing) cores than the standard CUDA cores.


I suppose I’m just struggling to see the value add. Ollama already makes it dead simple to get a local LLM running, and this appears to be a more limited vendor locked equivalent.

From my point of view the only person who would be likely to use this would be the small slice of people who are willing to purchase an expensive GPU, know enough about LLMs to not want to use CoPilot, but don’t know enough about them to know of the already existing solutions.


With all due respect this comment has fairly strong (and infamous) HN Dropbox thread vibes.

It's an Nvidia "product", published and promoted via their usual channels. This is co-sign/official support from Nvidia vs "Here's an obscure name from a dizzying array of indistinguishable implementations pointing to some random open source project website and Github repo where your eyes will glaze over in seconds".

Completely different but wider and significantly less sophisticated audience. The story link is on The Verge and because this is Nvidia it will also get immediately featured in every other tech publication, website, subreddit, forum, twitter account, youtube channel, etc.

This will get more installs and usage in the next 72 hours than the entire Llama/open LLM ecosystem has had in its history.


Unfortunately I’m not aware of the reference to the HN Dropbox thread.

I suppose my counter point is only that the user base that relies on simplified solutions is largely already addressed with the wide number of cloud offerings from OpenAi, Microsoft, Google, whatever other random company has popped up. Realistically I don’t know if the people who don’t want to use those, but also don’t want to look at GitHub pages is really that wide of an audience.

You could be right though. I could be out of touch with reality on this one, and people will rush to use the latest software packaged by a well known vendor.


It is probably the most famous HN comment ever made and comes up often. It is a dismissive response to Dropbox years ago:

https://news.ycombinator.com/item?id=9224


Thanks for the explanation. I guess my only hope for not looking like I had a bad opinion is people’s intertia to move beyond CoPilot.


> the user base that relies on simplified solutions is largely already addressed

There is a wide spectrum of users for which a more white-labelled locally-runnable solution might be exactly what they're looking for. There's much more than just the two camps of "doesn't know what they're doing" and "technically inclined and knows exactly what to do" with LLMs.


Anyone who bothers to distinguish a product from Microsoft/nvidia/meta/someone else already know what they are doing.

Most users don't care whether whether the model is run, online or local. They go to ChatGPT or Bing/Copilot to get answers, as long as they are free. Well, if it becomes a (mandatory) subscription, they are more likely to pay for it rather than figure out how to run a local LLM.

Sounds like you are the one who's not getting the message.

So basically the only people who runs a local LLM are those who are interested enough in this. Any why would brand name matter? What matters is whether a model is good, whether it can run on a specific machine and how fast it is etc, and there are objectives for it. People who run local LLM don't automatically choose Nvidia's product over something just because nvidia is famous.


I'll try again.

Have you ever tried to use ChatGPT alone to work with documents? In terms of the free/ready to use product it's very painful. Give it a URL to a PDF (or something) and assuming it can load it (often can't) you can "chat" with it. One document at a time...

This is for the (BIG) world of Nvidia Windows desktop users (most of whom are fanboys who will install anything Nvidia announces that sounds cool) who don't know what an LLM is. They certainly wouldn't know/have the inclination to wander into /r/LocalLLaMA or some place to try to sort through a bunch of random projects with obscure names that are peppered with jargon and references to various models they've also never heard of or know the difference between. Then the next issue is figuring out the RAG aspects, which is an entirely different challenge.

This is a Windows desktop installer that picks one of two models automatically depending on how much VRAM you have, loads them to run on your GPU using one of the fastest engines out there, and then allows you to load your own local content and interact with it in a UI that just pops up after you double-click the installer. It's green and peppered with Nvidia branding everywhere. They love it.

What the Nvidia Windows desktop users will be able to understand is "WOW, look it's using my own GPU for everything according to my process manager. I just made my own ChatGPT and can even chat with my own local documents. Nvidia is amazing!"

> why would brand name matter?

Do you know anything about humans? Brands make a HUGE difference.

> People who run local LLM don't automatically choose Nvidia's product over something just because nvidia is famous.

/r/LocalLLaMA is currently filled with people ranting and raving about this even though it's inferior (other than ease of use and brand halo) to much of the technology that has been discussed there since forever.

Again - humans spend many billions and billions of dollars choosing products that are inferior solely because of the name/brand.


I have no idea what you're talking about and am waiting for an answer to OPs question. Downloading text-generation-webui takes a minute, let's you use any model and get going. I don't really understand what this Nvidia thing adds? It seems even more complicated than the open source offerings.

I don't really care how many installs it gets, does it do anything differently or better?


> Downloading text-generation-webui takes a minute, let's you use any model and get going.

What you're missing here is you're already in this area deep enough to know what ooogoababagababa text-generation-webui is. Let's back out to the "average Windows desktop user who knows they have an Nvidia card" level. Assuming they even know how to find it:

1) Go to https://github.com/oobabooga/text-generation-webui?tab=readm...

2) See a bunch of instructions opening a terminal window and running random batch/powershell scripts. Powershell, etc will likely prompt you with a scary warning. Then you start wondering who ooobabagagagaba is...

3) Assuming you get this far (many users won't even get to step 1) you're greeted with a web interface[0] FILLED to the brim with technical jargon and extremely overwhelming options just to get a model loaded, which is another mind warp because you get to try to select between a bunch of random models with no clear meaning and non-sensical/joke sounding names from someone called "TheBloke". Ok... Oh yeah, what's a "model"? GGUF? GPTQ? AWQ? Exllama? Prompt format? Transformers? Tokens? Temperature? Repeat for dozens of things you're familiar with but are meaningless to them.

Let's say you somehow braved this gauntlet and get this far now you get to chat with it. Ok, what about my local documents? text-generation-webui itself has nothing for that. Repeat this process over the 10 random open source projects from a bunch of names you've never heard of in an attempt to accomplish that.

This is "I saw this thing from Nvidia explode all over media, twitter, youtube, etc. I downloaded it from Nvidia, double-clicked, pointed it at a folder with documents, and it works".

That's the difference and it's very significant.

[0] - https://raw.githubusercontent.com/oobabooga/screenshots/main...


It's a different inference engine with different capabilities. It should be a lot faster on Nvidia cards. I don't have comp benchmarks for llama.cpp but if you find some compare them to this.

https://nvidia.github.io/TensorRT-LLM/performance.html https://github.com/lapp0/lm-inference-engines/


It brings more authority than "oh just use <string of gibberish from the frontpage of hn>"


That tells you how it might affect people's perception of it, not whether it's better in any way.


Sure, it's just disingenuous to pretend that authority doesn't matter.


Disingenuous to what? I'm asking what it brings someone who can already use an open source solution. I feel like you're just trying to argue for the sake of it.


I just looked up Ollama and it doesn't look like it supports Windows. (At least not yet)


Oh my apologies for the wild goose chase. I thought they had added support for Windows already. Should be possible to run it through WSL, but I suppose that’s a solid point for Nvidia in this discussion.


I think there's a market for a user who is not very computer savvy who at least understands how to use LLMs and would potentially run a chat one on their GPU especially if it's just a few clicks to turn on.


There are developers which fail to install Ollama/CUDA/Python/create-venv/download-models on their computer after many hours of trying.

You think a regular user has any chance?


Not really. I expect those users will just use copilot.


You are forgetting about developers who may want to develop on top of something stable and with long term support. That's a big market.


Would they not prefer to develop for CoPilot? In comparison this seems niche.


>people who are willing to purchase an expensive GPU,

Codeword for people who have hardware specialized and suitable for AI.


Gamers who bought an expensive card and see this advertised to them in Nvidia's Geforce app?


Does windows uses the pc's gpu or just cpu or cloud?


If they are talking about the Bing AI, just using whatever OpenAI has in the cloud


I’m referring to CoPilot which for your average non technical user who doesn’t care whether something is local or not has the huge benefit of not requiring the purchase an expensive GPU.


Never underestimate people's interest in running something which lets them generate crass jokes about their friends or smutty conversation when hosted solutions like CoPilot could never allow such non-puritan morals. If this delivers on being the easiest way to run local models quickly then many people will be interested.


The immediate value prop here is the ability to load up documents to train your model on the fly. 6mos ago I was looking for a tool to do exactly this and ended up deciding to wait. Amazing how fast this wave of innovation is happening.


Windows users who haven't bought an Nvidia card yet


I’ve used it in the past and the results were pretty decent. I strongly recommend hardwiring everything to avoid stutters. Tried it over WiFi a couple of times, and it worked well for more moderate games, but heavy particles and things would cause things to turn garbled for a few seconds. Then again, 6E and the like didn’t exist when I used it so if your access point has the throughput it might not be a problem anymore.


Those are two very different philosophies to Linux. Fedora is much more bleeding edge than Debian (by a lot) so the question really comes down to which you prefer.

I absolutely love Fedora and run it on my main desktop. That being said I also cut my teeth on CentOS so I’ve always had a soft spot for the RHEL approach to Linux. As for the bleeding edge aspect, I’ve rarely encountered issues with the latest updates. I had to do a bit of troubleshooting with pipewire and my HDMI output, and once a Gnome update caused the hertz setting on my monitor to bug out and cause a black screen.

None of it was really traumatizing though, and I have my system set to run a DNF update automatically on every login since I like to run as up to date as possible, and generally trust the packages won’t be overly buggy by the time they get pushed live.

—————————

After looking at your profile I realize you probably already are familiar with the philosophical differences of the two. Leaving the above for the potential benefit of anyone else who isn’t familiar.


This line of thinking only makes sense if you actively use OneDrive. If you don’t and have no interest in using it why leave it running? It’s bad enough that they automatically bundle it into Windows, but intrusively forcing a user to explain why they are using their own hardware in a certain way is silly.

I’m going to guess that the intersection of people who actively close unneeded background programs, and the users who are willing to explore alternatives overlaps quite a bit. This seems like a really shortsighted decision on Microsoft’s part.


Is it even possible to run Windows anymore without a Microsoft account?

If you only have a local account, OneDrive has nothing to sync to, right?


I’ve seen some guides in arduous ways to get around the Microsoft Account requirement, but I can’t say from experience. I abandoned Windows when the hardware requirements meant I couldn’t upgrade my personal computer to Win11.


> Is it even possible to run Windows anymore without a Microsoft account?

Sure. On Pro you choose Organization/Domain setup and create the local account. Home is harder and the process keeps shifting. There are guides.

> If you only have a local account, OneDrive has nothing to sync to, right?

Mostly correct except if you have a local account and an MS login for Store/Office/etc. It isn't syncing automatically (so far) but could be triggered accidentally.

Another first today. I saw a dot next to my username (Start->username) that was naggy notification to MS Account myself.


Either install windows without being connected to the internet or log in using username: test pass: fuckmicrosoft It will fail to log in and offer to create a local account. PD: that password is optional, any password will do, but over time I noticed it is good for mental health.


I’ve have the unpleasant task of interacting with auditors from the SEC regularly in a heavily regulated industry. Trying to get any sort of real, factual guidelines from them is a fool’s errand.

Their auditors views and interpretations vary wildly year to year, and they will not provide any written guidelines for their positions. Instead they insist it’s up to the discretion of the individual auditor. I’ve seen fairly reserved people in screaming matches by the end of it all, and concessions will eventually be made which seems strange for a regulatory body to do.

All that is that say, they won’t backtrack on anything. They’ll simply say that those were the views of those people as individuals, and they are not reflective of the official stance of the regulatory body.


> they will not provide any written guidelines for their positions

They legally can't. That's rulemaking. There are formal processes for government agencies issuing binding guidance.

There is a real problem with ambiguous laws. But asking the SEC to be your lawyer is a bit of a fool's errand. Coinbase absolutely knew they were breaking the law when they set out; they, and the rest of crypto, just hoped they could change it before they got caught.


While I have no experience in dealing with the SEC, I do in dealing with EASA and SOX audits. One common theme, both give you the barebone rules as written, and it is up to you to figure out how to follow them. In case of EASA, you define your own processes, and auditors sign off on those. You can hire external consultants to help you, but in the end it is your responsibility.

In case of SOX, you are required to have one company doing the prep work (processes, controls, internal "audits") with you, while another company does the formal audots and signs of on the balance sheets and financiap results. The latter wont give you any guidance neither when it comes to how compliance can be achieved.

And in both cases, there are always people that refuse to accept the well established guiderails and limits, when those are explained to them. Excuses range from inconvenient to I-do-not-want-to to "but what about innovation". Which is just bonkers, because both, SOX and EASA basically allow you to write your own internal rule book, and still people are not happy. It seems following rules is just below some people and their egos. Not that auditors care so.


I'm not doubting anything you're saying, but are there examples of such problematic behaviors in the public record?


Public record being official statements from business or government entities? I’m not aware of any personally.

Really I feel like I should clarify the above since I wrote it pretty late at night, and it I don’t think I really summed the experience up well. Another commenter mentioned their experience with other regulatory bodies that really cuts closer to what I intended to say:

“While I have no experience in dealing with the SEC, I do in dealing with EASA and SOX audits. One common theme, both give you the barebone rules as written, and it is up to you to figure out how to follow them.”

That really is my entire complaint. The grey area of what “counts” according to an auditor will vary over time, and what might count one year will suddenly be inadequate the next or vice versa.


The entire cryptocurrency space is filled with genuine grifters and conmen.

It’s absolutely reasonable to doubt anyone making unsubstantiated and unverifiable claims in this space.


When I said problematic behavior, I was referring to this:

> Their auditors views and interpretations vary wildly year to year, and they will not provide any written guidelines for their positions.

That's why I started with the disclaimer that "yes I know crypto it's a scam", but the regulators also seem to be making it difficult to stay compliant.


Ha! So true. And there would be huge savings if they would both simplify things and focus much harder on substance


Assuming that the regulated entities wouldn’t just move to the hinterlands and edge cases of that newly simplified/clarified/substantiated law in order to skirt it.

Which, of course they would, and would go so far as to create new forms of currency to try to do it.


It absolutely will. I had toyed with the idea of something like this before, but wasn’t sure what the behavior would look like until I found the cryptocurrency subreddit.

They have a system that rewards a monthly crypto amount based on the number of upvotes you received that surprisingly has an actual value that you can sell for. It’s largely led to a race to the bottom where comments and posts largely ignore any long form discussions or accuracy in favor of majority appeal.


This is subtley different though. 1$ gets split between all upvotes a person made. So instead of posts getting a value amount directly proportional to upvotes recieved, it will be proportional to how often those users upvote.

In theory I think this would encourage higher quality posts to attract those who upvote rarely.


I may be wrong, but I believe this is how the subreddit already works. They set a total amount of crypto to be released, and it goes in proportion to the total number of upvotes you receive in comparison to others in a given time window. I think the approaches are largely identical with the exception of the crypto vs direct fiat.


The difference is that the non.io model rewards more for upvotes from people who have a high bar for upvoting and upvote rarely.


To be clear, this is an entirely different model with a different incentive structure from the other one under discussion. It will not have the same issues. You don't want majority appeal, rather you want niche appeal. The more people like primarily your post and nothing else, the more valuable the upvotes become.


I often use airdrop to transfer small, one off files like PDFs or images. For larger files I transfer through my NAS. I’ve never had any issues using airdrop. I wonder if it’s because people are trying to transfer gigs of data using it.


Thanks for this link, I was actually chatting with someone today about good VPN providers and went to privacy tools. I was very confused why the top recommended VPN was Nord with an affiliate link.

Your link cleared up my confusion. It’s a shame that the project had to be forked.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: