They seize your phone and pull photographs, text messages, messenger logs, maps data, etc. The messages are inculpatory.
What do you want to say? That the extraction method is unreliable, that the Police have incorrect logs? Imply that the text messages have been extracted with errors, somehow?
That simply isn't the case. The data extracted is both reliable and probative. This is a copy and paste.
Also, I couldn't help but be amused by cheekiness:
"By a truly unbelievable coincidence, I was recently out for a walk when I saw a small package fall off a truck ahead of me. As I got closer, the dull enterprise typeface slowly came into focus: Cellebrite. Inside, we found the latest versions of the Cellebrite software, a hardware dongle designed to prevent piracy (tells you something about their customers I guess!), and a bizarrely large number of cable adapters."
sure, but if the phone company says there was an sms, when you open the phone it looks like there is an sms, and cellebrite extracts one, which you then present in a nice power point to the court...
then your accusation as to tapering with the evidence and where in the chain of custody it occurred had better be good, or all you are going to do is annoy the court (as someone with dozens , maybe hundred+, of cellebrite reports in various court records)
This is what armchair lawyers tend to forget, especially coders tend to forget. The law is not code. The judge is not a complier. When the law does not represent reality (this guys sells drugs and uses his phone to do it being a common reality) , then it becomes news-worthy.
But normally people just look at the reality and go "oh yes, this tool extracts the stuff on a phone and turns it into a pdf/html, how convenient". 99.99% of the time, the time a drug dealer alleging he has no knowledge of the 100's of deals on his phone is about as realistic as your 5 year old nephew with cake smeared on his face denying he ate the last bit of cake... and is treated as such.
Should the act of selling drugs be a crime?- completely different topic.
We have inprisoned 600 people in britain because electronic errors of transactions contained mistakes.
You cannot assume a random company's tech works reliably without any proof, if someone's life is at stake. If yhry have cloud upload shenanighans they could be mixing up records of different people.
Evidence from a network probider is a completely different matter, but if SMS records are enough you would not need this crap.
I don’t believe that’s the case at all. Cellebrite uses shady techniques kept in a black box. I see no reason to believe that it’s reliable at all. They’ve giving zero reasons to.
Try your hand with this next time you're arrested and have your phone dumped, I guess. Tons of police departments across the US use them and so many courts have accepted data extracted from devices.
People have been let off for crimes they obviously committed because the rock-solid evidence against them was illegally gathered or handled. I assume that's the sort of argument that should be leveled here: your evidence is illegal, you can't include that.
And I believe that's what the argument against secrecy with these systems is. How can you know whether legal lines have been crossed if the system is shrouded in secrecy?
ChatGPT is the killer app. It’s a Google killer. It is better than the SEO listicle garbage filling the internet. Even if it’s not always accurate it’s still better in a lot of circumstances. There’s a reason Sundar rushed Bard out of the gate even though it is clearly inferior.
Step 1. Humans write copy for humans to buy their garbage, humans counter by tuning out and switching channels
Step 2. Humans write SEO copy for machines to rank them higher.
Step 3. LLM writes copy for machines to rank them higher.
Step 4. Human uses LLM to try to distill the LLM generated SEO spam for any remaining signal.
Also to your point:
> SEO listicle garbage filling the internet.
the feeling that the LLM is better than what you described is going to be very temporary, then the mountains of LLM generated bullshit is going to overwhelm even LLM to make meaningful sense of.
You're missing the point. If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.
The only "if" to all this is if we will destroy the LLMs by feeding them their own diarrhea. I expect a sort of natural selection here to play out, especially in the open source space. Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.
> If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.
> Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.
Yes, humans are notorious for only seeking out high quality, accurate data, especially when it conflicts with our priors.
To say nothing of our ability to assess the accuracy or truthiness of information in the first place (look at how many people take, on faith, that Chat GPT isn’t wrong as often as it is right).
That's also true of a web search engine; but an LLM can (in principle, not saying it's there yet) be able to spot inconsistencies in the source data, to notice disagreement.
It isn’t though. Like I said, if the model gets worse OpenAI can simply not release a new version.
You also have to consider the money angle. As using ChatGPT and other chatbots becomes more popular, people will stop producing garbage internet articles because they will be less popular and therefore less profitable. Bloggers who enjoy writing will continue to do so because it was never about the money, they just enjoy writing.
Further, the internet is only one small portion of information available to train on. There’s a lot of other data out there, including real-world conversations.
You don’t update models to add new information. That’s extremely inefficient and susceptible to catastrophic forgetting. If you want the model to have new information you update an offline knowledge base. So yes you can simply not update the model.
Huh? You won't update the model you'll just give it new information? The exact concern is that the new information will be garbage aimed at pushing the model to produce certain output. Much like SEO spammers do to manipulate Google search results.
"Just don't update the model, only feed it new information" is exactly how to get to the outcome of concern in this thread.
Great, so you've updated your knowledge base, it's got garbage targeted to make it attractive to the model, and now your model is outputting garbage. It's the exact same problem Google has fighting the SEO spammers. Now the model is significantly less useful, exactly as suggested.
We've already seen exactly this happened with search. There's no reason to believe that LLMs are immune.
I understand what you are saying but to me it sounds very handwavy and (not to be disrespectful) naive.
How would LLM upstarts be able to counter the massive commercial interests? As with google they will also succumb to prefer money over usefulness at latest when they have a wide user base.
There is also an even less proven way of distinguishing spam from signal with LLMs.
And not updating a model means that they will be stuck in COVID-19 era forever.
I’ll push back on this, at least in its current iteration. I just asked it to list some restaurants near my apartment (major intersection in San Jose) and it wasn’t particularly close. While there are several restaurants less than half a mile from the intersection, ChatGPT listed restaurants from several miles away.
Given the “weights in a matrix” architecture of ChatGPT, I’m not sure it’s possible to store enough data to make the query practical to answer. Say there are a couple hundred intersections in my city. You have to store the token of “restaurant name” “close to” “intersection” for each intersection. I don’t know the size of Google’s Maps DB, but I would guess it’s several Gigabytes per city. From my understanding of the theory, you would need to store BOTH the LLM weights AND the Maps data for ChatGPT to have a shot at generating good answers for that type of query.
I’m happy to be wrong here. If I’m misunderstanding something, please let me know.
Well you’re right since ChatGPT isn’t hooked up to the internet, so certain queries aren’t good use cases. Adding maps info to a language model would be a pretty bad idea (even if it didn’t hallucinate) since it can change at any time, which would require more (expensive) training.
What Bing does is to use your query to search the web and use the top N search results in the context window for the chat.
However I’ll push back on your pushback. ChatGPT doesn’t need to be perfect to be a killer app. It is highly flawed. Maybe it was a bit to strident to say ChatGPT will kill Google search, but it’s strictly better for a lot of squishy queries that don’t have a factual basis.
How can I convince my boss to give me a raise? gets you a listicle on Google and a highly specific response on ChatGPT. And if some of the advice doesn’t apply, you can continue directing the conversation. It’s an idea generator, even if some of them are bad or don’t make sense.
Tangentially related, I had GPT-4 plan the sightseeing on my latest holiday.
It both picked out the interesting places of note, and then I asked it to plan them in such a way that made sense walking-wise (so I wasn't backtracking) and it did so without a hiccup.
You're not wrong at all, it doesn't know everything.
But it does know a lot of things and can be super useful. Personally i think search engine is a terrible use case, unless you use the Bing enabled version, or bing chat.
I've used it to write pretty complicated scripts where I had no idea what I was doing, rebuild crusty httpd configs from first principles, explain disassembled code, explain regular code, explain configs, read dmidecode and lspci for me and make a pcie slot report... It's bloody brilliant.
Other: read and translated my blood tests. Accurately!
It is not a direct replacement for search engines, but it will seriously dent their market share.
If you are looking for a location on the internet, use a search engine. LLMs do not memorise the data sources verbatim.
If you want to know how to do something, it will normally give you a better answer than you would find by googling around multiple blogs. No location on the internet needed.
Even though it is just a bare kernel (LLM), GPT-4 is a better teacher than any I've ever had. Who can I ask at 3AM on a Saturday night to explain ancient Greek philosophy using dwarf fortress mechanics? And iterate with infinite patience and focus on any follow up questions?
This is one of the most compelling use cases. It’s dramatically reduced research time on certain topics for me. If I have a “question” I don’t go to google first anymore.
I think we're just scratching the surface of apps. As we figure out how to integrate this technology in novel ways (not just "here's my app + AI!!!!"), it will open new doors.
Shameless self-promotion, I'm trying to build some of those intermediary pieces. I have authored an open source library[1] that lets businesses externalize LLMs to their users, so that users can use natural language to query their data in your database. The goal is to try to simplify UIs to have more natural language components, without needing to send your data to an LLM.
Adobe’s Firefly AI and other Adobe AI is the result of billions of dollars of investment over multiple years. Lousy source (sorry): See the number of Adobe authored 2 minute papers (YouTube channel) about AI based graphics over the years.
The rest of the examples shared are mostly just direct integrations with the GPT-4 API, which should be trivial for almost any startup to do. I.e. it’s very likely not going to be game-changing for a CRUD company like OP’s case.
There is a reason most creative professionals are screaming to the nine high heavens of hell for dear mercy, because they are in the best position to see the writing on the wall.
The current deconstruction of intellectual property rights is damning and must be rectified, but even putting that aside "AI" is still going to eliminate the vast majority of creative occupations because a supercomputer is still cheaper than a human on the payroll or invoice.
to me potential killer app is the search, you already now can find better answers fast in many verticals asking chatgpt than going through the tons of seo spam at google.
Health and medical queries are a big one for me. The top results in Google are the same cookie cutter responses from the same 5-10 domains. There is no expertise there - just an article ghost written by a freelance writer with a random doctor’s name attached to it.
With the right prompt, chatGPT gives me far better insight and can even point out academic papers to back its claims.
There are no health issues that anyone should ever be using Chat GPT for - the hallucinations are very real, and often severe. Not that you should be googling your symptoms either - there’s a reason we put medical folks through more than a few minutes of training.
LLMs are absolutely worthless for medical information.
I’ve gone to multiple doctors for my problems and most of them have only made it worse. Most doctors I’ve met get dumbfounded when the problem is anything more complex than a straightforward case.
And I say this as someone married into a family of doctors.
> LLMs are absolutely worthless for medical information.
chatgpt gives you initial hint/direction, and then you can research specifics by finding actual information in search engine and see if it is hallucinated or not.
Seriously, one of the true wonders of the world. It is amazing it was not fully ruined in XX century with cars and advertisements and "modern" architecture. I feel like all of the western civilisation should be spending money on saving Venice, considering its historic influence as well.
I agree 100%. If every city had vehicle traffic fully grade-separated from pedestrians the way the Venice does, they would be much more livable and safe.
We got enough debt and unfilled payments to go around already. Let those who want to save it spend their own personal money by donating it to certain accounts. And do not shoulder everyone with stupid waste of money.
And why pick only it to save? Don't any dying city build in 1900s deserve to be saved too?
It's a modern problem. Below the current city are the remains of several other cities that previously sank into the muck. The problem only became difficult to manage when we stopped rebuilding over sunken structures and started continually investing in extremely complex and heavy structures that we are hesitant to write off.
I’m sure if we compare governments and corporations with impact of individuals it’s not 50/50. Govs and corporations have much bigger chance at changing large scale factors, quite obviously. Even individual behaviour is in the hands of the governments – see France banning short flights as example.
the same could be said on the power of people above all with inflation, drought, heat waves, pollution waves, mindsets change slowly but surely, it'll become shameful to have noisy and polluting vehicles. I can see some environmental revolution in the future, environmental migrants, environment will be problem number 1, far ahead of minor things like covid. Government often follow and react. I'm french and this law is a bit of an experiment, very minor https://www.nytimes.com/2023/05/24/world/europe/france-short... I live near Nice, a plane landing every minute, at least 6 planes trail visible in the sky, when there's not too much pollution, the government is still subsidising airplane companies, airports etc. It's not enough, we need quotas of a few air flights in our lives, no more
People already have unhealthy parasocial relationships with influencers.
It seems clear that people (lonely/depressed people especially) will overdose on this sort of thing once it is developed, commercialised, and less bleeding edge.
It's vapour filling the place of human connection. It's stevia. It's not going to give you cancer, but it's still unhealthy and will certainly exceed the parameters of entertainment.
> People already have unhealthy parasocial relationships with influencers.
At least if they switch from parasocial relationships with influencers to parasocial relationships with open source bots they won't be financially exploited by the influencers. GPT doesn't have anything to sell us.
> they won't be financially exploited by the influencers. GPT doesn't have anything to sell us.
Except the vast majority of people aren't able to host such a bot themselves, so it seems inevitable that paid hosting services for such bots will arise. Then there's just the potential for financial exploitation at greater scale.
Well then catch the opportunity by the balls and start offering incels AI girlfriends written in a such a way to deradicalize them and emulate real interaction with a woman. They will subsequently get less aggressive more socialized and will also pay for it. Win-win situation.
I believe this was the issue with replika, which encouraged people to develop emotional attachments with their 'AI partner' and then first put romantic chat responses behind a pay wall before removing them entirely a year or so later.
From the outside this could be seen as a good thing, but for someone involved in the relationship, someone who may struggle with a traditional relationship and may see this as the only available option, I'm lead to understand the event was remarkably traumatic.
> It seems clear that people (lonely/depressed people especially) will overdose on this sort of thing once it is developed, commercialised, and less bleeding edge.
Sounds much less exploitative and unhealthy than the streamer/influencer parasocial relationships these people are probably currently invested in.
Had HN been around when personal computing dropped, I imagine that we could have made some attempt at a steel man argument justifying such a purpose. Even if that argument was 'hobbyists', there was something to be said.
There is something like this in 'A Fire Upon the Deep' by Vernor Vinge; his intergalactic societies translate their alien languages - and incompatible methods of expression - using an application similar to this one.
I encourage you not to be embarrassed and to simply opensource it. To err is human; anyone giving you grief because of bugs doesn't deserve the effort you've put in. And opening it now could actually bring assistance in getting those bugs fixed, while simultaneously benefitting everyone who wants to do something similar but isn't sure where to start
I won't speak for anyone else but sometimes "bugs" in more about process than code. I have a similar project as the GP and am not currently interested in open sourcing the project because there's a lot of bespoke elements and manual setup process. I don't want to have to make a README describing all the process steps that make my code actually useful.
For me, on my hardware, on my network, I've got a process that works. It's a non-zero amount of effort to generalize the description of that process.
Your content-chatbot repo was very useful when I was figuring out how to achieve this sort of thing with langchain. I was able to knock together a chatbot for a client's documentation site in an afternoon. But I guess the real value for SiteGPT is the ease-of-use and the client-side chat interface.
Same here. I know it hurts to offer a free trial for something that already costs you money to serve (those API calls won't be free), but it's really hard to sign up without trying it first.
Yes, I understand. I will give you the same option as the previous person. Please give me a sample webpage, just one web page and I will create a chatbot for that webpage, and post the chatbot link here.
I understand. In that case, the only way is to subscribe to the $19/month plan and try it out for yourself. You can cancel anytime you like with the click of a button.
Yeah, I didn't know how much that is going to cost me if I added a free trial. But I can create a demo for you if you like. Give me an example web page link for which you want the chatbot.
I will create one and post the link here. Just a single page url.
I don’t mean to be harsh, but if running a free trial would bankrupt you, you shouldn’t be trying to start a company.
It also doesn’t inspire much confidence in your early users, there’s been a lot of these GPT API cashgrabs popping up all over so if you want to differentiate yourself you might need to actually incur some risk.
I think I'm qualified to give this advice, seeing as some of the biggest brands in the world have trusted my advice on digital marketing.
It has nothing to do about whether I personally can risk the $19, I'm not even in the target audience for this – the question is what percentage of the target audience is going to be ready to pay $19 for something they don't know is going to work for their site, and how much bigger would that pie be if the site owner spent a tiny amount on offering trials.
Just making people get their card out is going to make a huge percentage of leads drop off, especially when there's almost no content/demos or an actual working trial in the site (even the screenshot is just a static screenshot instead of a live demo).
If you want to know more, you can reach out via email and I'd be happy to help (though it might cost you a bit more than $19)
There are various inhalation resistance trainers on Amazon for as little as $20. I'm doubtful the more expensive devices offer any benefits over the less expensive models. In fact the simpler less expensive models look much easier to take apart and clean. As long as you are breathing against resistance, you're getting a benefit.
They seize your phone and pull photographs, text messages, messenger logs, maps data, etc. The messages are inculpatory.
What do you want to say? That the extraction method is unreliable, that the Police have incorrect logs? Imply that the text messages have been extracted with errors, somehow?
That simply isn't the case. The data extracted is both reliable and probative. This is a copy and paste.