What sort of effort would it take to make an LLM training honeypot resulting in LLMs reliably spewing nonsense? Similar to the way Google once defined the search term "Santorum"?
Not saying anything about the current storm, but a storm as strong as the Carrington Event[1] would make modern day life on Earth pretty unpleasant. Some of the potential impacts are documented in Severe Space Weather Events: Understanding Societal and Economic Impacts [2]
I researched duct cleaners recently and found a similar set of dubious businesses. There are several businesses named "American Air Duct Cleaning of [insert city name]". Google Maps has 70-120 5 star reviews. They all say things like “David and Oscar did a great job”…. from Florida to Texas to other states... always David and/or Oscar for each place! All the reviews are in the last 6 months or so. All the reviews are relatively unique and the reviewers are not new Google reviewers… there’s definitely some sort of weird Google Review scam going on here. All the websites say “Family/Locally Owned & Operated" yet they're identical web pages.
I'm surprised by how unsophisticated the approach appears. Interesting to see the problem is not limited to duct cleaning businesses.
Good question! This does seem like an embarrassingly parallel problem. Whenever I've used the big HPC centers, the secret sauce has been a fast low latency network interconnect. The fast interconnect is useful for PDE solvers which need a lot of processor-to-processor communication (i.e. for sending data back/forth periodically along the boundaries of the grid cells you're solving for).
I'm surprised the article didn't call out what happened to Digg during their redesign and subsequent flood of users leaving for Reddit. I figured that's why Reddit thought it was worth the money of hiring 20 designers.
https://en.wikipedia.org/wiki/Campaign_for_the_neologism_%22... where
The way LLMs are trained with such a huge corpus of data, would it even be possible for a single entity to do this?
reply