Ask HN: What will become of current population of AI's GPUs as they age?

ganzuul · on Jan 2, 2024

Guessing they will be used to generate synthetic data from simulation to feed the next gen of AI. No copyright issues then.

There might be room for clubs of AI hobbyists who curate datasets and build simulations.

samstave · on Jan 2, 2024

>synthetic data from simulation to feed the next gen of AI

Oh, damn, Hadn't thought of that - is this already happening? Or, where may I read more about what people are thinking about having AI feeders come up with synthetic data to pump into newer models?

@Ganzuul - So the next AI trope might be "Yeah, but can it play Crisis?"

Imagine fighting 'Terminators' who have been trained on billions of hours of AIs playing FPS+Stealth games.

However, the real will be 'Death From Above' - I am sure all the Drone data from all the recent wars will be used to make tiny killer robots.

I've trained in Martial Arts for 30+ years.

Watching Russian solders being taken out with flying FPS grenade drones is so disheartening, and really makes ones 'Martial Arts' ego die quickly.

Luckily - I was in a deeply spiritual and meditative art - not just Bikini Brawlers such as the UFC produces.

AI, drones, Human War is FN scary as heck for the future.

ganzuul · on Jan 2, 2024

Datasets generated out of GPT-4 used to train open source models have been referred to as synthetic.

Before LLMs there was a trend of playing classic video games with deep learning, so perhaps the next datasets will come out of more modern games.

samstave · on Jan 2, 2024

The use of GPS scrapers is going to get to "Zoom | ENHANCE" trope levels:

-

"Find me a product family of the same [PRODUCT SPEC/MODEL/TYPE] {such as a web cam} - which is online. Build a model for" <<---- Such an idea, using a feeder-AI to slurp such and then feed it as a module for "Open WebCams of [MNFR],[MODEL],[FIRMWARE]) which is connected to [ISP] (via ipv4 allocations (AS#s))" == CVEs available

(Maybe I have a lib/ability to have a bot continually CVE-MAP (nmap) for those that match my exploit, but be able to do it from a diffuse perspective, as NLP requested - so my CVE can even hit patterns that are not obvious, over a long time....

Lets say I CVE all webcames in [AREA AROUND TARGET] - then let it run with lots of models, and scan for trafficking (humans walking, as well as cars driving) - but now I can tag that:

Show me every time 'that car' has driven near hear in N period.

Give me a list of frequency of license plates, anomolies, move that to plotly - show me who always drives by, and whomever is driving in this area with a different, but consistant frequency over a short period of time"

------

This power is finally in the hands of anyone who can connect visual tagged feeds to llms which are currently avail... and a f-ton of folks....

SO....

--

This is why we need the ROBOTS.TXT version of AI.txt on INDIVIDUAL LINES OF ANYTHING readable by any AI bot.

Like a singular UNICODE char, at beginning|end of {ELEMENT} to EXCLUDE that from your model training?

a [REDACTED] Function, mayhaps?

---

I mean something more /enforeble/ as a gitignore?

----

Does this already exist, as a function in all languages - such that you can have this integrated to any repo easily?

(UNDERSTAND our business model based on [WHATEVER])

(REDACT anything from REPO which refers to function X, look at the outcomes for function X and recommend other areas of REDACTION which relate to our business model. (even in schemas/challs/variables that may not be trained on?)

((I am not sure what this level of code "obfuscation from observation by AI" falls into?))

Lets say, we want to identify useful data analysis ARCHETYPE logic - without revealing business logic.

Manufacturer

Country

IP-AS#/ISP

Firmware version

blah blah

you get the idea.

Code crawl security. Open

------

An interesting dataset would be any open DARPA/DOD/[WHATEVER] contract.

Look at the similarities in contracts for accepted archetypes that get funded, and who submitted, budget, success, news stories, lawsuits, congress-folk-and-circle connected...

Yeah - .gov corruption should be afraid of datasets developed by just simple SaaS' corpus available.

=======

So develop an active scanning GPT-corporate-espionage-bot?

But that easily delves to pico-NSA's (corporate GPTs searching for sensitive [stuff] (AI CyberSecurity is the FUTURE)

However, protect your GPTs, which one may publish on a GPT store: - the logic can be used to reverse engineer your own business logic. It will show what your corp security GhostShell is paying attention to - thus, exactly what your surface area is?

The really crappy secure model is toxic injection. So one must really look REALLY FUCKING HARD to indexing the internet on their models with each result as a hash - AND THEN

Tracking that result hash. For trust.

Look at a publication over time. Alert on edits - but the AI, can check on remote/foreign reposts for a hash...

Even a sentence - so tokenize a sentence, against a URL such that if the sequence in words in a particular body of [electronic text] (body/paragraph whatever - of text] is given an original hash of token for creation?

So you literally type something into and get youre uniqueness which allows for signatures.

Done with stream.

ganzuul · on Jan 3, 2024

Information warfare, measuring and predicting everything leads to a lifestyle without peace or satisfaction. It's a corporate lifestyle which we should defend against. It does offer some quick thrills but they are not sustainable. It is power without beauty.

Seasons, holidays and if you can stomach it spiritual observations are important for not losing touch with our humanity. So is arts and crafts, animal care, community and family. These things are hard to measure and predict but vital for our well-being. Beauty is abundant while power is loosely coupled.

In my opinion, this is the background against which we define ourselves with our choice of ethics.

gregjor · on Jan 2, 2024

They will start forgetting their keys, then the names of their grandchildren.