Any links to the actual construction of the LLM? I'm told the underlying code isn't all that complicated. Hence a lot of people here posting that 'everyone' will catch up to OpenAI eventually.
Edit:
Guess I hadn't read far enough.
Links in the parent to some LLM's with source available.
Has anyone tried using one of the models for cyber-security? I'm concerned that censored models are going to make that difficult, since I want to describe attacking systems.
Yes, we work with security analysts and other folks doing data-intensive investigations & analysis in Louie.ai (think being able to use natural language to talk to you Splunk, OpenSearch, Databricks, etc. logs/news/.., and many Python / Pandas viz + wrangling + AI tools, to investigate more easily & further, build autonomous hunts, smarter monitors & detections, etc)... and LLMs work amazingly here.
Blue team is a lot easier than red team, and as part of that, code tasks are easier than social engineering. It rarely comes up in investigations, but you do need to set prompt persona for simulating attacks.
At the same time, fine-tuning a model is a pretty clear way to undo needing to do even that.
We benefit from models having business-neutral alignment by default -- the typical case suffers if the default persona was a racist forum troll with extreme politics. So it's more about when you want to turn that off for specific tasks. It's more work than we'd like, and gotchas like changing when models update.
Got it, thanks. Alright that makes sense. I haven't really tried much at this point, though I have had ChatGPT give me a few "I can't do that" and I've had to be like "no, seriously, I'm doing this because it's a good thing" in the past (3.5 days).
Ex: Outbound - Interactive chat sessions trigger outside push update, such as to a database for data to index, new findings & enrichments, etc
Ex: Inbound - API calls trigger internal runs... which in turn can also trigger external workflows
We're experimenting with APIs for embedding in your own apps, headless APIs for automations, and all sorts of integrations. Getting security and modular/compositional interfaces right has been a fascinating challenge!
Automated alerts - SRE as a service. Anomaly detection with some good suggestions. SRE intern as a service? Something like that - notifying you of something vs human requesting.
If they are censored I think you could find someone who has gotten around it . Truly censoring a model is tough to get to high parameter models that are performant.
Fine-tuning mistral for tool use like metasploit is effective, but even default mistral with a basic system prompt is very capable and doesn’t often say it can’t do things. ChatGPT obviously needs a lot of coaxing, “my job depends on this” is a hilarious way to get it to be helpful here. But for cyber security tooling I think we’ll see things more akin to David Shapiro style swarms with small models that are domain specific coordinating with each other (very basic discovery focused models communicating with a more complex reasoning model to validate findings, then remediation)
The tricky part here is (football metaphor) so far I’m having to train the “strikers” before I can effectively train the “goalie”. Which feels bad for AI safety. I think this is why we’re not seeing a lot of work in the open here.
But we’re planning to open source the goalie, which will look more like Markov/monte-Carlo traditional ML on specific bits, like infrastructure as code.
If you want to work on this stuff, especially in EU, DM me; we’re hiring ;)
Haha, I have done the "my job depends on this" kind of thing. I think I did something like "someone's life depends on this". I just feel like that's flaky.
Sounds very cool :D I'm US based and currently taking a long time off, but I wish you the best of luck.
Any links to the actual construction of the LLM? I'm told the underlying code isn't all that complicated. Hence a lot of people here posting that 'everyone' will catch up to OpenAI eventually.
Edit: Guess I hadn't read far enough.
Links in the parent to some LLM's with source available.
https://github.blog/2023-10-05-a-developers-guide-to-open-so...