I played a lot with LLMs over the last year and built a multitude of products with it and it’s always just the same bottom line outcome:
- be specific
- keep it small
- be precise when adding context
- don’t expect magic and shift deterministic requirements to deterministic code execution layers
All this is awfully painful to manage with current frameworks and SDKs, somehow a weird mix of over-engineered stuff while missing the actual point of making things traceable and easy changeable once it gets complex (my unpopular personal opinion, sorry). So I have built something out of my own need and started to offer it (quite successfully so far)to family & friends to get a handle on it: Have a look: https://llm-flow-designer.com
While working on a new startup I had exactly the same challenges and issues. As soon as it gets complex, data gets bigger, amount of tools increases it becomes tremendously difficult to control agents, network of agents or anything LLM related. Add then specific domains like legal or finance where most of the time there is just 0 or 1 and nothing in between (metaphorical speaking) it becomes a nightmare in code and side effects.
So I started to actually build something to solve most of my own problems reliably and pushing deterministic outputs with help of LLMs (e.g. imagine finding the right columns/sheets in massive spreadsheets to create tool execution flows and fine tuning finding a range of data sources).
My idea and solution which helped not only me but also quite a few business folks so far to fix and test agents is visualizing flows, connect and extract data visually, test and deploy changes in real time while keeping it very close to static types and predictable output (other than e.g. llama flow).
Just find a Hoster with low traffic egress cost, reverse proxy normal traffic to Cloudflare and reply with 2GB files for the bot, they annoy you/cost you money, make them pay.
Isn't ingress free at AWS? You'd have to find a way to generate absurd amounts of egress traffic - absurd enough to be noticed compared to billions of HTTP requests. 2B requests at 1 KB/request is 2 TB/month so they're likely paying a double-digit dollar amount just for the traffic they're sending to you (wtf - where does that money come from?).
But since AWS considers this fine, I'd absolutely take the "redirecting the entirety of the traffic to aws abuse report page" approach. If they consider it abuse - great, they can go turn it off then. The bot could behave differently but at least curl won't add a referer header or similar when it is redirected, so the obvious target would be their instance hosting the bot, not you.
Actually, I would find the biggest file I can that is hosted by Amazon itself (not another AWS customer) and redirect them to it. I bet they're hosting linux images somewhere. Besides being more annoying (and thus hopefully attention-getting) for Amazon, it should keep the bot busy for longer, reducing the amount of traffic hitting you.
If the bot doesn't eat files over a certain size, try to find something smaller or something that doesn't report the size in response to a HEAD request.
I'd be surprised to see a mass-scraping bot behind a NAT gateway. They're probably using public lambdas where they can't even control the egress IPs (unless something has changed in the last 6 months since I last looked) and sending results to a queue or bucket somewhere.
What I'd do is block the AWS AP range at the edge (unless there's something else there that needs access to your site) - you can get regularly updated JSON formatted lists around the internet, or have something match its fingerprint to send it heaps of garbage, like the zip-bombs others have suggested. It could be a recursive "you're abusing my site - go away" or what-have-you. You could also do some-kind of grey-listing, where you limit the speed to a crawl so that each connection just consumes crawler resources and gets little content. If they are tracking this, they'll see the performance issues and maybe adjust.
would be good to have an indicator if it’s available with your distro by default or what package you’ll need to install it since all tools are only as useful as available they are…
+1 I dropped school relatively early (I was extremely bored and the way of education was certainly somewhere close from the Stone Age times). I did an apprenticeship as software engineer with some (extremely useless) school component. Most of the time in the late 90ties was trial and error, for me, the master and the master of masters. Playing around with Linux and make it ISDN routers with servers for websites built in HTML, Perl, PHP. This was devops (before it got hyped) and real engineering by figuring stuff out with almost no documentation, a lot of crazy creativity and push the boundaries of what’s possible. And it reminds me just a little like today’s world with AI and vibe coding just on a complete different level and with significant more pressure…fun times :-).
I input text and preferably I output JSON but doesn’t matter much as long as it’s somewhat structured.
Ultimately I’d like to extract information like date ranges, specific indications of tool usages (e.g. I have a bunch of data apis with their own individual data and semantic meaning which need to be picked and then a combination of tools to transform the data)
I am creating something along these lines, https://github.com/zero-day-ai, it's meant for security testing, but probably has most of the functionality you need (and you can write plugins fairly easily if not); you can create a prompt repository, defined by a schema that are organized my domains (again, security testing domains, but they can be expanded). If you have any features you'd like to see, or have an ideal workflow feel free to ping me: anthony@zero-day.ai
In my case it's consistently poor web experience. Chrome scrolling and rendering is lagging so hard that it's hard to believe. The same page on Safari works flawlessly.
I switched as well and I actually use the AI assistant since then primarily. It’s awesome to connect search directly with AI, almost always get what I want immediately.
I'm curious, how is the AI assistant experience different from Perplexity or even ChatGPT's search feature? Is it just the convenience of having several models there or are the outputs inherently better because the results are from Kagi's engine instead of google?