Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Nous – Open-Source Agent Framework with Autonomous, SWE Agents, WebUI (github.com/trafficguard)
155 points by campers 35 days ago | hide | past | favorite | 37 comments
Hello HN! The day has finally come to stop adding features and start sharing what I've been building the last 5-6 months.

It's a bit of CrewAI, OpenDevon, LangFuse/Cloud all in one, providing devs who prefer TypeScript an integrated framework thats provides a lot out of the box to start experimenting and building agents with.

It started after peeking at the LangChain docs a few times and never liking the example code. I began experimenting with automating a simple Jira request from the engineering team to add an index to one of our Google Spanner databases (for context I'm the DevOps/SRE lead for an AdTech company).

It incudes the tooling we're building out to automate processes from a DevOps/SRE perspective, which initially includes a configurable GitLab merge request AI reviewer.

The initial layer above Aider (https://aider.chat/) grew into coding agent and an autonomous agent with LLM-independent function calling with auto-generated function schemas.

And as testing via the CLI became unwieldy soon grew database persistence, tracing, a Web UI and human-in-the-loop functionality.

One of the more interesting additions is the new autonomous agent which generates Python code that can call the available functions. Using the pyodide library the tool objects are proxied into the Python scope and executed in a WebAssembly sandbox.

As its able to perform multiple calls and validation logic in a single control loop, it can reduce the cost and latency, getting the most out of the frontier LLMs calls with better reasoning.

Benchmark runners for the autonomous agent and coding benchmarks are in the works to get some numbers on the capabilities so far. I'm looking forward to getting back to implementing all the ideas around improving the code and autonomous agents from a metacognitive perspective after spending time on docs, refactorings and tidying up recently.

Check it out at https://github.com/trafficguard/nous




This looks fantastic! I've been using aider and had my own scripts to automate some things with it, but this looks next level and beyond.

I wanted to try this out (specifically the web UI), so I configured the env file, adjusted the docker compose file, ran `docker compose up` and it "just works".

It would be great if there was a basic agent example or two pre-configured, so you can set this up and instantly get a better sense of how everything works from a more hands-on perspective.


Updating the Dockerfile and docker-compose.yml was the last change I made so glad to hear that worked for you! What change did you make to the docker compose file?

The CLI scripts under src/cli would be the best examples currently to have a look at for running an autonomous agent, and the fixed workflows (e.g code.ts)


environment cannot be an empty object, which it is by default currently. And I commented out the google cloud line (thanks for that code comment).


If this isn't by Nous Research, may want to consider renaming (https://x.com/NousResearch, https://nousresearch.com/)


I can confirm this is not a projected related to Nous Research in any way, just an unfortunate naming collision


Nous is the french word for "us". haven't heard of NousResearch


but they explain it is from the greek nous which fits better for ai


I was surprised when I learned about the Greek derivation. In the UK it's slang for common sense ("use your nous, mate"). I have to wonder how a bit of Ancient Greek ended up as UK slang.


Where in the UK is this a thing?


Probably less common today but honestly I'd be surprised if someone didn't know it at all. Try asking someone over fifty in England what "use your nous" means.

A quick dip into Google Books throws back this from the works of George Garrett, a Liverpool-born working class writer who wrote about being a merchant seaman during World War I: "An older fireman stopped them. 'Use your nous,' he sang out. 'You can't pile into another room and waken all hands for the sake of an individual."

Or this from The Spectator in the 80s: "Use your nous, you silly cow"

https://www.quora.com/What-does-the-British-slang-word-nous-...


Eton probably.


Actually when I was bouncing ideas off Claude it was suggested with the alternative spelling of noos. Then I can keep the concept and only have have one letter to change


And there is https://nous.technology/, known for their smart plugs.


And if it is is by Nous Research we there definitely needs to be clearer branding as this is very confusing.

If OP is not Nous Research (which I suspect to be the case) then a name change is a must as they're already a fairly well established company in the LLM space (surprised OP isn't aware of the name collision already). It's a bit similar to creating a new library with the "Smiling Face with Open Hands emoji"[0] as your logo

0. https://emojipedia.org/hugging-face


When I first picked the name, after a chat with Claude, I hadn't come across Nous Research back then, and they didn't show up Googling for just nous.

I see a bit of reuse of words in other various llm related projects.

Langchain/langfuse/langflow

Llama/ollama/llamaindex

so I hadn't been too worried about it when became aware of them.

That's what Show HN is for, getting feedback, and a name changed now would be easy before I post it around more.


Never heard of you


I'm not affiliated with Nous Research in anyway, but do work in the LLM space and at least in this community it's a fairly well known org. Since this project also is in that space I was just adding support for parent's observation.


Yeah but in the LLM/AI space, everyone who isn’t total newbie knows Nous Research

And this framework kinda does fall within that space


I'm not entirely sure what this does? The initial paragraph goes into history and what other platforms do, but it doesn't say what problem this will solve for me. Then it continues with some features and screenshots, but I still don't know how to use this or why.


This looks too good. I have a B2B AI product, the features that exist in Nous easily outclass anything I could make in a reasonable timeline.

Maybe I should rewrite my app using Nous...


Thanks! I've spent a lot more time on the computer than I would like over the last few months building it.

If you think you might want to feel free to get in touch


I'm having a hard time figuring out how much logic lives in Nous and how much in Aider for code changes - could you say some more about it?

Playing with the code agents do far I've found Aider to do many silly mistakes and revert its own changes in the next commit of the same task. On the other hand Plandex is more consistent but can get in a loop of splitting the take into way too small pieces and burning money. I'm interested to see other approaches coming up.


I have a few steps so far in the code editing at https://github.com/TrafficGuard/nous/blob/main/src/swe/codeE... There is a first pass I initially created when I was re-running a partially completed task and it would sometimes duplicate what already had been done. This helps Aider focus on what to do.

  <files>${fileContents}</files>
  <requirements>${requirements}</requirements>
  You are a senior software engineer. Your task is to review the provided user requirements against the code provided and produce an implementation design specification to give to a developer to implement the changes in the files.
  Do not provide any details of verification commands etc as the CI/CD build will run integration tests. Only detail the changes required in the files for the pull request.
  Check if any of the requirements have already been correctly implemented in the code as to not duplicate work.
  Look at the existing style of the code when producing the requirements.
Then there is a compile/lint/test loops which feeds back in the error messages, and in the case of compile errors the diff since the last compiling commit. Aider added some similar functionality recently.

Then finally there's a review step which asks:

  Do the changes in the diff satisfy the requirements, and explain why? Are there any redundant changes in the diff? Was any code removed in the changes which should not have been? Review the style of the code changes in the diff carefully against the original code.  Do the changes follow all the style conventions of the original code?
This helps catch issues that Aider inadvertently introduced, or missed.

I have some ideas around implementing workflows that mimic what we do. For example if you have a tricky bug, add a .only to the relevant describe/it tests (or create tests if they dont exist) add lots of logging and assertions to pinpoint the fix required, then undo the .only and extra logging. Thats whats going to enable higher overall success rates, which you can see the progress in the SWE-bench lite leaderboard as simple RAG implementations had up to ~4% success rate with Opus, while the agentic solutions are reaching 43% pass rate on the full suite.


Cool project!

Just FYI your chosen name collides with Nous Research, which has been a prominent player in open weights AI the past year.


Thanks! I posted a reply to another comment about the name clash. I thought I could add another weird to differentiate, but Nous Agents doesn't really roll off the tongue. New name ideas welcome!


Pick a proper noun, you will soar above the plethora of startups re-using common nouns for no good reasons (eg: "plane" having nothing to do with aeronautics).


Noosphere might be cool


That's promising! Congratulations with launch. Considering adding to the specialized directory for AI agents and Frameworks to build them. Let me know if you need help.

https://aiagentsdirectory.com/


Which definition of "agent" are you using for this project?


Good question, at first I only called the fully autonomous agents as agents, as to me that's what having agency is. I didn't like when other projects had "multi-agent" when it's just a bunch of llm calls.

Initially the coding and software dev agents were called workflows, but to make it more agenty I was ok with it being called an agent if the result of an llm call affected the control flow


So an agent here is the combination of a system prompt and a configured set of tools, kind of like an OpenAI "GPT"?


No, a chat bot using tools (e.g. GPTs) is an "assistant."

An LLM agent is not a chat bot, unlike an assistant. It is a primarily or fully autonomous LLM driven application which "chats" primarily with itself and/or other agents.

In other words, assistants primarily interact with humans while agents primarily interact with themselves and other agents.


I'm going to make the availability of requestFeedback function a boolean flag, so when running benchmark suites etc it can be disabled. Whether its an assistant or agent by that definition is just really a parameter value.


That trace UI is nice


I can't take credit for that particular screen, it's the Trace UI in Google Cloud. I did look at LangSmith for tracing, but for now I wanted to stick with standard OpenTelemetry tracing, so you could export the spans to Honeycomb etc


How much does it cost to run?


To have it deployed costs nothing to run with the Cloud Run and Firestore free tier.

As for LLM costs that really depends what you're trying to do when it. Fortunately that cost is always coming down. When I was first building it with Claude Opus the costs did add up, but 100 days later we have 3.5 Sonnet at a fraction of the cost.

The Aider benchmarks are good to see how different LLMs perform for coding/patch generation. Sonnet 3.5 is best if it's in the budget. DeepSeek coder v2 gives the best bang for buck https://aider.chat/2024/07/25/new-models.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: