Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Thread – AI-powered Jupyter Notebook built using React (github.com/squaredtechnologies)
161 points by alishobeiri 8 days ago | hide | past | favorite | 44 comments
Hey HN, we're building Thread (https://thread.dev/) an open-source Jupyter Notebook that has a bunch of AI features built in. The easiest way to think of Thread is if the chat interface of OpenAI code interpreter was fused into a Jupyter Notebook development environment where you could still edit code or re-run cells. To check it out, you can see a video demo here: https://www.youtube.com/watch?v=Jq1_eoO6w-c

We initially got the idea when building Vizly (https://vizly.fyi/) a tool that lets non-technical users ask questions from their data. While Vizly is powerful at performing data transformations, as engineers, we often felt that natural language didn't give us enough freedom to edit the code that was generated or to explore the data further for ourselves. That is what gave us the inspiration to start Thread.

We made Thread a pip package (`pip install thread-dev`) because we wanted to make Thread as easily accessible as possible. While there are a lot of notebooks that improve on the notebook development experience, they are often cloud hosted tools that are hard to access as an individual contributor unless your company has signed an enterprise agreement.

With Thread, we are hoping to bring the power of LLMs to the local notebook development environment while blending the editing experience that you can get in a cloud hosted notebook. We have many ideas on the roadmap but instead of building in a vacuum (which we have made the mistake of before) our hope was to get some initial feedback to see if others are as interested in a tool like this as we are.

Would love to hear your feedback and see what you think!

Uncalled landing page roast: The landing page needs some serious overhaul. No one cares if it's written in react. And AI in and of itself is not a feature. Tell me what I can do with it.

The demo is pretty nifty! I have the suspicion that for more complex things it will stumble, but I'll give it a try and fine-tune layout ML with a custom dataset or something that's more complex than survivors in the titanic dataset.

Oh and the API key/proxy thingy sounds a bit annoying.

Slightly off topic feedback: find a different name. There are too many products named Thread or Threads out there, it's impossible to Google and doesn't convey much information about your particular tool.

I'm very interested in this. I'm a Software Engineer who's been doing some Data Science on the side and been looking for something like this.

My current set up is running Jupyter on an EC2 instance and using inside PyCharm. One feature I actually really value is being able to use it directly in PyCharm as I can have my IDE on one side of split screen and my browser on the other. Not sure how feasible it is to integrate something like this into an IDE, VSCode would work

But a real killer feature that could get me to switch to a browser based would be the ability to load custom context about the data I'm working with. So I have all my datasets and descriptions of all their columns in my own database and would love a way to load that into the LLM so that it has a greater understanding of the data I'm working with in the notebook.

I store all my data in objects called `distributions` [1] and have a `get_context()` function that will return a text blob of things like dataset description, column description, types, etc.

The issue with all these auto-code AI tools is they don't really have a good grasp of the actual data domain and I want to inject my pre-made context into an LLM thats also integrated in my notebook.

[1] https://www.w3.org/TR/vocab-dcat-3/#version-history

Following up: A reason I really like using Jupyter in PyCharm is because Github CoPilot works in it which helps a lot.

how do you use pycharm with your own jupyter instance?

You can easily to remote Jupyter server in PyCharm [1]. It requires PyCharm Professional though.

[1] https://www.jetbrains.com/help/pycharm/configuring-jupyter-n...

This seems cool! Is there a way to try it locally with an open LLM? If you provide a way to set the OpenAI server URL and other parameters, that would be enough. Is the API_URL server documented, so a mock a local one can be created?


Definitely, it is something we are super focused on as it seems to be a use case that is important for folks. Opening up the proxy server and adding local LLM support is my main focus for today and will hopefully update on this comment when it is done :)

I just added the ability to run the proxy locally: https://github.com/squaredtechnologies/thread/commit/7575b99...

Will update once I add Ollama support too!

Ollama support would be amazing. There's a stack of people in organizations (data rich places) who would likely love something like this, but who cannot get to OpenAI due to organizational policies.

Awesome, thank you! I'll check it out.

From https://news.ycombinator.com/item?id=39363115 :

> https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; [and promptfoo, chainforge]


From "Show HN: IPython-GPT, a Jupyter/IPython Interface to Chat GPT" https://news.ycombinator.com/item?id=35580959#35584069

Can we stop calling things Thread? I can't think of a single more overused name. Impossible to Google as well.

Hahaha I take the point, I thought I was slick because I got the thread.dev domain and so didn't consider SEO as much. One of the alternate titles we considered was `Show HN: Thread.dev` instead of just `Show HN: Thread` to minimize overlap but last second opted against it. Anyways appreciate the feedback.

It goes beyond mere CEO. When you say "Hey I built thread!" people will think you're talking about one of:

- threads.com: Another tech startup (recently acquired)

- threads.net: Instagram (must I say more?)

- the thread protocol: https://threadgroup.org/

I don't think naming will hinder adoption of your product, but it will cause needless confusion for your (potential) users. Up to you whether you care or not, though!

>(must I say more?)


It's a twitter competitor from facebook.

Are you thinking Thread would be an open-source alternative to Hex (https://hex.tech)?

I was thinking of doing something like this last year, but I couldn't figure out a good business model. Google Colab is cheap (free, $10 per month) and Hex isn't that expensive (considering the compute cost they need to cover).

If you focus on local, you're going against VS Code and Jupyter. Both are free and very good.

It's something we are considering, I think Hex provides a lot of features that aren't available in existing local notebooks (SQL, reactive cell execution) that we hope to integrate for sure. I think both Jupyter and VS Code are really strong players in the space so one of the concerns we have was around whether the feature set would be compelling enough to get people to switch. (Which is why we wanted to post to test the initial reaction :))

The reason we wanted to focus on running things locally is that we were both engineers at big companies in the past, and we didn't have access to tools like Hex but we could use local tools. Our initial thesis is to bring the best development experience local and see if there is an opportunity to build a business model around collaboration features.

That sounds like a good plan. I think you're right to focus on the local experience. Good luck!

Congrats on launching! What is your ideal target user? What is the hardest use case that is solved with thread.dev that cant be solved with existing tools? What is the architecture of thread? Frontend is in react and you start jupyter server in background?

What's the benefit of using this instead of GitHub copilot + notebooks inside of VScode?

I think being able to use your own LLM is def a plus.

This is very interesting!

One thing I really want but missing in Jupyter is a straightforward auto-completion integrated with something like Copilot. I'm spoiled by the "just-mashing-Tab development", where I just type a few words and let auto-complete do the rest.

The lack of auto-completion is the main reason I prefer using VS Code or Neovim recently over Jupyter even for experiments.

> Best of all, Thread runs locally, and can be used for free with your own API key.

That doesn't sound very local...

What are the benefits of running the notebook infrastructure locally when your data is being processed in the cloud? Can it be isolated to just code? Can I point this at a local db of customer information to workshop some SQL?

From the code, it seems to send information to https://vizly-notebook-server.onrender.com/ when in "production". Not so local (src: https://github.com/squaredtechnologies/thread/blob/d450bbf2a...)

Yeah I think point is very fair, the reason we host it on a proxy server was that we wanted to enable first time users to be able to use it without putting in an API key, the only way we could think of doing that without exposing the key was to host the tool on our own server. We looked to a tool like Cursor that users might be okay with that.

I the think the feedback is very fair that when an API key is present that all the calls should happen locally, and that is something I will take as an action item on us to improve.

Plus APGL. Not gonna use. Want local, private and FOSS.

I think that's another really good point, it was the first time we have open sourced a project or a library so I wasn't really sure how to best position it with the license. The reason I opted for AGPL was mainly because of the worry that someone bigger could come along and take what we've built and we'd kinda be stuck without an option.

If you'll humour me for a second, just for my own knowledge, I would love to learn a bit more about how you think about the license when deciding whether to use a tool like this or not?

AGPL still means someone can take what you've built and you'd be stuck "without an option(?)". Do you have a lawyer on retainer to pursue copyright claims? If no, and you don't want people to use your stuff, keep it closed source.

Unfortunately, these responses aren't clearing much up for me. I may be dumb, so here's two dumb questions:

1. Am I correct in assuming the "API key" here is an OpenAI API key?

2. Can this tool be pointed at local models?

Sorry that is my mistake, to clear it up: 1. Yes the API key here is the OpenAI key 2. The tool at the moment cannot be pointed to local models, though I am adding support for that. My focus right now is the following, in order of priority: - Self-hosting the proxy server (priority) - Adding support for local models

Ideally having the self-hosting would also allow people to switch things to use their own models (hosted) as well if need be.

That clears things right up, so I appreciate the follow-up!

It sounds like you're not quite there yet, but quickly and diligently heading in the direction I would have hoped!

I like this tool!

Please consider changing the name to something that is more searchable.

I feel this suggestion may get traction if posted on a certain twitter alternative or a certain slack alternative.

Hahaha I was not aware of the Slack alternative too hahah point taken

The title sounds like a mish mash of hacker news buzzwords.

Are you YC backed? https://www.ycombinator.com/companies/vizly Why you didnt do post with Launch HN?

This is very cool!

AI-Powered is starting to get boring...

The GIFs look good and I like how it lets the user diff between original code and AI-generated ones. But still, I would like to quote from an article on Thread (well, this Thread is a network stack) [1]:

> No, seriously. Can we please not name new things using terms that are already widely used? I hate that I have to specify whether I’m talking about sewing, screwing, parallel computing, a social network from Meta, or a networking stack. Stop it.

[1] https://overengineer.dev/blog/2024/05/10/thread/#user-conten...

Hahahah point taken, in all fairness, I only knew about Threads (the social media app) when I posted and so I thought it wouldn't overlap. I was worried the word `thread` might overlap in a parallelism sense though wasn't aware of the networking stack. Anyways I appreciate the feedback :)

This is a correctness nightmare.

Can you please review the Show HN guidelines (https://news.ycombinator.com/showhn.html) and the site guidelines (https://news.ycombinator.com/newsguidelines.html)?

Criticism is fine but shallow dismissals aren't. If you'd like to explain the point in more detail, that would of course be ok.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact