Hacker Newsnew | past | comments | ask | show | jobs | submit | MrTravisB's commentslogin

We completely agree. Framework fatigue is real, and getting locked into a rigid loop is frustrating.

Choices are great, and our goal is to let you piece together a setup to your own liking. We want Pilo to work with your existing tools, not against them. If you just want to rip out our accessibility tree compression pipeline and use it as a standalone skill in your own custom framework, we consider that a massive win.

That is exactly why we are open sourcing it. We want to see what others can do with it.

If there is a framework or tool this could work with but does not currently, we would love to hear about it.


I use ADK which has many points 3rd parties plug in. I'm also involved in the development (from the outside). I will look more into Pilo and how this could work. Would save me a bunch of effort!

I'll open an issue for tracking

---

said issue: https://github.com/mozilla/pilo/issues/318


We actually do have an MCP server! Right now, it’s bundled as a package within our TypeScript SDK rather than being published as a standalone entity.

Just a heads-up though - it is currently in beta and hasn't been put through the ringer with exhaustive testing yet. We’d love for you to try it out, and we’re very open to feedback if you run into any issues or have ideas for improvement, please let us know!

Link: https://github.com/Mozilla-Ocho/tabstack-typescript/tree/nex...


Mozilla (Tabstack) | Founding GTM Lead | Remote (US/Canada) | Full-time

Mozilla’s New Products organization is an internal incubator for high-potential ventures. We are building Tabstack (https://tabstack.ai), the browser automation stack for AI agents. Our goal is to make it dead-simple for developers to integrate fast, reliable web interactions into agentic applications.

We are looking for a Founding GTM Lead to own the 0→1 journey. This isn't a "standard" marketing role; it’s for someone with a founder’s hustle who can speak the language of AI engineers and bridge the gap between technical APIs and commercial traction.

The Role:

- Drive early customer acquisition through scrappy, high-signal experiments.

- Own messaging/positioning for a technical audience (LangChain, Playwright, LLM orchestration).

- Translate customer friction into product roadmap priorities.

- Secure our first commercial wins and design partners.

Requirements:

- Technical Fluency: You understand browser automation mechanics and the current LLM landscape.

- 5+ years experience: A mix of DevRel, Technical Sales, PMM, or Founder experience.

- Founder Hustle: You can generate leads and write technical docs without a massive marketing budget.

Values: You align with Mozilla’s mission of a healthy, open internet.

We don’t have a public job post yet, so please reach out directly to https://linkedin.com/in/tbeauvais.


Regarding the browser instances: While VM boot times have definitely improved, accessing a site through a full browser render isn't always the most efficient way to retrieve information. Our goal is to get the most up-to-date information as fast as possible.

For example, something we may consider for the future is balancing when to implement direct API access versus browser rendering. If a website offers the same information via an API, that would almost always be faster and lighter than spinning up a headless browser, regardless of how fast the VM boots. While we don't support that hybrid approach yet, it illustrates why we are optimizing for the best tool for the job rather than just defaulting to a full browser every time.

Regarding robots.txt: We agree. Not all potential customers are going to want a service that respects robots.txt or other content-owner-friendly policies. As I alluded to in another comment, we have a difficult task ahead of us to do our best by both the content owners and the developers trying to access that content.

As part of Mozilla, we have certain values that we work by and will remain true to. If that ultimately means some number of potential customers choose a competitor, that is a trade-off we are comfortable with.


thank you so much, great to hear the thinking behind these considerations :)


This is a valid perspective. Since this is an emerging space, we are still figuring out how to show up in a healthy way for the open web.

We recognize that the balance between content owners and the users or developers accessing that content is delicate. Because of that, our initial stance is to default to respecting websites as much as possible.

That said, to be clear on our implementation: we currently only respond to explicit blocks directed at the Tabstack user agent. You can read more about how this works here: https://docs.tabstack.ai/trust/controlling-access


This tension is so close to a fundamental question we’re all dealing with, I think: “Who is the web for? Humans or machines?”

I think too often people fall completely on one side of this question or the other. I think it’s really complicated, and deserves a lot of nuance. I think it mostly comes down to having a right to exert control over how our data should be used, and I think most of it’s currently shaped by Section 230.

Generally speaking, platforms consider data to be owned by the platform. GDPR and CCPA/CPRA try to be the counter to that, but those are also too-crude a tool.

Let’s take an example: Reddit. Let’s say a user is asking for help and I post a solution that I’m proud of. In that act, I’m generally expecting to help the original person who asked the question, and since I’m aware that the post is public, I’m expecting it to help whoever comes next with the same question.

Now (correct me if I’m wrong, but) GDPR considers my public post to be my data. I’m allowed to request that Reddit return it to me or remove it from the website. But then with Reddit’s recent API policies, that data is also Reddit’s product. They’re selling access to it for … whatever purposes they outline in the use policy there. That’s pretty far outside what a user is thinking when they post on Reddit. And the other side of it as well — was my answer used to train a model that benefits from my writing and converts it into money for a model maker? (To name just an example).

I think ultimately, platforms have too much control, and users have too little specificity in declaring who should be allowed to use their content and for what purposes.


Thanks for the feedback. We are definitely not trying to hide it. We actually do have pricing listed in the API section regarding the different operations, but we could definitely work on making this clearer and easier to parse.

We are simply in an early stage and still finalizing our long-term subscription tiers. Currently, we use a simple credit model which is $1 per 10,000 credits. However, every account receives 50,000 credits for free every month ($5 value). We will have a dedicated public pricing page up as soon as our monthly plans are finalized.

Regarding semantic data, our JSON extraction endpoint is designed to extract any data on the page. That said, we would love to know your specific use cases for those ontologies to see if we can further improve our support for them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: