The devs of Windmill seem to have taken the winning advice of "do one thing and do it well" and done the exact opposite. Going over windmill.dev I have no idea what specifically the software should be used for. Is it a competitor to Retool? Airflow? Temporal? Apparently there's a no-code workflow builder? A drag and drop UI builder? An online IDE? Dozens of integrations? What on earth is going on??
The critics is more than fair and our positioning is not very clear.
We are doing a developer platform for enterprise to build internal software, including apis, workflows, background jobs, and UIs using code, but only where it matters. We happen to be quite decent in all aspects, with a focus on performance, and we have an active community of users and customers that are mostly developers and hence have plethora of feedback and feature requests, so we're happy to oblige.
By being at the right level and trying to expose the code as much as possible, we can be great generalists without sacrificing on the quality. Python, Typescripts, Go, Bash and all the query languages are where all the complexity is and capabilities come from. They are well developed languages with great ecosystem. We merely provide a way to run those language in a system fit for production at enterprise scale.
> We are doing a developer platform for enterprise to build internal software
> for enterprise
I mentioned this product to my enterprise boss and his enterprise answer was "we can't easily get this approved by tech-selection committee for no good reason" (and that would have just been to have an off-to-the-side proof-of-concept instance or two running in our sandbox k8s cluster let alone actually bringing it onboard as a paid service)
How do enterprise users not already have the functionality you are offering covered one way or another and why would they migrate off of what they have to what you are offering? What kind of business today doesn't have something like ActiveBatch, AWS/Azure functionality with massive cloud support contracts, etc. already?
You can't win every contracts but the market we're after is immense and so even a small percentage of enterprises being more daring is sufficient.
But the argument is true for any enterprise software, you cannot be small and sell to enterprise (or you must have amazing product market fit), that's where open-source comes in.
We have overlap with many existing products, but we are the only one to provide such product, with emphasis on DX and excellent performance and scalability, and to be open-source. We do currently have a few big enterprise customers and many enterprise open-source users. It's a lot easier to get approval since they get it for free and it's fully self-hostable and air-grappable. Once they have tried it on a few non-essential workflow and see the benefits, then they are more motivated to make a case internally. Wide net, some catches until it becomes ubiquitous.
I happen to be a really strong believer in open-source so it's not just an adoption strategy but I think it happens to be also fortunately the best strategy for such infra level software.
Whether enterprise or not, the criticism is still fair: the value proposition needs to be super clear. I’m missing this.
Your comments make it a bit more clear, but the big question (as an Airflow user) I have is: why would I want to migrate?
A big question for an enterprise customer is typically: will I save money with this? In developer productivity, in resource costs, or something else? Can you unlock new things that were previously not possible?
Question is - what are you using airflow for? My experience with airflow have been in data ETL, and if so you are not the target for something like windmill.
The target would most likely be automating HR, Finance and IT workflows and tearing down the shadow IT web of crazy integrations taking place at every larger organization I’ve ever experienced.
We’re talking “new hire” workflow for example, which at my current employer is about 25 activities in a workflow.
All assets have to be lifecycle managed in an enterprise and automated workflows will help you scale that. Far too many enterprises have a lot of people shuffle excel files and emails around to fulfill processes and workflows.
Airflow is a beast imho and usually not used in the same niche IME.
Just guessing with no background experience on windmill, maybe to support reactive workflows. Airflow does not have a good proposition on that front. Sensors are a workaround and not performant.
A problem is that you're trading 'Compete by offering one thing this company needs' with 'Compete by offering 5 things this company needs, 4 of which are in various stages of entrenchment at the organization already'.
You'll be up against incumbent software that populates teams of workers at the company who have a vested interest in their livelihoods not being taken away, who have the support of whatever commercial organization wants to keep their business. All to provide that +1 value-add you can't even focus on as much as you should due to your efforts being split to compete with multiple other products.
Do one thing well, FIRST. Make it easy to integrate with and build integrations from. THEN expand into the supporting ecosystem.
I’ve been the person in a position to recommend enterprise solutions and have those recommendations taken.
I’ve also been the person on the other side of that relationship, helping potential customers make a case that my product is worthy of their software/services gatekeeper’s consideration.
There is often high motivation to bring in a smaller/newer tool, because the existing solutions are not scalable, or are missing a critical feature, or require a team of specialized people to make it work, or has onerous licensing costs, etc.
“Shadow IT” is also a very real thing. Someone’s boss is frustrated with the bureaucracy and timeframe for bringing something in and so they just throw it on a corp card or install it themselves and ask for forgiveness later. This happens everywhere and is often the precursor to forcing the product to become officially blessed because by now it’s supporting production workloads and has proven its worth.
This particular space is still ripe for innovation. Very few of the products that target this kind of tool building approach are close to finished and each has its quirks.
> There is often high motivation to bring in a smaller/newer tool
I must be misunderstanding the definition of "enterprise" here. I can't picture any of the 3 enterprise companies I've ever worked for adapting any sort of product like this.
I’ve worked with many of the largest companies across quite a few verticals in a product management capacity. Many large enterprises (think 200K+ employees) still have pockets of tool building and automation springing up everywhere. Big names you’ve heard of.
I’d be willing to bet money that this was happening where you were, but you may not have been exposed to it. It often shocks IT management what they find when turning on software auto-discovery and inventory tools.
These tools are often employed in operations and other non-core-to-the-business departments to simplify/automate busy work happening there.
I’ve worked for years implementing tools like this in enterprises.
My most fond memory was introducing a workflow platform to an enterprise with a fully outsourced IT-ops department - it was ruining everyone else in terms of cost, speed and quality.
The security dept (this was a large bank) was gridlocked in this setup and wanted the ability to automate their way out of the sourcing mess.
I spent roughly three months building a few “hot path” workflows important to them which enabled them to take the ownership back of the processes and save an incredible amount of time and money.
Encapsulating these integrations as workflows makes them observable and measurable. The customer had in the first quarter after deployment 10’s of thousands runs and avg time to completion went from 2 weeks to 2 days. It also cut out an rather expensive middle man.
And this is not the worst enterprise customer I’ve worked with. One hade 4000 Windows servers manually provisioned and managed.
There’s low hanging fruit out there!
You basically trade agility and quality for competence, unfortunately a lot of enterprise IT shops are not willing or capable to do so.
They have “not only one way or the other” - they have “all the ways” and this is exactly the problem you want a solution like this to fill.
We’re talking backoffice/cost-center workflows in IT, finance and HR - absolutely not business profit processes mind you.
A tool such as this can help an IT department take ownership of integrations and workflows, building them in a framework that can improve speed and quality. It will help with organizational scalability.
The alternative is in my experience a giant mess of integrations lacking ownership and observability.
Every large enterprise needs “something” like windmill, they might just not know it.
>A workflow engine is a software application that manages business processes
>Enterprise resource planning (ERP) is the integrated management of main business processes, often in real-time and mediated by software and technology. ERP is usually referred to as a category of business management software—typically a suite of integrated applications—that an organization can use to collect, store, manage and interpret data from many business activities.
What is the difference between a workflow engine and an ERP system?
Why not call it an ERP system if you're targeting the broader enterprise market? As someone who is far from SV, the business people I know probably don't know what a workflow engine is, but they definitely know what an ERP system is.
Happy to expand on this. Only where you would consider to be not boilerplate, aka your business logic. I hate low-code that force me to use a rigid system for the core of my tools, but I love the velocity that it brings. What if we could have high-velocity for all the boilerplate (permissioning, queuing, frontend, secret management, etc, etc) but code for the business logic. That's what windmill is. In a way, this is what serverless was meant to be, write your business logic and servers autoscale and you can hook a gazillon managed services on top.
We bring that vision in a fully open-source and consistent platform that you can host wherever you please.
I suggest you work with a copywriter and come up with a better way of presenting windmill.dev. How you're presenting it makes sense to you, because you built it. You know what you mean when you say words like "boilerplate" and "business logic". That has a specific meaning in your head. But it's different for every person in the software industry. To a customer, it's confusing.
For investors:
> We are a developer platform for enterprise, offering a performance-focused solution for building internal software using code. Our open-source platform allows businesses to focus on their unique business logic while we handle the boilerplate tasks, resulting in high-velocity development and the ability to scale at enterprise levels.
For devs:
> We are a developer platform for enterprise, providing a system for running code written in Python, Typescript, Go, Bash, and query languages at scale. Our platform focuses on performance and offers extensive capabilities to build internal software, including APIs, workflows, background jobs, and UIs. With a fully open-source and flexible architecture, developers can concentrate on their business logic while leveraging our robust ecosystem for efficient development and hosting options.
Generic:
> Windmill is a developer platform designed for enterprises to build internal software efficiently. It combines the speed of low-code solutions with the flexibility of coding, allowing developers to focus on unique business logic rather than boilerplate tasks like permissioning, queuing, and front-end development. Our platform is fully open-source, offering high performance and hosting versatility.
> Our active community of developer users provides constant feedback, driving our platform's growth. By emphasizing commonly used languages such as Python, TypeScript, Go, and Bash, Windmill serves as a generalist tool without compromising quality, enabling complex functionalities within a production-ready, enterprise-scale environment.
I think there is a disconnect in your marketing of the product as well Ruben. From what I can gather, I think you are more suited to be in the RPA(robo/proc/auto) market. Also, since there is a current global push to agresively retool and re-train personnel in AI (ex. state dept. prioritize AI-READY WORKFORCE). I think you can bridge the gap between the world needing to retool workers NOW.. and AI achieving AGI in the near future. Once AGI is achieved, human workers will just become robot(ai) handlers/overseers. Best of luck to you.
I had the same thought when it popped up on my radar a year ago. Now that I’ve been using it for a few weeks, it’s difficult to go back to the exact tools you’re naming. I didn’t realize how large of an impediment it is to move back and forth all of them. Windmill is the thing I never knew I needed. It’s changed how I think about delivering data products/solutions.
Doing one thing well is not a winning advice. You can do multiple things, it really depends on the execution and other factors. They are doing one that is workflows and doing it well.
Also they are using postgres for everything. I think that is a good example of picking one thing that you know very well and using that to the limits.
But i do agree that windmill is doing so many things that the elevator pitch need some work. But a lot of companies are trying to do the same. So i do not think people will think of workflow engine + online ide + app builder as a natural grouping in 5 years
> Is it a competitor to Retool? Airflow? Temporal?
If it makes you feel any better, I'm pretty sure I personally couldn't tell you specifically what any of those specifically do in terms of "the one thing they do well" / why anybody would ever use one instead of the other / not need to use all 3 to accomplish the same thing slightly different ways
The intersection of all those areas listed is often an inordinate amount of work and integration and maintenance.
In that way as people want an easily grokable description, windmill seems to be a super-platform that leverages enough of the important bits that come up.
I'm also very confused about the pricing. Its "open-source" but with SSO user-limits? Paid version comes with Prometheus metrics, so does that mean the open-source free tier doesn't? So many questions.
For our purposes it looks like Windmill might take the place of MWAA (Airflow) that we were planning for our ELT pipelines and could take the place of Rundeck as well. I'll need to look at it and play with it (especially from the ELT pipeline perspective for running transformers), but I suspect that we will be able to be up and running both faster and cheaper with either self-hosted Windmill or cloud.
Many others have done the same thing (I mean, something like airtable is now 'everything' as well, as are many others). It's hard to do positioning though like this I must admit; for new users it's almost impossible to say what exactly it is. The 'what can I do with it?' -> 'everything' type of thing.
Does being fast beyond certain threshold really matter for a workflow engine, though, especially given that many workflows have long-running tasks? What I find matters is multi-tenancy. That is, can a workflow system support as many jobs as a user desires, yet each job gets scheduled and executed as if it is the only one on the workflow engine. For instance, Amazon EC2 and EBS probably have millions of jobs and job queues running at any moment, yet each job has pretty predictable and stable startup delay and execution time. In contrast, each of Nomad and k8s has a global queue that sometimes gets stuck because of bad actors or too many pending jobs (I understand that an orchestrator is far more than just a workflow engine. I'm just contrasting the user experience).
So I agree that for most jobs taking already more than 300ms, a 50ms per job will not be perceivable anymore. What matters is the DX and even though I'm very proud of our performance, a lot of our work is focused on providing a consistently great developer experience in all aspects (vscode extension, instant deployment, etc, etc).
However, there is one big benefit to being performant which is that you can use the same stack for performance sensitive jobs such as event-streaming use cases. That remove the duplication of infrastructure. The other benefit is that it improve the developer experience overall with fast previews.
Yes. Just think of how annoying it is when you have to wait 5 seconds instead of 0.5s for a 2FA message, then multiply that by everything that you ever do with your workflow engine. That's not even to speak of cases where running the workflow (e.g. acquiring data) faster is a competitive advantage, although this thing is still probably too slow for truly HFT-level tasks.
Yes it's too slow for HFT, but anything would be too slow for HFT unless it's custom built. You likely want a streaming engine or even hand-optimize all your event handlers to shave-off any nanoseconds.
A workflow engine is indistinguishable from a regular interpreter or runtime, except for the fact it's suspendable (hence can run tasks long).
Ideally a workflow engine being fast means it can be just the platform you write everything in. The artificial separation between "normal" code and "workflow engines" is completely unnecessary. Our platforms are extremely fragmented, pointlessly so. Remote vs local, statically vs dynamically typed, async vs sync, short running vs long running...
If an engine is written without these arbitrary limitations, it can be more.
Is this what I need? We have ad hoc business processes like:
1. Sell service to client
2. Email client with scheduling info
3. Agree to scheduling date with client
4. Send client email with documentation and request their data
5. Remind client to send data
6. Alert admins data hasn't been received
7. Beat data out of client
8. Prepare data
9. QA and checkoff service
10. Email client to review and approve
11. Publish/run service
12. Send client stats/results
And I want to move that from spreadsheets, personal emails and admins keeping in their head where everything is at, to web forms and uploads, automated emails and dashboards. I looked at airtable, smartsheet, budibase and many others but they seem to be very project-based, where you are inventing a unique workflow for each project and using it to keep managers on top of where their team is at with the various steps. Project Management vs process management. None of them seem to have decent calendar integration. Few of them seem to be great about emails or scheduled scripts.
I have APIs for my data, or I can code them if needed. I'd prefer a low-code to a no-code approach, where managers have a spreadsheet view and can do some of of the UI work and programmers can make it do stuff and handle integrations.
Yes, windmill would likely be a great fit for that as long as you're willing to do a little bit of coding.
For instance, we do not have a calendar integration per se but can manage your oauth tokens towards any service (including gcalendar) and allow you to write code using their official sdks.
For spreadsheets, you can build a similar interface in windmill using aggrid.
It seems you also need approval steps and have your workflows wait for events, and that's built-in in our workflows.
If you want to explore this I'd recommend taking the approach I've seen with scripts that's very successful -
The first bash script simply tells a person what to do and hit enter when they've done it.
Then the easiest to automate parts are automated, as and when it's valuable to do so.
Windmill appears to have manual approval steps - so you could try modelling things just with those. Then automate the most annoying/costly/easy to automate steps one by one.
This should be easier to create, and actually solves an initial problem of knowing where everything is up to. If it doesn't help, or the process is too inflexible when reality hits it, you won't have spent too long automating things.
I don't think that'll achieve your goals, it's far too clunky and inflexible for actually running a business end-to-end. (No knock on Windmill, they're all like this in my experience as the way they model the world is how a programmer thinks about things, not a business owner. BPMN is straight up cancer—avoid.)
I would suggest looking at workflow systems within the VFX space, which they call "pipelines". They are human-driven and maintained, but support very high levels of process automation.
VFX pipelines are written in Python, but the key thing is, they're very good at a few things simultaneously:
(1) Highly irregular work…
(2) That nevertheless requires lots of process, including of the ad hoc variety…
(3) With high human touch (human in the loop), so calendars, task assignments, timelines, approvals, producer dashboards, etc.
(4) That's repeatable on the next project/opportunity, with as much customization as needed.
As a bonus, pipelines handle billions of dollars of work annually under high pressure, with tens of thousands of users, so there's a lot of hard won experience built into them.
I personally have found that there are three levels of VFX pipeline tech that need to be combined and customized to do everything. This is the most recent stack I've set up:
* Kitsu for the producer/manager level, human task assignments. Mainly for managers, has the calendar functionality.
* Prism Pipeline for software integration (i.e. keeping the humans following the process). This is what people actually do.
* Kabaret for compute work. People usually kick off these jobs, but they're handled by the farm. You'll be responsible for everything that happens here. [0]
Don't try to get by with just one or two of the three, even though feature-wise it looks like you might be able to. Incorporating computer-driven workflows is tricky and requires purpose-built libraries in my experience. Just use the right library for the job, and accept that they all have somewhat overlapping functionality.
For most non-VFX work, you'll need to script access to the browser, since that's where people do stuff today (i.e. with the Prism Pipeline library). Launch Chrome, Firefox, or Edge with a debug port and use Playwright as your "plugin" mechanism, with Prism/Python as the driver. You can make your pages do whatever you want and script literally any web-based process, filling in fields from your database as needed, adding data to the page, etc.
For the last six years, I've run a business that does around $5M of annual revenue using this approach, with 4 FTE people (including me). We handle around 40 million compute tasks a day, but also have all of the "daily work" where humans are in the loop. We do all of the tasks of the type you've listed.
HTH!
[0] If you need fully automated task generation (so, no human in the loop), write "pump" scripts in plain Python that push jobs into Kitsu or Kabaret at a particular interval. I personally run them under a god monitor, and have them exit after, say, 15 minutes (or whatever interval makes sense) to get cron-like functionality. This is useful because you can pause execution of any pump process easily, change the interval, restart them, etc. plus you get easy access to the logs for debugging. The scripts themselves are usually less than 100 LOC. I have around 60 of these that run the automatic parts of our business.
Whenever I need custom, temporary behavior in this part of the business, I copy the pump script, edit the original to avoid the special case, and have the new script only handle the special case (with whatever new behavior I need). Done, takes very little time, and is easy to run temporarily (e.g. for the next week, a common occurrence because some downstream vendor has caused a problem we need to work around).
Another approach to trigger tasks/workflows is to have a server that gets pinged by 3rd parties (e.g. Webhooks) and does the same whenever the endpoint is invoked. We have that too, but it's literally just a few scripts. (Most workflow systems seem to believe this is damn near 100% of the stuff that needs to be "automated.")
Searching for "Vfx pipeline" just gave me visual effects stuff. Quite sure that was not what you talked about. Could you explain a bit more what ypu mean by it and point to a resource. Keen to learn more.
I think windmill could have helped with several of the task and automated some. Its very easy to create dashboard and table with statuses. But its def not a process tool, if that was what he was after.
All of these projects are targeted at VFX/animation, but there's nothing about them that ties them to those things, it's just why they were developed. You can use them to run any kind of business IMO.
I posted about it because (a) they handle the commenter's use case, and (b) most HN people are probably unaware they exist and would be discouraged into thinking they are only useful for VFX/animation when in fact they are general approaches to running a business that requires high interaction between humans, processes, and background jobs in a very dynamic environment.
Jobber looks like it is more in the "project management" space. All of my processes are post-sales. We're not inventing new projects that we need to organize. We have a short list of services we sell and each of them has a process with a lot of dependent steps that require a lot of communication and coordination.
> We have a short list of services we sell and each of them has a process with a lot of dependent steps that require a lot of communication and coordination.
Something that comes to mind: In Jira my administrators have locked it down so I can't move certain tickets to certain statuses
It always amazes me that people spend this much time and effort on writing articles like this and don't run their text through a spell checker even once.
It also amazes me that people still use text editors that by default doesn't spellcheck. Anyone know what editors that could be? Seems pretty crazy in 2023!
If this is an engineer, they probably use whatever tool they're most familiar with, and that tool is probably set up for mostly engineering (code) work, in which case in my experience spell checking is a major distraction with an outsized ratio of false positives, because my identifiers aren't chosen for their validity in a wordlist. Even in markdown, it's more annoying than helpful most of the time, so I tend to leave it as a manual step that I mostly ignore. People know what I mean 99.99% of the time. So, that'd be my guess. Any sort of editor that's been tuned to an engineer's preferences which includes only manual spell checking.
It's pretty good imo. It's super easy to add words too, easier in fact than the spell checker in iOS or macOS. The one in iOS is the worst imo, as it adds stuff automatically based on common mistakes you make because "machine learning" and you can't even see the list of the "learned" words to remove things it learned that aren't actually words!
Yeah, but this sort of article is what you get at the end of the day. Plus there are several ways of getting rid of false positives and as GP said: at least do it before posting.
You're right, I apologize. Truth is I wrote everything in vscode markdown and really wanted to finish the article before going out for dinner and I ended up skipping it. We fixed everything
The Code Spell Checker extension is great. It has proper handling for camelcase and it's fast to add words to the dictionary (cmd + .). Catches many typos when coding.
Probably not best for the last line of defence for public articles, but probably good enough.
I turn off spell check in any text editor I touch because I can't handle the visual noise. Agreed on running spell check. The few grammar issues in the article made my brain glitch for sure.
I've been following Windmill since their HN Launch (my first script was edited 540 ago, just checked). Less than a year ago I started using it more heavily and it's been a breeze. Their Discord server is very active and Ruben usually replies and fixes bugs in minutes, even on weekends.
Daily puzzle generation, daily PDF updates/creation (if the original content is changed, it triggers a new generation), triggering and editing of Remotion videos.
It says open source, then it says limit of 10 SSO users. IANAL so this confuses me. Can you explain what this means?
Actually, that sounds rude, excuse the ‘tism, I’m sorry. I’m just not super familiar with the license stuff, I generally only touch MIT licensed code when it’s work related, but I thought open source generally allows modification to code. So how do you enforce a limit of 10? I just skimmed the code a bit because of this confusion and I see stuff about enforcing a license in there. So couldn’t anyone just remove that license check code if it’s open-source?
If that can’t be modified, then it’s just “source available”, isn’t it?
Which to me is also fine, I just feel like I’m being either misled, if you get what I’m saying?
Basically I think this project is pretty cool, so I wanted to bring it up to my boss because I think we have a need for this, and I had previously brought up Airflow, but I don’t know how to explain this to her.
Sorry for the stream of consciousness. Typing on mobile while answering Teams messages and tickets.
Some people will argue we are open-core because almost everything is open-source (AGPL) but our sso and enterprise plugins are under a source-available proprietary license.
The reason that we didn't split them out of the codebase is that it's harder to maintain and would require to load the plugins. We didn't want to waste time on that when we had so much to build.
oh ok I see. Just the SSO part is like that. That makes more sense.
So I guess I can just self-host for our team of 7 without SSO and we can finally organize these messy cron jobs we’ve accumulated. We would not be reselling at all. Just organizing some annoying aspects of our day to day.
Thanks for the clarification. Much appreciated. If we end up expanding our usage down the road, I’ll see if I can convince her to consider shelling out funds for the SSO stuff too. I’d love to be able to support this project. Seems pretty cool!
You can even self-host with SSO for 7 people. The SSO is free up to 10ppl.
You will never have to pay if you do not want to.
We believe there are many reasons to start using Windmill and most of them are not worth monetizing by themselves but we strive to build a software so great that you will move more and more stuff on there and at that point, you will want our enterprise plugins.
Windmill is great. Definitely don't stray from the self-hostable + DX mission! I haven't had need to use it professionally, but I use it on a home server to run several small web crawlers and yt-dlp jobs. Really fun piece of tooling.
I hate cron for running small personal scripts and jobs. I hate ad-hoc handling of logging / observability / success monitoring. I want a web UI for all of it. I hate trying to figure out how to handle event based job triggers that are more fancy than the 10 line bash script they end up running. I hate trying to wrap up solutions for all those into infrastructure as code — I will end up running an unpatched Ubuntu 20 for the next decade if I can't just nuke the box and reload my config + scripts. I hate the idea of relying on the continued existence of a hosted service, as well as them being able to see that I'm fine tuning an LLM for erotic stories (or whatever).
Windmill solved all that for me. I write a short little Python script, paste it into the web UI and boom it's running with good answers for all the above issues. That's great DX.
If workflow engines try to solve every problem they can devolve into plugin and complexity hell. Jenkins issues have caused a lot of headache at my day job lately. I don't want Windmill to fall into the same trap.
I would love to begin using this system, but the licensing is giving me serious pause.
While most of the software is under the AGPLv3, the Commercial License section of the README [0] implies that the company takes on a fairly broad interpretation of the AGPL.
In particular, the line "[...] to build a feature on top of Windmill, to comply with AGPLv3 your product must be AGPLv3 [...]" seems to imply the company aligns with the stance taken by Google and other companies: that even calling the application via API is enough to trigger copyleft [1].
This implies that if I were to build a sign-up form that triggers a Windmill workflow in the backend, my entire application would either need to be AGPLv3 or I would need a commercial license.
That's perfectly reasonable, as it means any non-AGPL use will have to contribute back to Windmill via a commercial license. However, it does mean positioning this as an "Fully Open-source" alternative to Airflow is only technically correct. This is much closer in practice to "source available" than how most developers would think as "open source".
If this isn't how Windmill wants their license interpreted, I highly encourage clarifying things.
Got it! That's great to hear. It'd be great to explicitly clarify that in the README :)
Out of curiosity: Have you considered alternative licenses like SSPL or the Elastic license, which make a clear(er) delineation between whitelabelling/hosting vs simply using the application? I'm sure I'm not the first person to have written off Windmill because of the AGPL.
Those licenses are not open-source and likely to be more restrictive than what we currently offer. It was important for me to be truly open-source.
If you are using windmill as a whole, you're free to use it however you please with the AGPL but for customers with doubts, we do sell an enterprise/commercial license.
I think the workflow engine is the _best_ part of it, but together with the script editor (online code editor with LSP support) and a drag-n-drop app builder, that is for developers, so you "drop down" to code much of the logic and connect it to script or workflow, it's a really nice combo.
(you see a lot of other companies trying to span out to workflow + script + appbuilder).
It's really made for developers or at least people that want code as a "first class citizen" of the platform. So i has a cli to sync to/from a VCS, VS code extention. But at the same time I have gotten a tech savvy person on the customer success team to create a workflow where they connect to our API and actually create meaningful services way faster then the enterprise developer team would have done.
The script and flows get automatically endpoints to connect to, but all request is saved to db, and picked up by a worker that has to save the result to disk, before the server again pick up the result and send it to a client. So I would not use it to customer. I have used it for a lot of internal tools to employees. Meaning they can wait 1-2 sec, because you can teach them that it works and you provide a lot of value.
I am using Rundeck now to schedule project jobs, get notified on errors, provide UI for PowerShells/bash scripts, and have decent project and user management. It uses Quartz and can handle hundreds of jobs per minute without a problem. It can connect to git repository to keep job definitions there. I also use local file system a lot (on Windows or Linux) as I don't like to edit my scripts in browser, given that they are mini programs with lots of separate includes.
I would like to try windmill instead. Can it cover all those cases, particularly the part where scripts are on the file system? Can I use automatic GUI creation for PowerShell scripts of windmill in such case? Rundeck transmits environment variables with GUI values when starting script.
Not only it can cover those cases, you will likely be pleased with the advantages of such system (the automatic GUI creation is just one of those).
The binary itself doesn't sandbox your processes unless you're in the nsjail mode. Hence, if you either use the binary raw (without docker) or mount the filesystem to be available to your container, it can run anything that is available on the filesystem.
I played a bit now on cloud and was able to run things in powershell and create a custom GUI form, connect button to script and show results in the text area.
I didn't succeed to return all output from the powershell, just last value. The only way I got it is by using control "job log". This one however has some custom header of windmill (job guid etc.), so is there any way to show a raw log ?
I generally do not like no-code tools, but first impression of this mix of no-code and code looks intriguing and app feels very nice overall, with great UI and easy to understand features. Good job.
So bash and powershell are a bit special since they are the only one that are not implemented as using a main function and thus have no result to return.
However, we had that feedback a few time so there is a trick. If you write your result in ./result.json, we will process it as the result of your powershell script.
Am I able to install it on Windows? I need windows specific things, such as powershell remoting. One of the good things about Rundeck is that server acts as worker, and it runs on Windows without problems (Installing it is as simple as choco install rundeck). I can then use all windows tools, run applications (including GUI ones) etc. just the same as if I run them outside rundeck, using OTB powershell.
Yes, the binary can be compiled to windows and run there as a worker but it will need a connection to the main database. Not sure how currently rundeck does it ?
Rundeck is a java web app, using (by default) embeddable database or file system for configuration and data (you could use mainstream databases, but I never needed that).
Regarding "Connection to the main database" as far as my use case is, all the settings could live in the file system json files or sqlite if that makes installation trivial for the use case of being very good job scheduler with options to create great jobs UI, logs and management of small number of users (which is a fairly common scenario).
I looked in to this a little when Windmill was last discussed here, and the gap you may bump into (as a replacement for Rundeck) is that AFAICT Windmill has no support for executing scripts remotely over (e.g.) SSH and passing the GUI values along as environment variables.
It seems you need a full-blown install of Windmill on each host where actions are to be run.
It's a good question, and deserves a more detailed answer than I'm able to give without more first-hand experience with trying to use Windmill in this way. But my sense of it is that there's likely to be a chunk of legwork involved in the passing of parameters and credentially over to the remote host, capturing both stdout and stderr from the remote site - things that Rundeck handles pretty much seamlessly - and a concern that using Windmill like this the question of which host the job is executing on isn't a first-class concept, like it is in rundeck.
It's likely that in time I will evaluate Windmill to get a more detailed understanding of this - there's definitely aspects of it that are very substantial improvements on what can be done in Rundeck.
But inevitably, the question of whether to learn & install & maintain two separate systems side by side, with some degree of overlap in their functionality, vs. just picking one system to do the whole job and living with the shortcomings of whatever it doesn't do so well - it's always going to be a difficult one to weigh up.
Interesting that Windmill's backend seems to be implemented in Rust, but the docs talk about supporting jobs written in Python, TypeScript and Go. I understand there is generic Docker support, but does anyone know if you can run Rust jobs natively?
We could add support for Rust, and it would work the same as go, do a built step prior, and then cache the binary. If we have enough demand for it, we will add it (we are a small team and can only prioritize what's popular unfortunately).
On a personal note, I write a lot of rust (and love it) but using rust for writing non-backend type of stuff is not something I would do myself. I'm more of a believer of hybrid like polars that optimize the compute in Rust but provide a compatible sdk in python.
One balance I've yet to find is how to handle the split between code-in-database (workflows stored in a database and edited via web IDE) vs code-in-git (workflows checked into the repo and only changed via the normal development+peer-review process).
It looks like Workflow is primarily the former (code-in-database), but provides an API to sync things from a git repo. Is there a mechanism to enforce the rule that certain scripts/functionality/secrets are locked behind workflows sourced from a provided repository?
I think most of windmills enterprises customers do it the code to git way.
I have chosen another part where i deploy and then it's automatically push to git. Buy windmill proves you with 3-4 really good ways of deploying from 1 cowboy just doing what he wants to the normal EE way of doing software meaning code-in-git and people only have read access to production and can not modify things. You can have separate instance running or just separate workspaces and promote.
A little late but after reading this thread I moved a airflow instance to windmill and I much much prefer windmin, airflow was clunky and heavy for my simple workflow that consist of running a few python script.
Also is auto ui is great to play with the arguments in, the cron ui and next start date... Also is it written in rust so much more low usage than airflow that consistently cpu picked while doing nothing.
I did not realize at first glance you can write a script and then trigger it (with an API key, not sure if you can make a public one or if that's a security nightmare) through HTTP to basically have your own self-hosted AWS Lambda / the whole "serverless HTTP triggers/functions" craze
Yes absolutely, we're even faster than lambda with dedicated workers since the design is simpler, while loop jobs from the queue while still keeping all the error handling, observability, and easy deployment.
Is there a workaround I'm missing where the URLs are exposed without needing an API key? (Something I'm pretty sure Lambda allows if you configure it properly)
The api keys can be generated to be webhook specific (that's what they are by default when you generate them in the script's UX) and hence can be made public so there is no risk to expose them publicly, they are similar to a uuid of a lambda (not a key but impossible to guess).
We require api key because every request is permissioned and hence we need to know who the script is executed on behalf of. For instance, scripts can fetch secrets/variables but those calls will be permissioned with the permissions of the caller which in this case we get from the webhook specific token.
While appreciating that they’re not direct substitutes, does anyone have experience comparing Windmill with Dagster for managing dependency resolution in complex data pipelines?
I briefly tried windmill for a project that involved creating custom workflows on-demand from a configuration file. I can't recommend windmill for data pipelines, it is meant to be more of a low-code internal app platform like retool or budibase. It is meant for a relatively static workflow that requires human intervention, like a simple business process involving some API calls and humans approvals in the loop. For complex (and potentially reconfigurable) data pipelines, dragster is a much better choice
When you tried, we didn't have s3 integrations nor restartable flows. We will present all of this in day 5 of our launch week and it might be a good time to revisit.
I agree we were not a good fit prior. I think we would now compare favorably as we will offer excellent ergonomics for data processing, leveraging polars, duckdb, and other OLAP libraries to their full extent.
It was several months ago, so it is entirely possible. Back then I got the feeling that windmill tried to be more of a low code business/internal tool platform than a data/ETL workflow tool, I especially missed an expressive way to define workflows programmatically (i think you had a JSON schema but nothing as powerful as dragster where you can define a whole workflow in pure python)
yaml is not real code and so the part that I consider to be real code is that each step is its separate file in python, typescript, that you can edit in your code editor and have your plugins working, testing frameworks. It's normal functions that you can run locally.
It would be possible for us to do like dagster/prefect/airflow which is to use a macro-processing step/decorators to build dynamically the graph which is what our yaml is in the end, a 1:1 encoding of our dag spec called openflow: https://docs.windmill.dev/docs/openflow/
We didn't do it yet because in most cases, the decorators are a lot like yaml, they are a very rigid way of declaring that some functions are nodes that you can chain in limited ways. On the other hand, not providing that mode allow us to put more efforts in the low-code graph builder for now.
But, as someone that love compilers and built a few, I'm very eager for us to provide such a mode so it's probably a few months away :)
We'd love to have your input on the DX for data/ETL once we present it Friday so feel free to join our discord or shoot me an email at ruben@windmill.dev
Thank you for the extensive reply! The first part you mentioned is exactly what we were missing back then, we wanted to dynamically generate workflows starting from a configuration selected by the user. This wasn't really possible unless we would generate the YAML openflow specification ourselves.
At the end we gave up and rolled our own simple tool that just does what little we need. This said, it is cool that you are considering offering a more code-friendly way to define workflows. I still think this doesn't offer the same level of dynamism of dragster, where you could easily design branching/conditional workflows. I suppose your considerations regarding the decorators/compilers go exactly in that direction
you can use that to generate client sdks in any languages and build your own dag with it. That's what one of our customer did building a reactflow to openflow library: https://github.com/Devessier/reactflow-to-windmill
It's not as good as the decorator way but we move fast and if you still have interest for it we could prioritize it (and ask for feedbacks :))
This looks great! Are there any plans to include Ruby as a supported scripting language along with Python, Typescript, Go, Bash, and Sql?
I currently use my own crude workflow runner for processes that are mostly sql, but with some ruby code. I’ve stuck with it because it works and we currently use Ruby for almost everything, so it’s not worth it to add another language to the mix so we could use a better tool.
So that is the self hosted enterprise version. You can selfhost for free.
It does not ping back home. But if you download all the resource types from the windmill hub you will send a http request to them.
I do not believe the EE version pings back either as of now. But I think the plan is to report back with usage. Or i need to report it manually.
Hi Windmill devs -- why build this on relational Postgres at all? I've been toying with the idea of forking airflow and trying to swap out the backend for something document based either on Mongo or something else.
The reasons being db migrations are a pain in the relational strongly typed schema world vs the more forgiving paradigm of documents.
because 95% of the tables you need for this are relational (queue, completed_job, users, etc, etc) and for the few things that are unstructured, you can represent them well in PG with JSONB (which is what we do for inputs and outputs).
that wasn't the concern being presented. The concern is "I want to store mostly unknown structured data, so why dont I use a DB already stupd to do that"
Only if you deem SSO to be of an essential nature to such products. To me it's all semantics at this point.
But it's a complex software that require software engineers with salaries and we do not want to rely on donations so we do have to give reasons for people to pay. SSO and audit logs are good ways to segment enterprises to support the project.
I could use this as a bridge between developers and operation, but in my case the lack of localization is annoying for the staff in the ops side. I don't have time to contribute something like this right now, but I will keep an eye on this.
Did you guys considered existing standards when you chose what to use for representing workflow definitions before choosing OpenFlow? For example, Common Workflow Language
We did and started with an existing standard but quickly, trying to fit to the standard was more complex than rolling our own.
Agreed we just created yet one more standard but the bit about the input transforms being full javascript expressions or the way we encoded suspend steps was impossible to retrofit.
you can always run compiled programs from bash as binaries (but yeah it's not as great as an experience than being able to run interpreted languages directly).
Articles generally can't be show hn's either so it doesn't matter much for the purposes of Show HN. The full thing is "Off topic: blog posts, sign-up pages, newsletters, lists, and other reading material. Those can't be tried out, so can't be Show HNs. Make a regular submission instead."
I would argue that implementing your internal queues inside your DB is sound system design. Your backups and system restores are as simple as it gets. You should have a specific reason to do anything more complicated. And no, "deploy at scale" is puffery, not a good reason.
Postgres is having a moment with excellent queueing libraries right now, not sure what GP is referring to. We use Oban[1] at Mux for some workloads and love it, and a former colleague just built something similar in Go that was on the front page yesterday (River)[2].
Connection limits, locking issues, concurrency issues, multiple consumer/producer issues, dead-letter, record expiration, performance limits, scalability limits, lack of gateway, lack of interoperable standards, tightly-coupled applications, general architectural limits.
Can you "work around" all that? Sure, it's technically possible (well actually it's not possible to solve all of the above, but most of it). Just like it's also possible to take a mini-van and convert it into a hot rod, so it can pick up groceries after it's entered in a drag race. It will just cost you a lot of extra time and money, and do both jobs poorly.
It's NIH syndrome plain and simple. Hipster nerds who want to invent something for fun, rather than using something off the shelf. People decrying "complexity", who then poorly implement complexity themselves, in the form of functionality shoved into a system not built for it. Software architecture by HN meme.