
Show HN: Gaia – Build powerful pipelines in any programming language - michelvocks
https://github.com/gaia-pipeline/gaia
======
myWindoonn
"Any programming language" but only Go is currently available and it's not
clear whether other languages need anything besides gRPC support in order to
be eligible.

Cool idea, early code, we'll see whether it goes anywhere.

~~~
masukomi
the screenshot has go, java, c++, and nodeJS as choosable languages....

~~~
myWindoonn
From the first few paragraphs of the README, "Develop pipelines with the help
of SDKs (currently only Go) and simply check-in your code into a git
repository." If they've added those others in the intervening time, then
great!

~~~
detaro
[https://github.com/gaia-pipeline/gaia/issues/15](https://github.com/gaia-
pipeline/gaia/issues/15) 2 days ago

> _Sadly, gaia is in alpha phase and we currently only support pipelines
> written in go._

------
jsd1982
The language used in the README is extremely odd. What do you mean by
pipeline? The term 'priority' is confusing because what you describe is not a
classical interpretation of priority. That sounds more like job ordering with
a limited DAG form of dependencies (sequential, fan out, fan in). How does
data flow from one job to the next and how does it get combined for a fan-in?
What if I have two jobs that get fanned in to? More documentation coverage of
the basics of what you're talking about would help.

~~~
michelvocks
Hey jsd1982. Sorry for the late reply. You are right, I didn't describe the
term "pipeline" explicitly. A pipeline is a compilation of functions which do
"something". We used/use "priority" to determine to order of execution of
these functions. We already noticed that priority is probably a bad name and
we actually need something different ([https://github.com/gaia-
pipeline/gaia/issues/19](https://github.com/gaia-pipeline/gaia/issues/19)). We
are currently working hard on the documentation part ([https://gaia-
pipeline.io](https://gaia-pipeline.io)). The project is really early. I hope
we can improve within the next weeks/month a lot! :-)

------
faizshah
There's a pretty interesting apache project called NiFi that's kind of
similar, it's for data flow across different softwares. Apparently was
developed and is still used at the NSA.

[https://nifi.apache.org/docs/nifi-
docs/html/overview.html](https://nifi.apache.org/docs/nifi-
docs/html/overview.html)

~~~
snthpy
Very interesting. The provenance tracking is a nice feature that I haven't
seen a lot of in other systems.

Any examples of this being used in the wild?

------
smarx007
Wait, didn't we have BPMN more than a decade ago? BTW, if "any" programming
language necesarily means Go for you, here is one BPMN implementation in Go:
[https://zeebe.io](https://zeebe.io)

~~~
brightball
BPMN is the most under appreciated technology in our field IMO.

~~~
StefanHoutzager
Agreed. And the ideal language to build a backend interpreter for bpmn - I am
doing so - is elixir imho (a language to build build massively scalable soft
real-time systems with requirements on high availability, see f.e
ds.cs.ut.ee/courses/course-files/To303nis%20Pool%20.pdf ).

------
regnerba
Can anyone give an example of what this would be useful for? I don't really
understand its use case and the one sort of example in an image, of creating a
k8s namespace and deployment, is covered by our CI/CD system. If this is meant
to replace that and be part of the CI/CD I don't see anything talking about
how to trigger the pipelines.

I feel like I am missing a small but critical piece to understand how this
should be used and where.

~~~
michelvocks
Hey regnerba. Sorry for the late reply. I don't know which CI/CD system you
use but please let me explain my current experience. :-) At my company, we
often had the requirement to do complex stuff besides the deployments. For
example, we wanted to offer a service for developers that they can via "One-
Button-Click" deploy our big monolith into Kubernetes from their code-fork. To
get this working, we had to use Spinnaker and Jenkins. We had a really poor
experience with Spinnaker cause everything needs to be configured manually and
there is no way to actually "code" it. Jenkins supports this via Jenkins
Pipelines. Those pipelines are actually nice but they force you to write them
in Groovy. Gaia solves this problem. Any programming language can be
potentially be used to develop such pipelines.

~~~
danielodio
@michelvocks -- the Armory team would love to hear more about the challenges
you had with Spinnaker. (We're commercializing an enterprise version of OSS
Spinnaker).

Also, we've created a 'pipelines as code' feature called Dinghy that may have
helped. And our installer & configurator provides a much smoother install &
configure experience. Details at www.Armory.io

Hit us up at hello@armory.io if you have more specific feedback (our exec team
reads emails to that addy).

DROdio

------
Karunamon
Is this basically a CI tool? It looks very similar to Gitlab CI or Concourse
or Jenkins Flow. If so, odd to not see that initialism anywhere; "pipeline" is
pretty generic.

~~~
rhencke
'Jenkins Flow' is now called 'Jenkins Pipeline'. Gitlab CI also uses pipelines
in its terminology
([https://docs.gitlab.com/ee/ci/pipelines.html](https://docs.gitlab.com/ee/ci/pipelines.html)).
I think the use is appropriate here.

~~~
rahimnathwani
Thank you for that link. Gaia's README assumes readers are familiar with what
they mean by 'pipeline' but, because I'd never seen the word used in this way,
I was scratching my head wondering whether they were talking about stages of
data processing or similar.

~~~
michelvocks
Hey rahimnathwani. Sorry for the late reply. You are absolutely right, this is
my fault. I should have explained the term "pipeline" a bit more. I will work
on this!

------
empath75
I’d prefer to have a dependency graph to a priority system, personally. It’s
much easier to reason about and maintain.

~~~
michelvocks
Hey empath75. Sorry for the late reply. We also discussed this
([https://github.com/gaia-pipeline/gaia/issues/19](https://github.com/gaia-
pipeline/gaia/issues/19)) and you are right. We will switch soon to a
dependency system instead of a priority system. :-)

------
cjhanks
This looks nice enough. It may be worth mentioning that the real challenge for
creating a generalized pipeline is rarely the control flow. Usually
performance is not achieved due to; poor intermediate data locality, poor
system utility balancing.

Unless pipeline code has the ability to match estimated job utility to device
capabilities - it won't be useful in many non trivial cases.

Unless there is an automated way to store intermediate assets such that data
locality between stages is (at least somewhat) optimal, significant amounts of
all time will be spent in process migration.

------
jacques_chester
I can't give an unbiased review because I love Concourse. As I see it, the
pros are the availability of the type system and testing ecosystem.

I don't mind YAML myself, but I could I use something like Enaml[0] and get
the same benefits as you see them?

[0] [https://github.com/enaml-ops/enaml](https://github.com/enaml-ops/enaml)

~~~
ramoz
any references/comparisons for Concourse?

~~~
jacques_chester
Could you elaborate on what you're after?

The canonical reference is the main website: [https://concourse-
ci.org/](https://concourse-ci.org/)

------
stevekemp
Looks like there is no way to downloaded the results of any pipeline that has
executed. For example if I wanted to write a pipeline that would:

* Compile a program. * Run some tests. * Create a Debian binary package.

It looks like I'd have to write a final task to upload to a staging repository
in the pipeline itself.

~~~
michelvocks
Hey stevekemp. Sorry for the late reply. You are right, Gaia is alpha so this
feature is currently missing. I hope we can provide this soon. :-)

------
rs86
The design on the product screenshot is too complex. Why that fat curvy font
on the page title? We need more brutalism

~~~
michelvocks
Hey rs86. Sorry for the late reply. I actually like the design a lot. I think
many developers also enjoy a fresh modern design, too :-)

------
oldgun
Thanks. Looks like something I've been looking for. Interesting idea! Will
evaluate that.

~~~
michelvocks
Hey oldgun. Sorry for the late reply. Good to hear that. Let me know if you
have any questions! :-)

------
meitham
It would be nice to see this mature as an open-source alternative to
Informatica PowerCenter.

