
Yml Coding - undefinednull
https://cloud.google.com/workflows/docs/reference/syntax
======
xiaq
Saying this is reinventing BASIC in YAML seems to miss the point here; the
programming model here is essentially a finite state machine with optional
data flowing between nodes, and AFAIK there isn't really a widespread language
that targets this model.

And a restricted model is not there just so that non-programmers can use it.
Restricting what you can do _in_ the program means that there are more things
you can do _with_ the program. I haven't checked if GCP's Workflows supports
everything below, but here are some things you can do in principle:

* You can visualize the entire program as a flowchart, and visualize the state of the program simply by pointing at a node in the flowchart. This is not possible with a general purpose language since there could be arbitrary levels of call frames.

* You can implement retry policies for each step entirely transparently, and possibly other things like authentication. Aspect-oriented programming is more practical when the programming model is restricted.

* You can schedule the steps onto different hosts, possibly in parallel.

* You can suspend and resume the workflow, since its entire state is just which step is being executed, plus a handful of variables (which presumably are always serializable).

Re the problem of extension: the idea seems to be that you put all "smartness"
inside HTTP services that are written in real languages and only use this as a
dumb glue language.

~~~
mfateev
Look at my project temporal.io. It does the same thing using general purpose
programming languages. Currently Java and Go are supported with Python, Ruby
and C# under development. This is absolutely valid workflow code:

    
    
         public void execute(String customerId) {
           activities.onboardToFreeTrial(customerId);
           try {
             Workflow.sleep(Duration.ofDays(60));
             activities.upgradeFromTrialToPaid(customerId);
             while (true) {
               Workflow.sleep(Duration.ofDays(30));
               activities.chargeMonthlyFee(customerId);
             }
           } catch (CancellationException e) {
             activities.processSubscriptionCancellation(customerId);
           }
        }

~~~
xiaq
But what this does under the hood is still sending instructions to a workflow
engine that does the actual work, as opposed to this code doing the actual
work directly, right?

If that's the case, then what you could do is still restricted by the protocol
of the workflow engine - building the instructions in code gives you some
dynamicism, but not a whole world of difference, and complicates things that
are easier done statically. It is definitely a valid approach, but it doesn't
invalidate the approach of writing out the workflow definition statically,
especially if the paradigm is "put all smartness inside HTTP service and only
use the workflow as a glue".

~~~
mfateev
It is a workflow engine. So the actual work happens inside the activities. So
from this point of view it is the same. The difference is that the
orchestration code has the full power of a programming language including
threads and OO techniques. And all the tools like IDEs, debuggers, unit
testing frameworks work out of the box and don't need to be reinvented for
YAML based reimplementation of Java.

There are lot of advantages of using a general purpose programming language
for implementing workflows. An incomplete list in no particular order:

    
    
       * Strongly typed for languages that support it
       * No need to learn a new programming language
       * Practically unlimited complexity of the code 
       * IDE support which includes code analysis and refactoring
       * Debuggers
       * Reuse of existing libraries and data structures. For example can YAML based definition support ordered maps or priority lists without any modification?
       * Standard error handling. In Java, for example, exception are used.
       * Easy to implement handling of asynchronous events
       * Standard toolchains just work. For example Gradle for Java and modules for Go.
       * Standard logging and context propagation can be supported
    

And so on. Any new language has to have a ton of tools, libraries and
frameworks to be useful. And using an existing language allows to benefit from
the existing ecosystem out of the box.

~~~
xiaq
Yes, the orchestration code can be developed and debugged as normal code. I
agree that it can be a pain to write and debug YAML "code".

But what's usually more interesting when it comes to workflows is inspecting
and debugging the workflows themselves, and you'd still need custom tooling
for that, regardless of how the workflow is built.

The actual appeal of using YAML here is that the "code" is amenable to static
analysis. YAML is irrelevant; the DSL embedded in YAML is. For example, you
can easily count how many steps there are, how many edges there are between
steps, etc. If you build the workflow in code, in general you can only know
these after the orchestration code has executed.

~~~
mfateev
I'm not sure how useful all these counts are. If the static analysis that you
described was that important we would write most of the software in YAML
instead of C/C++/Java/Go/Python, etc. Linux kernel in YAML would be really
cool :).

In my experience, no developer ever asked for this information, especially if
the price is writing code in turing complete YAML/XML/JSON based language.

~~~
xiaq
Well, that was a contrived example. Something more useful would be e.g.
enumerating all the HTTP endpoints being depended on, or calculating the
maximum resource consumption.

I think our disagreement really boils down to different approaches towards
workflow configuration. Static configuration has its place in a system where
all the "smartness " can easily fit somewhere else. This is often the case
when you are building a workflow that glues many in-house components together,
which tend to have uniform behavior and you can easily extend them. On the
other hand, if you are working with many heterogeneous components that are
clumsy to extend, having a smarter, more dynamic workflow configuration API
definitely makes more sense.

~~~
mfateev
In my opinion, there are two classes of workflow definition languages: domain-
specific (DSL) and general-purpose.

Configuration based languages are awesome for domain-specific use cases. For
example, AWS Cloud Formation or HashiCorp Terraform configuration language are
good examples of domain-specific workflow definition languages. They solve
just one specific problem that allows them to be mostly declarative and omit
most procedural complexity. Even in this case, I'm pretty sure that Pulumi
folks would not be 100% in agreement.

The general-purpose workflow definition languages are procedural. And I
believe that procedural code in YAML/XML/JSON is a bad idea. It looks ugly,
doesn't add much value, and never matches any of the general-purpose languages
in expressiveness and tooling. Such configuration languages work in limited
situations, but developers quickly hit their boundaries in most real use cases
and have to look for real solutions.

BTW Temporal and its predecessor Cadence are perfect platforms for supporting
custom DSL workflow definitions. Many production use cases run custom DSLs on
top of them.

------
bonestormii_
Data is data. Code is logic. So the the thought process goes...

...what if I could store my _configuration_ (data! right?!) in a way that is
nicely separated from the logic?! No more scripting for me!

Oh wait. Some of the configuration can't be generalized about universally.
Configurations fundamentally contain logic, I guess.

Oh, well then why don't I just represent the _logic in the data_?! That will
be _much better_ than representing the _data in the logic!_

....but now, you are back where you started, only instead of using something
nice, standard, and powerful like Python, you have to use this... language...
thing. This YAML convention you cooked up.

Separate the data from logic. Read the data into the logic. Make a nice
organized place to call custom logic from... like, you know, a file directory
full of scripts, which are called according to some scheme. It could be
another data file. Stop there.

Like this:

    
    
        $ ls -Ra
        ./config/do_something.yaml
        ./config/do_something_else.yaml
        ./config/config.yaml
        ./logs/ping.log
        ./scripts/do_something.py
        ./scripts/do_something_else.py
    

\---

    
    
        $ cat ./config/config.yaml
    
        do_something:
            target: my.stupid.server
            exec_frequency: daily
            ...
        do_something_else:
            target: my.stupid.server
            exec_frequency: monthly
            ...

\---

    
    
        $ cat ./config/do_something.yaml
    
        ping: true
        output: 
           directory: ./logs/ping.log
           mode: append
    

\---

    
    
        $ cat ./config/do_something_else.yaml
    
        delete: ./logs/ping.log
    

\---

    
    
        $ cat ./scripts/do_something.py
    
        from stupid.library.task import Task, subscribe
    
        class DoSomething(Task):
    
            @subscribe
            def ping(self, target):
                return super().ping(target)
    

Then you code the program that runs collects all of the task methods, put them
in an ordered list, and run them if the conditions in the config.yaml is met.

Every other thing is done in a task method in python, or something like it. I
don't think we _need_ more abstraction than that.

~~~
pydry
This is more or less the structure I followed with this testing framework:
[https://github.com/hitchdev/hitchstory](https://github.com/hitchdev/hitchstory)

The whole idea being that you don't want the _story_ to be turing complete
(there are no loops or conditionals with a story), but the code that
_executes_ it needs to be turing complete.

~~~
bonestormii_
Right, and it's not like it's ground breaking thinking here. It's just a
matter of some personal restraint on your part to say, "No, I will not turn
python into an interpreter for some disgusting yaml-based language."

Everyone agrees we hate this, until one day it's us who is dreaming it up, and
we just can't resist! There should be no concept of "flow of execution",
except for _maybe_ expansion of previously declared variables just so you can
do something like concatenating two lines together. Otherwise, the lack of
logical operations will make even that unnecessary.

~~~
pydry
Well, knowing when the flexibility of turing complete language is required and
when the simplicity of non-turing complete declarative language is required
and _where_ to draw the line between them is often pretty hard.

This is why we get YAML based languages that edge towards turing completeness
with conditionals and loops and ugly templating hacks. Mostly people get it
wrong. _I_ got it wrong several times when creating two earlier versions of
that framework.

------
abeppu
I get the appeal of something like this. It's locked down enough that it's
perhaps challenging to do something really unsafe with it. It's maybe usable
by someone who doesn't feel comfortable with a "real" programming language.

But it seems like this falls apart as soon as one service you need to interact
with creates a requirement not anticipated by this very constrained tool-set.
You need to query service A, extract something with a regex, base64 encode
something else before you post to service B? Well we didn't include regexes, a
module/import system, or the ability to introduce UDFs in a different
language.

And if you had the resources to make all your services play into the
expectations of this workflow system, you might not need to use this workflow
system.

~~~
rightbyte
I don't see how writing a kinda abstract syntax tree with the opacity of
Enterprise JavaBeans is easier than say writing the workflow in some Basic
dialect with only GOTOs and IFs.

------
perfunctory
And here we come. Google, in the 21st century, is trying to sell us BASIC

    
    
      - define:
        assign:
            - array: ["foo", "ba", "r"]
            - result: ""
            - i: 0
      - check_condition:
        switch:
            - condition: ${len(array) > i}
                next: iterate
        next: exit_loop
      - iterate:
        assign:
            - result: ${result + array[i]}
            - i: ${i+1}
        next: check_condition
      - exit_loop:
        return:
            concat_result: ${result}
    

edit: it's not April 1st yet, is it?

~~~
baq
i've seen this before, as has my father and probably my grandfather, except it
was thought out better and was called lisp instead of yaml.

~~~
mahmoudimus
I completely agree with this statement and made a similar realization as well.
Basically, modern devops and application development boils down to writing
Lisp in YAML.

Config languages for Ansible, K8S, etc.are just basically a bad implementation
of a Lisp-ish language.

~~~
Jtsummers
It’s not a lisp. It’s similar to s-expressions, but it is not lisp. It’s a
tree based language and that’s the entire similarity. There is nothing else
here that is reminiscent of lisp.

~~~
timgilbert
Yeah, among other things Lisp isn't big on labels and GOTO (but BASIC is).

~~~
Jtsummers
Well, Common Lisp isn’t _opposed_ to it, it does have labels and gotos. But
it’s not used in your average code (at least from what I’ve seen they’re
typically constrained to code needing high performance or macros).

But this is missing some of the key things of lisps. It’s not homoiconic (it’s
a tree-based syntax, yes, but with no tree structure unless you count nested
arrays). There are no closures or lambdas. You can’t pass steps/sub-workflows
around as parameters to variables (at least not what is shown) so you can’t do
something like:

    
    
      - a_step:
        - assign:
          - some_task: a_step
      - b_step:
        - call: $some_task
    

I mean, that’d be a basic element for passing function-like things around and
getting the functional capabilities of lisp and let you have higher-order
functions.

This is a weird, basic, imperative language that’s aiming for deliberately
limited capabilities. They seem to have chosen YAML only because it’s already
used as a configuration language, but not because it actually provides any
real value or novelty. We are very much approaching the 00s infatuation with
XMLifying everything here.

------
tuankiet65
Trying to turn YAML into a programming language reminds me of Ansible
playbooks.

~~~
dhosek
We went through great lengths to be able to do parameterized YAML using jinja2
templates. I designed a system that let us effectively create a new yaml out
of two other yamls which let us handle repetitive configuration tasks
reasonably well but there were occasional unexpected challenges thanks to the
syntactically significant white space of yaml.

~~~
hbogert
sincerely, why do this in the first place?

~~~
dhosek
We were targetting configurations of Concourse for deploying software. We
would have to do essentially the same deployment to multiple environments
(dev, qa, staging, performance, prod) and Concourse doesn't provide a
sufficiently rich configuration in its use of yaml. For a single app cut and
paste is acceptable but once you get into multiple microservices, it becomes
more efficient to do it this way.

~~~
q3k
Have you looked into Jsonnet/CUE/Dhall? They attempt to solve this problem,
but give you an actual programming language instead. I've been using Jsonnet
to successfully escape the 'yaml-templating-yaml' hell for years now.

But yeah, concourse configuration files are probably the worst YAML verbosity
offenders, even worse than k8s manifests.

------
osmarks
This seems like some sort of horrible inner platform effect. It would be nice
if they would use an actual embeddable programming language, like Lua.

~~~
baq
turing complete configuration? no thanks. i like it when my configs are
guaranteed to halt.

~~~
perfunctory
what makes you think this yml language is not turing complete?

~~~
osmarks
Based on skimming the docs there, you _can_ do infinite loops but may have
finite storage (strings are apparently <=64KB and there seems to not be a way
to append to arrays). It does have subworkflows, so maybe you could store data
in the callstack in some horrendous way, but it's probably also limited in
size.

------
swiley
IMO: most of these weird DSLs come from wanting a data structure but getting
confused and writing a bad programming language instead.

------
thangalin
Tangentially related is my YAML pre-processor that performs string
interpolation:

[https://bitbucket.org/djarvis/yamlp/](https://bitbucket.org/djarvis/yamlp/)

Along with Pandoc, it allows common prose from Markdown to be de-duplicated,
as described in my Typesetting Markdown series:

[https://dave.autonoma.ca/blog/2019/07/06/typesetting-
markdow...](https://dave.autonoma.ca/blog/2019/07/06/typesetting-markdown-
part-5/#interpolated)

Obviously it's laborious insert to YAML keys everywhere, so I'm developing an
editor that integrates YAML and plain text document formats (such as
Markdown):

[https://www.youtube.com/watch?v=u_dFd6UhdV8](https://www.youtube.com/watch?v=u_dFd6UhdV8)

------
jpxw
Even writing CI configuration is onerous enough in YAML, let alone a program

------
orf
I like this. Go "har har har yaml" all you like, but this reads a damn lot
better than AWS step functions and seems way more powerful.

There are a number of tasks that benefit from something like this, and there
are a huge number of advantages from being able to encode steps in domain
specific languages such as YAML and have them execute with truly "no code".

~~~
baq
yaml is not a language. it's a serialization format.

this thing here is a language that serializes into yaml. there are zero
benefits of using it except perhaps 'it is yaml so i don't have to compile my
configuration'.

note that xml with xslt would have handle it better.

~~~
orf
> there are zero benefits of using it

There are many, many benefits from having steps defined in a restrictive,
tightly-controlled language that doesn't require sandboxing.

There are also many benefits to not using XML and XSLT, not least of all user
experience.

------
jpalomaki
Do they have a graphical tool for defining these workflows? For me this looks
like the business process management tools from the past. With those the idea
was not to use them to write program logic, but connect the dots on very high
level. See for example the use case pictures at [1].

[1] [https://cloud.google.com/workflows](https://cloud.google.com/workflows)

------
cordite
How about we use resumable lua scripting

------
dpc_pw
Peak Yaml?

~~~
throwaway4889
Yaml is the new xml

~~~
IshKebab
I dunno, XML may have been verbose and overused but at least it had a fairly
solid design. YAML is terrible.

~~~
Spivak
YAML “won” for some definition because it’s really natural to write. Mix
colons and brackets, multiline strings that aren’t weird, type annotations
when desired, add your own types, basic but functional-ish DRY support that’s
transparent to the reader, dict merging. Throw in some value templating a la
Ansible and you have a really solid language to express annoying data shapes.

------
fmakunbound
Good god it’s stuff like this makes me contemplate leaving professional
programming.

------
wickedOne
despite the readability of the yaml syntax, for me the big disadvantage is the
inability to use meta / type info.

for a straightforward collection of strings: probably good enough, for
anything else xml is more likely to cover the usecase better

~~~
Spivak
YAML has really good typing support with type annotations being a first class
citizen. But you’re dependent on the hosting software (adding or
understanding) the types.

For example

    
    
        start: !date 2020-09-12
    

could actually map to a native date object.

------
taylorlapeyre
I am surprised at how well this actually reads!

