
Citrine: Localized Programming Language - MindGods
http://citrine-lang.org
======
olodus
I don't get it at all. Nothing stops you from writing you code in your own
language in most other prog langs (as long as it has unicode support). The
reason a lot of us chose to write in English is because we want to share our
code and English is the most universal lang we have right now. As a non native
English speaker I most definitely see the problem in this but it is also an
imperfect world and communication is one of those problems which will never
have a perfect solution.

Programming languages should imo from the offset be steps ahead of this. They
have simple rules that can be parsed despite understanding the subjectives. I
could probably understand some rust code written in mandarin after studying it
for a while. In our line of work it is important with clear communication.
Spoken languages are not good ways of accomplishing that.

------
lifthrasiir
I've said too much about this subject in the past [1], but my litmus test for
"localized" programming languages is a Korean support (both because I speak it
natively and because it is very much different from most Indo-European
languages). It spectacularly fails. At the very least it is evident that
Citrine only ever cares about languages with prepositions (e.g. `x on:
'greet:' do: { ... }`).

[1]
[https://news.ycombinator.com/item?id=21352775](https://news.ycombinator.com/item?id=21352775)

~~~
hhas01
“Citrine only ever cares about languages with prepositions”

You’re confusing keyword/identifier localization for natural language. I agree
with everyone who says NL is the wrong problem to solve, but that’s not what
Citrine is attempting to do.

Look at it this way: Citrine code is NOT natural English either. At most, it
is a sort of “pidgin English” where individual words can be readily understood
but the grammar is wholly artificial.

For instance, a Chinese or Japanese dialect should continue to use whitespace
to separate identifiers. It’s a programming language, not a natural one, and
that difference is the point. Trying to “fake it” leads to dead-end mistakes
like AppleScript. Trying to do it for real creates exactly the sort of
ambiguities and imprecision that a programming language is meant to eliminate.

It’s about finding the right compromise between human accessibility and
machine precision, and anything that breaks down barriers between the
traditional US English-dominated programming world and the billions of humans
who speak something else has got to be a step in the right direction.

~~~
lifthrasiir
I definitely agree that localized languages can only reflect a _subset_ of
human languages (and it's of course true even for English-based languages),
but that doesn't make my point moot. If the author did really care about other
languages and wanted to pursue the fixed syntax, every identifier should have
been strictly nouns or verbs and nothing else; for example (say)
`animate(source: X, target: Y)` instead of `animate(from: X, to: Y)`.

~~~
hhas01
“every identifier should have been strictly nouns or verbs and nothing else”

Totally agree. TBH I don’t think the authors’ enthusiasm, while admirable, is
quite matched by their knowledge and experience. A bit more up-front learning
could probably save them disappearing down some obviously wrong paths.

As you say, they should focus on verbs and nouns, and getting those to
machine-translate precisely. Nailing that would be a big deal just in itself.
If they later want to experiment with adding prepositions to make code read
“more naturally”, make the system add those on a per-language basis according
to per-language rules laid down by native-language speakers.

Thus in English the method signature might present as `animate(from: source,
to: target)`, because that’s what English readers expect and like. But that
doesn’t mean that `from` and `to` should be translated to every other
language, because that’s just an exercise in cutesy-clever nonsense, burying
what is significant under what is not, and software already suffers from more
than enough of that.

------
Someone
The site isn’t clear on it, but from downloading the translation package
([http://citrine-lang.org/downloads/headers.tar.gz](http://citrine-
lang.org/downloads/headers.tar.gz)), it seems translation is a simple string
replacement. I don’t think that’s enough to cover all languages.

[https://en.wikipedia.org/wiki/Non-English-
based_programming_...](https://en.wikipedia.org/wiki/Non-English-
based_programming_languages#Modifiable_parser_syntax) has a list of languages
that do at least as well (I think AppleScript tried to do better, but can’t
find examples)

Of course, switching language for the programming language itself won’t
translate identifiers, so multi-language teams still will have to decide on
the language(s) used for those.

~~~
hhas01
“it seems translation is a simple string replacement. I don’t think that’s
enough to cover all languages.”

Honestly, simple string replacement isn’t good enough to cover _any_
languages. Homonyms and synonyms, anyone? I suspect machine translation will
be harder with code simply because there’s a lot less contextual information
around individual words compared to ordinary prose with which to make a best
guess as to meaning.

“I think AppleScript tried to do better, but can’t find examples”

Dr Cook’s HOPL paper on AppleScript gives an example of French and Japanese
dialects (p20; can’t it paste here as it’s an image):

www.cs.utexas.edu/~wcook/Drafts/2006/ashopl.pdf

Not really the same thing as Citrine as it relied on manual localization of an
application’s resources (custom keywords defined in AETE terminology, vs
labels and tooltips on GUI controls). And, as you say, it did not do anything
to localize user-defined words.

In any case, AppleScript is a really good example of how NOT to design an
accessible language syntax. All the rigidity and tolerance of a machine
language, with all the complexity and ambiguity of a natural one faked on top.
(Plus all the fun that comes with arbitrary keyword injection—argh.)

This is why I say _artificiality_ is good. Humans aren’t after Shakespeare,
they’re after high-level understanding of program code.

~~~
Someone
_“Homonyms and synonyms, anyone?”_

I think, for this case, that’s mostly solvable by being careful to not make
assumptions. For example, you should create separate translation strings for
the ‘for’ in “for…in” and in “for…each”, and for the ‘define’ in “define
function” and “define procedure”.

Also, when (not if; if your language is successful, it will happen)
translators report problems, you should ‘simply’ not hesitate to introduce new
‘clones’ of to-be-translated words.

Even ignoring languages with different word order, that wouldn’t make things
perfect. For example, there will be languages where the correct way to say
“define” depends on the (perceived) plurality or gender of the function name.

I think the correct way to handle this is by letting the translator produce a
grammar for your language that produces the same tokens as the ‘original’.
That probably is beyond many would-be translators, though.

------
oneplane
This is just silly. The whole point of having an abstraction like a higher
level programming language is so that we can all work together on tasks
instead of spending time on language details.

Unless you are working on an isolated system in an isolated setting that will
be destroyed rather than merged, none of this helps. It's the same as
localised programming some people do today where you simply can't work
together because you lost the most important common denominator.

------
Jenz
> Citrine is one of the first∗ embeddable∗∗, general purpose, localized
> programming languages (...) designed to allow every man to write code in his
> mother tongue

This concentration of asterixes had me laugh out loud. Anyhow, this reminds me
of so-called “auxilliary”, constructed languages of communication, which are
supposed to be easy to learn all over the world for as many people as possibe
all over the world. In reality though, they’re almost without exception wholly
eurocentric, built mostly on germanic and romance languages, which barely
covers _one_ of the main language families just in europe... likewise, this
citrine promises grossly more than it delivers.

Though we shouldn’t stop at that; if we look away from what it claims to be,
what it _is_ seems pretty neat. For education, maybe this is great.

------
ridaj
I've written once an internationalized parser for another (more established)
language. As long as you can output an AST, this works.

User feedback was interesting. Some non-native speakers of English objected to
this, on the basis that it was a lot easier for them to distinguish language
keywords (in English) from user symbols (typically in the user's language);
language acted as a kind of syntax highlighter for them.

Also, they felt it disorienting to express in their own native language
programming constructs that they'd learned in English. For example when one
translates "a `while` loop", the keyword `while` is often left untranslated as
there is no equivalent domain-appropriate word in the target language, and the
generic translations don't "sound" right.

~~~
hhas01
“they felt it disorienting to express in their own native language programming
constructs that they'd learned in English”

Trust me, going from natural-language English to programming-language English
is just as discombobulating. (Been there, done that; made a deliberate point
of remembering that experience.)

If you’re only testing it with existing programmers, you’re bound to skew your
results because anything new and different to what they’re already used to is
a disruption to their established flow. As in any scientific research,
designing the right control for your trial is critical to avoiding GIGO.

A better test would be taking groups of English and non-English _non_
-programmers, and teaching both of them entirely from scratch. (This assumes,
of course, that each group’s teaching materials have been human-localized to
the same standard, and all trainers follow the same script.) That way you
avoid polluting your results with preconceptions and existing biases.

Out of interest, is there any more public information of your work here? iris-
script[1], my own end-user-friendly language project and obvious anagram, has
a way to go before I can begin to explore localizability, but it’s on the TODO
list so I am collecting links to relevant material. Ta.

\--

[1] [https://github.com/hhas/iris-script](https://github.com/hhas/iris-script)

~~~
ridaj
> A better test would be taking groups of English and non-English non-
> programmers, and teaching both of them entirely from scratch.

In a way, we already have this test today. Most of the world's programmers are
non-native speakers and, by and large, they learn programming languages where
keywords are based on English. I've heard people thinking that language
localization would be a cool thing, but I haven't heard anyone _complain_
about it not being available.

> Out of interest, is there any more public information of your work here?

No, my comment is all the public info there is about it. My $.02,
localizability wouldn't be what I would worry about for a proof-of-concept
language.

------
raptium
The Chinese version looks like just bad machine translation.

`Object` is translated to `宾语` which means the grammar component _object_ in
_subject, verb, object, etc_.

The `power` operator is translated to `功率` which means _A measure of the
effectiveness that a force producing a physical effect has over time._

`Ceil` is translated to `细胞` which means _Cell_ ???

:-(

~~~
dragonwriter
> `Object` is translated to `宾语` which means the grammar component object in
> subject, verb, object, etc.

Sounds right (though OOP treats objects artificially as grammatical subjects,
they are the things on which functions operate, and thus more like what would
normally be grammatical objects; they are the patients rather than the agents
of actions.)

~~~
ridaj
There is an established, domain-specific translation for this and many other
terms of art (in this case, 对象)

~~~
hhas01
Ouch. Sounds like they’re using a general-purpose, not domain-specific,
dictionary for their translations. That might suffice for an initial private
proof-of-concept, but not for a public audience.

Imagine translating a medical or legal textbook without knowing the proper
professional medical/legal terminology. Those target audiences will tear it a
new one, and quite right too.

+1 for authors need to do their homework better.

------
gabordemooij
Hi, I am the creator of the language, my name is Gabor. You can ask questions
if you like.

To answer some:

\- Yes, we use machine translations, they serve as an example, they are far
from perfect. Some language files are translated by native speakers. I think
the website needs to be more clear about this.

\- I use gendered language because coding is a men's job, women belong in the
kitchen! ;-). No, just joking. Women are also welcome to become Citrine users.
I just think the opening sentence is beautiful, it combines the concepts of
male and female in a lovely, natural way ignoring today's PC-bullshit.

\- No, Emoji-language is not allowed in the core. I only support natural
languages. Endangered languages (EGIDS6 and higher) are also welcome. There is
no limit.

I understand that there will be a lot of hate because of this language. I even
received death threats over it. When a young developer I worked with brought
up the idea I even laughed at him. However as I thought it over, the idea
began to grow on me and I longed for a purely Dutch programming language (I
had created one as a child for the C64 by just overriding the BASIC tokens). I
figured that, if I longed for such a thing, maybe others do as well. I decided
to share my code after some years just to give anyone interested some kind of
basis or just discuss it.

It is important to realize that Citrine is trying to strike a balance.
Programs will never read like a book. However, having a programming language
using your own words and grammar just feels better and makes me more
productive, I also tend to make fewer mistakes. The problem with just mixing
Dutch with English programming languages is that is extremely ugly, also you
never know when it's justified to use Dutch or English, especially when
interacting with established English conventions, 3rd party software libraries
or embedded languages in code (like shell or SQL). The other solution,
translating everything into English is just horrible. I have encountered so
many bugs that stemmed from miscommunication because of translation issues to
English that I believe this will become a dead end eventually. One technique I
am working on, that might help to improve the readability even further is
simple macro processing, so you can say 'create a new Object' instead of
'Object new'.

Anyway, if you have any questions let me know, always happy to answer ;-)

~~~
hhas01
“I use gendered language because coding is a men's job, women belong in the
kitchen! ;-). No, just joking. Women are also welcome to become Citrine users.
I just think the opening sentence is beautiful, it combines the concepts of
male and female in a lovely, natural way ignoring today's PC-bullshit.”

Dude, really. If you gave a crap about effective communication you would not
just have said that.

Hell, you would not even have _thought_ of saying that; never mind typing it,
reading it back to yourself, and then hitting Post cos you still think it’s a
good idea. SMH

Your enthusiasm is admirable but your limited expertise is clearly showing.
Instead of saying that you’re here to answer our questions, you should be the
one who’s listening to our criticisms and then asking searching questions of
us. A bit more humility and a lot less hubris. You and your product will be a
lot better for it.

~~~
gabordemooij
If you have technical criticisms please share.

~~~
hhas01
Go read the entire thread then, because that’s all you’ll be getting out of me
and probably quite a few others now thanks to your awful attitude.

Programmers like you are the reason I finally taught myself how to code, so I
would never have to depend on your sort for anything. You’re a smug,
condescending martinet with a grossly inflated sense of your own specialness,
and the sooner you grow up/the world kicks you to the curb, the better.

So here’s me expressing _my_ freedom of thought and expression by having
nothing more to do with you.

------
k__
The grammars langauges are so different, can't imagine how this should work in
practice.

Sadly the site is dead (503).

~~~
qayxc
Works fine for me (the site, that is).

Spoiler alert: it doesn't work at all in practise.

Grammar is one thing, but the "translation" even fails with single words.

It's basically complete garbage.

------
polm23
This is comical. In Japanese nil is translated to ナイル, like Nile River. It
looks like their machine translation software aggressively corrects "errors"
in input.

------
deepsun
In Java (among others), you can use any unicode names. Yes, reserved words
stay the same ("if", "public", etc), but most of the code still reads like
native.

Also I've seen business ERP languages, completely localized. Those who program
on them usually laugh on it, they say it's not hard at all to get used to
English keywords, business logic is what's hard. Even better, the fact it's
English means it's built-in.

------
HALtheWise
Is anyone aware of tooling or languages in existence that allow for
maintaining codebases with multiple supported languages? I'm imagining
something similar to a git hook that performs translation at an AST level,
where library can provide a "strings database" that maps it's public members
into arbitrarily many languages. I could imagine for multinational companies,
it might even be worth hiring dedicated translators to maintain manual
translations as part of such a system, although it would definitely be
desireable for it to fall back to machine translation.

That way, anyone could check out a version of the code that's localized to
their native language, including language primitives, standard library calls,
variable names, comments, etc. without fracturing the whole library ecosystem
by language boundaries.

~~~
dTal
>variable names

Ouch. Naming things is, famously, one of the two Hard Problems of computer
science. How much harder if you have to name everything, down to the last loop
iterator, thousands of times - once for each language that humans speak?

------
stepstop
This is cool! Has anyone here taken a computer science course in another
language?

I have wondered if non-English-speaking universities have to teach their CS
courses in 25-50% English, just due to the syntax of the programming languages
generally being in English

~~~
LukaszWiktor
I took a CS course in Polish and had no problem at all with programming
languages syntax in English.

------
1f60c
I think a localized programming language like this could be useful as a
stepping stone while teaching gifted kids, for instance (before moving on to a
“real” programming language like Python or Java).

------
lnyan
The grammar of localized programming language should be specifically designed.
Only replacing the keywords and operators seems strange in some sense.

Besides, the example in my first language is unreadable.

------
haolez
This actually looks cool! Besides the "localized" stuff, it borrows some good
parts from Smalltalk, JavaScript and Lisp. I'm curious now.

~~~
vincent-manis
They lost me at dynamic scope.

------
woofie11
I'd love to have this sort of i18n for Python or JavaScript, and a good kid-
friendly environment. Scratch does i18n well. Text-based programming doesn't.
There's a gap in age between programming and English fluency.

What's odd to me is the page is internationalized. Their explanation: "As a
developer, you have to know some English. Nobody is going to change that
anytime soon."

------
jejones3141
The first? Algol 68 allowed implementations to have language-specific versions
of keywords. I remember seeing a French language Algol 68 book in the stacks
at the OU library that had listings with French keywords. I don't remember
what they did with the portmanteau words that Algol 68 (and much more so in
the Revised Report) favored, like "ouse" and "elif".

~~~
vincent-manis
I remember using Université de Grenoble ALGOL 60 circa 1968 on an IBM 7044.
All of the keywords were in French. I don't remember if anyone hacked it to
add English equivalents. It was kind of fun to say 'DEBUT' for _begin_.

------
pmontra
If my memory doesn't fail me, Microsoft Office Basic in the 90s had localized
keywords. Word, Access, Excel had English keywords in the US edition and
Italian keywords in the Italian one. Probably the same in all the other
countries. I don't remember if programs were portable across languages. VBA is
100% English now and has been for a long time.

------
ejanus
This is the first language and open source project I have contributed to. I
would never have believed that one day my name would land on a page of a
programming language site. Thanks to open source, thanks to Gabor, and to all
that contributed to making me reach this level.

------
sivizius
I first wondered, why there is not an english version of this language,
because there was no “en”…apparently citrine things, “us” is a language…

------
dang
If curious see also

2016
[https://news.ycombinator.com/item?id=10965259](https://news.ycombinator.com/item?id=10965259)

------
sivizius
I just looked on the translations to my native language and it is hilarious.
Need more popcorn.

------
steviedotboston
Weird how a language that exists to be more welcoming and accessible uses
needlessly gendered language on its website.

"Designed to allow every man to write code in his mother tongue" could easily
be written as "designed to allow every person write code in the language they
are most comfortable with."

~~~
PixyMisa
It did say that, before they ran the site through Citrine.

~~~
gabordemooij
that's actually quite funny ;-)

------
scoopertrooper
> designed to allow every man to write code in his mother tongue. Hopefully,
> by doing so, Citrine will make coding accessible to a wider audience.

One day that audience might extend to women.

~~~
charles_f
My thoughts exactly! but it looks like op is Dutch in which language google
translation tells me "mens" can mean "person", so hopefully it is just an esl
thing

~~~
herbstein
Which is just a perfect example of why anything "universal" is quite hard in
practice when these misunderstandings happen even between English and Dutch.

~~~
zokier
Not really, it is more like counter-example; if the author would have been
writing in his native language this might not have been issue. So it supports
the claim that allowing people to work with their native languages would
reduce mistakes.

Of course this also implies that the translations should be done by
professional translators, instead of the authors themselves.

~~~
dTal
Which I'm sure the authors would agree with:

> The Citrine community is working hard to provide translation files. We use
> machine translations if we can't find a translator yet. We appreciate any
> help to improve language support!

------
kanox
Sorry to be dismissive it's almost always essential to allow international
contributors and english is by far the most common international language.

