Hacker News new | comments | ask | show | jobs | submit login
Show HN: Bato – A Filipino Programming Language (github.com)
159 points by jjuliano on Jan 12, 2018 | hide | past | web | favorite | 123 comments

Awesome stuff - i think this kind of languages are more important than most people (esp with strong english school background assume). I could see this as educational language for kids.

Related to this: In the 80/90s i saw a German BASIC variant floating around. Unfortunately i couldnt find it online

But… if you are into alternative-language programming languages i highly recommend you reading up on Plankalkül[1] - written by Zuse (after whom SuseLinux is named) which inspired ALGOL

Also worth knowing about is obviously GERMAN[2] a programming language much like brainfuck that is redused to only the most important German words.

Example code

More links: https://en.wikipedia.org/wiki/Non-English-based_programming_...

[1]: https://en.wikipedia.org/wiki/Plankalk%C3%BCl

[2]: https://esolangs.org/wiki/German

Word Basic (this was around '96, '97) used to translate the tokens, and I can not express how much I hated it, probably because I spent far too much time dealing with a codebase that had been written in Danish Word Basic and had been transferred to Norwegian Word Basic by someone who seems to have cut and pasted the code, and so ensured that the "automagical" translation of the tokenized keywords didn't happen.

This was pretty much the worst combination of languages, because Norwegian and Danish is similar enough that the code looked completely sane - it read as pretty much as valid Norwegian in most instances, but the keywords and function names deviated in all kinds of subtle ways.

While I understand why some would like to code in their native language, the big persistent problem is that it is not clear that it is a net improvement in communication - so much of our communication about code is international, and as not nearly all of our tools are context sensitive enough to understand when we are talking about code that would need to be translated in a very specific, and (programming-)language specific way, we're creating some very tough challenges.

> I could see this as educational language for kids.

In Greece we had (and maybe still have, but it has been almost two decades since i was there) programming lessons in high school. For the most part it was laughable (and TBH almost everyone ignored them) but what i personally found interesting was that someone went and designed a Pascal+BASIC hybrid where everything is in Greek. There wasn't any official implementation since this was supposed to be theory only (one of the laughable parts - how are you going to learn programming without actually doing it in a computer?), but some people decided to make their own implementations. I was one of them, but mine never went beyond a simple command-line interpreter which i gave to some of my classmates.

However there was a professor who went the extra mile and built an entire interactive IDE with interpreter, debugger and a full reference (the name of the language is simply "LANGUAGE" and the Greek word for that is the same as for tongue so the icon is a tongue :-P). Apparently he even got some certification from the government about it as being usable for actual learning in schools. You can see screenshots from it at [1]. It is in Greek of course, but it has a fairly simple UI (it actually looks a bit how QBasic would look if it was a Windows program around the late 90s/early 2000s :-P). The main window is an editor, at the bottom there is an "Execution" panel where you see the program output and interact with it, a watches panel for the debugger and at the right you can see a panel with all available commands, a panel with the variables (when in the debugger) that you can inspect and edit and a panel with an input file (IIRC the language doesn't have I/O support beyond some predefined files - it wasn't needed for the educational purpose after all).

There was also a Greek compiler for another Pascal-like language during the late 80s or early 90s (this was unrelated to the "LANGUAGE" i mention above which AFAIK was designed in the very late 90s). It was made for DOS and it came with an IDE that looked a lot like a Turbo Pascal 5.x (but in monochrome UI). This one generated actual standalone executables. I found it on a coverdisk much later but i don't know what happened to that and i can't find it today (thanks to the "LANGUAGE" having such a generic name all searches in Google give me that instead of what i want :-P).

[1] http://alkisg.mysch.gr/screenshots/

I built http://www.pseudoglossa.gr/ (you need flash) a web interpreter, ide, debugger with cloud support around 10 years ago when such software was a rare commodity.

It contributed a lot in getting the students and the teachers in the lab to get some hands on experience with programming.

It also spawned the creation of other applications that embedded it and provided classroom management and automatic correction.

Although I am not a teacher anymore I keep it online and a lot of people still use it regularly. Even if you don't know english you can start programming in your native language at a young age.

Never got support from everyone though but it was fun and challenging to make it.

Everything from the editing component until the interpreter and the execution environment had to be made from scratch.

The code is in github https://github.com/sstergou/pseudoglossa.gr

It strikes me that in a language like Common Lisp, where the identifiers are not strings but symbols with a name, you could give the symbols a map from language->name and have the input language be nearly irrelevant: as long as it was standard practice to distribute source as serialized symbols rather than as text files.

Or, maybe, you could add another segment analogous to the package segment to each read symbol, e.g.:

    (defpackage :foo
      (:use :cl)
      (:source-language :en))
    (in-package :foo)

    (defun show (:translations :de schau) ()
EDIT (forgot): and then, if you needed to access the symbol directly by some particular language's representation, do something like foo:en:show

Neat! Now I just need to develop an Ancient Greek dialect of this ΓΛΩΣΣΑ, and then maybe Italian Liceo Classico high schools will finally have some coding classes...

Someone made a pseudo-parser-thing of Perl in Latin: http://search.cpan.org/~dconway/Lingua-Romana-Perligata-0.50...

Well, starting around grade 3 kids in the Philippines are taught primarily in English, so that's probably not the biggest hurdle there.

Thanks for the German link, made my morning :)

> written by Zuse (after whom SuseLinux is named)

I was under the impression that SuSE (now SUSE) stood for "Software und System Entwicklung" (Software and System Development, translated from German), are we referring to the same Linux distribution?

interesting - tried to look up if i am correct - couldnt find any result

i always assumed it to be named as a nod to zuse the famous german computer pioneer.

but you might be right - might just be coincidence

Perhaps it should have been "BIER" instead of "BEER".

It's a considerate effort to make the language more readable to foreigners

One thing I'm completely ignorant of is how to people from countries where non-latin alphabets are write code? Eg if you're Chinese, Arabic even Russian do you have to do everything in English and English characters?

I'd hope there is a Unicode mapping into other characters - is there any language that supports this? Even for libraries would be nice.

Russian reporting in. I think everyone here who writes code knows Latin alphabet. It's just a basic requirement - it's a given that if you want to code you'd better learn English, let alone alphabet.

In schools, kids learn to code using Algol-like language with identifiers in Cyrillic (essentially, Russian) abbreviations, e.g. "нц для i от 1 до n" rather than "for i := 1 until n do", but even here you see Latin letter "i" ;) However, it's not uncommon to just go with e.g. BASIC or Pascal.

There is also de-facto standard accounting & ERP software producs from 1C which use Visual Basic-like language with Russian identifiers like "Константа" for "Const", etc. It's widespread in its domain, but not used anywhere else. 1C programmers are sometimes (half-jokingly) considered to be of a different caste. Sort of like COBOL programmers, maybe, or - probably closer - SAP developers.

I don't think anyone has any serious issues with basic English language for identifiers. Literature, documentation, discussion - here, language barriers do matter (a lot!), but not for programming languages themselves. That is, unless someone wants to show off as a true hardcore Russophile, of course.

Code comments, identifier names and e.g. log messages are different matter. I don't think there is any common practice for this - more like a matter of personal preferences. Some try to stick with English, some use Russian or transliterated Russian (mixing it with some English) liberally so it's not uncommon to see something like `log.Error("Дом не найден %s, %s: %s", ulica, nomerDoma, responseText) /* TODO: Добавить поиск лучшего совпадения */` in their code.

Code comments, identifier names and e.g. log messages are different matter.

There was (is still?) a project to translate all the German comments in Open/Libre Office into English, a legacy of the German company that originally wrote it.

> some use Russian or transliterated Russian (mixing it with some English) liberally so it's not uncommon to see something like `log.Error("Дом не найден %s, %s: %s", ulica, nomerDoma, responseText) /* TODO: Добавить поиск лучшего совпадения */` in their code.

This gives me nightmares of dealing with an old patched Joomla install, where the developers had managed to bring in plugins etc. that used multiple inconsistent encodings in comments, so you could get all of the code to render correctly with any single terminal setting.

> old patched Joomla

Been there, saw that. The quoted part alone is way more than enough to give nightmares. I bet, encoding issues is just the icing on the cake. ;)

I'd like to add that Latin alphabet is definitely studied in Russian schools early on, both for mathematics and for the required study of a foreign language (invariably one using the Latin alphabet).

The number of keywords in a modern programming language is not very large, and most are short, so even a beginner with no prior knowledge of English would not have very hard time using something like Python (or Java, or Pascal, or whatever they use at their school).

In Japanese companies that I've worked at, the convention is using English words for programming concepts, using Japanese words transliterated into English for business concepts except where the English word is easy and obvious, and using a pedantic level of Japanese code comments to signpost for those developers who are less confident in their English ability.


// 教科リストを取得する。

String[] kyoukaList = getKyoukaList();

(Japanese university backend systems have two concepts of "a subject", with kyouka being a subject like math is a subject and kamoku being a subject like Algebra II is a subject.)

At least here in Brazil it's relatively common to mix English defaults and Portuguese variable names. So a Django class declaration would look something like:

    class Pessoa(models.Model):
        idade = models.IntegerField()
        nome = models.CharField()

Then some things wind up having mixed-language variable names:

    class PessoaForm(forms.ModelForm):

Portuguese is rooted in Latin so it uses the Latin alphabet, the parent was concerned about other alphabets.

Off course there are the accented characters but those are just transliterated to ASCII. Even when the language accepts unicode identifiers (like Python or Javascript) they are not transliterated to ASCII so "função" is not the same as "funcao". Everybody just sticks with ASCII for identifiers (occasionally people throw the Greek letter for pi or lambda in a formula but not on my watch).

The only thing more awful than these "Portunglish" codebases are Brazilian projects trying to stick with English names only. These are often full of misleading translations - for example using "Budget" instead of "Quote" because in Brazilian Portuguese the word "Orçamento" means both. Very confusing.

I am from Slavic country and writing code in native language looks amateurish (that's the general thinking).

Same in Spanish. At my company we use only English for code, even comments. And are trying to push for jira tickets to be written in English only as well. At least among the developers. Can't keep managers from using Spanish.

It's not a matter of alphabet because people still learn programming languages from English books.

Pretty much nails it: https://temochka.com/blog/posts/2017/06/28/the-language-of-p...

Thanks for the link

قلب is a Scheme-like programming language that is written entirely in Arabic.



Wow! Right-to-left languages don't just have to alter the glyphs, they also face the questions of different indentation, different argument order (does `a->b` still express the same meaning?), etc.

Can't recall exactly where I saw it, but one of the major language designers (I think for Perl) was talking about how there was a plan to abstract some things like quotes such that a character used to quote a string in any spoken language would be a valid way to declare a string. This didn't seem to go so far as to also allow this for keywords, but for symbols the thought was that you could simplify things a little bit for others by not requiring only symbols common to an english keyboard.

in perl 6 you can quote strings in quite a few ways!


"Quoting" is all defined inside of a sub-language called "Q"

Ruby has pretty much the same flexibility. "%" followed by almost any non-alphabetic character starts a quoted string. But most then become the quote character. If the quote character is at least <,(,[ or { then the end-quote character is the corresponding >,),] or }. These are all equivalent:





Unfortunately the rules are loose enough that even space is a valid quote character. So this too is equivalent to the above:

% x

("% x ")

No it doesn't have the same flexibility.

Perl 6's string quoting sub language has various feature flags that you can turn on or off.

Here are a few of the basic ones shown in valid code (comments and newlines included)

  # start quoting with no features enabled
  # (not even backslashing the delimiter is enabled)

    :scalar    # enable $foo
    :array     # enable @foo[]
    :hash      # enable %foo{}
    :closure   # enable { 12/3 }
    :function  # enable &foo('bar')
    :single    # turns on backslashing the delimiter
    :backslash # enable \n
    :!exec     # turn off executing it (redundant)
    :exec(0)   # another syntax for turning off a feature
  {…}          # various delimiters are allowed (generally punctuation)
There are shorter variants of each of the feature flags. :c => :closure

I would like to point out that the parsing of the :closure and :function part of the sub language is reentrant. (Actually most of them add some form of reentrancy, it is just harder to show it for the others)

  "a b &foo( "c d &bar( "baz" )" )"
Note that this reuses the base Perl 6 parser, and that is why it is reentrant.

There are shortcuts for regularly used forms

  「」 =:= Q[]
  '' =:=  q[] =:= Q:q[]  =:= Q:single[]
  "" =:= qq[] =:= Q:qq[] =:= Q:double[]
     =:= Q:b:s:a:h:c:f[]
     =:= Q:backslash:scalar:array:hash:closure:function[]
The ability to turn off features can be useful

  qq :!c [a {\n  b\n} c] # { and } don't form a closure here
I would like to point out that the :foo syntax is used everywhere in the language for named parameters to routines and operators (most operators are implemented as subroutines)

  :foo =:= :foo( True ) =:= foo => True
  :!bar =:= :bar( False ) =:= bar => False
  :$baz =:= :baz( $baz ) =:= baz => $baz
If the delimiters are paired () <> {} 「」 «» “” ‘’ you can double up on them.

  q<FOO> =:= q<<FOO>> =:= q<<<<<FOO>>>>>
This is useful to avoid having to backslash a delimiter within the string, or trying to find a delimiter that isn't in the string.

  q<< <a> >> =:= q' <a> '

One more equivalent expression: ?x

A question mark followed by any other single character is treated as a string literal containing that character.

It probably varies based on country, team, and nature of the project. For Chinese OSS projects on Github, I notice that functions/vars/etc. are in English, but comments are in Chinese.

I'm the only English person working in a German team at the moment, and their codebase takes letters like "ö" and makes it "oe". I think there are a few like that.

Personally I prefer it when they use English obviously (I don't speak German), but it also means you don't get ridiculously long identifiers.

don't you as a programmer feel familiar with longidentifiers. Would you prefer in dent a-gogo ers? Yes, my latin is lacking.

I looked for some Chinese developer on GitHub. This is the most Chinese source file I found https://github.com/hzxie/city-picker/blob/master/city-picker...

Variables in English, strings in Chinese when they have to (it manages a list of city names). I could work on that code and I don't know Chinese.

This is more or less what I see in code here in Italy. Sometimes somebody slips a variable name in Italian but among pro developers it's limited at cases when there is really no obvious English counterpart. Example: legal terms, they tend to be very specific and you'd need a lawyer to translate them correctly. Accounting too.

Chinese is hard to transliterate to English, because there's no nice way to express the tone, and even with tones, there's a number of homophones. To express a precise meaning, you either use hanji, or proper English words.

Other languages, like Russian or Japanese, can be transliterated pretty unequivocally, so transliterated identifiers are often found in source code, especially in business-related identifiers where a precise translation may not even exist.

Maybe using single characters (e.g. chinese characters) instead of words for keywords is more efficient to type and read for programmers. Not sure how typing compares but it could give code more structure.

Does anyone know how governments in China, Japan or Korea handle this? Do they try to use non-english programming languages?

Single chinese characters are inputted with more than one key press on the keyboard, so it won't be much different.

You can test this by changing your input method to Japanese or Chinese. At least in Japanese, the sound of the character is typed out using english letters (romanji) and various options are presented for autocompletion.

You are right that reading characters is probably faster, though.

> At least in Japanese, the sound of the character is typed out using english letters

Technically speaking, Japanese have kana-based keyboard layout, where a single mora is a single keypress rather than two. However, there is still an extra (second) keystroke for dakuten/handakuten (か -> が), which is not required with romaji input (where it's just "ka" vs "ga"), so it's not a 100% speed increase. So, typing is somewhat more efficient than with romaji, but I think I've heard no one uses that, except for, possibly, professional typists.

Just as an anecdote, I know of at least one person in my company who uses it, without being anywhere near a professional typist. I suspect older generations may be much more likely to use it.

In any case, even though mostly everyone prefers the IME on their computers, I haven’t seen any Japanese people who prefer romaji input on their smartphones.

Pinyin input for Chinese is similarly phonetic. Supposedly there is a method popular in Taiwan based on strokes.

Reading characters is a bit faster, but only because you had to memorize 10,000 of them already. People reading English will similarly chunk many of the words they are familiar with so that they are not so much read as they are recognized.

This is also a stroke system (Wubi) that's more popular and faster in China too. Maybe it's a related system to the one in Taiwan. Pinyin makes sense only if you're already familiar with the Latin alphabet and how they sound. Those are literally foreign concepts to the Chinese.

Pinyin IMEs are still much more popular than Wubi IMEs because of the latter's steep learning curve. Pinyin was designed by Chinese for Chinese learners, and is only incidentally used by westerners. It is based on the latin alphabet yes, but its how all the school kids in China learn standard pronunciation of mandarin words (indeed, this is pinyin's main purpose). So pinyin isn't really a foreign concept to any Chinese who has went to school since the 60s.

There's also a Taiwanese system called bopomofo or zhuyin, which is basically a Chinese version of kana.

The result seems like it would be very concise code. Identifiers could be very descriptive and still only be a few characters.

Though, I wonder if the varying size of English words makes it easier to distinguish patterns just from the shape of the code.

As far as in my experience, no. Comments can be Chinese for ease of reading, but Chinese developers work with the usual programming languages out there: C, C++, c#, .NET, Java/Scala, Python, Javascript. Honestly there is no reason to use a Chinese programming language (there are such). If you work for an international companies, then English is the preferred language for comments.

Back in 2000s, text editors often did not default to UTF-8, so if you are from Taiwan for example, your file might be in big5 encoding, and would require a change if open by an editor in utf-8. Now this is a rare practice.

Worked at a Chinese company in Shanghai before. Everyone on my team could read/write but not speak English. It is most common that people would write code in English but use Chinese for comments and documentation (both internal and external). We all used Sogou which makes it easy to make a hotkey to easily change between Chinese pinyin input and English.

I am lead to believe this usage of English for code and Chinese for comments is quite common given the code I've read from public repositories and other companies we worked with.

At my Korean company, all code is written in English and comments are in Korean/English.

On osx, capslock is used to quickly switch between languages.

Capslock with custom bindings?

The default way to switch is control + space.

It's not a custom binding. It's a default option offered by osx https://i.imgur.com/uJo5XUj.png

Strange that I've never seen that before. To bad it only toggles between two languages (I have 3 setup)

Yes but it's not comfortable for writing in multiple languages. I use Capslock both on Windows (Punto Switcher) and Mac (Seil / Karabiner Elements).

I'd like to know how you do that with capslock, too :D

It's a default option offered by osx https://i.imgur.com/uJo5XUj.png

Keyboards switch between the two easily. I've worked with a couple Russian guys and sometimes they'd just write cyrillic jibberish in Slack because they forgot to switch keyboard :) One of them is a really big contributor in the Ruby open source scene, he has no problems switching back and forth.

Another natural language programming language (in Finnish): https://github.com/fergusq/tampio

It's technically somewhat amusing as it runs a morphological parser in order to deal with the Finnish word inflection system, which is then utilized as an integral part of the language grammar. As a consequence, since this allows relaxing constraints e.g. on word order, the resulting code does actually read like natural language.

Finn here: it reads like language that's tediously formal to the point of being awkward, kind of like an old translation of the Bible ("And verily X did beget Y").

"Natural" spoken Finnish abbreviates a lot and dispenses with many conjugations. Maybe I should fork this and build a Helsinki slang version...

It's really interesting to see non-English languages being used as programming languages.

I recall Aheui (아희) programming language that was posted on HN a while back. It's truly exotic.


> 밤밣따빠밣밟따뿌 빠맣파빨받밤뚜뭏 돋밬탕빠맣붏두붇 볻뫃박발뚷투뭏붖 뫃도뫃희멓뭏뭏붘 뫃봌토범더벌뿌뚜 뽑뽀멓멓더벓뻐뚠 뽀덩벐멓뻐덕더벅

will print hello world according to the example. Reading it phonetically in Korean, makes no sense at all.

If I knew how to design programming languages I would do it like this:

> ㅍㄾㄴHello Worldㄱ

this is the equivalent of the acronym for 프린트 (ㅍㄹㅌ) or print (english word phonetically typed in Korean) and then using the "N" and "G" Korean alphabet to depict a square parentheses.

> prt(Hello World)

인쇄(印刷) would be the more formal word in which case further simplifies the acronym.

> ㅇㅅㄴHello Worldㄱ

I feel like using 니은 and 기윽 for your quotations limits you in terms of keywords quite a bit. AFAIK, in Korean, they use <텍스트> for their quotations.

ㅋㅋㅋ perhaps in the capitalist counterparts. I'm envisioning a new Korean alphabet only programming language, staying true to the Juche philosophy of self-reliance!

I would replace all operators with Korean characters:

* is ㅆ

|| is ㅣㅣ

&& is ㅃ

+ is ㅏ

= is ㄹ

- is ㅡ

== is ㅌ

> is ㅋ

< is ㄷ

I've had ideas of where we abstract the syntax of a program language in a way that there would be multiple 'views' of the code.

What if you could run a program that would translate Bato programs into Ruby and vice versa? The variables would be a hard part so maybe it would hint you at a translation of it or you could add a hint yourself.

Or what if you had a language that had a javascript like syntax and a lisp like syntax. You would run a program that would translate it from one syntax for the other. You could add or subtract syntactic sugar.

I read about isomorf yesterday... it might be kinda like what you're looking for?

Possibly relevant blog post from Brian McKenna, too: https://brianmckenna.org/blog/polymorphic_programming

[1] https://isomorf.io/

This sort of translation is definitely part of our syntactic sugar mission. If anyone wants to beta test particular languages for us just reach out.

My experimentation about multi view HTML editor which combine few textual formats, (horrible) GUI and preview for interactive editing experience: https://www.youtube.com/watch?v=CNryKyBPfws

Some toughts about that by Steve: http://futureofcoding.org/journal.html#niko-autios-microedit...

This idea has been around for ages. Check out this old video on intentional programming by Microsoft: https://www.youtube.com/watch?v=tSnnfUj1XCQ

A few months ago, I was daydreaming about creating a new programming language and started thinking about language internationalization. I found this page, with some interesting comments: https://blogs.msdn.microsoft.com/alfredth/2011/07/21/why-are...

As an aside, this one stuck out to me: "One of the difficulties in allowing the keyword to be switched on the fly to a different language is that keywords have to be 'reserved' such that they cannot be used as identifiers"...

I find the concept of reserved keywords kind of strange. It seems like a "smarter" language/parser should be able to handle a keyword being used as an identifier.

It is always good to not have possible ambiguity in a language though.

A classic example is a C++ line such as:

    a ** b
Before you know if a and b are variable names or type names, you do not know whether this line is a variable declaration or a multiplication or a syntax error.

My point is that some keywords could be used as identifiers in some situations, but this might not always be the case.

One option would be to not have any keywords, just operators. Or maybe a special keyword decoration.

Programming languages already use contextual parsers. C# and Swift both allow you to re-use some keywords as identifiers, function names, etc and use context to disambiguate.

Both inherited the restricted list from C so for legacy reasons they prohibit those keywords from being used as identifiers, but newly introduced keywords are always contextual due to compatibility concerns.

> Or maybe a special keyword decoration.

What you're asking for is called stropping [0], and it's actually a very old concept going back to the early days of Algol.

Algol didn't have any concept of reserved words. Instead, keywords were typographically distinguished from identifiers. In publications, keywords were typically rendered in bold and/or underline, and identifiers were not. Representing this in a computer encoding proved problematic, so people came up with a number of ways to distinguish keywords from identifiers. The most common way was to enclose keywords in 'apostrophes', and from there 'strop' originated as an abbreviation for 'apostrophe', giving rise to the term 'stropping'.

But despite the etymology, stropping doesn't have to use apostrophes. When Algol 68 came out, the spec defined three acceptable styles, called 'stropping regimes'. 'Quote' stropping was the traditional form using 'apostrophes', .point stropping would prefix .keywords with a .decimal .point, UPPER stropping would render keywords in ALL CAPS (which forces identifiers to be all-lowercase, but that's ok because Algol 68 offers something much better than CamelCase). There was actually a fourth stropping regime, res stropping, which is different because it would treat keywords as reserved words and thus restrict how you can use identifiers.

a68g, the only modern-day Algol 68 interpreter I know of, can be configured to use either UPPER or res stropping. UPPER stropping is preferred, as res stropping changes the way the language works (for the worse, IMO).

The really cool thing about stropping is that it enables you to put spaces in your identifiers, since something other than whitespace is used to separate your identifier names from keywords [2] [3].

[0] https://en.wikipedia.org/wiki/Stropping_(syntax)

[1] Example from an HTML-ized copy of the spec: http://www.masswerk.at/algol60/report.htm

[2] Here's an example from Algol 68 of a function called days in month (yes, with spaces), taken from Wikipedia. I've rendered the keywords using UPPER stropping for the benefit of anyone wanting to try it in a68g:

    PROC days in month = (INT year, month)INT:
      CASE month IN
        IF year MOD 4 EQ 0 AND year MOD 100 NE 0  OR  year MOD 400 EQ 0 THEN 29 ELSE 28 FI,
        31, 30, 31, 30, 31, 31, 30, 31, 30, 31
[3] Another Wikipedia example of a variable delcaration, showing how stropping enables both spaces in identifiers and identifiers that share words with keywords:

    INT a real int = 3;

It feels good that somebody thought about the trouble that spaces in identifiers can bring.

I am currently using the robot framework language which combines the worst of all worlds. It supports spaces in identifiers, but it does so by using “more than one white space” as a separator for everything. No parentheses, no commas, and of course identifiers are case insensitive.

>>The really cool thing about stropping is that it enables you to put spaces in your identifiers.

I think that this is a recepie for confusion and misunderstanding, both for humans and for developer tools. I do not understand how can it be "cool" ?

A "smarter" language/parser has implications, though. It means your text editors & IDEs need to be more elaborate to handle it. It means new users have more to take in before they can be productive. It's more complexity and thus more effort to maintain. All in all, making a smarter language/parser means making a more expensive language/parser, and that means you're making a tradeoff that isn't always justifiable.

There seems to be a few ways around this issuie:

1. You could create a programming language where every keyword is a symbol: http://ccsenet.org/journal/index.php/cis/article/view/23904. This is what mathematics does.

2. You could also create a programming language where every reserved keyword is a special type and then expose that type to the programmer. Thereby allowing them to reuse that word just with a different type then the reserved keyword. This would come with a lot of complexity costs however because now you don't know if your generated AST is valid until the typechecker.

BASIC on the ZX80 and its variants encoded keywords as bytes in memory but displayed them as English-like words; there's no reason those keywords couldn't be displayed in another language, regardless of their internal representation.

An advantage of reserved keywords is that users cannot write code like this PL/I snippet.


Clojure seems to be very conducive to this by importing a library of words in another language:


Elango Cheran gave a talk about this: https://www.youtube.com/watch?v=MqjMZNwnYCY

TIL That Ruby lets you modify keyword parsing at runtime: https://github.com/jjuliano/bato/blob/master/lib/bato/ruby_p...

This reminds me of http://nas.sr/%D9%82%D9%84%D8%A8/ (Lisp in Arabic(?))

Yes, that's Arabic, and it does indeed look like an Arabic Lisp!

Is it just Ruby with Filipino words?

Tagalog speaker here. Yes, it is.


Bato -> stone.

Ang 'Bato Programming Language' ay isang scripting language sa wikang Filipino. -> The Bato programming language is a scripting language in the Filipino tongue (T/N: or more specifially, Tagalog).


More interestingly in the following section: Bakit Bato? -> Why call it Bato?

Ang 'bato' ay hango sa Ruby Programming Language na may Filipino sintaks. Ang kadahilanang ginamit ang pangalang 'bato' ay dahil ang Ruby ay isang uri ng bato. -> Bato is made with Ruby using Filipino syntax. The reason to use the word "Bato" as a name is because Ruby is a kind of stone.

Even if it is, this might be the first time I've seen a programming language that isn't based in English (aside from the more esoteric Brainfuck-esque languages for code golfing).

At my school days in 1980s we used to write programs using RAPIRA (russian: РАПИРА) programming language on AGAT microcomputer. RAPIRA was an original programming language with Russian keywords. https://en.wikipedia.org/wiki/Rapira

I've seen a couple of Spanish versions of Python out there, just novelties, never seen them move beyond a proof of concept. There's a whole bunch of esoteric programming languages that don't use English, just like Brainfuck. [0]

[0] https://en.m.wikipedia.org/wiki/Esoteric_programming_languag...

Back in the 80's there were a few variants of Basic and Pascal in Portuguese.

In the first BASIC programming book I ever read, there was a program to change all the keywords from English to Tagalog, to help a Filipino student who didn't know English.

Esperanto would be nice

There was a programming language with Esperanto syntax. I have to leave for work but I'll respond later with the name if no one else posts it.

I searched yesterday but wasn't able to find the reference.

In the Philippines here. I'm a native English speaker and my Tagalog is pretty bad, but it looks like it's Ruby with Filipino syntax.

It is just Ruby.

Everything is just alias-ed:

e.g. https://github.com/jjuliano/bato/blob/master/lib/bato/core_e...

Indeed. The submission's title should then be changed perhaps to "Bato - the Ruby programming language in Filipino/Tagalog," or perhaps "Bato - Ruby made Filipino/Tagalog."

A language with Internationalized keywords would be pretty dope.

That's what they do with formulas in Excel. There it sucks HARD. Try googling for documentation on StackOverflow for example. Anything you want to copy paste doesn't work because Excel is expecting the commands in a different language.

That sounds like there's a demand for a code-help service that is syntactically aware enough to perform translations into the viewer's native language. I'm sure it would be harder to implement than a basic text-only + syntax highlighting site, but imagine how nice it would be to have your user's language preference (or, if not logged in, with a dropdown "pick $human_language for this $programming_language snippet"). You could also optionally integrate Google Translate or similar for the non-programming-language parts (obviously with less accuracy, but still better than nothing).

Searchability would be tricky, since google etc. might detect the very-similar-but-for-code-snippets links as duplicate/spam and bury them, but there are hopefully ways around that--up to and including asking google to somehow improve their algorithm; it seems like they might be sympathetic (because they're programmers and because there's money to be made) to a request to accommodate such search patterns.

Not only would that solve the problem you posed (googling for stuff on SO is tricky for you as a (presumably) English speaker if it's in another language), but it would also address a far wider-reaching problem with big implications: the relative inaccessibility of programming, especially at the early-beginning stages, to people who are not fluent English speakers.

Sure, you can solve almost any problem without a budget...

If you google in your native language you'll get results in that language. I agree that it takes a bit to relearn all formulas (and I find it too confusing to use different versions at the same time) but when I was still using a non-English version I had no problems finding solutions.

I think it was exactly the right thing to do. Excel formulas are often used by people with little knowledge of programming who aren't aware that "if" is a common keyword. Using the local variant makes formulas much easier to understand (and to get people closer to actual programming).

I agree it's a good move from Microsoft to make Excel more accessible. It's just not something I was expecting the first time I stumbled upon the problem.

I have a programming background so I'm used to searching everything in English, even though that's not my native language.

I inserted sqrt() in my non-English Excel and it didn't work. I thought I had a completely broken version of the program. Took a while to figure out it was a language problem!

And now you need to maintain hundreds of variants of each SO answer.

It only sucks for us, that are used to English as "programming language".

Many Office users only speak their native tongue, are quite happy with the translations and don't even know that StackOverflow exists.

They rather buy books, https://www.fca.pt/en/catalogue/computer-science/office/page...

I don't know, for me as a non-native speaker the words in a programming language are just memorised tokens, they could just as well be symbols or emojis. The bigger deal is libraries and documentation. For instance Microsoft is attempting to localise all their SDK documentation, and at least for German it's a nightmare, either the translations are done by non-technical people, or algorithms, sometimes the information is twisted in a way that reads like correct German, but doesn't contain any useful information, or is even plain wrong.

> they could just as well be symbols or emojis

I'm sure they will...

You can already use emoji in JavaScript.

There's also http://www.emojicode.org

You just described Excel, and I can assure it is anything but "pretty dope". It is in fact a monstrous pain in the ass.

What would be the point though? I feel like that would only increase misunderstandings . Like searching for documentation or help online on that function.

I can't find anything about this on the web, but didn't AppleScript originally do this?

The effort into this I must say (I speak as someone who came from the Philippines and knows the language by heart) is quite impressive.

But for someone who is used to coding in the English language and syntax, they may find some of the wordings harder to understand and apply.

There are already way too many languages on this planet being kept alive by all those pathetic governments limiting their citizens to their own (little) country. English should already long be the World's 1st language for every human being IMAO. I'm Dutch and it took me ages of struggling to get to my current level of English and I'm still learning..

So, this feels like a huge step back and a great way for Filipino developers and companies to completely limit and restrict their selves.

It's a fun coding experiment, but my negative response is because you shouldn't want something like this to become popular.

I agree that the would already should have had a lingua franca, but I must say that it English is far from the best candidate. The language is huge, and the grammar is full of irregularities. Pronunciation must be learnt case by case (why are "women" pronounced "wee-men"?).

(also recall this cute poem: http://www.ling.upenn.edu/~beatrice/humor/english-lesson.htm... )

> English is far from the best candidate.

It's also the only candidate. English is the most commonly learned foreign language by far[1], and I don't foresee any major economic movement to change that.

English has a momentum no other language have ever had. There is no language that has a larger body of knowledge online than English. Virtually all cross-border hobbyist communities, expert platforms, etc (eg Stack Overflow) are in English. People all over the world use English to communicate, even when nobody in the group has English as their native language. I was a member of a European student network called BEST, which had 0 member universities in English-speaking places and still the entire thing was in (bad) English. The entire EU is running on English informally, and I bet formally too a few decades from now (and not because we like the Irish so much). My startup has some customers in the UAE, in India and in Pakistan - their sites main (and sometimes only) language is English. The list goes on and on.

Chinese would've had a chance if they weren't so isolationist. Call me when the first Chinese language TV show gets popular outside China and Taiwan. I bet English language TV shows will be popular in China long before then (if they aren't already - I honestly don't know).

Like it or not, English has won.

[1] There's a bunch of sources, I liked this Duolingo post: http://making.duolingo.com/which-countries-study-which-langu...

This is a common complaint, and one the purist in me is sympathetic to. These days, I have a more nuanced perspective on English.

Yes, English is huge and complex. However, the things we are trying to express are themselves huge and complex. A big language gives us many many colors to paint with. I mean different things when I say I am "sad", "down", "blue", "disconsolate", or "woebegone". There is a learning burden on English speakers to loading such a large language in their heads, of course. But, in return, we get a higher-bandwidth communication protocol.

The complaints about pronunciation are also common and valid in many cases. Things like the Great Vowel Shift lead to words that no longer sound like they are written for little useful reason. But in some cases, pronunication varies between similar-spelled words because their etymology is wildly different.

A word's spelling carries both pronunciation and meaning. When the pronunciation varies, it is often because the underlying morphemes are different even though they are spelled the same. This means the spelling doesn't teach you to pronounce it as well as you like, but once you can pronounce it, the pronunciation helps you know what it means.

And, as a lingua franca for the world, I think English is great. English is riddled with loanwords and is constantly assimilating terms from other languages that can't already be expressed well. In many senses, it's a global language because it is the set union of them.

One of my favorite quotes from James Nicoll:

"The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary."

It's not pronounced wee-men. It's a shorter sound than that, but I guess I'm proving your point!

English is THE candidate. Lingua franca is not decided by merits, but by adoption rate, which makes English is only choice on the table. It is the British empire+US hegemony, since WWII/Cold War that makes English lingua franca as it is today.

> It is the British empire+US hegemony, since WWII/Cold War that makes English lingua franca as it is today.

To be fair, it's also centuries of important contributions from English-speaking countries in science, technology, industry, aviation, literature, etc, etc, as well as, since WWII at least, music, TV, film and other forms of popular entertainment.

By the number of speakers, some form of Chinese would probably win hands down. In any case, Hindi and Spanish (the latter with arguably much more regular grammar and spelling) would still take over English.

There's more to what makes a great language than regular grammar and spelling but, that aside, English is already gradually being adopted as the world language. The OP is complaining about parochialism slowing the spread. Whether you agree English is best or not, you might still agree that worse-is-better move to some common language improves things overall

Meanwhile, duolingo still doesn't support Taglog, much to my annoyance–since my wife is half-Filipino but doesn't know it–and we thought it'd be a fun challenge to learn it together.

Alternatively, you could skip the hegemony debate and just abandon natural languages altogether...


At first I thought this was kidney, then upon reading github it hit me that stone is the meaning ala Ruby.

Thanks for sharing.

Guess "boto" was already taken.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact