

What Lua scripting means for Wikimedia and open source - kibwen
https://blog.wikimedia.org/2013/03/14/what-lua-scripting-means-wikimedia-open-source/

======
cs702
This is a big deal: Wikipedia is allowing _contributors_ to extend the
functionality of Wikipedia with new code (written in a powerful language),
with minimal or no supervision. It's hard to predict what kinds of
improvements we will see over time. Every Wikipedia article is now,
effectively, _an interactive application_.

As far as I know, something like this has not been tried by anyone else at
this scale before.

Quoting from the original post: _"Anyone can write a chunk of code to be
included in an article that will be seen by millions of people, often without
much review. We are taking our 'anyone can edit' maxim one big step forward."_

~~~
StavrosK
The good thing about Lua is that you can sandbox it entirely and set hard
limits on memory/CPU time, after which the process exits. You can just give an
error message if RAM grows over a limit or if it takes longer than a few secs
to execute. It's pretty safe.

~~~
pekk
Is it possible to run a program in 'steps,' between which the state is paused?

~~~
StavrosK
I don't think it's possible with the stock interpreter, but I'm guessing it's
pretty simple to modify the interpreter to do it. Why do you ask?

~~~
gngeal
Why not? Isn't that the whole purpose of coroutines in Lua?

~~~
fab13n
No, quite the contrary: coroutines only yield explicitly, due to a call to
coroutine.yield(). They let you control your scheduling deterministically
(although multitasking is far from being their only use)

~~~
gngeal
Note that GP didn't mention the means in which the yielding was to be invoked.

------
sigil
While more efficient templating sounds like a good thing and is clearly a pain
point, I'm not convinced Wikimedia is really thinking this one through.

1\. Every page is now a "real" program. Will separation of code and content
result from this? The article hints that structured data is on the way, but I
have a hard time imagining the average contributor will restrain themselves
given this awesome new power. Instead I foresee bigger and badder messes.

2\. Does this get us any closer to a parseable format for Wikipedia content?
This is the world's knowledge. The fact that it's currently trapped in an ad
hoc, PHP-inspired template "language" (there _is_ no formal grammar for it,
it's whatever some PHP code accepts as input) should be very concerning.

3\. Why Lua? Javascript seemed like the obvious choice. Rendering templates
server-side at Wikipedia's scale sounds incredibly challenging; it boggles the
mind really. Surely most page views are from js-enabled browsers -- why not
offload all that rendering to the client?

~~~
neilk
It would be impossible to move rendering to the client.

Transforming wikitext into HTML is surprisingly IO-bound, since most Wikipedia
pages transclude many other pages for templating and logic. It's very slow
now, and you need to be close to the database for this to be remotely
practical.

Lua gives the community a way to replace certain horrible templates which
perverted transclusion and macro-expansion to do logic. But most Wikipedia
pages will still transclude a lot of other pages.

Also, web browsers are not the only consumer of Wikipedia pages; you can also
download PDFs and so on.

~~~
gngeal
"Transforming wikitext into HTML is surprisingly IO-bound, since most
Wikipedia pages transclude many other pages for templating and logic."

Sounds like a case for in memory caching. Or is it that any random Wikipedia
page requires logic from a random subset of a large pool of other pages?
Somehow I doubt that, with Zipf's law coming to my mind.

Also remember that an HTML5 front-end would be able to cache a substantial
amount of this (a few megs at least) and Lua can be quite effectively compiled
into JS.

~~~
neilk
MediaWiki uses lots of caching on the server, but the question is how hard it
would be to move rendering to the client.

For a sense of how hard this would be, try using the Special:Export page on
Wikipedia.

<http://en.wikipedia.org/wiki/Special:Export>

If you download transcluded templates, the article for Barack Obama is 781K.

My experience of Special:Export is that it has some flaws that cause it to
miss some things it needs to export, so the real total may be much larger.

And that's just the data - one would also have to download a lot of related
code, which might balloon that up to a megabyte or more.

Wikitext is particularly ornery (because it's just based on grinding regular
expressions against each line, it is not easy to describe with regular
grammars) so you'd have to download a very large parser, with various plugins
as each MediaWiki install uses them to warp how Wikitext is processed. This is
assuming some optimistic scenario where MediaWiki's rendering, and all related
plugins, are entirely ported to JavaScript compatible with all desired
browsers.

I'm not denying that, if you wanted to create a new Wikipedia from scratch
today, based on JavaScript, you could probably move a lot of rendering to the
browser. You would choose more browser-friendly formats, like JSON or XML,
rather than making up some random text-based format, just because it was easy
to type into a textarea. You would make transformational operations work in
JavaScript, or be exportable to JavaScript. You could definitely get it to the
point where it would be practical for quick previews while editing.

For the Wikipedia that we have today, it's really hard.

------
kibwen
Also of note is that (AFAICT) the ultimate decision came down to a choice
between Lua and Javascript:

[http://thread.gmane.org/gmane.science.linguistics.wikipedia....](http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/57769/)

------
greggman
This makes me both happy and sad. I'm happy that Wikimedia is adding a code
path. I'm sad that it's not JavaScript.

Wikimedia needs to move beyond 1990s internet and allow interactive articles.
Articles on 3D that have interactive 3D. Articles on Physics that have
interactive simulations. If you asked me in 1980 what an encyclopedia would
look like in 2013 I would not have guessed static text + static pictures.

I know it's not an "either or" thing. They can have both. But I think once
they get around to adding JavaScript they'll regret having added Lua now and
having 2 code bases with 2 sets of libraries and code to synchronize the 2
when they need to communicate with each other.

~~~
bsg75
What would JavaScript bring that Lua does not?

~~~
greggman
As pointed out, you can't run Lua in the browser. It would be nice to have
articles like this in Wikipedia.

<http://tests.web2py.com/physics2d/default/code/10>

That points out that arguably Wikipedia will eventually have to allow
JavaScript in pages it serves if it wants to progress.

When that happens there will now be 2 code bases. The Lua code that generates
pages and the JavaScript code that does interactive examples for articles.

So the question is, why have 2 language which requires twice as many
libraries, twice as much knowledge, twice as much expertise, and various glue
to get them to interact with each other, vs having 1 language used for both?

~~~
bsg75
> So the question is, why have 2 language

One perspective: Because choosing a language because it is conveniently in
place should not be the primary reason. Many people are comfortable with
JavaScript, but there are many server side code bases in other languages
because they are preferred for various reasons.

There is a large amount of PHP code deployed, but there does not seem to much
drive or concern that it does not run in browsers. Both JS and PHP earn a
large amount of negativity because of their shortcomings (which we don't need
to go into here), and they are obviously not the tool for every job.

I would simply make the point that choosing a language to avoid having to
learn more than one is not an approach for improvement. Otherwise as I said in
another post, we would all be using IE with ActiveX controls.

~~~
greggman
Many companies limit the number of languages used internally. There are both
positives and negatives of course. The biggest positive though is that it
makes it easier for programmers to understand code written by other
programmers. It also means code written by one programmer will be usable by
other programmers.

If the code is in a different language then both of those are often false. The
2 programmers don't speak the same language and even if they both happen to
understand both languages they can't share code.

------
acqq
Performance change after some wikipedia templates were rewritten in Lua:

[https://en.wikipedia.org/wiki/User:Dragons_flight/Lua_perfor...](https://en.wikipedia.org/wiki/User:Dragons_flight/Lua_performance)

------
scribu
Templates were already pretty complex a few years ago when I was a casual
contributor. I am so glad they did this!

Now, if we could only go back in time and replace Vimscript with Lua or Lisp
or anything...

------
Zash
Video presentation and demo: <http://www.youtube.com/watch?v=PrhzAtC8fCc>

~~~
TazeTSchnitzel
Some of the examples were shocking. The inefficiency of str len astounds me.

------
DannoHung
To people grousing about it no being in JS: The hell is wrong with you? JS
being the only language your browser will run is the problem, not people
choosing to use something besides JS for something they do on the web.

------
simonw
Can anyone link to a good example of a simply lua template that's live on
Wikipedia right now? The linked article wasn't pretty great for examples.

~~~
acqq
<https://en.wikipedia.org/wiki/Module:Citation/CS1>

------
gngeal
"But, because we’d never planned for wikitext to become a programming
language, these templates were terribly inefficient and hacky — they didn’t
even have recursion or loops — and were terrible for performance."

That sounds like the life story of PHP! And many other similar solutions.

------
ksec
Finally, i hope this makes Lua much more widespread.

------
reirob
This news made look closer into Lua. I discovered that it actually does not
provide any support for unicode. I am wondering now how this will impact the
goals for Lua as scripting language? Or is there a special module for Unicode
support that will be used?

------
youngerdryas
From the comments:

"People who have programming skills should enjoy better productivity and lower
frustration while editing Wikipedia. Perhaps most exciting is how many people
will have a gentle and practical introduction to programming because of this."

A lot of people on HN get down on Wikipedia because of deletionists etc. and
support a forked version but the fact is Wikipedia is _the_ most salient
expression of the spirit of the internet. Google is great but is a business
with all the downsides that entails, project abandonment etc. Maybe I am
gushing but Wikipedia is without peer.

