Hacker News new | past | comments | ask | show | jobs | submit login
I am a MUMPS programmer – Ask me anything
81 points by quink on Sept 1, 2013 | hide | past | favorite | 68 comments
The vast majority of my work involves maintaining a system written in MUMPS, running on InterSystems Caché.

This may be the Stockholm syndrome speaking, but it's pretty alright. And I say this working with extensively with, among other things, Python and JavaScript also. My educational background is an MSc in Physics, so I'm also familiar with everything from MATLAB to LabVIEW to Assembler to PHP.

Ask me anything!

Just to give everyone here a rough idea, the most annoying thing I found recently is that the function:

$ZCONVERT(stringVar, "O", "JS")

Which escapes stringVar into a valid JavaScript string without quotation marks doesn't escape line separator or paragraph separator (U+2028 and U+2029), when it should.

This was also a bit of a problem in browsers, JSONP and the JSON spec a while ago. Life in the MUMPS world isn't as bad as you'd think. Except for a lack of nice libraries. You do not want to know when regexes made it into the language. Last year. But there has been something similar - pattern matching - which alleviated the need for them a bit. And calling out to DLLs is fairly easy. It's really driven more by the healthcare industry than anything else.

Also, here's a short FizzBuzz I wrote:

  f i=1:1:100 w ! w:'(i#3) "Fizz" w:'(i#5) "Buzz" w:'$x i
Written in pseudocode:

  for i=1:1:100 {
    write newline
    write if not i%3 "Fizz"
    write if not i%5 "Buzz"
    write if not cursorposx i
Shorter than any other FizzBuzz I've seen other than Perl, yet probably more readable.

CoffeeScript is pretty cool

for i in [1..100] console.log(['Fizz' if i % 3 is 0] + ['Buzz' if i % 5 is 0] or i)

MUMPS is pretty cool:


I've actually decoded it. Turns out that the thing most useful for this, by far, is an ASCII table :P

"everything from MATLAB to LabVIEW to Assembler to PHP."

This is kind of a litany of relatively-unstructured programming languages and sounds like a relatively one-dimensional view of computer program organization techniques. Of those listed, PHP is the one with the most advanced code-organization and object model, but its object model is hardly renowned.

Have you ever considered learning something like Ruby in depth and messing around with intensive object-oriented techniques and write-your-own-DSL metaprogramming and the like, so you can know how the other half of the world gets to program?

(ed) oh, and here's me getting -1'd. wonder what that's about. probably someone with thin skin thinking that asking about highly structured programming implies a put-down on the other kind. maybe I could put in some words of praise for Matlab's awesome matrix handling and it'd help? :P

My languages of choice are JavaScript, SQL and Python and I've gone quite deep in all of them. I'd consider ObjectScript compiling down to efficiently work through SQL queries or to form objects and other things a metalanguage, so Caché isn't as one-dimensionally pure imperative as you'd think.

I don't have any formal background in computer science, so thank you for pointing this out, and it's true. In a nutshell, you're telling me to work through SICP, right? :P

Edit: Yes, you need to mention MATLAB's matrix handling. Probably also say something about NumPy and PyPy, I'm sure that'd help too.

Okay! Since you HAVE done something like Python that's good to know; knowledge of your experience colors my interpretation of your interpretations. :)

Are you looking for work? Because my company is hiring good MUMPS programmers -- we are nearly always on the lookout for qualified people.

I bring it up, not because I think you're likely to be looking for work, but because I thought the others reading this discussion thread might be interested in the fact that it is difficult to hire qualified MUMPS programmers.

I work at a bank, and our banking system (the system that keeps track of the balances in the accounts) is written in MUMPS -- probably because it dates back to the time when that was the shiny new programming language.

Yikes. I've not done mumps in a long while, but I've heard from old co-workers that Cache has had a number of significant security vunerabilities in recent years.

There was a thing a little while ago, but haven't seen anything big in many years apart from that. Mostly something about database corruption on VMS or ECP or similarly obscure things not really relevant to us.

The first thing I would do if I had to work with something like MUMPS is to write like, a MUMPS LLVM backed or something takes takes a saner language and emits MUMPS, therefore abstracting away the crazy. Why don't you guys do that? Or maybe it's already been done?

It has been done. The most popular form of MUMPS out there, Caché ObjectScript is a superset of MUMPS, so all existing MUMPS code will run on Caché.

But it adds a big bunch of things from error handling to better variable scope to classes to, I wish I was kidding, proper 'if'.

Something that's also been added (to GT.M as well, afaik), because it is so tightly integrated with its database, are tcommit, trollback and tstart as commands, which are strictly speaking database commands and not the kind of thing you'd expect to find in a programming language.

proper 'if'

What (if anything) does MUMPS proper have for 'if'?


There's two different types of if constructs, and it's awful, the older one is with a dot syntax.

What's a dot syntax, you may ask? Well, for each level of the if you just prefix a dot:


Makes you want to stab your eyes out. Note also that the comments need to have a dot prefixed as well. That bit me before.

>This may be the Stockholm syndrome speaking, but it's pretty alright.

>Makes you want to stab your eyes out.

I dunno...

Came here to plug the GT.M version of MUMPS, which is really great. It uses the underlying UNIX system as much as possible (so, for example, your routines are not stored in the database!)


It's easy to put a CGI interface on top of GT.M - performance is quite good.

Personally I am working on a utility that wraps GT.M in an "environment" similar to a Python virtualenv, but I'm not sure I'm ready to show my baby to the world yet...

Ewww, CGI.

InterSystems Caché ships with Apache as an administrative web server for its Management Portal, through which you can also run all applications.

It ships with modules for Apache and IIS (ISAPI), and probably others. These come with a little ini file that's meant to sit in the same directory.

I appear to be an inferior version of you; you've described my job, I'm also familiar w/ MATLAB (loved the absolute pants off that language in class), python, and javascript, but I only have a B.S. in Physics. I also think that MUMPS is unfairly maligned.

Do you live in a place that rhymes with 'Radisson'?

Nope, there are in fact MUMPS programmers overseas too - I'm in Australia.

Also, if it wasn't for the sheer number of libraries PHP ships with, I'd without a moment's hesitation say Caché + Caché ObjectScript >>> PHP + MySQL.

is it possible to use Cache ObjectScript on GTM?

The only experience I've had with GT.M is playing with it a little bit on Debian - it's only an apt-get away.

As far as I know, no, apart from the shared ANSI M featureset. They're pretty divergent.

what type of job do you have? something in the medical field?

Nope, not in the medical field.

No, Cache is proprietary.

I know that I was wondering if there was some type of Cache ObjectScript port to GT.M like something called GT.M ObjectScript, etc.

I think we may work for the same company. Does it rhyme with 'Shmeppic'?

Well it sure doesn't rhyme with 'Shmerner'.

Non-tech question, but with your background, you could have migrated to any field. Anything stand out as a motivation to move in the direction you did? LabVIEW/MATLAB were my early intros into CS. I haven't really touched MUMPS yet, and I should. Thanks for the recommendations.

After hundreds of applications sent out - in a vast variety of fields, considering that degree - and an interview with Red Hat I didn't succeed in ultimately, this was the next best thing. In retrospect, probably better. The employment market for scientists here in Australia is pretty much complete crapness unless you do a PhD, which I didn't go for. Plus, I've had the IT experience and interest and our system is a domain I'm really interested in. And the company is pretty awesome too, especially my coworkers who are all equally enthusiastic about both the product and our customers.

Sorry if I'm not going into that much detail :P

Yeh, did not expect much work detail, but that makes sense. Most of my fellows voice op-eds of frustration, so it's nice to see someone defend it a bit ;-)

How does it feel to be using NoSQL so old it came back into fashion? :-P

We have dozens if not in the low hundreds of SQL tables.

So, while we do have a huge pile of legacy code not using SQL you can map your NoSQL data structures to SQL tables and you can also later on convert to a more efficient format that gives you bitmap indices and so on, while still using the global storage backend.

So, it may have been NoSQL until some point in the nineties and after that it was really NotOnlySQL.

There's something a bit therapeutic about seeing the indices, including bitmap indices, and all the data on disk in a format that's intuitive and usable for humans that you can use without going through SQL, but either by accessing it directly or through the built-in ORM system. You don't get that with PostgreSQL or MySQL, or conversely, MongoDB or CouchDB. It's both worlds. Sure, there's a lot of stuff from PostgreSQL that I would kill for - any volunteers? - but as a compromise between the two worlds it works quite well indeed.

Edit: Here's more info:


The awesomest points are the %ID pseudo-column, implicit joins, embedded SQL (which compiles SQL down to native MUMPS code, including cursors and all), near enough complete SQL-92 and DDL compliance. And the ORM stuff.

The obvious question would be why? I don't know Caché except what I read on wikipedia, but having worked with plenty of languages over the years it doesn't sound like anything you couldn't easily do in python with 100x readability improvement. I see little reason to go backwards with languages when we (the CS field) have made such awesome improvements over the years. Now, rather than worry about the bs stuff we can focus on algorithms and whether or not an idea is actually useful when working. I build things in Python all the time, throw away most, but the ones that look promising are productized.

A lot of systems — legacy and current — are written in MUMPS. Historically MUMPS has been very popular in health care systems (where it originated), and I believe it's still huge there; it is used and supported by a number of niche companies for things like patient data.

In other words, MUMPS is a platform and an ecosystem as much as a language. Think of Java or Ruby — for a lot of companies, including MUMPS shops, staying with a specific "sub-ecosystem" is simply the most rational choice because they have so much invested it already.

If you look beyond tech that is currently considered "bleeding edge" — Go, JavaScript, Ruby and so forth — you will find a lot of companies who rely on what you may consider weird or even legacy software. For example, Delphi (a descendant of Borland's Turbo Pascal which is still based on ObjectPascal) is still very popular. In finance, languages like K are still popular. I believe finance still has a ton of stuff based on object databases such as Objectivity/DB, Versant, Matisse and GemStone (Smalltalk), which actually look a lot like today's document-oriented databases. InterSystems Caché, which is based on MUMPS, is a hybrid SQL/OODBMS. In other words, the software market has a lot of aging technology that is still working superbly for the parties involved. Old code is usually proven code.

InterSystems Caché is more like UNIX than it is like, let's say, MongoDB. Make the bottom of it efficient - that's where the runtime and the B-tree storage operate - and you can build a world on top. SQL from tables to views to indices, all the ORM and things like classes and MVC are implemented mostly as macros. And it works pretty well.

Sure. I didn't mean to include Caché when I referred to newer document-oriented databases. Caché has a different architecture. It's more similar in design to K and Kdb [1], I suppose, which is also heavily based around vector operations on persistent arrays.

[1] https://en.wikipedia.org/wiki/K_(programming_language)

You are right about healthcare, for example the entire US Veteran's Administration runs on MUMPS and I think epic systems also use cache. This area is ripe for disruption, they've been stuck with the same legacy stuff for 30 years.

We use Python too.

This system was originally put together in the 80s and while a complete reimplementation from scratch in Python is possible, it wouldn't really give us that many benefits that we don't already have. We don't have to worry about algorithms all that much - we may just write an SQL query which takes care of things for us. One of the more complex things we do in new code is maybe two levels of $order, the equivalent of your for x in y loop.

We're programming on a level quite a bit higher than C, nor are we exposed to the sheer verbosity of Java, thank god. You think Caché ObjectScript is awful? I think Java is awful.

What Caché gives us is tight integration between the language and the database system, so in that way the choice isn't really between Caché ObjectScript and Python, it's more between PL/pgsql or PL/Python and Caché ObjectScript. Once you go there you'll realise that the code that's written in ObjectScript isn't really affected that much by the choice of programming language anyway. And PL/Python isn't really something you'd want to write a production system in anyway. Between that and built-in ORM, things really aren't that bad.

And 100x readability improvement is just wrong. Sorry, it is. Sure, that may apply to some old MUMPS code that survived the 70s when disk space was sparse but even this is fairly straightforward to expand into something quite readable. We do have syntax highlighting, function calls look the same, and assigning a value to an object is 'set object.Property = "blah"' instead of 'object.Property = "blah"', a difference that's quite trivial.

That's not to say the lack of libraries isn't annoying, but everything really important is there in the right places. I've written a decompressor for tar.gz in about 34 quite readable lines, with error handling and all. gzip is built in these days so it's really mostly .tar I needed to worry about. Alternatively, calling out is just a matter of $zf(cmdLine, -1). Similarly, I've just put chosen (https://github.com/harvesthq/chosen) into our web application, based on Caché. It was easy enough to do.

And it's a pretty top-notch fast SQL implementation with a nice built-in language and a nice ORM and lots of other bonuses like full-text search and bitmap indices and OLAP cubes (yes, it speaks MDX even) if nothing else.

For one thing, there's not much of an incentive to port hundreds of thousands of lines of working legacy code to a different language just to make it more readable. Its age isn't a relevant issue either - sure, MUMPS debuted in 1966, but InterSystems Caché and GT.M are still being actively developed. Other databases are pretty old, too - the first version of Oracle was written in 1978.

If Caché had no strengths, I would agree with you, but as a non-relational database, it's pretty good. Sparse associative arrays are the default data structure, so it's very popular in, e.g., medical applications, where you would want to be able to store thousands of different things, but any given patient will only need a few of them.

> Sparse associative arrays are the default data structure, so it's very popular in, e.g., medical applications, where you would want to be able to store thousands of different things, but any given patient will only need a few of them.

Caché has really been moving away from that though... CacheStorage stores data in a big list in the *D globals.

I think the power in CacheStorage really comes from the indices and - very relevantly to the healthcare industry - all the relationships. They've got a very complex schema they need to support and Caché continues to be pretty good at that kind of thing - see the implicit joins in their SQL variety for example or Zen.

Unfortunately at my company we don't get to use the fancier Caché features. We have to stick to ANSI M.

We do make a lot of use of indices, though. Finding records can be incredibly fast.

> We have to stick to ANSI M.

I'm so sorry. Nothing more fun than manually keeping indices in MUMPS :|

It's all legacy stuff, the incentive not to port is that it's too hard to do so. Most of the intersystems cutomers I have worked with want to move away to something more modern but cant.

> Most of the intersystems cutomers I have worked with want to move away to something more modern but cant.

We're pretty happy with it. If, theoretically, we'd need to start from scratch, it'd would definitely be in our top very very few choices. Not least of all because of things like DeepSee.

I could certainly think of things much worse than Caché.

Don't really have anything to ask, just wanted to say MUMPS was always neat when I worked for a medical software company and did conversions from and older versions of MUMPS to a Caché server.

I liked the one-letter verb abbreviations, even if it made the code feel somewhat write-only. Data right next to your front-end language was neat too.

The one letter abbreviations are neat.

There's no substantial change in readability in going from:

if condition: print "Hello World"

Or: if (condition) { console.log("Hello, World\n"); };


w:condition "Hello, World",!

I never type 'write', but just 'w' instead. Here's a list of commands: http://docs.intersystems.com/cache20131/csp/docbook/DocBook....

> Data right next to your front-end language was neat too.

Don't know about neat, it made it very hard to separate your model from your logic and while convenient at the time that's also quite a bit of a pain.

I guess to me the data thing was neat because you didn't have to bust into another shell just to get at data.

Here's how to get a string from your persistent on disk configuration, completely from scratch:

set foo=^config("foo")

The '^' means it's a persistent variable.

Now do this with in the same number of characters either by reading in a flat file or doing an SQL query in any language. I don't think you'll succeed.

And, yes, it is possible to parameterise this ^config, like so:

set location="^config"

set foo=@location@("foo")

I don't get it. That seems like a rather trivial library thing to add in most languages, at least ones that let you define operators. Otherwise you'd just fine top-level functions "configp" or whatnot.

Just like "one letter abbreviations". Again, just "let w = printf" if that's what you're into. I do that kind of stuff all the time, with limited scope.

The fact that easy persistent on-disk storage other than through file streams or sqlite and the like is still not a built-in feature or a commonly used library in languages in 2013 when MUMPS did this in 1960-something then that's more a commentary on the state of things not MUMPS than the other way round.

Sure, you can do import json; json.load(open('config.json')); json['foo'] and I have code like that in production right now, but put pickle in comparison to the above and I'd know which one seems nicer, and not least of all changing ^config("foo") is fully concurrent, caching, network transversable, auditable, with Caché supporting these things like an operating system should, but in a built-in way.

I could write a library in Scala that allows almost this exact syntax. Say...

val fooConfig = "config" ^ "foo"

Of course, there would be some config required. There are various ways to handle that, but it could be as minimal as a single line of code. I don't want to be critical of your work, but my own preference is for languages which are DSL-friendly. Although there are some disadvantages (possible lack of fluidity, mildly more verbose), I feel the advantages (lack of vendor lock in, composable with the rest of the language) are worth the trade-off.

Admittedly I've never had experience with MUMPS, but I have used PHP. Until pretty recently, that was a language which attempted to break down the barriers between its syntax and the runtime environment at the expense of the language. Reams have been written on the PHP argument, and I won't contribute further here; just noting that my preference for languages with a flexible syntax and a "do it in a library" attitude is based on having experience with the opposite.

Have you had experience with any of the embeddable-DSL languages (Ruby, Clojure, Scala, Haskell)?

With the new string interpolation syntax you could define config"key": Option{Result] and even config"base/userdata/$user/$key": Option[Result].

Do you have any recommendations for books/learning materials/websites for people who want to learn MUMPs?

I'd recommend the documentation that Intersystems Caché ships with - which is accessible on http://docs.intersystems.com/

It's not traditional MUMPS, but it adds things like macros, error handling, sane variable scope, regular expressions and so on.


Would you actually recommend anyone learn MUMPS for any reason besides job reasons?

I wouldn't. It's not a terrible language to work with on the job, but I don't think I've gotten anything out of it in the same way that I did from learning OCaml or Lisp.

Ditto, there's not a whole lot of innovation of ideas in MUMPS at all. It's just a random programming language that's good enough to get things done with and the database that's next to it is nicer than most and supports both NoSQL and SQL. Spend the time on learning Python, JavaScript and SQL I'd say :P

For a modern view on Mumps as a database technology and the future (or not) of its language, see my blog at http://robtweed.wordpress.com

Also see: http://www.mgateway.com/docs/universalNoSQL.pdf

I've worked with this ... MUMPS is a dead language. Cache is a proprietary implementation of mumps that costs big bucks; cache is interpreted and pretty slow compared to any other language since 1989.

I wouldn't bother.

I think the most relevant bit is that it's much cheaper than Oracle. And I've seen Caché do quite complex things quite quickly, like with all database systems it's up to your indices more than anything else.

The entire VA (VistA) and the DOD (AHLTA, CHCS) systems are still implemented in Mumps and will be for the foreseeable future, far from a dead language

How well are you paid? You don't have to give specific numbers, just like in comparison with the average. I've heard rumors of people getting paid a shit ton of money to maintain these sorts of systems but those might be just rumors.

Sorry to disappoint you, but my job isn't as such the maintenance of a legacy system. We write new code in Caché, which includes things used by a wider world like JavaScript and SQL as well.

Ask me again in two or three decades, but our codebase is continuously being touched in all places and there is an ongoing drive to weed out legacy code all the time. But just on the side I've been able to get rid of about 30% of all the legacy code here without spending that much time on it at all mostly with the help of grep, in part because our system has now moved to being 100% web from a desktop client.

As for salary, never enough :P, but considering my age, the economy, and my lifestyle I'm pretty happy.

Kind of off-topic question but: Could you provide any option to contact you? I have a couple of questions with regards to MUMPS and I'd like to - if possible - drop you an e-mail.

Sure, PM me on reddit - same username.

I was a key-programmer for distributor in some countries for MSM database, before they were bought by Intersystem. From my personal experience, mumps better suited to DB related systems only not fancy stuffs.


Nope, sorry. We're a Caché user who is not in the healthcare industry, unusual as it is.

Do you work for Epic Systems?

Nope, not in the healthcare industry at all.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact