Most immediately, I need to turn this file


into this file


longer story:

I'm a doctor in the military, and we use IT systems derived from the VA's original VIsTA project. If you've ever heard of Mumps, this is it. The manuals date back to the early 80s. I am trying to build out some infrastructure so we can sensibly work with external research institutions in the San Diego area. I have figured out how to get some decent text out, which then needs to be parsed for specific things, which may vary depending on the project.

Think of this as a startup within the largest of large organizations. The downside is there's a huge bureaucracy. The upside is we're operating far enough up, that everybody understands the bureaucracy problems and some key people are willing to help facilitate some things along the way.

So my minimum viable product is a research paper published using 55 cases. Trivial, right? Maybe if you're at Stanford. Here, not so much. getting enough code written that I can automate the parsing and inject this (and some other similar files) into postgresql, and then pull data back out and do data analysis with it. I have struggled to find help. Joe has been very generous with helping me wrap my head around postgresql. I have been through a lot of python tutorials, but I have never sat down with someone who knows python and seen how they solve a problem. Go to a desert island and teach yourself python with nothing but the internet. Can it be done? Yes, but it is very hard.

So I'm hoping to build a relationship with someone who could help with practical matters, like finishing this first step of a small parser. Paul McGuire, the author of pyparsing, has been very helpful over Stack Exchange, but crafting a well-formed question on Stack Exchange is very difficult when, again, one learned to code on a desert island.

Here's my effort so far


Which I'm sure anyone here will look at and say: dude, you could have finished this by the time you wrote this comment! Except it's really hard to wrap your head around something you've never done, never seen anyone else actually do, and can only spend 10-15% of your time on.

But I need to automate some of this work if we are going to undertake larger projects in the future, which is the goal.

Unfortunately I don't know python, I'm ruby-rails-mysql for my day job.

If you're interested in going down ruby (and even mysql or i can probably get up to speed on postgres, as it's a bit rusty), I can get involved.

If you want, the needed input cases and their desired output forms are in the repo, I would love to see how it's done in Ruby.


I'm not completely wed to postgresql, but I would hate to give up Joe, who has been tremendously helpful. And most of my admin experience is with postgresql (blogs, wikis, etc).

If the sequence of the last 3 numbers is not important (does not appear to be), then this appears to be relatively trivial. Unfortunatley, like the OP , I can only do this in ruby. I'll attempt to do it tonight, and reply to you on here when its done.

sent you a mail with my attempt

