I understand that now I have two problems. Actually, since I'm trying to use these regex through Applescript, it appears that I have three.
I've tried using Smile from Satimage, but I can't seem to establish an environment in which I can run a regex on the text I have (which is in QuarkXpress). I can text that my expressions work, but I can't make manage to have them applied to the text in Quark, much less change the text.
Structurally, I feel I understand what I need to happen: activate Quark, select the document, find the offending text (given the regex parameters), replace the text (if it meets certain conditions), else ignore the text.
My goal is to script the editorial style guide of the magazine I work for, thus side-stepping a lot of formatting/spell checking that we do, so I can focus on fact checking.
I have a script that already does the formatting, actually, but I wanted to write it in regex so that it might run faster and fulfill a more goals. For example, we switch all manner of words with the letter 'z' to a spelling with the letter 's', as is customary in British grammar: "analyze" to "analyse", "capitalization" to "capitalisation", you get the idea. I've had trouble with my script thus far, because it introduces certain errors (ie. "size" to "sise"). I thought regex could prevent this, and that if I could learn it, I could go on to solving other, more complex, problems. But at this stage, I can't even get a script with regex in it to launch and work on the text.
Ideas?
You are absolutely not looking for a regular expression (as the grammatical rules and their exceptions are not all that easy to reimplement in regexp syntax), you are looking for a dictionary (a set of key-value pairs, i.e. "American zpelling" => "British spelling" ;)). Then iterate through each word in the document, see if it's a dictionary key, if so, replace with the relevant value. No regex needed.