This would be a cool addon into a Unity C# game, easily spin up some procedural dialog by NPCs, have some rhyming characters, spit out some 'clues' in their rhymes then save those as text variables and store them in the environments puzzle as solution pieces. Kind of dehumanizing though for the game, like sucking out the soul.
I made a random sentence generator in Python about 10 years ago. It doesn't have any kind of pattern control (totally random), but it has different lists of nouns, verbs, adjectives, etc, and outputs grammatically correct sentences. I might steal some ideas from this, if I ever get around to working on it again.
I've looked for this and never found the right search term. Nice!
Likewise, I can't identify the search terms to find a library for extracting data from a text corpus - e.g. to identify addresses, dates, names, phone numbers. Does one exist?
I've smashed together a few regular expressions to brute-force it, but the accuracy is very low.
Loosely you're talking about named entity recognition/extraction (strictly you might not include addresses etc under this definition who but who cares). You might get some mileage out of tools like:
Stanford NER is very powerful. If you dont want to mess around with Java, there are also Python libraries that do this very effectively and easier to get started on (e.g. NLTK or Spacy). A high level intro for both with code snippets - https://towardsdatascience.com/named-entity-recognition-with...
I used this a while back when I was researching a way to help people with macular degeneration. I'd created some visual distortions which could be applied to a VR headset (if you had eye-tracking information) which I hoped would improve reading ability in MD patients. I used Rant to generate some random text in Unity and then measured how reading speed varied.
Hi, most of this is shorthand for either several functions or functions with longer names.
For example, [rs:10;\s] is the same as [rep:10][sep:\s].
The last example is used to synchronize results between dictionary queries. This particular one is short for "output the same word as any other query to this table with the match ID 'a'."
In retrospect it was a poor decision to use shorthand forms in prominent example snippets, and I'll consider revising them.