Not to forget the middle dot that was often used to separate words in Latin inscriptions.
Linguists like Geoffrey Nunberg and Ted Briscoe have written about the linguistics of punctuation and its role in automatic syntactic processing of text ("Parsing (with) Punctuation").
For Mandarin, detecting the word boundaries in the absence of white space between word-indicating characters or character sequences has been a dedicated sub-task of NLP pre-processing, often applying machine learning in the process:
https://www.semanticscholar.org/paper/Chinese-Word-Boundarie...
> For Mandarin, detecting the word boundaries in the absence of white space between word-indicating characters or character sequences has been a dedicated sub-task of NLP pre-processing
This is also really hard(at least for me) for second language learners.
Linguists like Geoffrey Nunberg and Ted Briscoe have written about the linguistics of punctuation and its role in automatic syntactic processing of text ("Parsing (with) Punctuation").
For Mandarin, detecting the word boundaries in the absence of white space between word-indicating characters or character sequences has been a dedicated sub-task of NLP pre-processing, often applying machine learning in the process: https://www.semanticscholar.org/paper/Chinese-Word-Boundarie...