I wonder how the decisions for inclusion of languages were made, as there are some very odd decisions. For example, Osmanya is a script created for the Somali language that was hardly ever used (Somali literacy was only widespread after the latin alphabet was adopted - previously Arabic was commonly used). The population of actual users of this script is pretty indisputably 0. 100,000 would be a wildly ambitious estimate of the number of people who had ever actually even seen the script.
On the other hand, Oriya, which has over 33 million native speakers, including 80% of India's Odisha state, does not appear to be supported.
In their defense, when you click India and scroll down, it does say, "not supported yet". Which leads me to believe they picked both languages with few characters (or straightforward to render?) and those most common, and they'll get to the rest shortly. :)
Meanwhile, I wonder if this means we'll see OCR and ePubs for all kinds of scripts now; or if this will help enable Google Translate in more languages? ;-)
Also maybe this was a %20 time thing and the programmer who started it just wanted to do those languages (probably because they couldn't be found elsewhere).
It's probably just a matter of whether or not there's somebody in the relevant team(s) who is familiar with, or at least has heard of, any given script.
I wouldn't be surprised if there happens to be an Osmanya geek in Google, but none of his teammates has ever heard of Oriya. For the same reason, I wouldn't surprised if they added a bunch of geeky fictional languages before actual ones.
And the fun thing is, there is two OLD! persian script available, (Pahlavi, OldPersian Both dead for almost 1500 year) and the current Persian is not supported :)))
Perhaps even more curious is the inclusion of Deseret, a toy language developed by the Mormons in the early 1800s. It never caught on and few books other than the Book of Mormon were ever translated into it.
Good pick, but I also see the value in preservation — maybe 500 years from now, the Noto fonts will be the best or only representation of many dead or forgotten scripts and languages.
There are many more glyphs than there are codepoints -- a font contains a ton of information that would be needed to reproduce a script that is not present in unicode tables.
This is particularly true for non-European languages – ligatures are a minor feature for most of the languages using latin-1 but there are quite a few which depend on complex, multi-letter combinations which are required for text to be comprehensible:
Hi fellow Oriyan, google has very bad support for Oriya,since IT is not that great as in R&D in odisha, I work in IIIT hyd, which is the leading NLP lab in India and I dont see anything in Oriya.
On the other hand, Oriya, which has over 33 million native speakers, including 80% of India's Odisha state, does not appear to be supported.