The source of the diacritical mess in Polish is that it is a Slavic language rendered in a Latin orthography. Cyrillic is a good start for Slavic languages. Latin is a good start for Romance languages. Arabic is a good start for Semitic languages. When you start from an orthographic base of a related language, you don't need as much of these shenanigans. Persian, by the way, is in the same boat as Polish: a language with a large consonant inventory rendered in an orthography that is relatively consonant-poor.
However, the Turkic languages don't have a good start in any of these three, yet they're all written in one of them with a large number of diacritics.
From the linguistic point of view I posit, your primary concern in designing an orthography is: does this line up nicely with the phonology? If it does, congrats, your literacy rate just went up. Every digraph you add is another exception you have to explain, every sound with two letters is another exception you have to learn. My son is in kindergarten. He wrote "KUMIN" on a piece of paper and hung it on the door the other day. I asked him what it said, he told me "It says 'come in'". Was that obvious to you? It wasn't obvious to me.
So, in the grand scheme of things, I know a new Turkic orthography is probably a long shot. But wait, there is already a Turkic language with similar phonology: Turkish. Did they take the Turkish orthography and modify that? Not really. They had to come up with yet another g diacritic.
I hope they do normalize the spelling and make it phonetic. Then at least, they may improve literacy, assuming it wasn't as phonetic as it could be under the old alphabet. Because otherwise, what are they getting for what they're spending this huge amount of money replacing books, retraining teachers, fixing software, etc.?
> Every digraph you add is another exception you have to explain
In phonetic languages with digraphs there are rules to the exceptions. For example "sz" is related to "s" same way "cz" is related to "c". "I" makes the previous letter softer.
And there are Slavic languages with Latin script and almost no digraphs. Polish is kinda old school in that regard as it preserved sz and cz (funny thing - English name for Czech Republic uses Polish/Old Czech spelling).
English is one of the worst examples of using Latin script there is, and it could be drastically improved if it used regular digraphs instead of ad-hoc spelling. For comparison German from the same language family is much more regular and easier to pronounce.
Absolutely not. Czech orthography is beautiful. Latin letters, clear featural marking of palatal consonants (č, š, ž, ď, ť) and simple marking of vowel length (á). The spelling is even morphophonemic!
As an outsider with casual knowledge of Czech, I totally agree that the Czech orthography is logical and consistent (which is beautiful :). It makes it so much easier to learn to pronounce words, because the spelling tells you how - like it should. There are (almost) no surprises like in English, where the language is a hybrid of rules/exceptions depending on the etymology of each word. The reason for orthographic consistency in Czech could be that they were able to standardize it more recently? I also find Czech typography wonderful.
It's not really a mess for us Poles and that's my point, someone who hasn't used a language that butchers an alphabet extensively themselves shouldn't judge some other language's writing system just because they added a diacritic or few because if done well it's a non problem.
The writing split in Slavic languages also follows closely some West/East (historical and contemporary) and Catholic/Orthodox (it depends, Czechs turned largely atheist by now from Catholicism but Poles, Russians, etc. mostly remain in their religions) splits more than anything, e.g. West Slavs have Latin and prevalence of Catholicism, East Slavs have Cyrillic and prevalence of Orthodoxy, South Slavs are a mix of the two and it shows, e.g. Croatia is more Catholic and more Western and it has the Latin alphabet, Serbia has Orthodoxy and both writings, Bulgaria has Orthodoxy and Cyrillic, etc.
Technically even Cyrillic which should fit Russian perfectly has things like И and Й which are two related sounds, because it really makes sense that a related sound has a similar letter with just a diacritic mark.
Even the Japanese hiragana and katakana have diacritics and digraphs, even though it's a fully bespoke (yes, based on Chinese characters and some Indian Buddhist scripts but they were modified and adapted very extensively and barely look like the originals from which they were evolved from and it was over a thousand years ago) system that evolved in Japan (an island nation), got a few official revisions to clear it up and is specific to only the Japanese language (and Okinawan and Ainu, but that's secondary and due to Japanese presence and didn't affect its design).
It's similar with Polish, for example ź is like z but with sort of wheezing the air under your tongue and through lower teeth instead of on top of it (sorry for the bad explanation, you can compare Polish letters and their bases on Google Translate or something and you'll hear the similarity of the sound).
There were attempts to Cyrillize Polish[0] and Latinize Belarusian[1] but they were shoddy at best and done by an occupants so that didn't go well.
The only funky thing in Polish that is better done in Cyrillic is digraphs (but then Russian has ть and we have a single letter ć, which is visible in base form of many verbs, like in Russian делать and Polish robić). Polish has cz and sz while Cyrillic actually has real letters for those sounds: Ч and Ш (Cyrillic also has Щ that sounds like sz and cz chained together like in szczęście but we don't consider that combination to be one letter or a quadgraph or anything like that). The digraphs also aren't a problem because they (IIRC, maybe there's some words and I can't think of it because I'm native) never appear like that naturally, if you get s before a z, it's a sz sound. Because of this it's not really an exception but a rule, if you gave a Pole a paper with just SZ or CZ or RZ on they'd do that one sound, not try to pronounce two letters.
Digraphs (in addition to j, w and y, leading to wafelek jagodowy by Ashens) might be one of the hardest things for learners actually because people try to hack they way through Polish using their language and they go with trying to slur the digraphs together while they are a different sound altogether from the two letters (it depends, like rz is ż, but sz or cz is like a soft or swishing s or c, but nothing of a z, sz and cz actually do sound like sh as in shoot and ch as in chain from English and they also have nothing to do with that h, then again h is sometimes silent like in honor and sometimes voiced like in holy..) that make it up so it sounds very off. They are also not that easy to pronounce, one Polish tongue twister is "w Szczebrzeszynie chrząszcz brzmi w trzcinie" and I once heard an African immigrant to Poland say how when he first came to Poland he felt like everyone is rustling and swishing all the time at him (which is actually accurate with regards to some diacritics), then again, his Polish was really good (he could pass for a native if he wanted IMO) so it's possible.
If not for digraphs then you could probably get away with learning pronunciation of each letter and then gluing/slurring those sounds together into a word. It's much easier than English where spelling is often only tangentially related to pronunciation and adding or taking away a letter somewhere else in the word can change pronunciation of other letters. There may be words that have something weird going on but I can't think of any right now (it might be my native bias though so be careful when hacking Polish that way).
Lots of people say that a spelling contest wouldn't work in their language and except for ó vs u, rz vs ż, ch vs h and some cases where it's hard to say if you heard ę or en, ą or om/on, c or dz at the end of a word (and maybe something else I'm now forgetting) etc. that's true in Polish too. If you say something in Polish a Pole can usually write it down on first try without even thinking about it. E.g. there is a few niche jokes about two fake useless devices called bulbulator and przyczłapnik, these two words are never used except for that joke but their spelling makes 100% sense and if someone heard the joke for the first time they could write it down too, no problem.
However, the Turkic languages don't have a good start in any of these three, yet they're all written in one of them with a large number of diacritics.
From the linguistic point of view I posit, your primary concern in designing an orthography is: does this line up nicely with the phonology? If it does, congrats, your literacy rate just went up. Every digraph you add is another exception you have to explain, every sound with two letters is another exception you have to learn. My son is in kindergarten. He wrote "KUMIN" on a piece of paper and hung it on the door the other day. I asked him what it said, he told me "It says 'come in'". Was that obvious to you? It wasn't obvious to me.
So, in the grand scheme of things, I know a new Turkic orthography is probably a long shot. But wait, there is already a Turkic language with similar phonology: Turkish. Did they take the Turkish orthography and modify that? Not really. They had to come up with yet another g diacritic.
I hope they do normalize the spelling and make it phonetic. Then at least, they may improve literacy, assuming it wasn't as phonetic as it could be under the old alphabet. Because otherwise, what are they getting for what they're spending this huge amount of money replacing books, retraining teachers, fixing software, etc.?