
Windows10 Insider Preview Build 17035 Supports UTF-8 as ANSI - matarillo
https://twitter.com/matarillo/status/931050853347110912
======
captainmuon
There is a new checkbox in the legacy control panel under Region /
Administrative / Change system locale which says something like: "Beta: use
UTF-8 to support global languages" (I have the German version so I'm not sure
what the English label is.)

I really wonder what that does. This can't affect the Win32 wide API e.g.
GetWindowTextW will still return wide characters (UTF-16 / sometimes UCS-2).
Probably this sets the system codepage by default to 65001, so that if you
request a string "in codepage format" it will return UTF-8.

~~~
badsectoracula
I hope that this means that Microsoft finally made A functions (like
GetWindowTextA) be able to work directly with UTF-8. If this is the case and
it is possible for a process to say "give me UTF-8 regardless of global
codepage settings" then this can help a lot with portability since every other
C/C++ GUI uses UTF-8.

It is kind of way too late, but better late than never i suppose.

~~~
ygra
> every other C/C++ GUI uses UTF-8

Qt uses UTF-16, too, AFAIK.

------
rossy
> _The return value of GetACP() is also 65001_

Wow. I think that implies all the ANSI Windows APIs will be able to use UTF-8
as well. Native UTF-8 support is something that Windows developers have wanted
for years. I thought it was impossible because of some API limitation where it
assumed multibyte characters could only be three bytes long, but maybe this
assumption has been removed.

EDIT: So I just tested this. It does seem to affect the ANSI Windows APIs:
[https://0x0.st/siCS.png](https://0x0.st/siCS.png)

Unfortunately, this doesn't mean the average Win32 program will be able to
start using UTF-8 internally, since the ANSI codepage is a system-wide
setting, so you won't be able to opt-in to UTF-8 per-process (at least, I
don't think this is possible.)

------
bni
Will notepad.exe now also support unix line endings?

~~~
porfirium
This would be really neat. Thing is, as far as I remember, notepad.exe uses a
standard Windows control to display text, so that change would have to be
system-wide as well, which could break backwards compatibility.

------
daemin
This looks like just Notepad.exe supporting UTF-8 without having a byte order
mark at the start of the file.

~~~
rossy
It should affect more than just Notepad. One of the replies links to more
juicy details:
[https://srad.jp/story/17/11/14/0640253/](https://srad.jp/story/17/11/14/0640253/)

------
gigatexal
so will things like SQL Server output logs in UTF-8 instead of UTF-16 and
UTF-16LE?

~~~
ygra
Probably not. It might just make applications that never bothered to use the
Unicode Windows APIs to support Unicode. And there's still a lot of them. In
the past you could use CP65001 in some places to get sort-of UTF-8 support,
e.g. in batch files. However, there have been quite a few problems and bus
with that approach. Maybe they've just fixed those and added an option to use
CP65001 ad the legacy codepage.

------
yuhong
Of course, this will probably break VB6 apps for example and maybe even VBA.
Ideally this should have been done when NT 3.5 was released in 1994, but then
it might have been limited to 3 byte UTF-8.

