
CSS for Internationalisation - polm23
https://www.chenhuijing.com/blog/css-for-i18n/
======
strogonoff
Glad to see vertical writing covered. Wish it covered Ruby styling[0][1] and
markup[2] as well.

[0] MDN reference: [https://developer.mozilla.org/en-US/docs/Web/CSS/ruby-
positi...](https://developer.mozilla.org/en-US/docs/Web/CSS/ruby-position)

[1] W3C in-depth article:
[https://w3c.github.io/i18n-drafts/articles/ruby/styling.en.h...](https://w3c.github.io/i18n-drafts/articles/ruby/styling.en.html#ja_variants)

[2]
[https://www.w3.org/International/articles/ruby/markup.en](https://www.w3.org/International/articles/ruby/markup.en)

~~~
simongray
The author covers it in another article (2016):
[https://www.chenhuijing.com/blog/html-
ruby/#%F0%9F%9A%B2](https://www.chenhuijing.com/blog/html-ruby/#%F0%9F%9A%B2)

~~~
strogonoff
Thanks for linking! Vertical writing on the web is of huge interest to me.

------
chewxy
Sorry to be nitpicky, but this is wrong:

> Have you ever wondered how Chrome knows to ask you if you’d like a web
> page’s content to be translated? No? Okay, maybe it’s just me then. But it’s
> because of the lang attribute on the <html> element.

The lang attribute is a signal to whatever's inside Chrome that detects
language. Here's a simple counter example:
[https://twitter.com/chewxy/status/1253076770066010112](https://twitter.com/chewxy/status/1253076770066010112)

The image is an SVG generated by graphviz. With no lang attribute (or even a
HTML tag) Still Chrome thinks it's Luxembourgish.

Nonetheless, I appreciate the tip on the pseudoclass for lang.

~~~
saagarjha
Does Chrome ever _ignore_ the lang attribute?

~~~
bryanrasmussen
I don't know if it is the case nowadays, but I seem to recall that in the old
days they would check and if their model was highly predicting a language
different than the one you were claiming they would serve the language they
predicted.

Which, even if they don't still do this, I think makes total sense and is what
I would do because obviously programmers make mistakes and if your page claims
to be English but your language has a high chance of being Armenian and a low
chance of being English I would consider it was Armenian.

~~~
qilo
Google crawler apparently ignores "lang" attribute [0]:

> Google uses the visible content of your page to determine its language. We
> don’t use any code-level language information such as lang attributes, or
> the URL.

But what about the Chrome browser?

[0]
[https://support.google.com/webmasters/answer/182192?hl=en](https://support.google.com/webmasters/answer/182192?hl=en)

------
jzer0cool
I learned a lot from the content in case I need to do more localization/i18n
for future.

Nice easter egg, first time seeing emoji used in the url fragment. It also
loads a new one.

Edit: _As an idea_ to share, I wonder given the topic, whether different
emoji's can be localized too. An emoji can mean something different depending
on country.

------
robin_reala
One thing that’s been omitted is the :dir() pseudoclass[1] which has been
inexplicably ignored by Webkit and Blink. Yeah, it can sort of be replicated
by a descendant selector, but it’d be so much more obvious and selfcontained
to select an element based on its own calculated text direction; something
that’s currently only possible in Firefox.

[1] [https://www.w3.org/TR/selectors/#the-dir-
pseudo](https://www.w3.org/TR/selectors/#the-dir-pseudo)

------
memco
The article touches on having properties like borders and margins that
accommodate the language, but all the examples are manually calculated. I
recently saw a talk on Youtube that mentioned that there is (or is coming?)
support for margin-start/end kind of syntax that will allow browsers to handle
re-orientation of box properties depending on the language. Sadly, can't find
it. It was by a pair of people from the Chrome and Edge teams updating on some
new features coming. Obviously browser support and awareness will take time to
normalize these patterns, but it will help remove the need for many of these
manual considerations, which means that support for internationalization
should improve in the coming years.

~~~
chrismorgan
I’m confused. The feature you describe is logical properties, which is exactly
what the section you’re talking about is about. I don’t know what you mean by
“manually calculated”.

~~~
memco
Indeed! I guess I was confused by the organization of the example[0] for that
section. The physical section being first obscures all the examples of the
logical block style so all you see is variations of:

    
    
      border-top-color: tomato;
      border-right-color: limegreen;
      border-bottom-color: dodgerblue;
      border-left-color: gold;
    

Looking back, my comment was more due to my own inability to visualize the
simplicity of the logical properties as presented vs. what I've seen in other
demonstrations and realizing that I lost sight of a resource that I think
would be a helpful reference for myself as this new method becomes
commonplace.

[0]:
[https://codepen.io/huijing/pen/XWmKByZ](https://codepen.io/huijing/pen/XWmKByZ)

------
jbverschoor
I was actually expecting to see i18n translations with lang. Something like:

    
    
      .i18n-hello:lang(:en) {
        content: "Hello";
      }
      
      .i18n-hello:lang(:nl) {
        content: "Hallo";
        }
      
      <span class="i18n-hello">hello label</span>
    
    

That doesn't work, so you'd have to use ::after, but that requires some more
verbose styling.

~~~
angrais
But this mixes content and style, so would be difficult to update (for non
technical users etc.). Besides, it's fairly bad practice and won't scale to
sentences well.

------
seanwilson
> Have you ever wondered how Chrome knows to ask you if you’d like a web
> page’s content to be translated?

How difficult is it to automatically detect the language? Probably naive but
how far can you get by counting how many common words on the page are from a
particular language (e.g. "the of then because I" for English and "je pour des
les" for French)?

~~~
mjlee
Google translate already has an auto-detect language feature.

I suspect it would be reasonably simple to add a "Translate this" button
somewhere in Chrome (perhaps it's already there, I don't have Chrome installed
on this machine.)

Perhaps automatically doing it for every site would be a little bit intrusive
on the privacy front.

~~~
seanwilson
> Google translate already has an auto-detect language feature.

Yep, I'm asking how hard it is to build a quick and simple auto-detect
feature. Is it harder than it looks?

~~~
mjlee
It kind of depends what you mean by build. From scratch - I have no idea. For
Google Chrome? It's just an API call away -
[https://cloud.google.com/translate/docs/basic/detecting-
lang...](https://cloud.google.com/translate/docs/basic/detecting-language)

------
amelius
I just wish CSS offered a simple way to fix the aspect ratio of an arbitrary
element.

~~~
frosted-flakes
That's coming! It's not implemented in any browsers yet, but it should be
soon. It was introduced in "CSS Intrinsic & Extrinsic Sizing Module Level 4".

It will use the property "aspect-ratio" with values like "16/9" or "1/1".

MDN: [https://developer.mozilla.org/en-US/docs/Web/CSS/aspect-
rati...](https://developer.mozilla.org/en-US/docs/Web/CSS/aspect-ratio)

Article about it by Rachel Andrew in Smashing Magazine:
[https://www.smashingmagazine.com/2019/03/aspect-ratio-
unit-c...](https://www.smashingmagazine.com/2019/03/aspect-ratio-unit-css/)

~~~
Tomte
You can set the width and height attribute (HTML attribute, not CSS property)
today.

Recent Chrome and Firefox use that to determine aspect ratio for jank-free
loading.

~~~
frosted-flakes
For images and iframes, but I don't think that works for any element. In any
case, hardcoding that kind of information in the HTML is not ideal.

------
cryptonector
Very good writing, and very informative. I especially enjoyed the gif of
traditional vs. simplified Han.

------
jzer0cool
In modern browsers, whether or not lang is provided, the browser is smart
enough to set the correct encoding / glyphs.

In the olden times, browser was not smart (ie: still set the wrong charset).
You may have come across for example, an asian site (japanese, korean,
chinese), and see a lot of text with "?"'s and "□"'s like so ??? □□□ in 90's
and common still through beginning 2000's. Glad these days you don't have to
spend time trying to match it up, but it was available readily in browser's
menu for you to match.

I think here, the lang tag is useful so you can explicitly tell the browser.
And if you are designing the page, you have more control for localization to
target by the international code.

~~~
Flimm
I think you are confusing language (English, French, Arabic, etc) with
encoding (ASCII, UTF-8, UTF-16, Latin1, etc).

You do sometimes see mojibake in web pages, (question marks and □□ in place of
the real text). These are caused by incorrect encodings. The web server tells
the browser which encoding to use using the HTTP header Content-Type or using
the <meta charset="UTF-8"> HTML element. You should always set the encoding
rather relying on the browser guessing.

~~~
jzer0cool
I was referencing asian languages as one set of examples for encoding. Of
course setting the encoding is important (whether html or within a program
dealing with strings - I had my share of hard bugs only to realize just that,
another story)!

Here I was just sharing my encounters of the browser rendering ?? and □□
character marks, because it did not know. The browser had a "Auto Detect"
charset mode, so one always had to toggle (and remember to revert back again
when viewing another page). For the very reason you have indicated, "should
always set", but in those days (and maybe even today), not always set.

Again, just a memory sharing.

