Value of the HTML5 lang attribute
- by user359650
I'm working on a website which will offer localized content following the language+region approach as described on this W3.org page (e.g. fr-CA for Canadian French content, and fr-FR for "French French" content). As we consider content for each language+region to be unique, it is crucial to us that search engines properly identify and serve the content accordingly.
By looking up on the Internet (e.g. this question), it appears that most people recommend the use of an ISO639 language code in the HTML lang attribute to describe the content language. Following this recommendation, we would en up using <html lang="fr"> which wouldn't enable the differentiation between the aforementioned language+region combinations.
When reviewing the HTML4 specification, it seems that using language+region as a language code would be perfectly OK, as the en-US example is given as one possible value. However I couldn't find any confirmation of this in the HTML5 specification which doesn't seem to provide any example as to the possible allowed values.
From there I tried to get a de facto answer by looking at what the web giants are doing. I looked at what Facebook are doing: they offer Candian French and French French versions of their websites with (slightly) different content, whilst the HTML lang value remains the same:
fr-CA
URL: http://fr-ca.facebook.com
HTML lang attribute: <html lang="fr">
translation of the word 'email': courriel
fr-FR
URL: http://fr-fr.facebook.com/
HTML lang attribute: <html lang="fr">
translation of the word 'email': Adresse électronique
Q: What is the recommended/standard way of describing content that was localized using the language+region approach in HTML5 ?