Linux: setlocale: LC_ALL: cannot change locale (en_US.utf8) and Cyrillic symbols

By | 02/04/2021
 

Locales is a set of environment variables that are used to determine how to display data and time (for example, first of the week), symbols encoding (for example, how to display cyrillic symbols), default files order when one executing the ls command, and so on.

Those variables are:

  • LANG: Determines the default locale in the absence of other locale related environment variables
  • LANGUAGE: List of fallback message translation languages
  • LC_CTYPE: Character classification and case conversion
  • LC_NUMERIC: Numeric formatting
  • LC_TIME: Date and time formats
  • LC_COLLATE: Collation (sort) order
  • LC_MONETARY: Monetary formatting
  • LC_MESSAGES: Format of interactive words and responses
  • LC_PAPER: Default paper size for region
  • LC_NAME: Name formats
  • LC_ADDRESS: Convention used for formatting of street or postal addresses
  • LC_TELEPHONE: Conventions used for representation of telephone numbers
  • LC_MEASUREMENT: Default measurement system used within the region
  • LC_IDENTIFICATION: Metadata about the locale information
  • LC_RESPONSE: Determines how responses (such as Yes and No) appear in the local language (not in use by Debian GNU/Linux but Ubuntu)
  • LC_ALL: Overrides all other locale variables (except LANGUAGE)

Locale and Cyrillic symbols

For example, when running a vifm tool in the KDE Konsole on my Arch Linux – cyrillic symbols are not displayed correctly:

And after exiting from the vifm terminal produces an error message:

[simterm]

$ vifm
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8)

[/simterm]

Generate Locale

Check the  /usr/lib/locale/ where already generated locales must be stored – nothing here now:

[simterm]

$ ll /usr/lib/locale/
total 0

[/simterm]

Or you can check by calling the locale -a to display locales that are available in the system now:

[simterm]

$ locale -a        
C
POSIX

[/simterm]

Now, to add a new locale edit the locales.gen – find en_US.UTF-8:

[simterm]

$ cat /etc/locale.gen | grep en_US.UTF-8
#  en_US.UTF-8 UTF-8
#en_US.UTF-8 UTF-8

[/simterm]

Uncomment the en_US.UTF-8 UTF-8string and run locales generator:

[simterm]

$ sudo locale-gen
/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8)
Generating locales...
  en_US.UTF-8... done
Generation complete.

[/simterm]

Check the directory again:

[simterm]

$ file /usr/lib/locale/locale-archive 
/usr/lib/locale/locale-archive: locale archive 11 strings

[/simterm]

And locale -a:

[simterm]

$ locale -a        
C
en_US.utf8
POSIX

[/simterm]

Or by using the localedef utility:

[simterm]

$ localedef --list-archive
en_US.utf8

[/simterm]

And run the vifm again:

Locales parameters

You can check various parameters for a specific locale by calling locale -k:

[simterm]

$ locale -k LC_TIME 
abday="Sun;Mon;Tue;Wed;Thu;Fri;Sat"
day="Sunday;Monday;Tuesday;Wednesday;Thursday;Friday;Saturday"
...
first_weekday=1
first_workday=2

[/simterm]

Locales are described in a corresponding file in the /usr/share/i18n/locales/ catalog which is used during locale-gen. For example, for the en_US its LC_TIME locale is described as:

LC_TIME
abday   "Sun";"Mon";"Tue";"Wed";"Thu";"Fri";"Sat"
day     "Sunday";/
        "Monday";/
        "Tuesday";/
        "Wednesday";/
        "Thursday";/
        "Friday";/
        "Saturday"

week 7;19971130;1
abmon   "Jan";"Feb";/
        "Mar";"Apr";/
        "May";"Jun";/
        "Jul";"Aug";/
        "Sep";"Oct";/
        "Nov";"Dec"
mon     "January";/
        "February";/
        "March";/
        "April";/
        "May";/
        "June";/
        "July";/
        "August";/
        "September";/
        "October";/
        "November";/
        "December"
% Appropriate date and time representation (%c)
d_t_fmt "%a %d %b %Y %r %Z"
%
% Appropriate date representation (%x)
d_fmt   "%m//%d//%Y"
%
% Appropriate time representation (%X)
t_fmt   "%r"
%
% Appropriate AM/PM time representation (%r)
t_fmt_ampm "%I:%M:%S %p"
%
% Appropriate date and time representation for date(1).  This is
% different from d_t_fmt for historical reasons and has been different
% since 2000 when date_fmt was added as a GNU extension.  At the end
% of 2018 it was adjusted to use 12H time (bug 24046) instead of 24H.
date_fmt "%a %b %e %r %Z %Y"
%
% Strings for AM/PM
%
am_pm   "AM";"PM"
END LC_TIME

And for the ru_RU its LC_TIME has the first_weekday option set, see man 5 locale, which specifies the first day of the week – Monday (which is second in the list):

...
week 7;19971130;1
first_weekday 2
END LC_TIME
...

To set Monday as the first day of the week, set the LC_TIME as ru_RU.UTF-8 in the /etc/locale.conf file. At first, uncomment it in the /etc/locale.gen and then execute the locale-gen:

LANG=en_US.UTF-8
LC_TIME=ru_RU.UTF-8

Or by using the localectl tool:

[simterm]

$ localectl set-locale LC_TIME=ru_RU.UTF-8

[/simterm]

Useful links