Friday, April 20, 2012

Fun with UTF-8

I finally decided to configure my computers for UTF-8, mostly due to the frustration of attempting to determine which Finnish character should have been displayed (and the probable annoyance of Finns having their names mangled by my mail client.)
  • /etc/locale.gen is useful for generating only the locales you actually need. I selected English (US and UK), Finnish, and Swedish as UTF-8, ISO-8859-1, and ISO-8859-15 (where appropriate.)
  • .bashrc: export LANG=en_US.UTF-8
  • Setup .Xdefaults:
XTerm*Background: Black
XTerm*Foreground: White
XTerm*UTF8: 2
XTerm*utf8Fonts: 2
Actually the UTF-8 options are not required due to the way I am launching terminals from my window manager, but it does not cause any damage. Black terminal background is mandatory!

The fun part was figuring out why my xterm, specified in .ratpoisonrc as bind c exec xterm -u8 did not launch a UTF-8 terminal. I then realized that Ratpoison was launching xterm from a bare-metal shell and the LANG environment variable was never set; quickly changing the binding to bind c exec xterm +lc -u8 (switch off automatic selection of encoding and respect my -u8 option) resolved the final issue.

Now I can happily read the ä's, ö's, and Å's from a language I do not understand (excluding a few keywords and phrases.) Cool!

No comments:

Post a Comment