It’s unfortunate but true that getting universal support for the utf-8 encoding in Linux is a bit of a pain. Here are some notes on how you can get it done for the most part.
- Ensure that the locale data that comes with GNU libc is installed. This comes as part of the glibc-common package on Red Hat or Fedora and locales package on Debian.
- Ensure that your locale has a utf-8 character set. This can be done system-wide by setting the LANG parameter in /etc/sysconfig/i18n on Red Hat and derivatives. An alternative is to set the LANG and LANGUAGE environment variables in /etc/profile, /etc/csh.cshrc, /etc/zshenv and so on.
- Set the appropriate encoding in any display manager if need be, e.g. /etc/gdm/gdm.conf.
- For a user-specific setting, set the environment variables LANG and LANGUAGE to something like en_US.UTF-8 or similar in your .profile, .bash_profile, .cshrc, .zshrc or similar. (LANG is honored by glibc-based console applications, while LANGUAGE is used by GNOME applications.)
- If you’re stuck on a system where the administrator hasn’t installed locale information, obtain it from a machine where it is installed; look under /usr/lib/locale and /usr/share/locale. Copy these items to, say, ${HOME}/locale/lib/locale and ${HOME}/locale/share/locale respectively, and then set the environment variables LOCPATH=${HOME}/locale/lib/locale and NLSPATH=${HOME}/locale/share/locale/%L/LC_MESSAGES/%N. Keep in mind that these environment variables are not inherited by subshells, so you will probably want to place their definitions in .bashrc over .bash_profile or .cshrc over .login or .zshrc over .zlogin.
- To switch the Linux console over to utf-8 mode, place the command unicode_start in /etc/rc.d/rc.local or /etc/rc.local or an initscript in /etc/rcS.d or similar.
- To switch xterm over to utf-8, place XTerm*locale: UTF-8 in your ${HOME}/.Xresources or ${HOME}/.Xdefaults or similar.
- gnome-terminal and konsole both have settings in their configuration dialog boxes to set them for UTF-8 operation.
- For screen, you need to set its default encoding to UTF-8, as well as tell it that the terminal is capable of UTF-8. Normally this is done by running it as screen -U. However, for convenience, you can add these lines to your .screenrc; however, keep in mind that if you subsequently start screen from a non-Unicode terminal, screen will still think your terminal is Unicode-capable, with poor results.
defencoding utf8
defutf8 on
screen
utf8 on on
encoding utf8
- For irssi, you can /set term_type utf8 and then /save. You will need to restart irssi to get Unicode support.
Beyond this point, settings are mostly application-specific.
I like my shells to do tab completion in a certain way that I’ve only seen certain shells do right (tcsh and zsh come to mind). In particular, I’ve always found the readline-based completion of bash to be incredibly annoying in the way its default-configured responses are inconsistent — if there is a single completion available, it will complete as much as possible and then stop; if there are multiple completions, it will do nothing but beep (an annoying response in itself — you were expecting completions but got an annoying nonproductive result instead), and you need to tab a second time before you get useful information.
The way to get the sane behavior back is to create an .inputrc:
set completion-ignore-case On
set match-hidden-files On
set show-all-if-ambiguous On
set show-all-if-unmodified On
set bell-style visible
This not only silences the ever-annoying bell, but also makes it so that a single tab will complete as much as it can; if there’s no more to complete, it’ll list the available completions instead of sitting there annoying you, waiting for a second tab.
Additionally, I like a few more settings in the .bashrc:
# Make it so that users outside the group to which your user belongs
# won't be able to open your files.
umask 027
# A color scheme for GNU ls I like that actually doesn't look like complete crap
# on a dark background (yellow instead of blue for directories, for instance).
LS_COLORS='no=00:fi=00:di=33:ln=36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=\
01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01\
;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=35:*.tgz=32:*.arj=32:*.taz=32:*.lzh\
=32:*.zip=32:*.z=32:*.Z=32:*.gz=32:*.bz2=32:*.bz=32:*.tz=32:*.rpm=32:*.cpio=32:*\
.jpg=35:*.gif=35:*.bmp=35:*.xbm=35:*.xpm=35:*.png=35:*.tif=35:*.jpeg=35:'
export LS_COLORS
if [ "${PS1}" ]; then
.
.
.
# This enables some spell-checking when doing a cd
shopt -s cdspell
# This smoothes resolution of executable paths
shopt -s checkhash
# This improves the capabilities of filename globbing
shopt -s extglob
# This will complete hostnames(!) after a @. Useful for ssh and scp.
shopt -s hostcomplete
# If you don't want to leave a trail of commands you've typed, do this.
# It'll prevent .bash_history from being updated.
unset HISTFILE
# I like to have a large history of commands I've typed available.
HISTSIZE=4096
# This bit of logic will set a prompt. It'll see if the terminal is
# one that can have the titlebar set; if so, it'll set the prompt
# to update the titlebar with the current hostname and username.
if [ $TERM != "xterm" -a $TERM != "xterm-color" -a $TERM != "rxvt" -a $TERM != "screen" ]; then
PS1="[ \[\e[36;1m\]\u\[\e[m\] @ \[\e[32;1;4m\]\h\[\e[m\] | \
[\e[33;1m\]\w\[\e[m\] ] "
else
PS1="\[\e]0;\h (\u)\a\][ \[\e[36;1m\]\u\[\e[m\] @ \[\e[32;1;
4m\]\h\[\e[m\] | \[\e[33;1m\]\w\[\e[m\] ] "
fi
fi
In all, it makes the bash experience a lot more palatable.
I recently moved to a two-bedroom place that’s a whole lot better than I’ve been staying until recently. However, I haven’t got any furniture yet, save for a desk and a chair, so I’ve taken to sleeping on my comforter for now. Hopefully the situation will be resolved sometime this weekend. I also need to get myself a fridge and probably a dining table with a few chairs. I also need to get my phone line transferred over so I have Internet access at home. There’s also a poster on the bulletin board on the ground floor for cable modem service that’s actually much cheaper than what I pay now for telephone service + dialup; I might just drop the phone line and get that instead if the service proves to be decent. (“Broadband” here is synonymous with “always on”; it usually doesn’t actually say anything about the bandwidth available. Most “broadband” here happens to be 64 kbit/s service, which is hardly an improvement over dialup. And yes, it’s also metered — usually by quantity transferred, and frequently by amount of time spent connected.)
We’ll see how things go in the coming weeks. I’m considering getting immigration to Canada. Evidently it’s a bit of a pain to get it, but it’s certainly easier than the procedure followed in other countries. One of the things I need to get done is to get no-objection certificates from the police under whose jurisdiction I’ve stayed for longer than six months over the past ten years. This would mean I need certificates from at least Dubai Police, Bangalore Police, Pittsburgh Police and Rockville Police.