[ Content | Sidebar ]

Archives for linux

Restoring the old Mu4e message view look and feel

October 30th, 2022

I’ve used mu4e as an email client for several years now and by and large I’m happy with it. However one big change with the 1.8 release was the switch from the “old” message view to the new one based on Gnus article mode. Functionally it’s fine but I did prefer the look and feel of the old message view. Despair not, as Gnus is sufficiently customisable that we can tweak it to look almost the same.

First we do the easy bit which is just setting the font-lock highlighting to match the old message view:

(use-package gnus
  :custom-face
  (gnus-signature ((t (:inherit font-lock-comment-face))))
  (gnus-header-name ((t (:inherit message-header-name :weight bold))))
  (gnus-header-from ((t (:inherit font-lock-variable-name-face))))
  (gnus-header-subject ((t (:inherit font-lock-type-face))))
  (gnus-header-content ((t (:inherit font-lock-type-face))))
  (gnus-cite-attribution ((t (:inherit default))))
  ...)

(I’m using the excellent use-package.)

These aren’t strictly the same as the original message view, but I prefer them to Gnus default of multiple subtly different shades of blue:

(gnus-cite-1 ((t (:foreground "light salmon"))))
(gnus-cite-2 ((t (:foreground "turquoise"))))
(gnus-cite-3 ((t (:foreground "light goldenrod"))))
(gnus-cite-4 ((t (:foreground "chartreuse2"))))

Mu4e used to highlight some of the header fields in different faces:

:config
(add-to-list 'gnus-header-face-alist '("To" nil font-lock-variable-name-face))
(add-to-list 'gnus-header-face-alist '("Reply-To" nil font-lock-variable-name-face))
(add-to-list 'gnus-header-face-alist '("Cc" nil font-lock-variable-name-face))

Restore the previous sort order for header fields:

(setq gnus-sorted-header-list '("^From:" "^To:" "^Reply-To:" "^Cc:" "^Subject:"
                                "^Flags:" "^Summary:" "^Keywords:" "^Newsgroups:"
                                "^Followup-To:" "^Date:" "^Organization:"))

Highlight the >, >>, etc. at the start of quote lines in addition to the quoted text itself (this really bothered me for some reason):

(defun filter-gnus-cite-args (args)
  "Replace PREFIX argument with the empty string."
  (setf (cadr args) "")
  args)
 
(advice-add 'gnus-cite-add-face :filter-args 'filter-gnus-cite-args)

Filed in linux - Comments closed

MG: a simple lightweight Emacs clone

February 27th, 2022

For a while now I’ve been searching for a simple lightweight text editor to use when editing configuration files as root and when SSH-ing to remote machine. My normal editor is GNU Emacs with 15+ years of accumulated baggage which makes it pretty slow to start up. I’ve tried the various workarounds like the Emacs daemon and TRAMP but it still feels like a lot of faff compared to just SSH and starting an editor. I’ve also tried nano and vi, but my Emacs muscle memory makes them too annoying to use.

Recently I’ve settled on mg, a venerable Emacs clone that’s maintained as part of the OpenBSD base system and also available on Linux.

It starts up instantly and with the minor configuration tweaks below is pretty ergonomic.

global-set-key "\^z" undo
global-set-key "\^?" delete-backward-char
global-set-key "\^h" delete-backward-char
global-set-key "\e[1;3C" forward-word
global-set-key "\e[1;3D" backward-word
global-set-key "\e[1;5C" forward-word
global-set-key "\e[1;5D" backward-word
 
auto-execute *.c c-mode
auto-execute *.h c-mode
 
make-backup-files 0
set-default-mode indent
set-fill-column 72

Filed in linux - Comments closed

How I Accidentally Discovered Gnome’s Emoji Keyboard

October 31st, 2021

After I updated Gnome recently I was alarmed to discover my usual C-. key binding in Emacs was broken and also now stealing the keyboard input. Turns out it’s Gnome/Ibus’s emoji input method which was previously bound by default to C-Shift-e and I was completely unaware of. It’s quite fun! (You can change the shortcut key to whatever you want in ibus-setup.)

Filed in linux - Comments closed

Generating perf maps with OpenJDK 17

February 28th, 2021

Linux perf is a fantastically useful tool for all sorts of profiling tasks. It’s a statistical profiler that works by capturing the program counter value when a particular performance event occurs. This event is typically generated by a timer (e.g. 1kHz) but can be any event supported by your processor’s PMU (e.g. cache miss, branch mispredict, etc.). Try perf list to see the events available on your system.

A table of raw program counter addresses isn’t particularly useful to humans so perf needs to associate each address with the symbol (function) that contains it. For ahead-of-time compiled programs and shared libraries perf can look this up in the ELF symbol table on disk, but for JIT-compiled languages like Java this information isn’t available as the code is generated on-the-fly in memory.

Let’s look at what perf top reports while running a CPU-intensive Java program:

Samples: 136K of event 'cycles', 4000 Hz, Event count (approx.): 57070116973 lost: 0/0 drop: 0/0
Overhead  Shared Object                       Symbol
  16.33%  [JIT] tid 41266                     [.] 0x00007fd733e40ec0
  16.15%  [JIT] tid 41266                     [.] 0x00007fd733e40e3b
  16.14%  [JIT] tid 41266                     [.] 0x00007fd733e40df1
  16.14%  [JIT] tid 41266                     [.] 0x00007fd733e40e81
   2.80%  [JIT] tid 41266                     [.] 0x00007fd733e40df5
   2.62%  [JIT] tid 41266                     [.] 0x00007fd733e40e41
   2.45%  [JIT] tid 41266                     [.] 0x00007fd733e40ec4
   2.43%  [JIT] tid 41266                     [.] 0x00007fd733e40e87

Perf marks these locations as [JIT] because the addresses are in part of the process’s address map not backed by a file. Because the addresses are all very similar we might guess they’re in the same method, but perf has no way to group them and shows each unique address separately. Apart from that it’s not very helpful for figuring out which method is consuming all the cycles.

As an aside, it’s worth briefly comparing perf’s approach, which samples the exact hardware instruction being executed when a PMU event occurs, with a traditional Java profiler like VisualVM which samples at the JVM level (i.e. bytecodes). A JVM profiler needs to interrupt the thread, then record the current method, bytecode index, and stack trace, and finally resume the thread. Obviously this has larger overhead but there is a deeper problem: JIT-ed code cannot be interrupted at arbitrary points because the runtime may not be able to accurately reconstruct the VM state at that point. For example one cannot inspect the VM state halfway through executing a bytecode. So at the very least the JIT-ed code needs to continue executing to the end of the current bytecode. But requiring the VM to be in a consistent state at the end of each bytecode places too many restrictions on an optimising JIT. Therefore the optimised code can typically only be interrupted at special “safepoints” inserted by the JIT – in Hotspot this is at method return and loop back-edges. That means that a JVM profiler can only see the thread stopped at one of these safepoints which may deviate from the actual hot parts of the code, sometimes wildly. This problem is known as safepoint bias.

So a hardware profiler can give us better accuracy, but how to translate the JIT-ed code addresses to Java method names? Currently there are at least two tools to do this, both of which are implemented as JVMTI plugins that load into a JVM process and then dump some metadata for perf to use.

The first is the “jitdump” plugin that is part of the Linux perf tree. After being loaded into a JVM, the plugin writes out all the machine code and metadata for each method that is JIT compiled. Later this file can be combined with recorded profile data using perf inject --jit to produce an annotated data file with the correct symbol names, as well as a separate ELF shared object for each method allowing perf to display the assembly code. I use this often at work when I need to do some detailed JIT profiling, but the offline annotation step is cumbersome and the data files can be 100s of MB for large programs. The plugin itself is complex and historically buggy. I’ve fixed several of those issues myself but wouldn’t be surprised if there are more lurking. This tool is mostly overkill for typical Java developers.

The second tool that I’m aware of is perf-map-agent. This generates a perf “map” file which is a simple text file listing start address, length, and symbol name for each JIT compiled method. Perf will load this file automatically if it finds one in /tmp. As the map file doesn’t contain the actual instructions it’s much smaller than the jitdump file, and doesn’t require an extra step to annotate the profile data so it can be used for live profiling (i.e. perf top). The downsides are that profile data is aggregated by method so you can’t drill down to individual machine instructions, and the map can become stale with a long-running VM as methods can be unloaded or recompiled. You also need to compile the plugin yourself as it’s not packaged in any distro and many people would rightly be wary of loading untrusted third-party code which has full access to the VM. So it would be much more convenient if the VM could just write this map file itself.

OpenJDK 17, which should be released early next month, has a minor new feature contributed by yours truly to do just this: either send the diagnostic command Compiler.perfmap to a running VM with jcmd, or run java with -XX:+DumpPerfMapAtExit, and the VM will write a perf map file that can be used to symbolise JIT-ed code.

$ jps
40885 Jps
40846 PiCalculator
$ jcmd 40846 Compiler.perfmap
40846:
Command executed successfully
$ head /tmp/perf-40846.map 
0x00007ff6dbe401a0 0x0000000000000238 void java.util.Arrays.fill(int[], int)
0x00007ff6dbe406a0 0x0000000000000338 void PiSpigout.calc()
0x00007ff6dbe40cc0 0x0000000000000468 void PiSpigout.calc()
0x00007ff6dbe41520 0x0000000000000138 int PiSpigout.invalidDigitsControl(int, int)
0x00007ff6d49091a0 0x0000000000000110 void java.lang.Object.<init>()
0x00007ff6d4909560 0x0000000000000350 int java.lang.String.hashCode()
0x00007ff6d4909b20 0x0000000000000130 byte java.lang.String.coder()
0x00007ff6d4909ea0 0x0000000000000170 boolean java.lang.String.isLatin1()

Here we can see for example that the String.hashCode() method starts at address 0x00007ff6d4909560 and is 0x350 bytes long. Let’s run perf top again:

Samples: 206K of event 'cycles', 4000 Hz, Event count (approx.): 94449755711 lost: 0/0 drop: 0/0
Overhead  Shared Object                         Symbol
  78.78%  [JIT] tid 40846                       [.] void PiSpigout.calc()
   0.56%  libicuuc.so.67.1                      [.] icu_67::RuleBasedBreakIterator::handleNext
   0.34%  ld-2.31.so                            [.] do_lookup_x
   0.30%  libglib-2.0.so.0.6600.3               [.] g_source_ref
   0.24%  libglib-2.0.so.0.6600.3               [.] g_hash_table_lookup
   0.24%  libc-2.31.so                          [.] __memmove_avx_unaligned_erms
   0.20%  libicuuc.so.67.1                      [.] umtx_lock_67

This is much better: now we can see that 79% of system-wide cycles are spent in one Java method PiSpigout.calc().

Alternatively we can do offline profiling with perf record:

$ perf record java -XX:+UnlockDiagnosticVMOptions -XX:+DumpPerfMapAtExit PiCalculator 50
3.1415926535897932384626433832795028841971693993750
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (230 samples) ]
$ perf report
Samples: 340  of event 'cycles', Event count (approx.): 71930116
Overhead  Command          Shared Object       Symbol
  11.83%  java             [JIT] tid 44300     [.] Interpreter
   4.19%  java             [JIT] tid 44300     [.] flush_icache_stub
   3.87%  java             [kernel.kallsyms]   [k] mem_cgroup_from_task
...

Here the program didn’t run long enough for any method to be JIT compiled and we see most of the cycles are spent in the interpreter. (By the way, the interpreter still looks like JIT-ed code to perf because its machine code is generated dynamically by Hotspot at startup.)

I’d be interested to hear from anyone who tries this new feature and finds it useful, or otherwise!

YPbPr Mode vs RGB

February 26th, 2021

I was fiddling with my monitor settings today (Dell U2415) and noticed the “Input Color Format” was set to “YPbPr” instead of “RGB”. This is a compressed colour space where the chroma channel has half the resolution of the luminance channel. Normally this would be used for TVs or video encoding rather than a PC monitor. That said I’ve been using it this way for two years without noticing…

The problem is Dell monitors advertise this mode along with RGB in their HDMI EDID. The driver for my AMD graphics card see this and prefers it over RGB with no way to override the selection. There is one creative solution I found which involves patching a local copy of the EDID and telling the driver to load that from disk rather than reading it from the monitor. I took the simpler option of spending a few quid on a DisplayPort cable which only supports RGB.

The result? Fonts look a bit sharper… maybe… but it’s hard to tell.

Best Shell Prompt Colour Scheme

December 19th, 2020

It can be agonizing to pick a good colour scheme for your shell prompt. Especially when you have 256 or more colours to pick from. So rather than waste my time I decided to embrace serendipity and have my shell pick a random colour when it starts. The results are rather pleasing, as you can see below, and if I don’t like a particular colour then it will only last as long as that particular shell.

It also helps to visually distinguish different windows that are being used for different tasks, and root shells are coloured an alarming shade of red. Just pop the following in your .bashrc.

PS1=${SSH_CLIENT:+$(hostname -s):}'\w \$ '
case "$TERM" in
  *-256color)
    if [ "$UID" = 0 ]; then
      color=196   # Red
    else
      color=$((16+(36*(1+RANDOM%5))+(6*(1+RANDOM%5))+(1+RANDOM%5)))
    fi
    PS1='\[\033[1m\033[38;5;'$color'm\]'$PS1'\[\033[00m\]'
    ;;
  *-color)
    if [ "$UID" = 0 ]; then
      color=31   # Red
    else
      color=$((31+RANDOM%8))
    fi
    PS1='\[\033[1m\033['$color'm\]'$PS1'\[\033[00m\]'
    ;;
esac
unset color

For *-256color terminals the codes above 36 are a 6x6x6 RGB colour cube. This script avoids darker colours but you can tweak it to your liking. Most modern terminals also support a true colour escape sequence giving full 24-bit colour, but 120 different shades is surely enough for anyone.

Filed in linux - Comments closed

SIGPIPE and how to ignore it

September 23rd, 2020

I recently found myself trying to port a program that uses Boost Asio to run on OpenBSD. Everything compiled OK but while running it would occasionally exit with an unhandled SIGPIPE signal. This doesn’t happen on Linux. What’s going on here?

SIGPIPE is a synchronous signal that’s sent to a process (thread in POSIX.1-2004) which attempts to write data to a socket or pipe that has been closed by the reading end. Importantly it’s not an asynchronous signal that notifies you when the reading end has been closed: it’s delivered only when you attempt to write data. In fact it’s generated precisely when the system call (write(2), sendmsg(2), etc.) would fail with EPIPE and doesn’t give any additional information.

So what’s the point then? The default action for SIGPIPE is to terminate the process without a core dump (just like SIGINT or SIGTERM). This simplifies error handling in programs that are meant to run as part of a shell pipeline: reading input, transforming it, and then writing it to another process. SIGPIPE allows the program to skip error handling and blindly write data until it’s killed.

For programs that handle write errors it doesn’t seem to be useful and is best avoided. But unfortunately there are several different ways to do that.

Ignore the signal globally

This is the easiest if you are in complete control of the program (i.e. not writing a library). Just set the signal to SIG_IGN and forget about it.

signal(SIGPIPE, SIG_IGN);

Use MSG_NOSIGNAL

If you are writing to a socket, and not an actual pipe, pass the MSG_NOSIGNAL flag to send(2) or sendmsg(2). This has been in Linux for ages and was standardised in POSIX.1-2008 so it’s available almost anywhere.

Set SO_NOSIGPIPE socket option

This is a bit niche as it only exists on FreeBSD and OS X. Use setsockopt(2) to set this option on a socket and all subsequent send(2) calls will behave as if MSG_NOSIGNAL was set.

int on = 1;
setsockopt(s, SOL_SOCKET, SO_NOSIGPIPE, &on, sizeof(on))

This seems to be of limited utility as calling write(2) on the socket will still generate SIGPIPE. The only use I can think of is if you need to pass the socket to a library or some other code you don’t control.

Temporarily mask the signal on the current thread

The most general solution, for when you are not in full control of the program’s signal handling and want to write data to an actual pipe or use write(2) on a socket, is to first mask the signal for the current thread with pthread_sigmask(3), write the data, drain any pending signal with sigtimedwait(2) and a zero timeout, and then finally unmask SIGPIPE. This technique is described in more detail here. Note that some systems such as OpenBSD do not have sigtimedwait(2) in which case you need to use sigpending(2) to check for pending signals and then call the blocking sigwait(2).

Anyway back to the original problem. Asio hides SIGPIPE from the programmer by either setting the SO_NOSIGPIPE socket option on systems that support it, or on Linux by passing MSG_NOSIGNAL to sendmsg(2). None of these apply to OpenBSD which is why we get the SIGPIPE. I submitted a pull request to pass MSG_NOSIGNAL on OpenBSD as well. But I don’t know when or if that will be merged so I’m also trying to get the same fix added to the ports tree.

UPDATE: a patch is now in the ports tree.

OpenSMTPD: use SSL client certificate when relaying outgoing mail

September 13th, 2020

I recently set up OpenSMTPD as the MTA on my local machine. I want to relay outgoing mail through another mail server on my VPS which is configured to only accept SSL connections with valid client certificates.

It’s not clear from the documentation how to configure this in smtpd.conf. However I eventually found from the source code that the “relay” action accepts a “pki” option to specify a certificate and key file.

action "outbound" relay host smtps://user@mail.example.org \
	auth <secrets> pki host.example.org mail-from "@example.org"

My mail server requires a username and password in addition to the client certificate so a “secrets” table should also be configured:

table secrets file:/etc/mail/secrets

And finally add a “pki” stanza for host.example.org to associate the X.509 certificate and private key:

pki host.example.org cert "/etc/ssl/example.crt"
pki host.example.org key "/etc/ssl/private/example.key"

UPDATE: this is documented in the man page now. :D

Filed in linux - Comments closed

My first Linux “kernel” patches

June 7th, 2020

OK well not really kernel patches, but they’re in the Linux tree so I guess it counts?

Was so excited when I got the automatic notification they’d been merged for the 5.8 release. Hopefully someone out there using perf to profile Java finds them useful.

Filed in linux - Comments closed

Wlroots and Phosh on Samsung S7

April 19th, 2020

A few weekends ago I left my Samsung S7 running Gnome on software-rendered X11. This kind of works as a demo but it’s slow and clunky so I followed that by attempting to get Phosh running. Phosh is a gnome-shell replacement for Purism’s Librem5. It uses phoc as a Wayland compositor instead of Mutter, which in turn is based on wlroots, the compositor-as-a-library component of Sway.

It should be much easier to add a hwcomposer backend to wlroots than Mutter, and in fact someone already started: NotKit/wlroots. I took this and rebased on the latest upstream wlroots tag, hacked the code around until it compiled with the new interface, ran the example app and… the screen flashed green for a second and then kernel panicked and rebooted. Ouch.

<0>[ 7300.959344] I[0:      swapper/0:    0] Kernel panic - not syncing: Unrecoverable System MMU Fault!!
<0>[ 7300.959382] I[0:      swapper/0:    0] Kernel loaded at: 0x8013c000, offset from compile-time address bc000
<3>[ 7300.959438] I[0:      swapper/0:    0] exynos_check_hardlockup_reason: smc_lockup virt: 0xffffffc879980000 phys: 0x00000008f9980000 size: 4096.
<0>[ 7300.959489] I[0:      swapper/0:    0] exynos_check_hardlockup_reason: SMC_CMD_GET_LOCKUP_REASON returns 0x1. fail to get the information.
<0>[ 7300.959534] I[0:      swapper/0:    0] exynos_ss_prepare_panic: no core got stucks in EL3 monitor.

The panic log is not very helpful, there’s no user stack trace.

After a painful few hours debugging by adding prints and sleeps and comparing against the working test_hwcomposer from libhybris I managed to fix it. I’ve pushed a hwcomposer-0.10.1 branch here.

Then I built phoc and phosh, linking against the modified wlroots and libhybris. To my surprise it Just Worked, with the exception of touch input. Input requires enabling the libinput backend of wlroots, and that in turn requires an active “session”. Session in the systemd world means being associated with a “seat” in systemd-logind. We can do that by starting phoc inside a systemd service and associating it with a TTY. I copied phosh.service from the Librem5 package and edited it for my system.

Unfortunately phoc then hangs at startup inside the wlroots libinput backend polling for sd_seat_can_graphical(..) to return true. Logind seems to make some people very angry, but debugging it with the source code and loginctl wasn’t too bad.

$ loginctl 
SESSION  UID USER SEAT  TTY  
    132 1000 nick            
    162 1000 nick seat0 tty7     <---------------
      4 1000 nick            
      6 1000 nick            
      7 1000 nick            
     c2    0 root       pts/4
 
6 sessions listed.

Here phoc is running on seat0 which is attached to /dev/tty7.

$ loginctl show-seat seat0 
Id=seat0
ActiveSession=162
CanMultiSession=yes
CanTTY=yes
CanGraphical=no    <-----------
Sessions=162
IdleHint=yes
IdleSinceHint=1587287027759986
IdleSinceHintMonotonic=18371557550

From reading the logind source, CanGraphical is true if there is a device attached to the seat that has the udev TAG attribute with value "master-of-seat". Normally this attribute is added to graphics devices by the udev rules systemd ships in /lib/udev/rules.d/71-seat.rules. The S7 has a special “decon” graphics driver so none of the standard rules match. But it’s easy to add a custom rule:

SUBSYSTEM=="graphics", KERNEL=="fb0", DRIVERS=="decon", TAG+="master-of-seat"

After reloading and retriggering the udev rules, the framebuffer device now has this tag:

$ udevadm info /dev/fb0
P: /devices/13960000.decon_f/graphics/fb0
N: fb0
L: 0
S: graphics/fb0
E: DEVPATH=/devices/13960000.decon_f/graphics/fb0
E: DEVNAME=/dev/fb0
E: MAJOR=29
E: MINOR=0
E: SUBSYSTEM=graphics
E: USEC_INITIALIZED=42601393
E: ID_PATH=platform-13960000.decon_f
E: ID_PATH_TAG=platform-13960000_decon_f
E: ID_FOR_SEAT=graphics-platform-13960000_decon_f
E: ID_SEAT=seat0
E: DEVLINKS=/dev/graphics/fb0
E: TAGS=:seat0:seat:master-of-seat:      <-----------

And after restarting phoc/phosh touch is working! :D

From left: Gnome calculator, app drawer, Gnome terminal and squeekboard onscreen keyboard

Filed in linux - Comments closed