banner

rinsmiles´ Guide to the Void

Version 2021.02



Power Saving and Performance


Configuration in this section is presented to be applied statically, not dependent on the system’s power source or a global ‘power mode’, while the “Power profiles” subsection describes how it can be applied dynamically and files for such purpose are provided in the Appendix. Each item is succinctly described—the user is expected to read further about them elsewhere if required.

An excellent aid in the configuration of power saving is the tool powertop, which can generate system power reports and show real time power draw data. While being primarily a diagnostics tool, it also has an “auto-tune” option which can automatically set a good deal of the tunables below—however, note that such automatic configuration is not persistent and not meant to replace a custom-tailored one.

Indeed widely, I suggest avoiding tools that do intend to provide automatic power configuration. Understanding and tailoring the configuration for one’s hardware and its use will result in a more reliable, efficient and better performing system—and it can be quite simple, as today most power management aspects are already handled smartly by your firmware, drivers, kernel and distro.


Laptop mode

If you are using a laptop and running the OS from a hard drive (i.e, not an SSD, etc), you can use the “laptop mode” kernel feature as a power saving technique, which aims to keep the OS’s hard drive in suspension as much as possible. An implementation of this feature can be found in the Appendix, which requires that you disable access timestamps as described below, and that you do not enable block layer rpm on your main hard drive, as it uses hdparm for drive suspension.


Lockup detectors

Turning off the kernel’s lockup detectors can decrease power consumption, as these are active high-priority tasks and, particularly, the hardlockup detector may generate a high number of interrupts on some systems. These detectors are used for debugging, and are generally dispensable to PC users. To do this, add the “nowatchdog” option to your kernel parameters in your efibootmgr hook or bootloader configuration.


PCI and SCSI bus devices

PCI RPM: PCI devices can be put in a low-power state when idle through automatic Runtime Power Management to save power. The caveat is that some devices may not become active again after entering such state. To handle this, identify the Vendor and Device IDs of any device that fails to wake up with "lspci -nn" and add them to a blacklist, so that those remain active throughout.

Block Layer RPM: Similarly, block devices may use runtime power management to access low-power states. This time a timeout needs to be set, at which point the device will flush the cache and go into suspension. Note that using it with frequently accessed hard drives could result in high latencies and component wear-down if the power state transitions are over-frequent. A whitelist may thus be preferred, where drives can be identified by their World Wide Identifier (see /sys/bus/scsi/devices/*:*:*:*/wwid) to be selected and assigned specific timeouts.

SATA LPM: Further power savings can be achieved by using a medium power SATA Link Power Management setting, allowing devices to enter lower power states. The recommended option for modern systems is “med_power_with_dipm”, which works together with the devices’ own power configuration, otherwise being regular “medium_power”. The default is “max_performance”.
Note: While reportedly rare, using medium LPM settings (with some older drives) could result in data corruption. There is a “min_power” mode as well, but using it is strongly discouraged due to the risk of data loss, and using Device Initiated PM already results in similar power savings.

Readahead: Increasing file data readahead for hard drive devices can substantially increase I/O throughput with minimal latency penalties, and it can also benefit solid state drives to a more moderate degree. The default (software) readahead is 128kB, which can be safely doubled on modern systems, while more memory or I/O constrained systems should keep the default value.

Scheduler: I/O performance can be increased by using the appropriate I/O scheduler for your hardware and workloads. Generally, you can improve performance and reduce latency by using the "mq-deadline" scheduler on SSDs and the (default low latency) "bfq" scheduler on HDDs. However, NVMe drive performance benefits from setting no scheduler at all, i.e. “none”.

To apply these settings use a set of udev rules applying desired values, for example:

/etc/udev/rules.d/91-power-and-performance.rules
# schedulers
SUBSYSTEM=="block", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", \
	ATTR{queue/scheduler}="mq-deadline"
SUBSYSTEM=="block", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", \
	ATTR{queue/scheduler}="bfq"
# readahead
SUBSYSTEM=="block", KERNEL=="sd[a-z]", ATTR{queue/read_ahead_kb}="256"
# SATA LPM
SUBSYSTEM=="scsi_host", KERNEL=="host*", \
	ATTR{link_power_management_policy}="med_power_with_dipm"
# Block Layer RPM - using whitelist
SUBSYSTEM=="scsi", ATTR{wwid}=="exact very long world wide identifier", \
	ATTR{power/autosuspend_delay_ms}="300000", ATTR{power/control}="auto"
# PCI RPM - using blacklist
SUBSYSTEM=="pci", ATTR{vendor}=="0x0a1b", ATTR{device}=="0x2c3d", \
	GOTO="pci_rpm_end"
SUBSYSTEM=="pci", ATTR{power/control}="auto"
LABEL="pci_rpm_end"

HDD Advanced Power Management

Among other parameters, you can set a hard drive’s Advanced Power Management (APM) feature with the tool hdparm. Generally, it is best to set high APM levels to avoid excessive head parking and undesired spindown -which can cause heightened component wear-down when too frequent- while also not compromising the -much needed- drive’s performance. Power savings and wear reduction can be achieved by manually managing drive suspension, which may be done in different ways—e.g. the above-mentioned Block Layer RPM or tools like hdparm.

Note that many drives do allow spindown with APM levels above 127, while they may ignore any manually set timeout with levels below 128. If unsure, try first levels 192, 128, 127, in that order, and add hdparm commands to /etc/rc.local to apply them on system startup and to a script in /etc/zzz.d/resume/ to retain them after a system suspend.


Filesystems and virtual memory

Access timestamps: Using the noatime mount option disables inode access time updates on a filesystem, increasing performance and reducing drive access. You can set this in /etc/fstab to your root and ESP partitions, and any other mount point you judge it pertinent.

Data flushing: To further reduce drive access you can increase the intervals after which the kernel will write old data to disk. This is by default done every 5 seconds for both data and journaling (on journaling filesystems like ext4) — you can tune it with the following:
 i) Journaling: Add the “commit=❬seconds❭” option to your (ext4 or btrfs) filesystem's mount options in /etc/fstab. E.g, 15 seconds.
 ii) Data: Add the “vm.dirty_writeback_centisecs = ❬centiseconds❭” setting in a .conf file located in /etc/sysctl.d/ (see further below). E.g, 1500 centiseconds.

/etc/fstab
# <file system>    <dir>  	<type>		<options> 			<dump><pass>
# /dev/sda2
UUID=a1b2(...)	    /		ext4		defaults,noatime,commit=15         0    1
# /dev/sda1
UUID=9Z8Y(...)	    /boot	vfat		defaults,noatime,umask=022,utf8	   0    2
(...)

Paging and cached data: There are two main tuning knobs to consider here, swappiness and virtual filesystem cache pressure. The first one defines the rough relative I/O cost of swapping and filesystem paging, the other controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects.

The default value of swappiness is 60 —decreasing the value will cause the kernel to disprefer swapping. The default value of cache pressure is 100 —decreasing the value will increase its preference to conserve cached data. On systems with high physical memory, you can use lower swappiness and cache pressure values, e.g. 15 and 65 respectively, by setting “vm.swappiness” and “vm.vfs_cache_pressure” in a .conf file located in /etc/sysctl.d/:

/etc/sysctl.d/91-df-vm.conf
vm.dirty_writeback_centisecs = 1500
vm.swappiness = 15
vm.vfs_cache_pressure = 65

Audio power save

For audio devices using the "snd_hda_intel" or "snd_ac97_codec" drivers, these can be set to turn off the codec power when the device has not been in use after a certain amount of time. Check the drivers your audio devices are using with “lspci -k”, and set the “power_save” option with a timeout in seconds for such drivers in a .conf file located in /etc/modprobe.d:

/etc/modprobe.d/91-audio-power-save.conf
#options snd_ac97_codec power_save=5
options snd_hda_intel power_save=5

CPU frequency scaling

CPU frequencies are managed by the active governor for each core. Scaling governors set frequencies dynamically, providing great performance while not sacrificing efficiency. Situationally, however, non-scaling governorns may be preferred, like (cpufreq's) ‘powersave’ governor which sets frequencies statically to their lowest point for maximum power savings.

Within cpufreq, the “schedutil” scaling governor is gradually turning to be the default, “best-of-all-worlds” one for general use, being closely integrated with the kernel. To set it -or any other governor you wish- at startup, add the following to your rc.local file:

/etc/rc.local
(...)
for d in /sys/devices/system/cpu/cpufreq/policy[0-9]* ; do
	echo 'schedutil' > "$d/scaling_governor"
done
(...)
Note that, on modern Intel CPUs, you can use the Intel P-State driver which provides its own scaling governors when active, called ‘performance’ and ‘powersave’ (like cpufreq's non-scaling governors). Broadly, they still tend to perform better than “schedutil”, especially when raw performance is prioritized over power draw. If you wish to use them, set the governors’ status in /sys/devices/system/cpu/intel_pstate/ to “active” and then set a governor as above.

Initial images

You can generate smaller initramfs boot images by directing dracut to install only what is needed for booting the local host, which can speed up startup times. This is usually safe, but you should always test it first conserving a generic boot image in case some module results to be missing. Additionally, you can use a different compressor or no compression at all for further tuning. E.g:

/etc/dracut.conf.d/90-boot-image-tuning.conf
# generate host-specific boot image
hostonly="yes"
# we set kernel parameters elsewhere
hostonly_cmdline="no"
# having installed lz4, we can use it as a fast (de)compression alternative
#compress="lz4"

Power profiles

Some power and performance configuration needs to be dynamic: Laptop users will require that the system enter a power saving mode when running on battery, and desktop users may wish to trigger a (high power) minimal latency mode for timing-critical tasks, among other scenarios. Such cases can be easily managed with acpid, by creating an event file that matches the event, like a power source transition or some sort of turbo button press, to a script that will set the desired configuration.

When changing the configuration dynamically, kernel parameters should be set with the sysctl command, and parameters set above through udev are (typically) accessible as writable files under the /sys folder.

Laptop users should set up power profiles dependent on the computer’s power source, usually involving HDD power management if running the OS from one, CPU frequency scaling, and any relevant GPU settings that their driver provides. All but the last one are tackled in the Appendix files, and the latter can easily be added to them. Do not forget you can check powertop's live data while setting it up, to identify any further configuration that may be needed.


What about...?

Wakeup-on-Lan?

WOL appears to be disabled by default on Void. If you wish to set it, use the commands that powertop provides to toggle it, or use a udev rule to -for example- match to the "net" subsystem and change the wakeup power setting on the devices (enabled/disabled).

USB autosuspend?

I find that its unreliability outweighs the meager benefit. Furthermore, a power-sensitive system like a laptop would be set to suspend before USB autosuspend could make an appreciable difference, as any connected USB devices will likely be used when the computer is active.

AHCI host controller PM?

Quite bug-prone on both my laptop and desktop, so I omitted it out of caution. Perhaps in a future version of this guide, if I can solve the problems, it will be featured. If you want to try it, use the commands that powertop provides and test, checking dmesg logs, after having saved and synced your data.

Active State Power Management?

This is normally handled by your BIOS.

Vulnerability mitigations?

Turning off certain vulnerability mitigations can indeed improve performance on most systems, especially those running on older hardware. Knowledge of the issues, affected hardware and their consequences is paramount and better sought in dedicated literature.