There are subtle bugs in the Debian boot and shutdown sequences. They are hard to find, as they normally only affect rare combination of packages. They are harder to fix, as they normally require the combined work of several maintainers and changes in several packages. This talk is about the release goal for Lenny to solve them, and gain a few advantages on the way.
Note, this is the stuff going on after the initrd part is done. The very early boot is done before hard drive partitions are mounted.
Note that because the Linux kernel is becoming more and more event based, the boot sequence is no longer sequencial.
Note that switching to runlevel S will not run the scripts in /etc/rcS.d/. To get a similar effect after boot, switch to runlevel 1. It will (should) kill all services and prepare the machine for maintenance.
This is roughtly equivalent to switching to runlevel 0 (halt) or 6 (reboot).
Minor exception: all scripts (both start and stop) are executed with the stop argument, ignoring their start and stop settings and confusing script writers.
Only stop scripts for services started in the previous runlevel are executed.
Script ordering is vital for this to work. And how are the scripts ordered? By numbers 01-99!
And the numbers are picked using skills, knowledge and negotiation. Getting it right is often hard.
The current Debian default is wrong. The stop sequence should by default be the reverse of the start sequence. It isn't. The default uses '20' for both.
Reordering is hard and sometimes requires cooperation between maintainers of all packages involved.
Given two packages with two scripts inserted with the default settings in Debian:
Package A: script_a sequence 20 (start and stop)
Package B: script_b sequence 20 (start and stop)
Along comes script C, which should run before script_a and after script_b. Current solution is to change packages A and C or packages B and C to get something like this:
Package A: script_a start seq. 22, stop seq. 18
Package B: script_b sequence 20 (start and stop)
Package C: script_c start seq 21, stop seq 19
If other scripts depend on the old order of script_a, they will have to change their sequence number too. The only way to discover this is by a lot of testing, or documenting script dependencies.
Let each script document its dependencies, and generate sequence numbers using this dependency information. Example:
Package A: script_a depend on nothing
Package B: script_b depend on nothing
Package C: script_c depend on script_b, a dependency of script_a
Generated sequence:
script_b start seq 1, stop seq 3
script_c start seq 2, stop seq 2
script_a start seq 3, stop seq 1
An implementation of this system is the dependency based boot sequencing, provided by the insserv package. It uses the format specified in Linux Software Base to document init.d script dependencies.
Will refuse to enable when obsolete init.d scripts, loops, duplicate provides etc is detected
# aptitude install insserv # dpkg-reconfigure insserv info: Checking if it is safe to convert to dependency based boot. info: Backing up existing boot scripts in \ /var/lib/insserv/bootscripts-20080223T0742.tar.gz info: Reordering boot system, log to \ /var/lib/insserv/run-20080223T0742.log info: Recording new boot sequence in \ /var/lib/insserv/bootscripts-20080223T0742-after.list info: Use '/usr/sbin/update-bootsystem-insserv \ restore' to restore the old boot sequence. Adding `diversion of /usr/sbin/update-rc.d to \ /usr/sbin/update-rc.d.distrib by insserv' success: Boot system successfully converted # /var/lib/insserv/insserv-seq-changes \ /var/lib/insserv/bootscripts-20080223T0742.tar.gz [...] #
update-rc.d refuses to Insert scripts which create a loop.
update-rc.d requires scripts to be inserted in dependency order.
Incorrect dependencies give the wrong but predictable and stable (as in the same all the time) boot and shutdown order.
Every insserv upload are checked using test suite to make sure the generated boot sequence is correct, and that previously detected bugs do not show up again.
It is possible to enable concurrent booting, running boot scripts in parallel (CONCURRENCY=startpar in /etc/default/rcS)
Might even speed up boot and shutdown (initial benchmark shows speedup by just reordering based on dependencies).
Run dpkg-reconfigure insserv and disable it.
It is always possible to disable just after it was enabled, before any new packages are installed.
When disabling it, a backup of the old boot sequence is restored if no changes have been made to the boot sequence since it was enabled.
If restore is not possible, all postinst scripts for packages with init.d scripts will be executed again to make them call update-rc.d and add the boot scripts again.
This is guaranteed to work if no packages have been added since it was enabled, and most often works if packages have been added. So if you change your mind, do it quickly.
If your package used the default update-rc.d settings before, this is the header for you:
### BEGIN INIT INFO # Provides: scriptname # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 ### END INIT INFO
$remote_fs is needed by all scripts using files in /usr/. $syslog is needed only by scripts starting services logging to syslog.
Linux Software Base version 3.2 defines these virtual facilities:
All of these represent points in time during boot and shutdown. Virtual facilities are defined in /etc/insserv.conf and /etc/insserv.conf.d/
Normally, the start and stop dependencies are the same.
Virtual dependencies are preferred over specific dependencies.
When using specific dependencies, use the string listed in the provides header of the scripts you depend on.
Scripts started in rcS.d/ need extra care.
All scripts not started in rcS.d/ should depend on $remote_fs. This make sure /usr/ is available during start and that it is stopped before sendsigs kills all processes during shutdown.
A sysadmin can provide overrides in /etc/insserv/overrides/scriptname if the script settings are wrong. The insserv package provides overrides in /usr/share/insserv/overrides/ for packages currently missing headers.
Not quite tested yet.
Should perhaps be handled using virtual facilities in /etc/insserv.conf.d/.
Do not listing identical provides in several scripts - break installation.
Release goal for Lenny.
76% of packages got LSB headers.
Unsolved in BTS: ~85
Without BTS reports: ~150
Last package will be fixed 2008-06-15 at the current rate.
Needs better documentation for maintainers.
Should update Debian policy to reflect dependency based boot
sequencing.
Two insserv bugs left to fix:
The current status is tracked in the svn repository for insserv. Run
debian/rules missing-overrides or
debian/rules missing-by-popcon
in the source directory to see the current status.
See also wiki pages with documentation and status:
There is also slides from my talk from Debconf 7.
More man-power is needed to report BTS reports, NMU packages and discover dependency errors. Please test the system.
http://www.hungry.com/~pere/mypapers/200802-bootsequence/200802-bootsequence.html