Bug#901289: New upstream home?

Theodore Y. Ts'o tytso at mit.edu
Thu Dec 27 16:30:53 GMT 2018


Regarding in-kernel recovery "being good enough".  The reason why some
file systems and system administrators prefer to run fsck at boot,
even you can there is "in-kernel recovery", is that journal/log replay
only works on an unclean shutdown.

However, sometimes there are can be file system inconsistencies errors
caused by hardware problems, software bugs (e.g., an Nvidia
binary-only driver dereferencing a wild pointer and causing random
memory corruption leading to file system damage), etc.  The question
is what to do then?  Some file systems will just fail the boot,
leaving the server down until a system administrator can look at
things.  Other file systems will try to automatically repair "obvious"
problems for which there is only one thing a human would have done
anyway, so the system can be brought back to life much faster.  In
those cases, it is useful if a log of the repairs that were done to
fix the file system can be logged automatically.  "logsave" was an
initial attempt at doing this.

Programs like "logsave" are also really useful if you are trying to
run the repair from an initial ramdisk, since you want to save the
output of the boot log someplace useful, before the root file system
has been mounted.  Of course, systemd's journald daemon also serves
this need, although logsave predated system by decades.

For people who aren't worried about large numbers servers in
production, perhaps logsave really isn't that necessary.  This is
especially true if they are choosing to use a file system that does
not attempt automated recovery in the presence of hardware or software
failures, not just automated log replays; those systems will just stop
the boot dead in the water if there is any kind of unexpected file
system corruption, so logsave won't buy those sysadmins anything,
anyway.

So my recommendation for sysvinit is to make logsave optional; if
logsave is not installed, it's not critical for the functioning of
sysvinit to save the output somewhere.  I'd also suggest that sysvinit
might want consider a mode where if logsave *is* available, that it be
used to save the output of the full init.d boot sequence, and not just
the fsck output.  This will give sysvinit roughly similar
debuggability as journald, which is something that system
administrators could also find extremely useful.

If sysvinit *really* wants to take over logsave, I won't really
object.  For one thing, if you only care about ext4, we actually now
have a much more sophisticated way of saving fsck logs.  See the
LOGGING section of the e2fsck.conf man page.  Logsave was designed
back when I was worried about enterprise-grade Reliability,
Availability, and Serviceability, and I worked at a company that cared
about such things at scales up to a handful of mainframes.

What we now have built-into fsck.ext4 (aka e2fsck) was designed after
I started working for a company that has to deal with several orders
of magnitudes more servers and file systems in data centers all over
the world, with a very small staff of Site Reliability Engineers to
take care of them all.  :-)

However, I really don't think migrating logsave between packages so
it's provided by sysvinit is worth it.  Just make it be optional, for
most people running a desktop or a handful of servers, and if they are
using file systems that don't try to do automated recovery, it's not
going to buy them much anyway.

Cheers,

					- Ted




More information about the Debian-init-diversity mailing list