applychanges, applylog, compactdb, cphist, updatedb– simple client–server replica management


replica/compactdb db
replica/updatedb [ –cl ] [ –p proto ] [ –r root ] [ –t now n ] [ –u uid ] [ –x path ] ... db
replica/applylog [ –nuv ] [ –c name ]... [ –s name ]... clientdb clientroot serverroot [ path ... ]
replica/applychanges [ –nuv ] [ –p proto ] [ –x path ] ... clientdb clientroot serverroot [ path ... ]
replica/cphist [ –evn ] [ –l lineno ] clientroot serverroot oldserverroot


These five tools collectively provide simple log–based client–server replica management. The shell scripts described in replica(1) provide a more polished interface.
Both client and server maintain textual databases of file system metadata. Each line is of the form
path mode uid gid mtime length
Later entries for a path supersede previous ones. A line with the string REMOVED in the mode field annuls all previous entries for that path. The entries in a file are typically kept sorted by path but need not be. These properties facilitate updating the database atomically by appending to it. Compactdb reads in a database and writes out an equivalent one, sorted by path and without outdated or annulled records.
A replica is further described on the server by a textual log listing creation and deletion of files and changes to file contents and metadata. Each line is of the form:
time gen verb path serverpath mode uid gid mtime length
The time and gen fields are both decimal numbers, providing an ordering for log entries so that incremental tools need not process the whole log each time they are run. The verb, a single character, describes the event: addition of a file (a), deletion of a file (d), a change to a file's contents (c), or a change to a file's metadata (m). Path is the file path on the client; serverpath the path on the server (these are different when the optional fifth field in a proto file line is given; see proto(2)). Mode, uid, gid, and mtime are the files metadata as in the Dir structure (see stat(5)). For deletion events, the metadata is that of the deleted file. For other events, the metadata is that after the event.
Updatedb scans the file system rooted at root for changes not present in db, noting them by appending new entries to the database and by writing log events to standard output. The –c option causes updatedb to consider only file and metadata changes, ignoring file additions and deletions. By default, the log events have time set to the current system time and use incrementing gen numbers starting at 0. The –t option can be used to specify a different time and starting number. If the –u option is given, all database entries and log events will use uid rather than the actual uids. The –x option (which may be specified multiple times) excludes the named path and all its children from the scan. If the –l option is given, the database is not changed and the time and gen fields are omitted from the log events; the resulting output is intended to be a human–readable summary of file system activity since the last scan.
Applylog is used to propagate changes from server to client. It applies the changes listed in a log (read from standard input) to the file system rooted at clientroot, copying files when necessary from the file system rooted at serverroot. By default, applylog does not attempt to set the uid on files; the –u flag enables this. Applylog will not overwrite local changes made to replicated files. When it detects such conflicts, by default it prints an error describing the conflict and takes no action. If the –c flag is given, applylog still takes no action for files beginning with the given names, but does so silently and will not report the conflicts in the future. (The conflict is resolved in favor of the client.) The –s is similar but causes applylog to overwrite the local changes. (The conflict is resolved in favor of the server.)
Applychanges is, in some sense, the opposite of applylog; it scans the client file system for changes, and applies those changes to the server file system. Applychanges will not overwrite remote changes made to replicated files. For example, if a file is copied from server to client and subsequently changed on both server and client, applychanges will not copy the client's new version to the server, because the server also has a new version. Applychanges and applylog detect the same conflicts; to resolve conflicts reported by applychanges, invoke applylog with the –c or –s flags.
Cphist was designed to copy a dump faithfully from one file server to the next. Unlike other replica tools it takes special precautions to ensure that whenever files on the server and oldserver share a qid.path, the client's corresponding file will be modified and whenever they do not, the client's file will be deleted and a copy made from server. Thus the Qid–relationship between the file on server and oldserver will be the same as for the current file on the client and that file in the client's most recent dump. Also, muid is set to preserve the output of history(1). The –e flag toggles exiting on first error. The default is to exit on any error. The –l lineno sets the first line of the log (the gen field) to be processed while –v sets verbose output and –n prints what cphist would do, but does nothing.


One might keep a client kfs file system up–to–date against a server file system using these tools. First, connect to a CPU server with a high–speed network connection to the file server and scan the server file system, updating the server database and log:
9fs $fs
replica/updatedb –p $proto –r /n/$fs –x $repl $db >>$log
replica/compactdb $db >/tmp/a && mv /tmp/a $db
Then, update the client file system:
9fs $fs
9fs kfs
replica/applylog $db /n/kfs /n/$fs <$log
replica/compactdb $db >/tmp/a && mv /tmp/a $db
The $repl directory is excluded from the sync so that multiple clients can each have their own local database. The shell scripts in /rc/bin/replica are essentially a further development of this example.
The Plan 9 distribution update program operates similarly, but omits the first scan; it is assumed that the Plan 9 developers run scans manually when the distribution file system changes. The manual page replica(1) describes this in full.




These tools (excepting cphist) assume that mtime combined with length is a good indicator of changes to a file's contents.
Perhaps cphist should inspect the source files directly rather than using a replica log, as it picks through the oldserver to discover muid and other bits of information not in the log.