Add mkfile to keep indices up to date.

Remove man pages for things we don't provide.
This commit is contained in:
rsc 2005-01-03 06:41:38 +00:00
parent 058b0118a5
commit a19c44b83b
18 changed files with 45 additions and 2321 deletions

View file

@ -1,232 +0,0 @@
.TH VENTI 8
.SH NAME
venti \- an archival block storage server
.SH SYNOPSIS
.B venti/venti
[
.B -dsw
]
[
.B -a
.I ventiaddress
]
[
.B -B
.I blockcachesize
]
[
.B -c
.I config
]
[
.B -C
.I cachesize
]
[
.B -h
.I httpaddress
]
[
.B -I
.I icachesize
]
.PP
.B venti/sync
[
.B -h
.I host
]
.SH DESCRIPTION
.I Venti
is a block storage server intended for archival data.
In a Venti server,
the SHA1 hash of a block's contents acts as the block
identifier for read and write operations.
This approach enforces a write-once policy, preventing accidental or
malicious destruction of data. In addition, duplicate copies of a
block are coalesced, reducing the consumption of storage and
simplifying the implementation of clients.
.PP
Storage for
.I venti
consists of a data log and an index, both of which
can be spread across multiple files.
The files containing the data log are themselves divided into self-contained sections called arenas.
Each arena contains a large number of data blocks and is sized to
facilitate operations such as copying to removable media.
The index provides a mapping between the a Sha1 fingerprint and
the location of the corresponding block in the data log.
.PP
The index and data log are typically stored on raw disk partitions.
To improve the robustness, the data log should be stored on
a device that provides RAID functionality. The index does
not require such protection, since if necessary, it can
can be regenerated from the data log.
The performance of
.I venti
is typically limited to the random access performance
of the index. This performance can be improved by spreading the
index accross multiple disks.
.PP
The storage for
.I venti
is initialized using
.IR fmtarenas ,
.IR fmtisect ,
and
.I fmtindex
(see
.IR ventiaux (8)).
A configuration file,
.IR venti.conf (6),
ties the index sections and data arenas together.
.PP
A Venti
server is accessed via an undocumented network protocol.
Two client applications are included in this distribution:
.IR vac (1)
and
.IR vacfs (4).
.I Vac
copies files from a Plan 9 file system to Venti, creating an
archive and returning the fingerprint of the root.
This archive can be mounted in Plan 9 using
.IR vacfs .
These two commands enable a rudimentary backup system.
A future release will include a Plan 9 file system that uses
Venti as a replacement for the WORM device of
.IR fs (4).
.PP
The
.I venti
server provides rudimentary status information via
a built-in http server. The URL files it serves are:
.TP
.B stats
Various internal statistics.
.TP
.B index
An enumeration of the index sections and all non empty arenas, including various statistics.
.TP
.B storage
A summary of the state of the data log.
.TP
.B xindex
An enumeration of the index sections and all non empty arenas, in XML format.
.PP
Several auxiliary utilities (see
.IR ventiaux (8))
aid in maintaining the storage for Venti.
With the exception of
.I rdarena ,
these utilities should generally be run after killing the
.I venti
server.
The utilities are:
.TP
.I checkarenas
Check the integrity, and optionally fix, Venti arenas.
.TP
.I checkindex
Check the integrity, and optionally fix, a Venti index.
.TP
.I buildindex
Rebuild a Venti index from scratch.
.TP
.I rdarena
Extract a Venti arena and write to standard output.
.PD
.PP
Options to
.I venti
are:
.TP
.BI -a " ventiaddress
The network address on which the server listens for incoming connections.
The default is
.LR tcp!*!venti .
.TP
.BI -B " blockcachesize
The size, in bytes, of memory allocated to caching raw disk blocks.
.TP
.BI -c " config
Specifies the
Venti
configuration file.
Defaults to
.LR venti.conf .
.TP
.BI -C " cachesize
The size, in bytes, of memory allocated to caching
Venti
blocks.
.TP
.BI -d
Produce various debugging information on standard error.
.TP
.BI -h " httpaddress
The network address of Venti's built-in
http
server.
The default is
.LR tcp!*!http .
.TP
.BI -I " icachesize
The size, in bytes, of memory allocated to caching the index mapping fingerprints
to locations in
.IR venti 's
data log.
.TP
.B -s
Do not run in the background.
Normally,
the foreground process will exit once the Venti server
is initialized and ready for connections.
.TP
.B -w
Enable write buffering. This option increase the performance of writes to
.I venti
at the cost of returning success to the client application before the
data has been written to disk.
The server implements a
.I sync
rpc that waits for completion of all the writes buffered at the time
the rpc was received.
Applications such as
.IR vac (1)
and the
.I sync
command described below
use this rpc to make sure that the data is correctly written to disk.
Use of this option is recommended.
.PD
.PP
The units for the various cache sizes above can be specified by appending a
.LR k ,
.LR m ,
or
.LR g
to indicate kilobytes, megabytes, or gigabytes respectively.
The command line options override options found in the
.IR venti.conf (6)
file.
.PP
.I Sync
connects to a running Venti server and executes a sync rpc
(described with the
.B -w
option above).
If sync exits successfully, it means that all writes buffered at the
time the command was issued are now on disk.
.SH SOURCE
.B /sys/src/cmd/venti
.SH "SEE ALSO"
.IR venti.conf (6),
.IR ventiaux (8),
.IR vac (1),
.IR vacfs (4).
.br
Sean Quinlan and Sean Dorward,
``Venti: a new approach to archival storage'',
.I "Usenix Conference on File and Storage Technologies" ,
2002.

View file

@ -1,504 +0,0 @@
.TH VENTIAUX 8
.SH NAME
buildindex,
checkarenas,
checkindex,
conf,
copy,
fmtarenas,
fmtindex,
fmtisect,
rdarena,
rdarenablocks,
read,
wrarenablocks,
write \- Venti maintenance and debugging commands
.SH SYNOPSIS
.B venti/buildindex
[
.B -B
.I blockcachesize
]
[
.B -Z
]
.I venti.config
.I tmp
.PP
.B venti/checkarenas
[
.B -afv
]
.I file
.PP
.B venti/checkindex
[
.B -f
]
[
.B -B
.I blockcachesize
]
.I venti.config
.I tmp
.PP
.B venti/conf
[
.B -w
]
.I partition
[
.I configfile
]
.PP
.B venti/copy
[
.B -f
]
.I src
.I dst
.I score
[
.I type
]
.PP
.B venti/fmtarenas
[
.B -Z
]
[
.B -a
.I arenasize
]
[
.B -b
.I blocksize
]
.I name
.I file
.PP
.B venti/fmtindex
[
.B -a
]
.I venti.config
.PP
.B venti/fmtisect
[
.B -Z
]
[
.B -b
.I blocksize
]
.I name
.I file
.PP
.B venti/rdarena
[
.B -v
]
.I arenapart
.I arenaname
.PP
.B venti/read
[
.B -h
.I host
]
.I score
[
.I type
]
.PP
.B venti/wrarena
[
.B -o
.I fileoffset
]
[
.B -h
.I host
]
.I arenafile
[
.I clumpoffset
]
.PP
.B venti/write
[
.B -h
.I host
]
[
.B -t
.I type
]
[
.B -z
]
.SH DESCRIPTION
These commands aid in the setup, maintenance, and debugging of
Venti servers.
See
.IR venti (8)
and
.IR venti.conf (6)
for an overview of the data structures stored by Venti.
.PP
Note that the units for the various sizes in the following
commands can be specified by appending
.LR k ,
.LR m ,
or
.LR g
to indicate kilobytes, megabytes, or gigabytes respectively.
.PP
.I Buildindex
populates the index for the Venti system described in
.IR venti.config .
The index must have previously been formatted using
.IR fmtindex .
This command is typically used to build a new index for a Venti
system when the old index becomes too small, or to rebuild
an index after media failure.
Small errors in an index can usually be fixed with
.IR checkindex .
.PP
The
.I tmp
file, usually a disk partition, must be large enough to store a copy of the index.
This temporary space is used to perform a merge sort of index entries
generated by reading the arenas.
.PP
Options to
.I buildindex
are:
.TP
.BI -B " blockcachesize
The amount of memory, in bytes, to use for caching raw disk accesses while running
.IR buildindex .
(This is not a property of the created index.)
The default is 8k.
.TP
.B -Z
Do not zero the index.
This option should only be used when it is known that the index was already zeroed.
.PD
.PP
.I Checkarenas
examines the Venti arenas contained in the given
.IR file .
The program detects various error conditions, and optionally attempts
to fix any errors that are found.
.PP
Options to
.I checkarenas
are:
.TP
.B -a
For each arena, scan the entire data section.
If this option is omitted, only the end section of
the arena is examined.
.TP
.B -f
Attempt to fix any errors that are found.
.TP
.B -v
Increase the verbosity of output.
.PD
.PP
.I Checkindex
examines the Venti index described in
.IR venti.config .
The program detects various error conditions including:
blocks that are not indexed, index entries for blocks that do not exist,
and duplicate index entries.
If requested, an attempt can be made to fix errors that are found.
.PP
The
.I tmp
file, usually a disk partition, must be large enough to store a copy of the index.
This temporary space is used to perform a merge sort of index entries
generated by reading the arenas.
.PP
Options to
.I checkindex
are:
.TP
.BI -B " blockcachesize
The amount of memory, in bytes, to use for caching raw disk accesses while running
.IR checkindex .
The default is 8k.
.TP
.B -f
Attempt to fix any errors that are found.
.PD
.PP
.I Fmtarenas
formats the given
.IR file ,
typically a disk partition, into a number of
Venti
arenas.
The arenas are given names of the form
.IR name%d ,
where
.I %d
is replaced with a sequential number starting at 0.
.PP
Options to
.I fmtarenas
are:
.TP
.BI -a " arenasize
The arenas are of
.I arenasize
bytes. The default is 512 megabytes, which was selected to provide a balance
between the number of arenas and the ability to copy an arena to external
media such as recordable CDs and tapes.
.TP
.BI -b " blocksize
The size, in bytes, for read and write operations to the file.
The size is recorded in the file, and is used by applications that access the arenas.
The default is 8k.
.TP
.B -Z
Do not zero the data sections of the arenas.
Using this option reduces the formatting time
but should only be used when it is known that the file was already zeroed.
.PD
.I Fmtindex
takes the
.IR venti.conf (6)
file
.I venti.config
and initializes the index sections to form a usable index structure.
The arena files and index sections must have previously been formatted
using
.I fmtarenas
and
.I fmtisect
respectively.
.PP
The function of a Venti index is to map a SHA1 fingerprint to a location
in the data section of one of the arenas. The index is composed of
blocks, each of which contains the mapping for a fixed range of possible
fingerprint values.
.I Fmtindex
determines the mapping between SHA1 values and the blocks
of the collection of index sections. Once this mapping has been determined,
it cannot be changed without rebuilding the index.
The basic assumption in the current implementation is that the index
structure is sufficiently empty that individual blocks of the index will rarely
overflow. The total size of the index should be about 2% to 10% of
the total size of the arenas, but the exact depends both the index block size
and the compressed size of block stored to Venti.
.PP
.I Fmtindex
also computes a mapping between a linear address space and
the data section of the collection of arenas. The
.B -a
option can be used to add additional arenas to an index.
To use this feature,
add the new arenas to
.I venti.config
after the existing arenas and then run
.I fmtindex
.BR -a .
.PP
A copy of the above mappings is stored in the header for each of the index sections.
These copies enable
.I buildindex
to restore a single index section without rebuilding the entire index.
.PP
.I Fmtisect
formats the given
.IR file ,
typically a disk partition, as a Venti index section with the specified
.IR name .
One or more formatted index sections are combined into a Venti
index using
.IR fmtindex .
Each of the index sections within an index must have a unique name.
.PP
Options to
.I fmtisect
are:
.TP
.BI -b " blocksize
The size, in bytes, for read and write operations to the file.
All the index sections within a index must have the same block size.
The default is 8k.
.TP
.B -Z
Do not zero the index.
Using this option reduces the formatting time
but should only be used when it is known that the file was already zeroed.
.PD
.PP
.I Rdarena
extracts the named
.I arena
from the arena partition
.I arenapart
and writes this arena to standard output.
This command is typically used to back up an arena to external media.
The
.B -v
option generates more verbose output on standard error.
.PP
.I Wrarena
writes the blocks contained in the arena
.I arenafile
(typically, the output of
.IR rdarena )
to a Venti server.
It is typically used to reinitialize a Venti server from backups of the arenas.
For example,
.IP
.EX
venti/rdarena /dev/sdC0/arenas arena.0 >external.media
venti/wrarena -h venti2 external.media
.EE
.LP
writes the blocks contained in
.B arena.0
to the Venti server
.B venti2
(typically not the one using
.BR /dev/sdC0/arenas ).
.PP
The
.B -o
option specifies that the arena starts at byte
.I fileoffset
(default
.BR 0 )
in
.I arenafile .
This is useful for reading directly from
the Venti arena partition:
.IP
.EX
venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas
.EE
.LP
(In this example, 335872 is the offset shown in the Venti
server's index list (344064) minus one block (8192).
You will need to substitute your own arena offsets
and block size.)
.PP
Finally, the optional
.I offset
argument specifies that the writing should begin with the
clump starting at
.I offset
within the arena.
.I Wrarena
prints the offset it stopped at (because there were no more data blocks).
This could be used to incrementally back up a Venti server
to another Venti server:
.IP
.EX
last=`{cat last}
venti/wrarena -h venti2 -o 335872 /dev/sdC0/arenas $last >output
awk '/^end offset/ { print $3 }' offset >last
.EE
.LP
Of course, one would need to add wrapper code to keep track
of which arenas have been processed.
See
.B /sys/src/cmd/venti/backup.example
for a version that does this.
.PP
.I Read
and
.I write
read and write blocks from a running Venti server.
They are intended to ease debugging of the server.
The default
.I host
is the environment variable
.BR $venti ,
followed by the network metaname
.BR $venti .
The
.I type
is the decimal type of block to be read or written.
If no
.I type
is specified for
.I read ,
all types are tried, and a command-line is printed to
show the type that eventually worked.
If no
.I type
is specified for
.I write ,
.B VtDataType
(13)
is used.
.I Read
reads the block named by
.I score
(a SHA1 hash)
from the Venti server and writes it to standard output.
.I Write
reads a block from standard input and attempts to write
it to the Venti server.
If successful, it prints the score of the block on the server.
.PP
.I Copy
walks the entire tree of blocks rooted at
.I score ,
copying all the blocks visited during the walk from
the Venti server at network address
.I src
to the Venti server at network address
.I dst .
If
.I type
(a decimal block type for
.IR score )
is omitted, all types will be tried in sequence
until one is found that works.
The
.B -f
flag runs the copy in ``fast'' mode: if a block is already on
.IR dst ,
the walk does not descend below it, on the assumption that all its
children are also already on
.IR dst .
Without this flag, the copy often transfers many times more
data than necessary.
.PP
To make it easier to bootstrap servers, the configuration
file can be stored at the beginning of any Venti partitions using
.IR conf .
A partition so branded with a configuration file can
be used in place of a configuration file when invoking any
of the venti commands.
By default,
.I conf
prints the configuration stored in
.IR partition .
When invoked with the
.B -w
flag,
.I conf
reads a configuration file from
.I configfile
(or else standard input)
and stores it in
.IR partition .
.SH SOURCE
.B /sys/src/cmd/venti
.SH "SEE ALSO"
.IR venti (8),
.IR venti.conf (6)
.SH BUGS
.I Buildindex
should allow an individual index section to be rebuilt.
The merge sort could be performed in the space used to store the
index rather than requiring a temporary file.