The CellServDB file contains a database of known cells, and is loaded by the cache manager on startup. On machines running the standard computing environment, this file is located in /etc/openafs, and is generated automatically from a number of sources (listed below). To update the CellServDB file and notify the running cache manager of the changes, run /usr/adm/fac/bin/cellservdb.
| Filename | Source |
|---|---|
| /etc/openafs/CellServDB.Transarc | worldwide database from IBM (defunct) |
| /etc/openafs/CellServDB.GCO | worldwide database from GRAND.CENTRAL.ORG |
| /etc/openafs/CellServDB.CMUCS | site-global, platform-independent entries |
| /etc/openafs/CellServDB.group | project-global entries |
| /etc/openafs/CellServDB.local | entries local to this machine |
The CellAlias file contains cell name aliases, which appear as symbolic links in the dynamically-generated AFS root volume when the dynamic root feature is enabled (see the next section). On machines running the standard computing environment, this file is located in /etc/openafs, and is generated automatically from a number of sources (listed below). To update the CellAlias file and notify the running cache manager of the changes, run /usr/adm/fac/bin/cellservdb.
| Filename | Source |
|---|---|
| /etc/openafs/CellAlias.global | site-global, platform-independent aliases |
| /etc/openafs/CellAlias.group | project-global aliases |
| /etc/openafs/CellAlias.local | aliases local to this machine |
On workstations running the standard SCS computing environment, options related to
AFS configuration can be set as Quirks?. Quirk values may be set by creating
or editing the file /etc/quirk.local, which should contain lines of the form
name=value. The quirks described here take effect only during AFS startup,
which normally happens while the system is booting. The following AFS-related
quirks are currently available:
There are six cache tuning parameters which may be included in the afs_cache_parms quirk. At present the default values for these are sort of poor, because they are designed for the workstations, networks, and fileservers of the 1980's, rather than for those of today. That is expected to change in OpenAFS? 1.4; however, there may still be cases where it is necessary to manually tune the cache for unusual situations. This section describes the six parameters, and the best current practice (as of this writing) for selecting their values. At the end of this section is a brief summary of formulas which can be used to choose cache parameters for a general-purpose system.
-blocks (Cache Blocks) This parameter specifies the total number of 1K blocks in the cache. It is normally set in the third field of /etc/openafs/cacheinfo, and should not be overridden via the afs_cache_parms quirk. In our environment, the AFS cache is normally on its own filesystem; in such configurations, the value of this parameter should not exceed 90% of the usable size of the cache filesystem. The correct value is normally computed during installation.
-chunksize (Cache Chunk Size) This parameter controls the maximum size of a cache chunk, and also the size of read requests made to the fileserver. A tradeoff is required here; larger chunks allow for more efficient data transfer of large files, but may impact performance when accessing files for which there is heavy contention. In addition, on a system with a very small cache, setting the chunk size too large may allow only a few files to fill the entire cache. Current thinking puts an appropriate setting at somewhere between 256K and 1MB, with smaller chunk sizes for machines with unusually small caches.
For a general-purpose system, set the chunk size to 1MB, but never to less than 1/1000 the number of blocks in the cache, or less than blocks/files, if the number of cache files is configured manually.
For a special-purpose system where the properties of the working set are well known, it may be desirable to set the chunk size based on those properties. For example, if nearly all of the files in the working set are very large, change infrequently, and are not heavily contended, then a larger chunk size may be appropriate. If in doubt, use the general-purpose rule described above.
The chunk size is specified in bytes as a power of 2, via the -chunksize parameter. A setting like '-chunksize 20' indicates a 1MB chunk size.
-files (Number of Cache Chunks) This parameter controls the maximum number of distinct chunks which can be in the cache at one time. This is the same as the number of Vnnnn files in the cache directory; each such file holds a single chunk. Since the previous two parameters (blocks and chunksize) are merely maximums, there can be and ususally are more files than would be strictly required to utilize the entire cache, if every chunk were full. This is because cache chunks are often not full; a chunk may hold data from a file smaller than the maximum chunk size, or the tail end of a large file.
Because chunk files are initialized during startup and more files cannot be created once AFS is running, it is important to have enough files to fully utilize the cache; otherwise, space is wasted.
There cannot be simultaneous read/write operations to more AFS files than there are cache chunks; in the unlikely event of sustained attempts to exceed this limit, the system will eventually panic with the message "getdcache"
In order to ensure there are enough cache files, three rules have been developed, with the intent being that the largest resulting value used. Note that the constants used in these rules are a matter of active discussion among experts in this area; the values shown here represent best current practice:
Because each chunk occupies kernel memory, it is important not to have too many chunks, either. While this issue is still being studied, recent discussion on the openafs-devel mailing lists suggests that the number of chunk files be limited to not more than about one million. Except on machines with extremely large caches, the number of files suggested by the above rules should be well below this number already.
For a general-purpose machine, simply apply the rules above.
For a special-purpose machine where the size of the working set in chunks (or whole files, for large files) is predictable, the number of files should be set to the working set size, plus a certain amount of overhead. Note that the cache manager always caches chunk-sized blocks of data, not necessarily whole files. In no event should the number of files be set to less than the constant recommended by the first rule above. Because the cache controlled by this parameter is used for on-disk storage of file data retrieved over the network, the working set size should be computed over a relatively long period of time (say, at least a day).
The number of chunk files is specified via the -files parameter.
-dcache (Number of Data Cache Entries) This parameter controls the number of in-memory dcache entries; this is essentially a cache of the on-disk index of the contents of cache chunks. It is meaningless to ever set this parameter to larger than the number of cache files, since each dcache entry always holds data about a distinct cache file.
For a general-purpose machine, this value should be set to one half of the number of cache files, but not less than 2000 or more than 10000. The exception is that if the total number of cache files is less than 2000, then this parameter should be set to the number of cache files.
For a special-purpose machine where the working set size in chunks is known, this parameter should be based on the size of the working set. Because the cache controlled by this parameter is of a relatively large number of relatively small items stored on the local disk, the working set size can be computed over a fairly short period of time (say, several minutes).
The number of dcache entries is specified via the -dcache parameter.
-stat (Number of Vnode Stat Cache Entries) This parameter controls the size of the cache of AFS file metadata. This cache is independent of the data cache; there can be vcache entries for which there are no chunks cached, and there can be chunks cache for which there is currently no vcache entry. If the value is set too large, then kernel memory resources will be wasted. If this value is set too small, then the workstation will have to keep fetching file metadata from the fileserver, which impacts performance and increases fileserver load.
There cannot be simultaneous operations on more AFS files than there are vcache entries; if this limit is exceeded, the system will panic with the message "Exceeded pool of AFS vnodes(VLRU cycle?)".
For general-purpose machines, the best current practice is to assume that the number of files a client will be interested in correlates well with the number of chunks, and to set the number of stat cache entries based on this parameter, as follows:
| if chunksize is... | set stat cache size to... |
|---|---|
| 2^13 or less | dcache / 4 |
| 2^14 or 2^15 | dcache |
| 2^16 or more | dcache * 1.5 |
For special-purpose machines where the working set size in AFS files is predictable, this parameter should be set based on the working set size, plus a certain amount of overhead. Because this cache contains metadata which must be fetched from the fileserver but it small and may change fairly frequently (strictly more frequently than cache data), the working set size can be averaged over a relatively small period of time (say, around an hour).
The number of vnode stat cache entries is specified via the -stat parameter.
-volumes (Number of Volume Cache Entries) This parameter controls the size of the cache of volume location information. As usual, a certain amount of balance is required; too many entries will waste kernel memory storage, while too few will result in too-frequent requests to the VLDB.
There cannot be simultaneous operations involving more AFS volumes than there are volume cache entries. If this limit is exceeded, the system will panic with the message "getvolslot none".
For general purpose single-user machines, a value of 200 is considered sufficient.
For special-purpose machines where the working set size of volumes is known, this parameter should be set based on that value, plus a certain amount of overhead. Volume location information rarely changes, but is never cached for more than two hours. Thus, the working set size should be computed over a period of two hours.
The number of volume location cache entries is specified via the -volumes parameter.
| Parameter | Suggested Value | Notes |
|---|---|---|
| -blocks | 90% of cache partition | set in cacheinfo file |
| -chunksize | 1MB | specified as a power of 2 |
| -files | blocks/32 | not more than 1000000 |
| -dcache | files/2 | 2000 <= dcache <= 10000 </td> |
| -stat | dcache * 1.5 | |
| -volumes | 200 |
Much of the above advice on cache tuning comes from a message which I posted to the openafs-devel mailing list on May 12, 2005. I have reproduced below the analysis contained in that message.
0) The cacheBlocks is always configured, though on some platforms the
startup scripts attempt to select an appropriate value based on the
amount of available disk space. I see no reason to change this.
1) The chunkSize is the maximum size of any given cache chunk, and is
also the size of data transfers done on the wire. It should probably
be set such that a substantial fraction of files will fit in a single
chunk, for efficient wire transfers (recall that you get streaming
only _within_ a FetchData RPC). However, if you have a particularly
small cache you want smaller chunks, to avoid having a few chunks use
the entire cache.
The current default chunksize is 16 (64K). I agree that this is
probably a couple of orders of magnitude too small for today.
Instead I'd suggest a default in the range of 18-20, subject to
the restriction that the chunk size should not be larger than
(cacheBlocks/1000) or (cacheBlocks/cacheFiles), whichever is larger.
Note that if cacheFiles is computed, it will satisfy this rule.
In no case should we use a computed chunk size smaller than 8K.
2) The cacheFiles is the number of distinct data chunks which can be
in the cache at once. This is both the number of VNNNN files in the
cache directory and the number of entries in the CacheItems file.
The default behavior is to compute the number of files such that
(a) There is at least one file for every 10 cache blocks.
(b) There are at least enough files for the cache to be full of
chunks which are 2/3 the maximum chunk size.
(c) There are at least 100 files.
This is then clipped to insure that we don't have more files than
the number of blocks that will be available in the cache after the
CacheItems and VolumeItems files. The CellItems file is not currently
taken into account.
For rule (b) to have any effect, the chunk size would have to be less
than 14 (16K). The only way this can happen is if the chunk size is
hand-configured to be very small, or if the cache is very small. And
for rule (c) to have any effect, you'd have to have a cache smaller
than 1MB. So yes, for reasonable values of cacheBlocks and chunkSize,
rule (a) will dominate and you'll get cacheBlocks/10 files.
However, I'm not convinced that these rules are still right for today.
The right number of files is essentially a function of the cache size
and the expected average chunk size. Now, if we choose a large value
for chunkSize as suggested above, then most chunks will contain whole
files, and the average chunk size will be dominated by the average
file size. I think we can expect the average file size to be more or
less a constant, and this is what rule (a) is intended to accomplish.
However, I doubt the average file these days is as small as 10K. In
fact, a quick scan over the contents of my cell shows an average file
size of 50K, to the extent to which volume header data is valid.
So, I'd set the rule (a) limit to something conservative, like 32.
On the other hand, if the chunk size is a small value, then rule (b)
kicks in, making sure we have room for partially-full chunks even
when the maximum chunk size is quite small. We can probably leave
this rule alone.
Finally, I'd suggest increasing the rule c limit to 1000 files,
rather than only 100. I'm sorry, but 100 just seems tiny today.
Also, we should adjust the max-files computation to take into account
the expected size of the CellItems file (mine is about 4K).
3) OK, so much for the disk cache; now on to the in-memory structures.
The CacheItems file is used to store an index of all the cache files,
containing the FID, DV, offset, and size of each file in the cache,
along with some additional information. It is structured as an array,
with one entry (about 40 bytes) for each cache file. Rather than keep
all of this data in memory or keep searching it on disk, the cache
manager keeps a subset of this data in memory, in dcache entries.
The dCacheSize is the number of these entries that are kept in memory.
The default dCacheSize is currently half the number of cache files,
but not less than 300 and not more than 2000. I agree this range is
probably too small. Something in the range of 2000-10000 would seem
reasonable; however, it should _never_ be larger than cacheFiles.
The dCacheSize setting should approximate the size of the workstation's
working set of chunks. If the chunk size is large, this is close to
the number of files whose contents (not metadata) are in the working
set. If the chunk size is very small, then it's probably some multiple
of that number, though it likely gets complex.
Unfortunately, I don't know a good way to guess what the size of a
random machine's working set is going to be. So we're probably back
to using some property of the cache (cacheBlocks or cacheFiles) as an
approximation. The existing code uses cacheFiles/2, which might be
a little over-agressive, but I suspect that cacheFiles/10 is on the
low side. Let's keep it at cacheFiles/2 for now.
4) The vcache stores metadata about files in AFS. Any time we need to
get information about a file that is not in the vcache, we must make
an RPC to the fileserver. So, you don't want the vcache to be too
small, since that would result in lots of extra RPC's and considerable
performance loss. The ideal vcache size approximates the size of the
workstation's working set of AFS files, including files for which we
only care about metadata.
It is worth noting that on most platforms, vcache entries contain
vnodes, but these are _not_ drawn from the system vnode pool. So, the
size of the system vnode pool has little bearing on the vcache size.
Even on those platforms where AFS uses vnodes from the system pool, it
is important to remember that vcache entries cache information obtained
via fileserver RPC's, and so throwing them away is somewhat costly.
When possible, such platforms should be structured such that it is
possible to have vcache entries without associated vnodes, so that we
are not obligated to limit the vcache size or tie up a substantial
fraction of the system vnode pool.
The default vcache size is set to 300, which is probably way too small.
Unfortunately, I still don't know a good way to approximate the size of
a workstation's working set. However, the problem is similar to the
problem of sizing the dcache, so I'll propose making them dependent
on each other, based on the chunk size:
- chunkSize < 13: cacheStatEntries = dCacheSize / 4
- chunkSize < 16: cacheStatEntries = dCacheSize
- chunkSize > 16: cacheStatEntries = dCacheSize * 1.5
Further, if cacheStatEntries is configured and dCacheSize is not,
then perhaps we should set dCacheSize based on these formulas rather
than on cacheFiles, since the configured setting is more likely to
reflect the user's impression of the working set size and the amount
of memory available to devode to AFS.
5) The volume cache stores cached information about volumes, including
name-to-ID mappings, which volumes have RO clones, and where they are
located. The size of the volume cache should approximate the size of
the workstation's working set of volumes. Entries in this cache are
updated every 2 hours whether they need it or not, so unless you have
a busy machine accessing lots of volumes at once, a pretty small
number will probably be fine.
Even though it was set for the small memory sizes of the 1980's, the
default value of 50 is probably sufficient for single-user systems.
For a larger multi-user system, a larger value might be appropriate.
I'm going to go out on a limb here and guess that such a system ought
to have something like 3-5 volcache entries per active user, bearing
in mind that some of these will be used to cover shared volumes and
things used by the system.
It seems appropriate to make sure the default is sufficient for both
single-user workstations and multi-user systems with a small number of
active users. To that end, I'll propose bumping the default number
of volume cache entries to, say, 200.
-- JeffreyHutzelman - 06 Sep 2005