| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (41 commits)
NFS: Add mount options to enable local caching on NFS
NFS: Display local caching state
NFS: Store pages from an NFS inode into a local cache
NFS: Read pages from FS-Cache into an NFS inode
NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching
NFS: Add read context retention for FS-Cache to call back with
NFS: FS-Cache page management
NFS: Add some new I/O counters for FS-Cache doing things for NFS
NFS: Invalidate FsCache page flags when cache removed
NFS: Use local disk inode cache
NFS: Define and create inode-level cache objects
NFS: Define and create superblock-level objects
NFS: Define and create server-level objects
NFS: Register NFS for caching and retrieve the top-level index
NFS: Permit local filesystem caching to be enabled for NFS
NFS: Add FS-Cache option bit and debug bit
NFS: Add comment banners to some NFS functions
FS-Cache: Make kAFS use FS-Cache
CacheFiles: A cache that backs onto a mounted filesystem
CacheFiles: Export things for CacheFiles
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
nfs_readpage_async() needs to be non-static so that it can be used as a
fallback for the local on-disk caching should an EIO crop up when reading the
cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add some new NFS I/O counters for FS-Cache doing things for NFS. A new line is
emitted into /proc/pid/mountstats if caching is enabled that looks like:
fsc: <rok> <rfl> <wok> <wfl> <unc>
Where <rok> is the number of pages read successfully from the cache, <rfl> is
the number of failed page reads against the cache, <wok> is the number of
successful page writes to the cache, <wfl> is the number of failed page writes
to the cache, and <unc> is the number of NFS pages that have been disconnected
from the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Bind data storage objects in the local cache to NFS inodes.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Define and create superblock-level cache index objects (as managed by
nfs_server structs).
Each superblock object is created in a server level index object and is itself
an index into which inode-level objects are inserted.
Ideally there would be one superblock-level object per server, and the former
would be folded into the latter; however, since the "nosharecache" option
exists this isn't possible.
The superblock object key is a sequence consisting of:
(1) Certain superblock s_flags.
(2) Various connection parameters that serve to distinguish superblocks for
sget().
(3) The volume FSID.
(4) The security flavour.
(5) The uniquifier length.
(6) The uniquifier text. This is normally an empty string, unless the fsc=xyz
mount option was used to explicitly specify a uniquifier.
The key blob is of variable length, depending on the length of (6).
The superblock object is given no coherency data to carry in the auxiliary data
permitted by the cache. It is assumed that the superblock is always coherent.
This patch also adds uniquification handling such that two otherwise identical
superblocks, at least one of which is marked "nosharecache", won't end up
trying to share the on-disk cache. It will be possible to manually provide a
uniquifier through a mount option with a later patch to avoid the error
otherwise produced.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Define and create server-level cache index objects (as managed by nfs_client
structs).
Each server object is created in the NFS top-level index object and is itself
an index into which superblock-level objects are inserted.
Ideally there would be one superblock-level object per server, and the former
would be folded into the latter; however, since the "nosharecache" option
exists this isn't possible.
The server object key is a sequence consisting of:
(1) NFS version
(2) Server address family (eg: AF_INET or AF_INET6)
(3) Server port.
(4) Server IP address.
The key blob is of variable length, depending on the length of (4).
The server object is given no coherency data to carry in the auxiliary data
permitted by the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add FS-Cache option bit to nfs_server struct. This is set to indicate local
on-disk caching is enabled for a particular superblock.
Also add debug bit for local caching operations.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a function to install a monitor on the page lock waitqueue for a particular
page, thus allowing the page being unlocked to be detected.
This is used by CacheFiles to detect read completion on a page in the backing
filesystem so that it can then copy the data to the waiting netfs page.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement the data I/O part of the FS-Cache netfs API. The documentation and
API header file were added in a previous patch.
This patch implements the following functions for the netfs to call:
(*) fscache_attr_changed().
Indicate that the object has changed its attributes. The only attribute
currently recorded is the file size. Only pages within the set file size
will be stored in the cache.
This operation is submitted for asynchronous processing, and will return
immediately. It will return -ENOMEM if an out of memory error is
encountered, -ENOBUFS if the object is not actually cached, or 0 if the
operation is successfully queued.
(*) fscache_read_or_alloc_page().
(*) fscache_read_or_alloc_pages().
Request data be fetched from the disk, and allocate internal metadata to
track the netfs pages and reserve disk space for unknown pages.
These operations perform semi-asynchronous data reads. Upon returning
they will indicate which pages they think can be retrieved from disk, and
will have set in progress attempts to retrieve those pages.
These will return, in order of preference, -ENOMEM on memory allocation
error, -ERESTARTSYS if a signal interrupted proceedings, -ENODATA if one
or more requested pages are not yet cached, -ENOBUFS if the object is not
actually cached or if there isn't space for future pages to be cached on
this object, or 0 if successful.
In the case of the multipage function, the pages for which reads are set
in progress will be removed from the list and the page count decreased
appropriately.
If any read operations should fail, the completion function will be given
an error, and will also be passed contextual information to allow the
netfs to fall back to querying the server for the absent pages.
For each successful read, the page completion function will also be
called.
Any pages subsequently tracked by the cache will have PG_fscache set upon
them on return. fscache_uncache_page() must be called for such pages.
If supplied by the netfs, the mark_pages_cached() cookie op will be
invoked for any pages now tracked.
(*) fscache_alloc_page().
Allocate internal metadata to track a netfs page and reserve disk space.
This will return -ENOMEM on memory allocation error, -ERESTARTSYS on
signal, -ENOBUFS if the object isn't cached, or there isn't enough space
in the cache, or 0 if successful.
Any pages subsequently tracked by the cache will have PG_fscache set upon
them on return. fscache_uncache_page() must be called for such pages.
If supplied by the netfs, the mark_pages_cached() cookie op will be
invoked for any pages now tracked.
(*) fscache_write_page().
Request data be stored to disk. This may only be called on pages that
have been read or alloc'd by the above three functions and have not yet
been uncached.
This will return -ENOMEM on memory allocation error, -ERESTARTSYS on
signal, -ENOBUFS if the object isn't cached, or there isn't immediately
enough space in the cache, or 0 if successful.
On a successful return, this operation will have queued the page for
asynchronous writing to the cache. The page will be returned with
PG_fscache_write set until the write completes one way or another. The
caller will not be notified if the write fails due to an I/O error. If
that happens, the object will become available and all pending writes will
be aborted.
Note that the cache may batch up page writes, and so it may take a while
to get around to writing them out.
The caller must assume that until PG_fscache_write is cleared the page is
use by the cache. Any changes made to the page may be reflected on disk.
The page may even be under DMA.
(*) fscache_uncache_page().
Indicate that the cache should stop tracking a page previously read or
alloc'd from the cache. If the page was alloc'd only, but unwritten, it
will not appear on disk.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement the cookie management part of the FS-Cache netfs client API. The
documentation and API header file were added in a previous patch.
This patch implements the following three functions:
(1) fscache_acquire_cookie().
Acquire a cookie to represent an object to the netfs. If the object in
question is a non-index object, then that object and its parent indices
will be created on disk at this point if they don't already exist. Index
creation is deferred because an index may reside in multiple caches.
(2) fscache_relinquish_cookie().
Retire or release a cookie previously acquired. At this point, the
object on disk may be destroyed.
(3) fscache_update_cookie().
Update the in-cache representation of a cookie. This is used to update
the auxiliary data for coherency management purposes.
With this patch it is possible to have a netfs instruct a cache backend to
look up, validate and create metadata on disk and to destroy it again.
The ability to actually store and retrieve data in the objects so created is
added in later patches.
Note that these functions will never return an error. _All_ errors are
handled internally to FS-Cache.
The worst that can happen is that fscache_acquire_cookie() may return a NULL
pointer - which is considered a negative cookie pointer and can be passed back
to any function that takes a cookie without harm. A negative cookie pointer
merely suppresses caching at that level.
The stub in linux/fscache.h will detect inline the negative cookie pointer and
abort the operation as fast as possible. This means that the compiler doesn't
have to set up for a call in that case.
See the documentation in Documentation/filesystems/caching/netfs-api.txt for
more information.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add functions to register and unregister a network filesystem or other client
of the FS-Cache service. This allocates and releases the cookie representing
the top-level index for a netfs, and makes it available to the netfs.
If the FS-Cache facility is disabled, then the calls are optimised away at
compile time.
Note that whilst this patch may appear to work with FS-Cache enabled and a
netfs attempting to use it, it will leak the cookie it allocates for the netfs
as fscache_relinquish_cookie() is implemented in a later patch. This will
cause the slab code to emit a warning when the module is removed.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement two features of FS-Cache:
(1) The ability to request and release cache tags - names by which a cache may
be known to a netfs, and thus selected for use.
(2) An internal function by which a cache is selected by consulting the netfs,
if the netfs wishes to be consulted.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make FS-Cache create its /proc interface and present various statistical
information through it. Also provide the functions for updating this
information.
These features are enabled by:
CONFIG_FSCACHE_PROC
CONFIG_FSCACHE_STATS
CONFIG_FSCACHE_HISTOGRAM
The /proc directory for FS-Cache is also exported so that caching modules can
add their own statistics there too.
The FS-Cache module is loadable at this point, and the statistics files can be
examined by userspace:
cat /proc/fs/fscache/stats
cat /proc/fs/fscache/histogram
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add the API for a generic facility (FS-Cache) by which caches may declare them
selves open for business, and may obtain work to be done from network
filesystems. The header file is included by:
#include <linux/fscache-cache.h>
Documentation for the API is also added to:
Documentation/filesystems/caching/backend-api.txt
This API is not usable without the implementation of the utility functions
which will be added in further patches.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add the API for a generic facility (FS-Cache) by which filesystems (such as AFS
or NFS) may call on local caching capabilities without having to know anything
about how the cache works, or even if there is a cache:
+---------+
| | +--------------+
| NFS |--+ | |
| | | +-->| CacheFS |
+---------+ | +----------+ | | /dev/hda5 |
| | | | +--------------+
+---------+ +-->| | |
| | | |--+
| AFS |----->| FS-Cache |
| | | |--+
+---------+ +-->| | |
| | | | +--------------+
+---------+ | +----------+ | | |
| | | +-->| CacheFiles |
| ISOFS |--+ | /var/cache |
| | +--------------+
+---------+
General documentation and documentation of the netfs specific API are provided
in addition to the header files.
As this patch stands, it is possible to build a filesystem against the facility
and attempt to use it. All that will happen is that all requests will be
immediately denied as if no cache is present.
Further patches will implement the core of the facility. The facility will
transfer requests from networking filesystems to appropriate caches if
possible, or else gracefully deny them.
If this facility is disabled in the kernel configuration, then all its
operations will trivially reduce to nothing during compilation.
WHY NOT I_MAPPING?
==================
I have added my own API to implement caching rather than using i_mapping to do
this for a number of reasons. These have been discussed a lot on the LKML and
CacheFS mailing lists, but to summarise the basics:
(1) Most filesystems don't do hole reportage. Holes in files are treated as
blocks of zeros and can't be distinguished otherwise, making it difficult
to distinguish blocks that have been read from the network and cached from
those that haven't.
(2) The backing inode must be fully populated before being exposed to
userspace through the main inode because the VM/VFS goes directly to the
backing inode and does not interrogate the front inode's VM ops.
Therefore:
(a) The backing inode must fit entirely within the cache.
(b) All backed files currently open must fit entirely within the cache at
the same time.
(c) A working set of files in total larger than the cache may not be
cached.
(d) A file may not grow larger than the available space in the cache.
(e) A file that's open and cached, and remotely grows larger than the
cache is potentially stuffed.
(3) Writes go to the backing filesystem, and can only be transferred to the
network when the file is closed.
(4) There's no record of what changes have been made, so the whole file must
be written back.
(5) The pages belong to the backing filesystem, and all metadata associated
with that page are relevant only to the backing filesystem, and not
anything stacked atop it.
OVERVIEW
========
FS-Cache provides (or will provide) the following facilities:
(1) Caches can be added / removed at any time, even whilst in use.
(2) Adds a facility by which tags can be used to refer to caches, even if
they're not available yet.
(3) More than one cache can be used at once. Caches can be selected
explicitly by use of tags.
(4) The netfs is provided with an interface that allows either party to
withdraw caching facilities from a file (required for (1)).
(5) A netfs may annotate cache objects that belongs to it. This permits the
storage of coherency maintenance data.
(6) Cache objects will be pinnable and space reservations will be possible.
(7) The interface to the netfs returns as few errors as possible, preferring
rather to let the netfs remain oblivious.
(8) Cookies are used to represent indices, files and other objects to the
netfs. The simplest cookie is just a NULL pointer - indicating nothing
cached there.
(9) The netfs is allowed to propose - dynamically - any index hierarchy it
desires, though it must be aware that the index search function is
recursive, stack space is limited, and indices can only be children of
indices.
(10) Indices can be used to group files together to reduce key size and to make
group invalidation easier. The use of indices may make lookup quicker,
but that's cache dependent.
(11) Data I/O is effectively done directly to and from the netfs's pages. The
netfs indicates that page A is at index B of the data-file represented by
cookie C, and that it should be read or written. The cache backend may or
may not start I/O on that page, but if it does, a netfs callback will be
invoked to indicate completion. The I/O may be either synchronous or
asynchronous.
(12) Cookies can be "retired" upon release. At this point FS-Cache will mark
them as obsolete and the index hierarchy rooted at that point will get
recycled.
(13) The netfs provides a "match" function for index searches. In addition to
saying whether a match was made or not, this can also specify that an
entry should be updated or deleted.
FS-Cache maintains a virtual index tree in which all indices, files, objects
and pages are kept. Bits of this tree may actually reside in one or more
caches.
FSDEF
|
+------------------------------------+
| |
NFS AFS
| |
+--------------------------+ +-----------+
| | | |
homedir mirror afs.org redhat.com
| | |
+------------+ +---------------+ +----------+
| | | | | |
00001 00002 00007 00125 vol00001 vol00002
| | | | |
+---+---+ +-----+ +---+ +------+------+ +-----+----+
| | | | | | | | | | | | |
PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
| |
PG0 +-------+
| |
00001 00003
|
+---+---+
| | |
PG0 PG1 PG2
In the example above, two netfs's can be seen to be backed: NFS and AFS. These
have different index hierarchies:
(*) The NFS primary index will probably contain per-server indices. Each
server index is indexed by NFS file handles to get data file objects.
Each data file objects can have an array of pages, but may also have
further child objects, such as extended attributes and directory entries.
Extended attribute objects themselves have page-array contents.
(*) The AFS primary index contains per-cell indices. Each cell index contains
per-logical-volume indices. Each of volume index contains up to three
indices for the read-write, read-only and backup mirrors of those volumes.
Each of these contains vnode data file objects, each of which contains an
array of pages.
The very top index is the FS-Cache master index in which individual netfs's
have entries.
Any index object may reside in more than one cache, provided it only has index
children. Any index with non-index object children will be assumed to only
reside in one cache.
The FS-Cache overview can be found in:
Documentation/filesystems/caching/fscache.txt
The netfs API to FS-Cache can be found in:
Documentation/filesystems/caching/netfs-api.txt
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Recruit a page flag to aid in cache management. The following extra flag is
defined:
(1) PG_fscache (PG_private_2)
The marked page is backed by a local cache and is pinning resources in the
cache driver.
If PG_fscache is set, then things that checked for PG_private will now also
check for that. This includes things like truncation and page invalidation.
The function page_has_private() had been added to make the checks for both
PG_private and PG_private_2 at the same time.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The attached patch causes read_cache_pages() to release page-private data on a
page for which add_to_page_cache() fails. If the filler function fails, then
the problematic page is left attached to the pagecache (with appropriate flags
set, one presumes) and the remaining to-be-attached pages are invalidated and
discarded. This permits pages with caching references associated with them to
be cleaned up.
The invalidatepage() address space op is called (indirectly) to do the honours.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Document the slow work thread pool.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make the slow work pool configurable through /proc/sys/kernel/slow-work.
(*) /proc/sys/kernel/slow-work/min-threads
The minimum number of threads that should be in the pool as long as it is
in use. This may be anywhere between 2 and max-threads.
(*) /proc/sys/kernel/slow-work/max-threads
The maximum number of threads that should in the pool. This may be
anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater.
(*) /proc/sys/kernel/slow-work/vslow-percentage
The percentage of active threads in the pool that may be used to execute
very slow work items. This may be between 1 and 99. The resultant number
is bounded to between 1 and one fewer than the number of active threads.
This ensures there is always at least one thread that can process very
slow work items, and always at least one thread that won't.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Create a dynamically sized pool of threads for doing very slow work items, such
as invoking mkdir() or rmdir() - things that may take a long time and may
sleep, holding mutexes/semaphores and hogging a thread, and are thus unsuitable
for workqueues.
The number of threads is always at least a settable minimum, but more are
started when there's more work to do, up to a limit. Because of the nature of
the load, it's not suitable for a 1-thread-per-CPU type pool. A system with
one CPU may well want several threads.
This is used by FS-Cache to do slow caching operations in the background, such
as looking up, creating or deleting cache objects.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (36 commits)
dm: set queue ordered mode
dm: move wait queue declaration
dm: merge pushback and deferred bio lists
dm: allow uninterruptible wait for pending io
dm: merge __flush_deferred_io into caller
dm: move bio_io_error into __split_and_process_bio
dm: rename __split_bio
dm: remove unnecessary struct dm_wq_req
dm: remove unnecessary work queue context field
dm: remove unnecessary work queue type field
dm: bio list add bio_list_add_head
dm snapshot: persistent fix dtr cleanup
dm snapshot: move status to exception store
dm snapshot: move ctr parsing to exception store
dm snapshot: use DMEMIT macro for status
dm snapshot: remove dm_snap header
dm snapshot: remove dm_snap header use
dm exception store: move cow pointer
dm exception store: move chunk_fields
dm exception store: move dm_target pointer
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The logging API needs an extra function to make cluster mirroring
possible. This new function allows us to check whether a mirror
region is being recovered on another machine in the cluster. This
helps us prevent simultaneous recovery I/O and process I/O to the
same locations on disk.
Cluster-aware log modules will implement this function. Single
machine log modules will not. So, there is no performance
penalty for single machine mirrors.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Acked-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Remove the 'dm_dirty_log_internal' structure. The resulting cleanup
eliminates extra memory allocations. Therefore exposing the internal
list_head to the external 'dm_dirty_log_type' structure is a worthwhile
compromise.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The tt_internal is really just a list_head to manage registered target_type
in a double linked list,
Here embed the list_head into target_type directly,
1. to avoid kmalloc/kfree;
2. then tt_internal is really unneeded;
Cc: stable@kernel.org
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit f4112de6b679d84bd9b9681c7504be7bdfb7c7d5 ("mm: introduce
debug_kmap_atomic") broke PPC builds with CONFIG_HIGHMEM=y:
CC init/main.o
In file included from include/linux/highmem.h:25,
from include/linux/pagemap.h:11,
from include/linux/mempolicy.h:63,
from init/main.c:53:
arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot':
arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic'
In file included from include/linux/pagemap.h:11,
from include/linux/mempolicy.h:63,
from init/main.c:53:
include/linux/highmem.h: At top level:
include/linux/highmem.h:196: warning: conflicting types for 'debug_kmap_atomic'
include/linux/highmem.h:196: error: static declaration of 'debug_kmap_atomic' follows non-static declaration
include/asm/highmem.h:98: error: previous implicit declaration of 'debug_kmap_atomic' was here
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ixp4xx - Fix handling of chained sg buffers
crypto: shash - Fix unaligned calculation with short length
hwrng: timeriomem - Use phys address rather than virt
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
There is no ioremap'ing or anything in timeriomem-rng.c as I foolishly
used already remapped virtual addresses instead of passing the physical
address to be polled.
This patch fixes this flaw and lets developers do the Right Thing(tm).
Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* 'for-linus' of git://neil.brown.name/md: (53 commits)
md/raid5 revise rules for when to update metadata during reshape
md/raid5: minor code cleanups in make_request.
md: remove CONFIG_MD_RAID_RESHAPE config option.
md/raid5: be more careful about write ordering when reshaping.
md: don't display meaningless values in sysfs files resync_start and sync_speed
md/raid5: allow layout and chunksize to be changed on active array.
md/raid5: reshape using largest of old and new chunk size
md/raid5: prepare for allowing reshape to change layout
md/raid5: prepare for allowing reshape to change chunksize.
md/raid5: clearly differentiate 'before' and 'after' stripes during reshape.
Documentation/md.txt update
md: allow number of drives in raid5 to be reduced
md/raid5: change reshape-progress measurement to cope with reshaping backwards.
md: add explicit method to signal the end of a reshape.
md/raid5: enhance raid5_size to work correctly with negative delta_disks
md/raid5: drop qd_idx from r6_state
md/raid6: move raid6 data processing to raid6_pq.ko
md: raid5 run(): Fix max_degraded for raid level 4.
md: 'array_size' sysfs attribute
md: centralize ->array_sectors modifications
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move the raid6 data processing routines into a standalone module
(raid6_pq) to prepare them to be called from async_tx wrappers and other
non-md drivers/modules. This precludes a circular dependency of raid456
needing the async modules for data processing while those modules in
turn depend on raid456 for the base level synchronous raid6 routines.
To support this move:
1/ The exportable definitions in raid6.h move to include/linux/raid/pq.h
2/ The raid6_call, recovery calls, and table symbols are exported
3/ Extra #ifdef __KERNEL__ statements to enable the userspace raid6test to
compile
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
It really is nicer to keep related code together..
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This makes the includes more explicit, and is preparation for moving
md_k.h to drivers/md/md.h
Remove include/raid/md.h as its only remaining use was to #include
other files.
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The extern function definitions are kernel-internal definitions, so
they belong in md_k.h
The MD_*_VERSION values could reasonably go in a number of places,
but md_u.h seems most reasonable.
This leaves almost nothing in md.h. It will go soon.
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
.. as they are part of the user-space interface.
Also move MdpMinorShift into there so we can remove duplication.
Lastly move mdp_major in. It is less obviously part of the user-space
interface, but do_mounts_md.c uses it, and it is acting a bit like
user-space.
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move the headers with the local structures for the disciplines and
bitmap.h into drivers/md/ so that they are more easily grepable for
hacking and not far away. md.h is left where it is for now as there
are some uses from the outside.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There are two problems with is_mddev_idle.
1/ sync_io is 'atomic_t' and hence 'int'. curr_events and all the
rest are 'long'.
So if sync_io were to wrap on a 64bit host, the value of
curr_events would go very negative suddenly, and take a very
long time to return to positive.
So do all calculations as 'int'. That gives us plenty of precision
for what we need.
2/ To initialise rdev->last_events we simply call is_mddev_idle, on
the assumption that it will make sure that last_events is in a
suitable range. It used to do this, but now it does not.
So now we need to be more explicit about initialisation.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|\ \ \ \ \
| |_|_|_|/
|/| | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup:
remove <linux/ata.h> include from <linux/hdreg.h>
include/linux/hdreg.h: remove unused defines
isd200: use ATA_* defines instead of *_STAT and *_ERR ones
include/linux/hdreg.h: cover WIN_* and friends with #ifndef/#endif __KERNEL__
aoe: WIN_* -> ATA_CMD_*
isd200: WIN_* -> ATA_CMD_*
include/linux/hdreg.h: cover struct hd_driveid with #ifndef/#endif __KERNEL__
xsysace: make it 'struct hd_driveid'-free
ubd_kern: make it 'struct hd_driveid'-free
isd200: make it 'struct hd_driveid'-free
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
All <linux/hdreg.h> users that need <linux/ata.h> have been fixed
to include it directly.
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* Move HD_IRQ define to drivers/block/hd.c (only user).
* Remove unused *_STAT, *_ERR, HD_*, CD, IO, REL and TAG_MASK defines.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
|
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
|
| | |_|/
| |/| |
| | | |
| | | | |
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Remove two unneeded exports and make two symbols static in fs/mpage.c
Cleanup after commit 585d3bc06f4ca57f975a5a1f698f65a45ea66225
Trim includes of fdtable.h
Don't crap into descriptor table in binfmt_som
Trim includes in binfmt_elf
Don't mess with descriptor table in load_elf_binary()
Get rid of indirect include of fs_struct.h
New helper - current_umask()
check_unsafe_exec() doesn't care about signal handlers sharing
New locking/refcounting for fs_struct
Take fs_struct handling to new file (fs/fs_struct.c)
Get rid of bumping fs_struct refcount in pivot_root(2)
Kill unsharing fs_struct in __set_personality()
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit 29a814d2ee0e43c2980f33f91c1311ec06c0aa35 (vfs: add hooks for
ext4's delayed allocation support) exported the following functions
mpage_bio_submit()
__mpage_writepage()
for the benefit of ext4's delayed allocation support. Since commit
a1d6cc563bfdf1bf2829d3e6ce4d8b774251796b (ext4: Rework the
ext4_da_writepages() function), these functions are not used by the
ext4 driver anymore. However, the now unnecessary exports still
remain, and this patch removes those. Moreover, these two functions
can become static again.
The issue was spotted by namespacecheck.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
fsync_bdev() export and a bunch of stubs for !CONFIG_BLOCK case had
been left behind
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Don't pull it in sched.h; very few files actually need it and those
can include directly. sched.h itself only needs forward declaration
of struct fs_struct;
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* all changes of current->fs are done under task_lock and write_lock of
old fs->lock
* refcount is not atomic anymore (same protection)
* its decrements are done when removing reference from current; at the
same time we decide whether to free it.
* put_fs_struct() is gone
* new field - ->in_exec. Set by check_unsafe_exec() if we are trying to do
execve() and only subthreads share fs_struct. Cleared when finishing exec
(success and failure alike). Makes CLONE_FS fail with -EAGAIN if set.
* check_unsafe_exec() may fail with -EAGAIN if another execve() from subthread
is in progress.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Pure code move; two new helper functions for nfsd and daemonize
(unshare_fs_struct() and daemonize_fs_struct() resp.; for now -
the same code as used to be in callers). unshare_fs_struct()
exported (for nfsd, as copy_fs_struct()/exit_fs() used to be),
copy_fs_struct() and exit_fs() don't need exports anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (21 commits)
drm/radeon: load the right microcode on rs780
drm: remove unused "can_grow" parameter from drm_crtc_helper_initial_config
drm: fix EDID backward compat check
drm: sync the mode validation for INTERLACE/DBLSCAN
drm: fix typo in edid vendor parsing.
DRM: drm_crtc_helper.h doesn't actually need i2c.h
drm: fix missing inline function on 32-bit powerpc.
drm: Use pgprot_writecombine in GEM GTT mapping to get the right bits for !PAT.
drm/i915: Add a spinlock to protect the active_list
drm/i915: Fix SDVO TV support
drm/i915: Fix SDVO CREATE_PREFERRED_INPUT_TIMING command
drm/i915: Fix error in SDVO DTD and modeline convert
drm/i915: Fix SDVO command debug function
drm/i915: fix TV mode setting in property change
drm/i915: only set TV mode when any property changed
drm/i915: clean up udelay usage
drm/i915: add VGA hotplug support for 945+
drm/i915: correctly set IGD device's gtt size for KMS.
drm/i915: avoid hanging on to a stale pointer to raw_edid.
drm/i915: check for -EINVAL from vm_insert_pfn
...
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Cleanup some leftovers from the X port.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Remove an include that isn't actually needed to prevent needless
rebuilds.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|