summaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds2008-07-215-9/+521
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits) usb-storage: revert DMA-alignment change for Wireless USB USB: use reset_resume when normal resume fails usb_gadget: composite cdc gadget fault handling usb gadget: minor USBCV fix for composite framework USB: Fix bug with byte order in isp116x-hcd.c fio write/read USB: fix double kfree in ipaq in error case USB: fix build error in cdc-acm for CONFIG_PM=n USB: remove board-specific UP2OCR configuration from pxa27x-udc USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx USB: Fix pointer/int cast in USB devio code usb gadget: g_cdc dependso on NET USB: Au1xxx-usb: suspend/resume support. USB: Au1xxx-usb: clean up ohci/ehci bus glue sources. usbfs: don't store bad pointers in registration usbfs: fix race between open and unregister usbfs: simplify the lookup-by-minor routines usbfs: send disconnect signals when device is unregistered USB: Force unbinding of drivers lacking reset_resume or other methods USB: ohci-pnx4008: I2C cleanups and fixes USB: debug port converter does not accept more than 8 byte packets ...
| * USB: Force unbinding of drivers lacking reset_resume or other methodsAlan Stern2008-07-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1024) takes care of a FIXME issue: Drivers that don't have the necessary suspend, resume, reset_resume, pre_reset, or post_reset methods will be unbound and their interface reprobed when one of the unsupported events occurs. This is made slightly more difficult by the fact that bind operations won't work during a system sleep transition. So instead the code has to defer the operation until the transition ends. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * USB: fix usb_reset_device and usb_reset_composite_device(take 3)Ming Lei2008-07-211-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch renames the existing usb_reset_device in hub.c to usb_reset_and_verify_device and renames the existing usb_reset_composite_device to usb_reset_device. Also the new usb_reset_and_verify_device does't need to be EXPORTED . The idea of the patch is that external interface driver should warn the other interfaces' driver of the same device before and after reseting the usb device. One interface driver shoud call _old_ usb_reset_composite_device instead of _old_ usb_reset_device since it can't assume the device contains only one interface. The _old_ usb_reset_composite_device is safe for single interface device also. we rename the two functions to make the change easily. This patch is under guideline from Alan Stern. Signed-off-by: Ming Lei <tom.leiming@gmail.com>
| * USB: remove interface parameter of usb_reset_composite_deviceMing Lei2008-07-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | From the current implementation of usb_reset_composite_device function, the iface parameter is no longer useful. This function doesn't do something special for the iface usb_interface,compared with other interfaces in the usb_device. So remove the parameter and fix the related caller. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * USB Gadget: documentation updateAlan Stern2008-07-211-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1102) clarifies two points in the USB Gadget kerneldoc: Request completion callbacks are always made with interrupts disabled; Device controllers may not support STALLing the status stage of a control transfer after the data stage is over. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * usb: irda: cleanup on ir-usb moduleFelipe Balbi2008-07-211-0/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | General cleanup on ir-usb module. Introduced a common header that could be used also on usb gadget framework. Lot's of cleanups and now using macros from the header file. Signed-off-by: Felipe Balbi <me@felipebalbi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * usb gadget: composite gadget coreDavid Brownell2008-07-211-0/+338
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add <linux/usb/composite.h> interfaces for composite gadget drivers, and basic implementation support behind it: - struct usb_function ... groups one or more interfaces into a function managed as one unit within a configuration, to which it's added by usb_add_function(). - struct usb_configuration ... groups one or more such functions into a configuration managed as one unit by a driver, to which it's added by usb_add_config(). These operate at either high or full/low speeds and at a given bMaxPower. - struct usb_composite_driver ... groups one or more such configurations into a gadget driver, which may be registered or unregistered. - struct usb_composite_dev ... a usb_composite_driver manages this; it wraps the usb_gadget exposed by the controller driver. This also includes some basic kerneldoc. How to use it (the short version): provide a usb_composite_driver with a bind() that calls usb_add_config() for each of the needed configurations. The configurations in turn have bind() calls, which will usb_add_function() for each function required. Each function's bind() allocates resources needed to perform its tasks, like endpoints; sometimes configurations will allocate resources too. Separate patches will convert most gadget drivers to this infrastructure. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * usb gadget: descriptor copying supportDavid Brownell2008-07-211-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define three new descriptor manipulation utilities, for use when setting up functions that may have multiple instances: usb_copy_descriptors() to copy a vector of descriptors usb_free_descriptors() to free the copy usb_find_endpoint() to find a copied version These will be used as follows. Functions will continue to have static tables of descriptors they update, now used as __initdata templates. When a function creates a new instance, it patches those tables with relevant interface and string IDs, plus endpoint assignments. Then it copies those morphed descriptors, associates the copies with the new function instance, and records the endpoint descriptors to use when activating the endpoints. When initialization is done, only the copies remain in memory. The copies are freed on driver removal. This ensures that each instance has descriptors which hold the right instance-specific data. Two instances in the same configuration will obviously never share the same interface IDs or use the same endpoints. Instances in different configurations won't do so either, which means this is slightly less memory-efficient in some cases. This also includes a bugfix to the epautoconf code that shows up with this usage model. It must replace the previous endpoint number when updating the template descriptors, not just mask in a few more bits. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * USB: remove CVS keywordsAdrian Bunk2008-07-211-2/+0
| | | | | | | | | | | | | | | | | | This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * USB: implement "soft" unbindingAlan Stern2008-07-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1091) changes the way usbcore handles interface unbinding. If the interface's driver supports "soft" unbinding (a new flag in the driver structure) then in-flight URBs are not cancelled and endpoints are not disabled. Instead the driver is allowed to continue communicating with the device (although of course it should stop before its disconnect routine returns). The purpose of this change is to allow drivers to do a clean shutdown when they get unbound from a device that is still plugged in. Killing all the URBs and disabling the endpoints before calling the driver's disconnect method doesn't give the driver any control over what happens, and it can leave devices in indeterminate states. For example, when usb-storage unbinds it doesn't want to stop while in the middle of transmitting a SCSI command. The soft_unbind flag is added because in the past, a number of drivers have experienced problems related to ongoing I/O after their disconnect routine returned. Hence "soft" unbinding is made available only to drivers that claim to support it. The patch also replaces "interface_to_usbdev(intf)" with "udev" in a couple of places, a minor simplification. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * USB: handle pci_name() being constGreg Kroah-Hartman2008-07-211-1/+1
| | | | | | | | | | | | | | | | This changes usb_create_hcd() to be able to handle the fact that pci_name() has changed to a constant string. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | Merge branch 'next' of ↵Linus Torvalds2008-07-211-3/+0
|\ \ | |/ |/| | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpufreq: remove CVS keywords [CPUFREQ] change cpu freq arrays to per_cpu variables
| * [CPUFREQ] cpufreq: remove CVS keywordsAdrian Bunk2008-05-191-3/+0
| | | | | | | | | | | | | | | | This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Dave Jones <davej@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-07-213-11/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netfilter: nf_conntrack_sctp: fix sparse warnings netfilter: nf_nat_sip: c= is optional for session netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function netfilter: nfnetlink_log: send complete hardware header netfilter: xt_time: fix time's time_mt()'s use of do_div() netfilter: accounting rework: ct_extend + 64bit counters (v4) netlink: add NLA_PUT_BE64 macro netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM hdlcdrv: Fix CRC calculation. Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe." net: In __netif_schedule() use WARN_ON instead of BUG_ON net: Improve simple_tx_hash(). pkt_sched: Remove unused variable skb in dev_deactivate_queue function. sunhme: Remove stop/wake TX queue calls in set-multicast-list handler. ucc_geth: do not touch net queue in adjust_link phylib callback gianfar: do not touch net queue in adjust_link phylib callback atl1: Do not wake queue before queue has been started.
| * | netfilter: nfnetlink_log: send complete hardware headerEric Leblond2008-07-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds some fields to NFLOG to be able to send the complete hardware header with all necessary informations. It sends to userspace: * the type of hardware link * the lenght of hardware header * the hardware header Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netfilter: accounting rework: ct_extend + 64bit counters (v4)Krzysztof Piotr Oledzki2008-07-212-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initially netfilter has had 64bit counters for conntrack-based accounting, but it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are still required, for example for "connbytes" extension. However, 64bit counters waste a lot of memory and it was not possible to enable/disable it runtime. This patch: - reimplements accounting with respect to the extension infrastructure, - makes one global version of seq_print_acct() instead of two seq_print_counters(), - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n), - makes it possible to enable/disable it at runtime by sysctl or sysfs, - extends counters from 32bit to 64bit, - renames ip_conntrack_counter -> nf_conn_counter, - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT), - set initial accounting enable state based on CONFIG_NF_CT_ACCT - removes buggy IPCT_COUNTER_FILLING event handling. If accounting is enabled newly created connections get additional acct extend. Old connections are not changed as it is not possible to add a ct_extend area to confirmed conntrack. Accounting is performed for all connections with acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct". Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'x86/for-linus' of ↵Linus Torvalds2008-07-211-0/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits) x86: remove extra calling to get ext cpuid level x86: use setup_clear_cpu_cap() when disabling the lapic KVM: fix exception entry / build bug, on 64-bit x86: add unknown_nmi_panic kernel parameter x86, VisWS: turn into generic arch, eliminate leftover files x86: add ->pre_time_init to x86_quirks x86: extend and use x86_quirks to clean up NUMAQ code x86: introduce x86_quirks x86: improve debug printout: add target bootmem range in early_res_to_bootmem() Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM x86: remove arch_get_ram_range x86: Add a debugfs interface to dump PAT memtype x86: Add a arch directory for x86 under debugfs x86: i386: reduce boot fixmap space i386/xen: add proper unwind annotations to xen_sysenter_target x86: reduce force_mwait visibility x86: reduce forbid_dac's visibility x86: fix two modpost warnings x86: check function status in EDD boot code x86_64: ia32_signal.c: remove signal number conversion ...
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| | \ \
| *---------. \ \ Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', ↵Ingo Molnar2008-07-211-0/+2
| |\ \ \ \ \ \ \ \ | | |_|_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus
| | | | | | | * | x86: Add a arch directory for x86 under debugfsvenkatesh.pallipadi@intel.com2008-07-181-0/+2
| | | | |_|_|/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a directory for x86 arch under debugfs. Can be used to accumulate all x86 specific debugfs files. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds2008-07-212-2/+8
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm crypt: add merge dm table: remove merge_bvec sector restriction dm: linear add merge dm: introduce merge_bvec_fn dm snapshot: use per device mempools dm snapshot: fix race during exception creation dm snapshot: track snapshot reads dm mpath: fix test for reinstate_path dm mpath: return parameter error dm io: remove struct padding dm log: make dm_dirty_log init and exit static dm mpath: free path selector on invalid args
| * | | | | | | | dm: introduce merge_bvec_fnMilan Broz2008-07-212-2/+8
| | |/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a bvec merge function for device mapper devices for dynamic size restrictions. This code ensures the requested biovec lies within a single target and then calls a target-specific function to check against any constraints imposed by underlying devices. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* | | | | | | | Merge branch 'for-linus' of git://neil.brown.name/mdLinus Torvalds2008-07-216-28/+61
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://neil.brown.name/md: (52 commits) md: Protect access to mddev->disks list using RCU md: only count actual openers as access which prevent a 'stop' md: linear: Make array_size sector-based and rename it to array_sectors. md: Make mddev->array_size sector-based. md: Make super_type->rdev_size_change() take sector-based sizes. md: Fix check for overlapping devices. md: Tidy up rdev_size_store a bit: md: Remove some unused macros. md: Turn rdev->sb_offset into a sector-based quantity. md: Make calc_dev_sboffset() return a sector count. md: Replace calc_dev_size() by calc_num_sectors(). md: Make update_size() take the number of sectors. md: Better control of when do_md_stop is allowed to stop the array. md: get_disk_info(): Don't convert between signed and unsigned and back. md: Simplify restart_array(). md: alloc_disk_sb(): Return proper error value. md: Simplify sb_equal(). md: Simplify uuid_equal(). md: sb_equal(): Fix misleading printk. md: Fix a typo in the comment to cmd_match(). ...
| * | | | | | | | md: Protect access to mddev->disks list using RCUNeilBrown2008-07-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All modifications and most access to the mddev->disks list are made under the reconfig_mutex lock. However there are three places where the list is walked without any locking. If a reconfig happens at this time, havoc (and oops) can ensue. So use RCU to protect these accesses: - wrap them in rcu_read_{,un}lock() - use list_for_each_entry_rcu - add to the list with list_add_rcu - delete from the list with list_del_rcu - delay the 'free' with call_rcu rather than schedule_work Note that export_rdev did a list_del_init on this list. In almost all cases the entry was not in the list anymore so it was a no-op and so safe. It is no longer safe as after list_del_rcu we may not touch the list_head. An audit shows that export_rdev is called: - after unbind_rdev_from_array, in which case the delete has already been done, - after bind_rdev_to_array fails, in which case the delete isn't needed. - before the device has been put on a list at all (e.g. in add_new_disk where reading the superblock fails). - and in autorun devices after a failure when the device is on a different list. So remove the list_del_init call from export_rdev, and add it back immediately before the called to export_rdev for that last case. Note also that ->same_set is sometimes used for lists other than mddev->list (e.g. candidates). In these cases rcu is not needed. Signed-off-by: NeilBrown <neilb@suse.de>
| * | | | | | | | md: only count actual openers as access which prevent a 'stop'NeilBrown2008-07-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Open isn't the only thing that increments ->active. e.g. reading /proc/mdstat will increment it briefly. So to avoid false positives in testing for concurrent access, introduce a new counter that counts just the number of times the md device it open. Signed-off-by: NeilBrown <neilb@suse.de>
| * | | | | | | | md: linear: Make array_size sector-based and rename it to array_sectors.Andre Noll2008-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
| * | | | | | | | md: Make mddev->array_size sector-based.Andre Noll2008-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
| * | | | | | | | md: Remove some unused macros.Andre Noll2008-07-111-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | md: Turn rdev->sb_offset into a sector-based quantity.Andre Noll2008-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename it to sb_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | Merge branch 'master' into for-nextNeil Brown2008-07-112-0/+16
| |\ \ \ \ \ \ \ \
| * \ \ \ \ \ \ \ \ Merge branch 'for-neil' of ↵Neil Brown2008-07-081-1/+1
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/djbw/md into for-next
| | * | | | | | | | | md: resolve external metadata handling deadlock in md_allow_writeDan Williams2008-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | md_allow_write() marks the metadata dirty while holding mddev->lock and then waits for the write to complete. For externally managed metadata this causes a deadlock as userspace needs to take the lock to communicate that the metadata update has completed. Change md_allow_write() in the 'external' case to start the 'mark active' operation and then return -EAGAIN. The expected side effects while waiting for userspace to write 'active' to 'array_state' are holding off reshape (code currently handles -ENOMEM), cause some 'stripe_cache_size' change requests to fail, cause some GET_BITMAP_FILE ioctl requests to fall back to GFP_NOIO, and cause updates to 'raid_disks' to fail. Except for 'stripe_cache_size' changes these failures can be mitigated by coordinating with mdmon. md_write_start() still prevents writes from occurring until the metadata handler has had a chance to take action as it unconditionally waits for MD_CHANGE_CLEAN to be cleared. [neilb@suse.de: return -EAGAIN, try GFP_NOIO] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | | | | | | | | | Merge branch 'master' into for-nextNeil Brown2008-07-0821-32/+50
| |\ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / | |/| | | | | | | | |
| * | | | | | | | | | md: replace R5_WantPrexor with R5_WantDrain, add 'prexor' reconstruct_statesDan Williams2008-06-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Dan Williams <dan.j.williams@intel.com> Currently ops_run_biodrain and other locations have extra logic to determine which blocks are processed in the prexor and non-prexor cases. This can be eliminated if handle_write_operations5 flags the blocks to be processed in all cases via R5_Wantdrain. The presence of the prexor operation is tracked in sh->reconstruct_state. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | md: replace STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} with 'reconstruct_states'Dan Williams2008-06-281-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Dan Williams <dan.j.williams@intel.com> Track the state of reconstruct operations (recalculating the parity block usually due to incoming writes, or as part of array expansion) Reduces the scope of the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags to only tracking whether a reconstruct operation has been requested via the ops_request field of struct stripe_head_state. This is the final step in the removal of ops.{pending,ack,complete,count}, i.e. the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags only request an operation and do not track the state of the operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | md: replace STRIPE_OP_CHECK with 'check_states'Dan Williams2008-06-281-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Dan Williams <dan.j.williams@intel.com> The STRIPE_OP_* flags record the state of stripe operations which are performed outside the stripe lock. Their use in indicating which operations need to be run is straightforward; however, interpolating what the next state of the stripe should be based on a given combination of these flags is not straightforward, and has led to bugs. An easier to read implementation with minimal degrees of freedom is needed. Towards this goal, this patch introduces explicit states to replace what was previously interpolated from the STRIPE_OP_* flags. For now this only converts the handle_parity_checks5 path, removing a user of the ops.{pending,ack,complete,count} fields of struct stripe_operations. This conversion also found a remaining issue with the current code. There is a small window for a drive to fail between when we schedule a repair and when the parity calculation for that repair completes. When this happens we will writeback to 'failed_num' when we really want to write back to 'pd_idx'. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | md: kill STRIPE_OP_IO flagDan Williams2008-06-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Dan Williams <dan.j.williams@intel.com> The R5_Want{Read,Write} flags already gate i/o. So, this flag is superfluous and we can unconditionally call ops_run_io(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | md: kill STRIPE_OP_MOD_DMA in raid5 offloadDan Williams2008-06-281-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Dan Williams <dan.j.williams@intel.com> This micro-optimization allowed the raid code to skip a re-read of the parity block after checking parity. It took advantage of the fact that xor-offload-engines have their own internal result buffer and can check parity without writing to memory. Remove it for the following reasons: 1/ It is a layering violation for MD to need to manage the DMA and non-DMA paths within async_xor_zero_sum 2/ Bad precedent to toggle the 'ops' flags outside the lock 3/ Hard to realize a performance gain as reads will not need an updated parity block and writes will dirty it anyways. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | Make sure all changes to md/dev-XX/state are notifiedNeil Brown2008-06-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The important state change happens during an interrupt in md_error. So just set a flag there and call sysfs_notify later in process context. Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | Make sure all changes to md/sync_action are notified.Neil Brown2008-06-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the 'resync' thread starts or stops, when we explicitly set sync_action, or when we determine that there is definitely nothing to do, we notify sync_action. To stop "sync_action" from occasionally showing the wrong value, we introduce a new flags - MD_RECOVERY_RECOVER - to say that a recovery is probably needed or happening, and we make sure that we set MD_RECOVERY_RUNNING before clearing MD_RECOVERY_NEEDED. Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | Allow setting start point for requested check/repairNeil Brown2008-06-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes it possible to just resync a small part of an array. e.g. if a drive reports that it has questionable sectors, a 'repair' of just the region covering those sectors will cause them to be read and, if there is an error, re-written with correct data. Signed-off-by: Neil Brown <neilb@suse.de>
| * | | | | | | | | | Improve setting of "events_cleared" for write-intent bitmaps.Neil Brown2008-06-281-0/+1
| | |_|_|_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an array is degraded, bits in the write-intent bitmap are not cleared, so that if the missing device is re-added, it can be synced by only updated those parts of the device that have changed since it was removed. The enable this a 'events_cleared' value is stored. It is the event counter for the array the last time that any bits were cleared. Sometimes - if a device disappears from an array while it is 'clean' - the events_cleared value gets updated incorrectly (there are subtle ordering issues between updateing events in the main metadata and the bitmap metadata) resulting in the missing device appearing to require a full resync when it is re-added. With this patch, we update events_cleared precisely when we are about to clear a bit in the bitmap. We record events_cleared when we clear the bit internally, and copy that to the superblock which is written out before the bit on storage. This makes it more "obviously correct". We also need to update events_cleared when the event_count is going backwards (as happens on a dirty->clean transition of a non-degraded array). Thanks to Mike Snitzer for identifying this problem and testing early "fixes". Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de>
* | | | | | | | | | Merge master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 into nextDmitry Torokhov2008-07-21212-1753/+4577
|\ \ \ \ \ \ \ \ \ \ | | |_|_|_|/ / / / / | |/| | | | | | | |
| * | | | | | | | | Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-07-208-34/+56
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...
| | * | | | | | | | | lockd: Pass "struct sockaddr *" to new failover-by-IP functionChuck Lever2008-07-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to allow for future support of IPv6. Also provide additional sanity checking in failover_unlock_ip() when constructing the server's IP address. As an added bonus, provide clean kerneldoc comments on related NLM interfaces which were recently added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | | | | | | | | lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lockJeff Layton2008-07-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this function, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_lock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. Since nlmsvc_testlock() now just uses the caller's reference, we no longer need to get or release it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | | | | | | | | lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlockJeff Layton2008-07-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The callers of this functions, however, call nlmsvc_retrieve_args or nlm4svc_retrieve_args, which also return a nlm_host struct. Change nlmsvc_testlock to take a host arg instead of calling nlmsvc_lookup_host itself and change the callers to pass a pointer to the nlm_host they've already found. We take a reference to host in the place where nlmsvc_testlock() previous did a new lookup, so the reference counting is unchanged from before. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | | | | | | | | file lock: reorder struct file_lock to save space on 64 bit buildsRichard Kennedy2008-07-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce sizeof struct file_lock by 8 on 64 bit builds allowing +1 objects per slab in the file_lock_cache Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | | | | | | | | Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 ↵J. Bruce Fields2008-07-0374-206/+338
| | |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into for-2.6.27
| | | * | | | | | | | | svcrdma: Change WR context get/put to use the kmem cacheTom Tucker2008-07-021-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the WR context pool to be shared across mount points. This reduces the RDMA transport memory footprint significantly since idle mounts don't consume WR context memory. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | | * | | | | | | | | svcrdma: Remove unused wait q from svcrdma_xprt structureTom Tucker2008-07-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sc_read_wait queue head is no longer used. Remove it. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>