summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* RDMA/nes: Add support for SFP+ PHYEric Schneider2008-04-295-64/+259
| | | | | | | | | This patch enables the iw_nes module for NetEffect RNICs to support additional PHYs including SFP+ (referred to as ARGUS in the code). Signed-off-by: Eric Schneider <eric.schneider@neteffect.com> Signed-off-by: Glenn Streiff <gstreiff@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/nes: Use LROFaisal Latif2008-04-296-12/+70
| | | | | | Signed-off-by: Faisal Latif <flatif@neteffect.com. Signed-off-by: Glenn Streiff <gstreiff@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Copy child MTU from parentEli Cohen2008-04-291-0/+3
| | | | | | | | | | | | | When creating a child interface, copy the MTU information from the parent. Otherwise when the child's multicast join completes, the MTU will not be updated since the code does dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); and priv->admin_mtu will be set to 0. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Avoid changing userspace ABI to handle DMA write barrier attributeRoland Dreier2008-04-293-5/+20
| | | | | | | | | | | | | | | | | | Commit cb9fbc5c ("IB: expand ib_umem_get() prototype") changed the mthca userspace ABI to provide a way for userspace to indicate which memory regions need the DMA write barrier attribute. However, it is possible to handle this without breaking existing userspace, by having the mthca kernel driver recognize whether it is talking to old or new userspace, depending on the size of the register MR structure passed in. The only potential drawback of this is that is allows old userspace (which has a bug with DMA ordering on large SGI Altix systems) to continue to run on new kernels, but the advantage of allowing old userspace to continue to work on unaffected systems seems to outweigh this, and we can print a warning to push people to upgrade their userspace. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Avoid recycling old FMR R_Keys too soonOlaf Kirch2008-04-291-13/+0
| | | | | | | | | | | | | | | | | | | | | | | When a FMR is unmapped, mthca resets the map count to 0, and clears the upper part of the R_Key which is used as the sequence counter. This poses a problem for RDS, which uses ib_fmr_unmap as a fence operation. RDS assumes that after issuing an unmap, the old R_Keys will be invalid for a "reasonable" period of time. For instance, Oracle processes uses shared memory buffers allocated from a pool of buffers. When a process dies, we want to reclaim these buffers -- but we must make sure there are no pending RDMA operations to/from those buffers. The only way to achieve that is by using unmap and sync the TPT. However, when the sequence count is reset on unmap, there is a high likelihood that a new mapping will be given the same R_Key that was issued a few milliseconds ago. To prevent this, don't reset the sequence count when unmapping a FMR. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* mlx4_core: Avoid recycling old FMR R_Keys too soonOlaf Kirch2008-04-291-6/+0
| | | | | | | | | | | | | | | | | | | | | | | When a FMR is unmapped, mlx4 resets the map count to 0, and clears the upper part of the R_Key which is used as the sequence counter. This poses a problem for RDS, which uses ib_fmr_unmap as a fence operation. RDS assumes that after issuing an unmap, the old R_Keys will be invalid for a "reasonable" period of time. For instance, Oracle processes uses shared memory buffers allocated from a pool of buffers. When a process dies, we want to reclaim these buffers -- but we must make sure there are no pending RDMA operations to/from those buffers. The only way to achieve that is by using unmap and sync the TPT. However, when the sequence count is reset on unmap, there is a high likelihood that a new mapping will be given the same R_Key that was issued a few milliseconds ago. To prevent this, don't reset the sequence count when unmapping a FMR. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Allocate event queue size depending on max number of CQs and QPsStefan Roscher2008-04-294-4/+74
| | | | | | | | | | | | | | If a lot of QPs fall into Error state at once and the EQ of the respective HCA is too small, it might overrun, causing the eHCA driver to stop processing completion events and calling the application's completion handlers, effectively causing traffic to stop. Fix this by limiting available QPs and CQs to a customizable max count, and determining EQ size based on these counts and a worst-case assumption. Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Use separate CQ for UD send completionsEli Cohen2008-04-296-40/+64
| | | | | | | | | | Use a dedicated CQ for UD send completions. Also, do not arm the UD send CQ, which reduces the number of interrupts generated. This patch farther reduces overhead by not calling poll CQ for every posted send WR -- it does polls only when there 16 or more outstanding work requests. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/iser: Count FMR alignment violations per sessionEli Dorfman2008-04-293-1/+6
| | | | | | | | Count FMR alignment violations per session as part of the iscsi statistics. Signed-off-by: Eli Dorfman <elid@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/iser: Move high-volume debug output to higher debug levelEli Dorfman2008-04-292-2/+12
| | | | | | | Add another level for debug. Signed-off-by: Eli Dorfman <elid@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: handle negative return value from ibmebus_request_irq() properlyHoang-Nam Nguyen2008-04-291-18/+17
| | | | | | | | | | | | ehca_create_eq() was assigning a signed return value to an unsiged local variable and then checking if the variable was < 0, which meant that errors were always ignored. Fix this by using one variable for signed integer return values and another for u64 hcall return values. Bug originally found by Roel Kluin <12o3l@tiscali.nl>. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/cxgb3: Support peer-2-peer connection setupSteve Wise2008-04-297-29/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Open MPI, Intel MPI and other applications don't respect the iWARP requirement that the client (active) side of the connection send the first RDMA message. This class of application connection setup is called peer-to-peer. Typically once the connection is setup, _both_ sides want to send data. This patch enables supporting peer-to-peer over the chelsio RNIC by enforcing this iWARP requirement in the driver itself as part of RDMA connection setup. Connection setup is extended, when the peer2peer module option is 1, such that the MPA initiator will send a 0B Read (the RTR) just after connection setup. The MPA responder will suspend SQ processing until the RTR message is received and reply-to. In the longer term, this will be handled in a standardized way by enhancing the MPA negotiation so peers can indicate whether they want/need the RTR and what type of RTR (0B read, 0B write, or 0B send) should be sent. This will be done by standardizing a few bits of the private data in order to negotiate all this. However this patch enables peer-to-peer applications now and allows most of the required firmware and driver changes to be done and tested now. Design: - Add a module option, peer2peer, to enable this mode. - New firmware support for peer-to-peer mode: - a new bit in the rdma_init WR to tell it to do peer-2-peer and what form of RTR message to send or expect. - process _all_ preposted recvs before moving the connection into rdma mode. - passive side: defer completing the rdma_init WR until all pre-posted recvs are processed. Suspend SQ processing until the RTR is received. - active side: expect and process the 0B read WR on offload TX queue. Defer completing the rdma_init WR until all pre-posted recvs are processed. Suspend SQ processing until the 0B read WR is processed from the offload TX queue. - If peer2peer is set, driver posts 0B read request on offload TX queue just after posting the rdma_init WR to the offload TX queue. - Add CQ poll logic to ignore unsolicitied read responses. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/cxgb3: Set the max_mr_size device attribute correctlySteve Wise2008-04-294-1/+4
| | | | | | | | cxgb3 only supports 4GB memory regions. The lustre RDMA code uses this attribute and currently has to code around our bad setting. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/cxgb3: Correctly serialize peer abort pathSteve Wise2008-04-293-35/+72
| | | | | | | | | | | | | | | | | | | | Open MPI and other stress testing exposed a few bad bugs in handling aborts in the middle of a normal close. Fix these by: - serializing abort reply and peer abort processing with disconnect processing - warning (and ignoring) if ep timer is stopped when it wasn't running - cleaning up disconnect path to correctly deal with aborting and dead endpoints - in iwch_modify_qp(), taking a ref on the ep before releasing the qp lock if iwch_ep_disconnect() will be called. The ref is dropped after calling disconnect. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* mlx4_core: Add a way to set the "collapsed" CQ flagYevgeny Petrilin2008-04-293-3/+6
| | | | | | | | Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Improve queue_is_locked()Jens Axboe2008-04-291-0/+4
| | | | | | | | | spin_is_locked() doesn't work on UP without spinlock debugging. Make it safer and just return 1 on UP, so we don't get false positives. The plan is to kill this debug function during the -rc cycle. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/net/tehuti: use proper capability check for raw IO accessLinus Torvalds2008-04-291-1/+1
| | | | | | | | | Yeah, in practice they both mean "root", but Alan correctly points out that anybody who gets to do raw IO space accesses should really be using CAP_SYS_RAWIO rather than CAP_NET_ADMIN. Pointed-out-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'audit.b50' of ↵Linus Torvalds2008-04-2922-233/+346
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] new predicate - AUDIT_FILETYPE [patch 2/2] Use find_task_by_vpid in audit code [patch 1/2] audit: let userspace fully control TTY input auditing [PATCH 2/2] audit: fix sparse shadowed variable warnings [PATCH 1/2] audit: move extern declarations to audit.h Audit: MAINTAINERS update Audit: increase the maximum length of the key field Audit: standardize string audit interfaces Audit: stop deadlock from signals under load Audit: save audit_backlog_limit audit messages in case auditd comes back Audit: collect sessionid in netlink messages Audit: end printk with newline
| * [PATCH] new predicate - AUDIT_FILETYPEAl Viro2008-04-283-0/+25
| | | | | | | | | | | | | | | | | | | | Argument is S_IF... | <index>, where index is normally 0 or 1. Triggers if chosen element of ctx->names[] is present and the mode of object in question matches the upper bits of argument. I.e. for things like "is the argument of that chmod a directory", etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [patch 2/2] Use find_task_by_vpid in audit codePavel Emelyanov2008-04-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The pid to lookup a task by is passed inside audit code via netlink message. Thanks to Denis Lunev, netlink packets are now (since 2.6.24) _always_ processed in the context of the sending task. So this is correct to lookup the task with find_task_by_vpid() here. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [patch 1/2] audit: let userspace fully control TTY input auditingMiloslav Trmac2008-04-283-59/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the code that automatically disables TTY input auditing in processes that open TTYs when they have no other TTY open; this heuristic was intended to automatically handle daemons, but it has false positives (e.g. with sshd) that make it impossible to control TTY input auditing from a PAM module. With this patch, TTY input auditing is controlled from user-space only. On the other hand, not even for daemons does it make sense to audit "input" from PTY masters; this data was produced by a program writing to the PTY slave, and does not represent data entered by the user. Signed-off-by: Miloslav Trmac <mitr@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH 2/2] audit: fix sparse shadowed variable warningsHarvey Harrison2008-04-283-15/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use msglen as the identifier. kernel/audit.c:724:10: warning: symbol 'len' shadows an earlier one kernel/audit.c:575:8: originally declared here Don't use ino_f to check the inode field at the end of the functions. kernel/auditfilter.c:429:22: warning: symbol 'f' shadows an earlier one kernel/auditfilter.c:420:21: originally declared here kernel/auditfilter.c:542:22: warning: symbol 'f' shadows an earlier one kernel/auditfilter.c:529:21: originally declared here i always used as a counter for a for loop and initialized to zero before use. Eliminate the inner i variables. kernel/auditsc.c:1295:8: warning: symbol 'i' shadows an earlier one kernel/auditsc.c:1152:6: originally declared here kernel/auditsc.c:1320:7: warning: symbol 'i' shadows an earlier one kernel/auditsc.c:1152:6: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH 1/2] audit: move extern declarations to audit.hHarvey Harrison2008-04-283-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leave audit_sig_{uid|pid|sid} protected by #ifdef CONFIG_AUDITSYSCALL. Noticed by sparse: kernel/audit.c:73:6: warning: symbol 'audit_ever_enabled' was not declared. Should it be static? kernel/audit.c:100:8: warning: symbol 'audit_sig_uid' was not declared. Should it be static? kernel/audit.c:101:8: warning: symbol 'audit_sig_pid' was not declared. Should it be static? kernel/audit.c:102:6: warning: symbol 'audit_sig_sid' was not declared. Should it be static? kernel/audit.c:117:23: warning: symbol 'audit_ih' was not declared. Should it be static? kernel/auditfilter.c:78:18: warning: symbol 'audit_filter_list' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: MAINTAINERS updateEric Paris2008-04-281-3/+5
| | | | | | | | | | | | | | Change maintainers to include me and al viro Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: increase the maximum length of the key fieldEric Paris2008-04-281-1/+1
| | | | | | | | | | | | | | | | | | Key lengths were arbitrarily limited to 32 characters. If userspace is going to start using the single kernel key field as multiple virtual key fields (example key=key1,key2,key3,key4) we should give them enough room to work. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: standardize string audit interfacesEric Paris2008-04-285-24/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch standardized the string auditing interfaces. No userspace changes will be visible and this is all just cleanup and consistancy work. We have the following string audit interfaces to use: void audit_log_n_hex(struct audit_buffer *ab, const unsigned char *buf, size_t len); void audit_log_n_string(struct audit_buffer *ab, const char *buf, size_t n); void audit_log_string(struct audit_buffer *ab, const char *buf); void audit_log_n_untrustedstring(struct audit_buffer *ab, const char *string, size_t n); void audit_log_untrustedstring(struct audit_buffer *ab, const char *string); This may be the first step to possibly fixing some of the issues that people have with the string output from the kernel audit system. But we still don't have an agreed upon solution to that problem. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: stop deadlock from signals under loadEric Paris2008-04-281-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A deadlock is possible between kauditd and auditd under load if auditd receives a signal. When auditd receives a signal it sends a netlink message to the kernel asking for information about the sender of the signal. In that same context the audit system will attempt to send a netlink message back to the userspace auditd. If kauditd has already filled the socket buffer (see netlink_attachskb()) auditd will now put itself to sleep waiting for room to send the message. Since auditd is responsible for draining that socket we have a deadlock. The fix, since the response from the kernel does not need to be synchronous is to send the signal information back to auditd in a separate thread. And thus auditd can continue to drain the audit queue normally. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: save audit_backlog_limit audit messages in case auditd comes backEric Paris2008-04-281-21/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch causes the kernel audit subsystem to store up to audit_backlog_limit messages for use by auditd if it ever appears sometime in the future in userspace. This is useful to collect audit messages during bootup and even when auditd is stopped. This is NOT a reliable mechanism, it does not ever call audit_panic, nor should it. audit_log_lost()/audit_panic() are called during the normal delivery mechanism. The messages are still sent to printk/syslog as usual and if too many messages appear to be queued they will be silently discarded. I liked doing it by default, but this patch only uses the queue in question if it was booted with audit=1 or if the kernel was built enabling audit by default. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: collect sessionid in netlink messagesEric Paris2008-04-2817-87/+132
| | | | | | | | | | | | | | | | | | | | Previously I added sessionid output to all audit messages where it was available but we still didn't know the sessionid of the sender of netlink messages. This patch adds that information to netlink messages so we can audit who sent netlink messages. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * Audit: end printk with newlineEric Paris2008-04-281-4/+4
| | | | | | | | | | | | | | A couple of audit printk statements did not have a newline. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | drivers/pcmcia/pcmcia_ioctl.c: fix buildakpm@linux-foundation.org2008-04-291-2/+2
| | | | | | | | | | | | | | argh. A hunk got lost from "proc: remove proc_bus" Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dm: use unlocked variants of queue flag check/setJens Axboe2008-04-291-5/+3
| | | | | | | | | | | | | | dm.c already provides mutual exclusion through ->map_lock. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2008-04-2913-468/+302
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits) pciehp: fix error message about getting hotplug control pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2 pci/irq: restore mask_bits in msi shutdown -v3 doc: replace yet another dev with pdev for consistency in DMA-mapping.txt PCI: don't expose struct pci_vpd to userspace doc: fix an incorrect suggestion to pass NULL for PCI like buses Consistently use pdev as the variable of type struct pci_dev *. pciehp: Fix command write shpchp: fix slot name make pciehp_acpi_get_hp_hw_control_from_firmware() pciehp: Clean up pcie_init() pciehp: Mask hotplug interrupt at controller release pciehp: Remove useless hotplug interrupt enabling pciehp: Fix wrong slot capability check pciehp: Fix wrong slot control register access pciehp: Add missing memory barrier pciehp: Fix interrupt event handlig pciehp: fix slot name Update MAINTAINERS with location of PCI tree PCI: Add Intel SCH PCI IDs ...
| * | pciehp: fix error message about getting hotplug controlKenji Kaneshige2008-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | People are confused by the following error message that actually is not for indicating a error. Cannot get control of hotplug hardware for pci %s This patch changes this message to debug message. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
| * | pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2Yinghai Lu2008-04-293-9/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [PATCH 2/2] pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2 this change | commit 23a274c8a5adafc74a66f16988776fc7dd6f6e51 | Author: Prakash, Sathya <sathya.prakash@lsi.com> | Date: Fri Mar 7 15:53:21 2008 +0530 | | [SCSI] mpt fusion: Enable MSI by default for SAS controllers | | This patch modifies the driver to enable MSI by default for all SAS chips. | | Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com> | Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> | Causes the kexec of a RHEL 5.1 kernel to fail. root casue: the rhel 5.1 kernel still uses INTx emulation. and mptscsih_shutdown doesn't call pci_disable_msi to reenable INTx on kexec path So call pci_msi_shutdown in the shutdown path to do the same thing to msix Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
| * | pci/irq: restore mask_bits in msi shutdown -v3Yinghai Lu2008-04-292-7/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [PATCH 1/2] pci/irq: restore mask_bits in msi shutdown -v3 Yinghai found that kexec'ing a RHEL 5.1 kernel with 2.6.25-rc3+ kernels prevents his NIC from working. He bisected to | commit 89d694b9dbe769ca1004e01db0ca43964806a611 | Author: Thomas Gleixner <tglx@linutronix.de> | Date: Mon Feb 18 18:25:17 2008 +0100 | | genirq: do not leave interupts enabled on free_irq | | The default_disable() function was changed in commit: | | 76d2160147f43f982dfe881404cfde9fd0a9da21 | genirq: do not mask interrupts by default | For MSI, default_shutdown will call mask_bit for msi device. All mask bits will left disabled after free_irq. Then in the kexec case, the next kernel can only use msi_enable bit, so all device's MSI can not be used. So lets to restore the mask bit to its pci reset defined value (enabled) when we disable the kernels use of msi to be a little friendlier to kexec'd kernels. Extend msi_set_mask_bit to msi_set_mask_bits to take mask, so we can fully restore that to 0x00 instead of 0xfe. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
| * | doc: replace yet another dev with pdev for consistency in DMA-mapping.txtMatti Linnanvuori2008-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | Replace "dev" with "pdev" for consistency in DMA-mapping.txt. Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com> Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
| * | PCI: don't expose struct pci_vpd to userspaceJesse Barnes2008-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | We just need to forward declare it for struct pci_dev, not expose it outside of __KERNEL__. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | doc: fix an incorrect suggestion to pass NULL for PCI like busesMatti Linnanvuori2008-04-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix an incorrect suggestion to pass NULL to pci_alloc_consistent for PCI like buses where devices don't have struct pci_dev (like ISA, EISA). Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com> Acked-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
| * | Consistently use pdev as the variable of type struct pci_dev *.Matti Linnanvuori2008-04-281-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update DMA mapping documentation to use 'pdev' rather than 'dev' in example code that calls routines expecting 'struct pci_device *', since 'dev' might make readers think they're passing 'struct device *' parameters. Bug 10397. Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com> Acked-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Fix command writeKenji Kaneshige2008-04-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of pciehp_write_cmd() always enables command completed interrupt. But pciehp_write_cmd() is also used for clearing command completed interrupt enable bit. In this case, we must not set the command completed interrupt enable bit. To fix this bug, this patch add the check to see if caller wants to change command complete interrupt enable bit. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | shpchp: fix slot nameKenji Kaneshige2008-04-251-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current shpchp uses the combination of bus number and slot number as a slot name. But it is not a good idea because bus number is not a physical identifier but a logical identifier. This is against the shpc specification. So remove the bus number from the physical identifier. However, there are some platforms with the problem that it provides the same slot number. For those platforms, this patch also introduces new module option 'shpchp_slot_with_bus'. If it is specified, shpchp uses the combination of bus number and slot number as a slot name. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | make pciehp_acpi_get_hp_hw_control_from_firmware()Adrian Bunk2008-04-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this_patch_makes_the_needlessly_global_pciehp_acpi_get_hp_hw_control_from_firmware_static ;) Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Clean up pcie_init()Kenji Kaneshige2008-04-251-110/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up pciehp_ini(). This patch is trying to - Remove redundant capablity checks that were already done in PCIe port bus driver. - Separate the code only for debugging and make debug information easier to read. - Make the entire code easier to read and understand what it is doing. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Mask hotplug interrupt at controller releaseKenji Kaneshige2008-04-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must disable hotplug interrupt at controller relase time, otherwise spurious interrupts might happen if any slot events occured (e.g. MRL change) after unloading pciehp driver. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Remove useless hotplug interrupt enablingKenji Kaneshige2008-04-251-46/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hotplug interrupt is enabled at initialization and nobody clears it. So we need to setup it in each command. This patch removes redundant codes about this. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Fix wrong slot capability checkKenji Kaneshige2008-04-254-39/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current pciehp saves only 8bits of Slot Capability registers in ctrl->ctrlcap. But it refers more than 8bit for checking EMI capability. It is clearly a bug and EMI would never work. To fix this problem, this patch saves full Slot Capability contens in ctrl->slot_cap. It also reduce the redundant reads of Slot Capability register. And this pach also cleans up the macros to check the slot capabilitys (e.g. MRL_SENS(), and so on). Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Fix wrong slot control register accessKenji Kaneshige2008-04-251-112/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current pciehp implementaion clears hotplug events without waiting for command completion. Because of this, events might not be cleared properly. To prevent this problem, we must use pciehp_write_cmd() to write to Slot Control register. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Add missing memory barrierKenji Kaneshige2008-04-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the possible race condition between pcie_isr() and pciehp_write_cmd() because of the lack of memory barrier. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | pciehp: Fix interrupt event handligKenji Kaneshige2008-04-252-120/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current pciehp implementation disables and re-enables hotplug interrupts in its interrupt handler. This operation might be intend to guarantee that interrupts for the events newly occured during previous events are being handled will be successfully generated. But current implementaion has the following prolems. - Current interrupt service routin clears status changes without waiting command completion. Because of this, events might not be cleared properly. - Current interrupt service routine clears status changes caused by disabling or enabling hotplug interrupts itself. This will lose new events that occurs during previous interrupts are being handled. - Current implementation doesn't have any serialization mechanism between the code to wait for command completion and the interrupt handler that clears the command completion events caused by itself. There is clearly race conditions between them, and it may cause the problem that waiting for command completion doesn't work for example. To fix those problems, this patch stops disabling/re-enabling hotplug interrupts in interrupt service routine. Instead of this, this patch re-inspects Slot Status register after clearing what is presumed to be the last bending interrupt in order to guarantee that all interrupt events are serviced. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>