summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Documentation: add sysfs entries for mtd docg3 chipsRobert Jarzmik2012-01-091-0/+34
| | | | | | | | | Add documentation for MSystems disk-on-chip docg3 chips sysfs entries, which enable and disable protection areas, giving or disabling access to the chip's memory. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: dereferencing an ERR_PTR() in docg3_probe()Dan Carpenter2012-01-091-9/+12
| | | | | | | | | | | | | | If doc_probe_device() returned an ERR_PTR, then we accidentally saved that to docg3_floors[floor] = mtd; which gets derefenced in the error handling when we call doc_release_device(). I've reworked the error handling to take care of that and hopefully make it a little simpler. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: Remove redundant spi driver bus initializationLars-Peter Clausen2012-01-093-3/+0
| | | | | | | | | | | | | | | | | | | | | | In ancient times it was necessary to manually initialize the bus field of an spi_driver to spi_bus_type. These days this is done in spi_driver_register(), so we can drop the manual assignment. The patch was generated using the following coccinelle semantic patch: // <smpl> @@ identifier _driver; @@ struct spi_driver _driver = { .driver = { - .bus = &spi_bus_type, }, }; // </smpl> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add protection areas sysfs accessRobert Jarzmik2012-01-092-0/+134
| | | | | | | | | | | | As each docg3 chip has 2 protection areas (DPS0 and DPS1), and because theses areas can prevent user access to the chip data, add for each floor the sysfs entries which insert the protection key into the right DPS. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add fast modeRobert Jarzmik2012-01-092-20/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Docg3 chips can work in 3 modes : normal MLC mode, fast mode and reliable mode. Normally, as docg3 is a MLC chip, it should be configured to work in normal mode. In both normal mode, each page is distinct. This means that writing to page 12 of blocks 14,15 writes only to that page, and reading from page 12 of blocks 14,15 reads only from that page. In reliable and fast modes, pages are coupled by pairs, and are clones one of each other. This means that the available capacity of the chip is halved. Pages are coupled in each block, and page of index 2*n contains the same data as page 2*n+1 of the same block. In fast mode, the reads occur a bit faster, but are a bit less reliable that in normal mode. When reading from page 2*n, the chip reads bytes from both page 2*n and page 2*n+1, makes a logical and for each byte, and returns the result. As programming a page means "clearing bits", even if a bit was not cleared on one page because the flash is worn out, the other page has the bit cleared, and the result of the "AND" gives a correct result. When writing to page 2*n, the chip writes data to both page 2*n and page 2*n+1. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add suspend and resumeRobert Jarzmik2012-01-092-1/+80
| | | | | | | | | | Add functions to powerdown and powerup from suspend, in order to save power. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add ECC correction codeRobert Jarzmik2012-01-093-23/+113
| | | | | | | | | | | | | | | | Credit for discovering the BCH algorith parameters, and bit reversing algorithm is to be give to Mike Dunn and Ivan Djelic. The BCH correction code relied upon the BCH library, where all data and ECC is bit-reversed. The BCH library works correctly when each input byte is bit-reversed, and accordingly ECC output is also bit-reversed. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: map erase and write functionsRobert Jarzmik2012-01-091-10/+4
| | | | | | | | | | Map the developped write and erase functions into the mtd structure. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add erase functionsRobert Jarzmik2012-01-091-0/+90
| | | | | | | | | | | Add erase capability to the docg3 driver. The erase block is made of 2 physical blocks, as both share all 64 pages. That makes an erase block of at least 64 kBytes. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add write functionsRobert Jarzmik2012-01-091-12/+541
| | | | | | | | | | | | Add write capability to the docg3 driver. The writes are possible on a single page (512 bytes + 16 bytes), even if that page is split on 2 physical pages on 2 blocks (each on one plane). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add OOB buffer to device structureRobert Jarzmik2012-01-091-0/+8
| | | | | | | | | | | Add OOB buffer area to store the OOB data until the actual page is written, so that it can be completed by hardware ECC generator. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add registers for erasing and writingRobert Jarzmik2012-01-091-1/+13
| | | | | | | | | | Add the required registers and commands to erase and write flash pages / blocks. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add OOB layout to mtdinfoRobert Jarzmik2012-01-091-0/+15
| | | | | | | | | | Add OOB layout description for docg3, so that userspace can use this information to setup the data for write_oob(). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: add multiple floor supportRobert Jarzmik2012-01-092-56/+126
| | | | | | | | | | | | | | Add support for multiple floors, ie. cascaded docg3 chips. There might be 4 docg3 chips cascaded, sharing the same address space, and providing up to 4 times the storage capacity of a unique chip. Each floor will be seen as an independant mtd device. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: fix reading oob+data without correctionRobert Jarzmik2012-01-091-99/+95
| | | | | | | | | | | | | | | | | | | | Fix the docg3 reads to be able to cope with all possible data buffer / oob buffer / file mode combinations from docg3_read_oob(). This especially ensures that raw reads do not use ECC corrections, and AUTOOOB and PLACEOOB do use ECC correction. The approach is to empty docg3_read() and make it a wrapper to docg3_read_oob(). As docg3_read_oob() handles all the funny cases (no data buffer but oob buffer, data buffer but no oob buffer, ...), docg3_read() is just a special use of docg3_read_oob(). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: fix BCH registersRobert Jarzmik2012-01-091-1/+1
| | | | | | | | | | BCH registers are contiguous, not on every byte. Fix the register definitions. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: fix protection areas readingRobert Jarzmik2012-01-091-7/+9
| | | | | | | | | | | | The protection areas boundaries were on 16bit registers, not 8bit. This is consistent with block numbers, which can extend up to 4096 on bigger chips (and is 2048 on the docg3). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: fix tracing of IO in writebRobert Jarzmik2012-01-091-1/+1
| | | | | | | | | | Writeb was incorrectly traced as a 16 bits write, instead of a 8 bits write. Fix it by tracing the correct width. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: docg3: fix debug log verbosityRobert Jarzmik2012-01-091-1/+1
| | | | | | | | | | Change the NOP debug log verbosity to very verbose to unburden log analysis. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: nand: Making MTD_NAND_OMAP2 depend on ARCH_OMAP2PLUSShubhrajyoti D2012-01-091-1/+1
| | | | | | | | | | Making MTD_NAND_OMAP2 depend on ARCH_OMAP2PLUS instead of oring with ARCH2/3/4. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Shubhrajyoti D <shubhrajyoti@ti.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: cfi: Allow per-mapping CFI device endiannessAaron Sierra2012-01-094-57/+41
| | | | | | | | | | | This patch allows each CFI device map to use its own endianness. The globally defined CFI endianness (CONFIG_MTD_CFI_NOSWAP, CONFIG_MTD_CFI_BE_BYTE_SWAP or CONFIG_MTD_CFI_LE_BYTE_SWAP) becomes the default value which can be overridden by a driver for a particular device. Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: mtd_blkdevs: don't increase 'open' count on error pathBrian Norris2012-01-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some error paths in mtd_blkdevs were fixed in the following commit: commit 94735ec4044a6d318b83ad3c5794e931ed168d10 mtd: mtd_blkdevs: fix error path in blktrans_open But on these error paths, the block device's `dev->open' count is already incremented before we check for errors. This meant that, while the error path was handled correctly on the first time through blktrans_open(), the device is erroneously considered already open on the second time through. This problem can be seen, for instance, when a UBI volume is simultaneously mounted as a UBIFS partition and read through its corresponding gluebi mtdblockX device. This results in blktrans_open() passing its error checks (with `dev->open > 0') without actually having a handle on the device. Here's a summarized log of the actions and results with nandsim: # modprobe nandsim # modprobe mtdblock # modprobe gluebi # modprobe ubifs # ubiattach /dev/ubi_ctrl -m 0 ... # ubimkvol /dev/ubi0 -N test -s 16MiB ... # mount -t ubifs ubi0:test /mnt # ls /dev/mtdblock* /dev/mtdblock0 /dev/mtdblock1 # cat /dev/mtdblock1 > /dev/null cat: can't open '/dev/mtdblock4': Device or resource busy # cat /dev/mtdblock1 > /dev/null CPU 0 Unable to handle kernel paging request at virtual address fffffff0, epc == 8031536c, ra == 8031f280 Oops[#1]: ... Call Trace: [<8031536c>] ubi_leb_read+0x14/0x164 [<8031f280>] gluebi_read+0xf0/0x148 [<802edba8>] mtdblock_readsect+0x64/0x198 [<802ecfe4>] mtd_blktrans_thread+0x330/0x3f4 [<8005be98>] kthread+0x88/0x90 [<8000bc04>] kernel_thread_helper+0x10/0x18 Cc: stable@kernel.org [3.0+] Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: nand: scan 1st and 2nd page for Macronix SLCBrian Norris2012-01-091-3/+4
| | | | | | Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: nand: add 512 Mbit device code (Macronix)Brian Norris2012-01-091-1/+2
| | | | | | | | Macronix MX30LF1208AA is a 512 Mbit NAND with device code 0xF0. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* mtd: nand: add Macronix manufacturerBrian Norris2012-01-092-0/+2
| | | | | | | | | Macronix is produing SLC NAND MX30LF1208AA, so add their manufacturer code to the manufacturer lists. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* Linux 3.2-rc7v3.2-rc7Linus Torvalds2011-12-231-1/+1
|
* Merge branch 'for-linus' of ↵Linus Torvalds2011-12-231-4/+32
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Fix race between CPU hotplug and lglocks
| * VFS: Fix race between CPU hotplug and lglocksSrivatsa S. Bhat2011-12-221-4/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the *_global_[un]lock_online() routines are not at all synchronized with CPU hotplug. Soft-lockups detected as a consequence of this race was reported earlier at https://lkml.org/lkml/2011/8/24/185. (Thanks to Cong Meng for finding out that the root-cause of this issue is the race condition between br_write_[un]lock() and CPU hotplug, which results in the lock states getting messed up). Fixing this race by just adding {get,put}_online_cpus() at appropriate places in *_global_[un]lock_online() is not a good option, because, then suddenly br_write_[un]lock() would become blocking, whereas they have been kept as non-blocking all this time, and we would want to keep them that way. So, overall, we want to ensure 3 things: 1. br_write_lock() and br_write_unlock() must remain as non-blocking. 2. The corresponding lock and unlock of the per-cpu spinlocks must not happen for different sets of CPUs. 3. Either prevent any new CPU online operation in between this lock-unlock, or ensure that the newly onlined CPU does not proceed with its corresponding per-cpu spinlock unlocked. To achieve all this: (a) We introduce a new spinlock that is taken by the *_global_lock_online() routine and released by the *_global_unlock_online() routine. (b) We register a callback for CPU hotplug notifications, and this callback takes the same spinlock as above. (c) We maintain a bitmap which is close to the cpu_online_mask, and once it is initialized in the lock_init() code, all future updates to it are done in the callback, under the above spinlock. (d) The above bitmap is used (instead of cpu_online_mask) while locking and unlocking the per-cpu locks. The callback takes the spinlock upon the CPU_UP_PREPARE event. So, if the br_write_lock-unlock sequence is in progress, the callback keeps spinning, thus preventing the CPU online operation till the lock-unlock sequence is complete. This takes care of requirement (3). The bitmap that we maintain remains unmodified throughout the lock-unlock sequence, since all updates to it are managed by the callback, which takes the same spinlock as the one taken by the lock code and released only by the unlock routine. Combining this with (d) above, satisfies requirement (2). Overall, since we use a spinlock (mentioned in (a)) to prevent CPU hotplug operations from racing with br_write_lock-unlock, requirement (1) is also taken care of. By the way, it is to be noted that a CPU offline operation can actually run in parallel with our lock-unlock sequence, because our callback doesn't react to notifications earlier than CPU_DEAD (in order to maintain our bitmap properly). And this means, since we use our own bitmap (which is stale, on purpose) during the lock-unlock sequence, we could end up unlocking the per-cpu lock of an offline CPU (because we had locked it earlier, when the CPU was online), in order to satisfy requirement (2). But this is harmless, though it looks a bit awkward. Debugged-by: Cong Meng <mc@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org
* | Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds2011-12-232-13/+13
|\ \ | | | | | | | | | | | | | | | | | | for linus: writeback reason binary tracing format fix * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: show writeback reason with __print_symbolic
| * | writeback: show writeback reason with __print_symbolicWu Fengguang2011-12-182-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the binary trace understandable by trace-cmd. CC: Dave Chinner <david@fromorbit.com> CC: Curt Wohlgemuth <curtw@google.com> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
* | | Merge branch 'rc-fixes' of ↵Linus Torvalds2011-12-231-3/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kconfig: adapt update-po-config to new UML layout
| * | | kconfig: adapt update-po-config to new UML layoutPaul Bolle2011-12-181-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5c48b108 ("um: take arch/um/sys-x86 to arch/x86/um") broke the make target update-po-config, as its symlink trick (again) fails. (Previous breakage was fixed with commit bdc69ca4 ("kconfig: change update-po-config to reflect new layout of arch/um").) The new UML layout allows to drop the symlick trick entirely. And if, one day, another architecture supports UML too, that should now work without again breaking this make target. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | | | Merge branch 'v4l_for_linus' of ↵Linus Torvalds2011-12-232-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
| * | | | [media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodesLaurent Pinchart2011-12-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3e0ec41c5c5ee14e27f65e28d4a616de34f59a97 ("V4L: dynamically allocate video_device nodes in subdevices") makes the embedding video_device directly. Fix accesses to the devnode accordingly. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@iki.fi> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-12-232-5/+7
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: call d_instantiate after all ops are setup Btrfs: fix worker lock misuse in find_worker
| * | | | | Btrfs: call d_instantiate after all ops are setupAl Viro2011-12-231-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This closes races where btrfs is calling d_instantiate too soon during inode creation. All of the callers of btrfs_add_nondir are updated to instantiate after the inode is fully setup in memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | | | | Btrfs: fix worker lock misuse in find_workerChris Mason2011-12-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dan Carpenter noticed that we were doing a double unlock on the worker lock, and sometimes picking a worker thread without the lock held. This fixes both errors. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
* | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2011-12-231-2/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().
| * | | | | | sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().David S. Miller2011-12-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This silently was working for many years and stopped working on Niagara-T3 machines. We need to set the MSIQ to VALID before we can set it's state to IDLE. On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL errors. The hypervisor documentation says, rather ambiguously, that the MSIQ must be "initialized" before one can set the state. I previously understood this to mean merely that a successful setconf() operation has been performed on the MSIQ, which we have done at this point. But it seems to also mean that it has been set VALID too. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2011-12-2310-20/+26
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: netfilter: xt_connbytes: handle negation correctly net: relax rcvbuf limits rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt() net: introduce DST_NOPEER dst flag mqprio: Avoid panic if no options are provided bridge: provide a mtu() method for fake_dst_ops
| * \ \ \ \ \ \ Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller2011-12-231-3/+3
| |\ \ \ \ \ \ \
| | * | | | | | | netfilter: xt_connbytes: handle negation correctlyFlorian Westphal2011-12-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "! --connbytes 23:42" should match if the packet/byte count is not in range. As there is no explict "invert match" toggle in the match structure, userspace swaps the from and to arguments (i.e., as if "--connbytes 42:23" were given). However, "what <= 23 && what >= 42" will always be false. Change things so we use "||" in case "from" is larger than "to". This change may look like it breaks backwards compatibility when "to" is 0. However, older iptables binaries will refuse "connbytes 42:0", and current releases treat it to mean "! --connbytes 0:42", so we should be fine. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | | | net: relax rcvbuf limitsEric Dumazet2011-12-233-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b533 (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()Xi Wang2011-12-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will cause a kernel oops due to insufficient bounds checking. if (count > 1<<30) { /* Enforce a limit to prevent overflow */ return -EINVAL; } count = roundup_pow_of_two(count); table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count)); Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as: ... + (count * sizeof(struct rps_dev_flow)) where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow 32 bits. This patch replaces the magic number (1 << 30) with a symbolic bound. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: introduce DST_NOPEER dst flagEric Dumazet2011-12-224-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | mqprio: Avoid panic if no options are providedThomas Graf2011-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace may not provide TCA_OPTIONS, in fact tc currently does so not do so if no arguments are specified on the command line. Return EINVAL instead of panicing. Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | bridge: provide a mtu() method for fake_dst_opsEric Dumazet2011-12-221-0/+6
| | |/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol depended handlers) forgot the bridge netfilter case, adding a NULL dereference in ip_fragment(). Reported-by: Chris Boot <bootc@bootc.net> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | Merge branch 'for-linus' of git://neil.brown.name/mdLinus Torvalds2011-12-224-10/+13
|\ \ \ \ \ \ \ \ | |/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://neil.brown.name/md: md/bitmap: It is OK to clear bits during recovery. md: don't give up looking for spares on first failure-to-add md/raid5: ensure correct assessment of drives during degraded reshape. md/linear: fix hot-add of devices to linear arrays.
| * | | | | | | md/bitmap: It is OK to clear bits during recovery.NeilBrown2011-12-231-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d0a4bb492772ce5c4bdfba3744a99ed6f6fb238f introduced a regression which is annoying but fairly harmless. When writing to an array that is undergoing recovery (a spare in being integrated into the array), writing to the array will set bits in the bitmap, but they will not be cleared when the write completes. For bits covering areas that have not been recovered yet this is not a problem as the recovery will clear the bits. However bits set in already-recovered region will stay set and never be cleared. This doesn't risk data integrity. The only negatives are: - next time there is a crash, more resyncing than necessary will be done. - the bitmap doesn't look clean, which is confusing. While an array is recovering we don't want to update the 'events_cleared' setting in the bitmap but we do still want to clear bits that have very recently been set - providing they were written to the recovering device. So split those two needs - which previously both depended on 'success' and always clear the bit of the write went to all devices. Signed-off-by: NeilBrown <neilb@suse.de>
| * | | | | | | md: don't give up looking for spares on first failure-to-addNeilBrown2011-12-231-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before performing a recovery we try to remove any spares that might not be working, then add any that might have become relevant. Currently we abort on the first spare that cannot be added. This is a false optimisation. It is conceivable that - depending on rules in the personality - a subsequent spare might be accepted. Also the loop does other things like count the available spares and reset the 'recovery_offset' value. If we abort early these might not happen properly. So remove the early abort. In particular if you have an array what is undergoing recovery and which has extra spares, then the recovery may not restart after as reboot as the could of 'spares' might end up as zero. Reported-by: Anssi Hannula <anssi.hannula@iki.fi> Signed-off-by: NeilBrown <neilb@suse.de>