summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2015-09-112-35/+21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull second round of SCSI updates from James Bottomley: "There's one late arriving patch here (added today), fixing a build issue which the scsi_dh patch set in here uncovered. Other than that, everything has been incubated in -next and the checkers for a week. The major pieces of this patch are a set patches facilitating better integration between scsi and scsi_dh (the device handling layer used by multi-path; all the dm parts are acked by Mike Snitzer). This also includes driver updates for mp3sas, scsi_debug and an assortment of bug fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (50 commits) scsi_dh: fix randconfig build error scsi: fix scsi_error_handler vs. scsi_host_dev_release race fcoe: Convert use of __constant_htons to htons mpt2sas: setpci reset kernel oops fix pm80xx: Don't override ts->stat on IO_OPEN_CNX_ERROR_HW_RESOURCE_BUSY lpfc: Fix possible use-after-free and double free in lpfc_mbx_cmpl_rdp_page_a2() bfa: Fix incorrect de-reference of pointer bfa: Fix indentation scsi_transport_sas: Remove check for SAS expander when querying bay/enclosure IDs. scsi_debug: resp_request: remove unused variable scsi_debug: fix REPORT LUNS Well Known LU scsi_debug: schedule_resp fix input variable check scsi_debug: make dump_sector static scsi_debug: vfree is null safe so drop the check scsi_debug: use SCSI_W_LUN_REPORT_LUNS instead of SAM2_WLUN_REPORT_LUNS; scsi_debug: define pr_fmt() for consistent logging mpt2sas: Refcount fw_events and fix unsafe list usage mpt2sas: Refcount sas_device objects and fix unsafe list usage scsi_dh: return SCSI_DH_NOTCONN in scsi_dh_activate() scsi_dh: don't allow to detach device handlers at runtime ...
| * scsi_dh: kill struct scsi_dh_dataChristoph Hellwig2015-08-282-22/+20
| | | | | | | | | | | | | | | | | | | | | | Add a ->handler and a ->handler_data field to struct scsi_device and kill this indirection. Also move struct scsi_device_handler to scsi_dh.h so that changes to it don't require rebuilding every SCSI LLDD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
| * scsi_dh: move device matching to the core codeChristoph Hellwig2015-08-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a single list of devices that need non-ALUA device handlers to the core scsi_dh code so that we can autoload the modules for them at probe time. While this is a little ugly in terms of architecture it actually significantly simplifies the code in addition to the new autoloading functionality. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
| * scsi_dh: integrate into the core SCSI codeChristoph Hellwig2015-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop building scsi_dh as a separate module and integrate it fully into the core SCSI code with explicit callouts at bus scan time. For now the callouts are placed at the same point as the old bus notifiers were called, but in the future we will be able to look at ALUA INQUIRY data earlier on. Note that this also means that the device handler modules need to be loaded by the time we scan the bus. The next patches will add support for autoloading device handlers at bus scan time to make sure they are always loaded if they are enabled in the kernel config. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
| * dm-mpath, scsi_dh: request scsi_dh modules in scsi_dh, not dm-mpathChristoph Hellwig2015-08-281-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | This way we can reused the same code any attachment method, not just those requested from dm-mpath. [jejb: fixup checkpatch error] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
| * dm-mpath, scsi_dh: don't let dm detach device handlersChristoph Hellwig2015-08-282-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While allowing dm-mpath to attach device handlers is a functionality we need for backwards compatibility reason there is no reason to reference count them and detach them if dm-mpath stops using the device for some reason. If the device handler works for the given device it can just stay attached, and we can take the retain_hw_handler codepath. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Hannes Reinecke <hare@Suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
* | Merge tag 'media/v4.3-2' of ↵Linus Torvalds2015-09-112-6/+49
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: "A series of patches that move part of the code used to allocate memory from the media subsystem to the mm subsystem" [ The mm parts have been acked by VM people, and the series was apparently in -mm for a while - Linus ] * tag 'media/v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] drm/exynos: Convert g2d_userptr_get_dma_addr() to use get_vaddr_frames() [media] media: vb2: Remove unused functions [media] media: vb2: Convert vb2_dc_get_userptr() to use frame vector [media] media: vb2: Convert vb2_vmalloc_get_userptr() to use frame vector [media] media: vb2: Convert vb2_dma_sg_get_userptr() to use frame vector [media] vb2: Provide helpers for mapping virtual addresses [media] media: omap_vout: Convert omap_vout_uservirt_to_phys() to use get_vaddr_pfns() [media] mm: Provide new get_vaddr_frames() helper [media] vb2: Push mmap_sem down to memops
| * | [media] media: vb2: Remove unused functionsJan Kara2015-08-161-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conversion to the use of pinned pfns made some functions unused. Remove them. Also there's no need to lock mmap_sem in __buf_prepare() anymore. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
| * | [media] vb2: Provide helpers for mapping virtual addressesJan Kara2015-08-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide simple helper functions to map virtual address range into an array of pfns / pages. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
| * | [media] mm: Provide new get_vaddr_frames() helperJan Kara2015-08-161-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide new function get_vaddr_frames(). This function maps virtual addresses from given start and fills given array with page frame numbers of the corresponding pages. If given start belongs to a normal vma, the function grabs reference to each of the pages to pin them in memory. If start belongs to VM_IO | VM_PFNMAP vma, we don't touch page structures. Caller must make sure pfns aren't reused for anything else while he is using them. This function is created for various drivers to simplify handling of their buffers. Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
* | | Merge branch 'next' of ↵Linus Torvalds2015-09-112-18/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal updates from Zhang Rui: - use int instead of unsigned long to represent temperature to avoid bogus overheat detection when negative temperature reported. From Sascha Hauer. - export available thermal governors information to user space via sysfs. From Wei Ni. - introduce new thermal driver for Wildcat Point platform controller hub, which uses PCH thermal sensor and associated critical and hot trip points. From Tushar Dave. - add suuport for Intel Skylake and Denlow platforms in powerclamp driver. - some small cleanups in thermal core. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: Add Intel PCH thermal driver thermal: Add comment explaining test for critical temperature thermal: Use IS_ENABLED instead of #ifdef thermal: remove unnecessary call to thermal_zone_device_set_polling thermal: trivial: fix typo in comment thermal: consistently use int for temperatures thermal: add available policies sysfs attribute thermal/powerclamp: add cpu id for denlow platform thermal/powerclamp: add cpu id for Skylake u/y thermal/powerclamp: add cpu id for skylake h/s
| | \ \
| | \ \
| *-. \ \ Merge branches 'thermal-core' and 'thermal-intel' of .git into nextZhang Rui2015-09-022-18/+14
| |\ \ \ \
| | * | | | thermal: consistently use int for temperaturesSascha Hauer2015-08-032-18/+14
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thermal code uses int, long and unsigned long for temperatures in different places. Using an unsigned type limits the thermal framework to positive temperatures without need. Also several drivers currently will report temperatures near UINT_MAX for temperatures below 0°C. This will probably immediately shut the machine down due to overtemperature if started below 0°C. 'long' is 64bit on several architectures. This is not needed since INT_MAX °mC is above the melting point of all known materials. Consistently use a plain 'int' for temperatures throughout the thermal code and the drivers. This only changes the places in the drivers where the temperature is passed around as pointer, when drivers internally use another type this is not changed. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com> Reviewed-by: Darren Hart <dvhart@linux.intel.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Peter Feuerer <peter@piie.net> Cc: Punit Agrawal <punit.agrawal@arm.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Jean Delvare <jdelvare@suse.de> Cc: Peter Feuerer <peter@piie.net> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Lukasz Majewski <l.majewski@samsung.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-acpi@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-omap@vger.kernel.org Cc: linux-samsung-soc@vger.kernel.org Cc: Guenter Roeck <linux@roeck-us.net> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: Darren Hart <dvhart@infradead.org> Cc: lm-sensors@lm-sensors.org Signed-off-by: Zhang Rui <rui.zhang@intel.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-09-113-1/+9
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few fixes for snapshot behavior with CephFS and support for the new keepalive protocol from Zheng, a libceph fix that affects both RBD and CephFS, a few bug fixes and cleanups for RBD from Ilya, and several small fixes and cleanups from Jianpeng and others" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: improve readahead for file holes ceph: get inode size for each append write libceph: check data_len in ->alloc_msg() libceph: use keepalive2 to verify the mon session is alive rbd: plug rbd_dev->header.object_prefix memory leak rbd: fix double free on rbd_dev->header_name libceph: set 'exists' flag for newly up osd ceph: cleanup use of ceph_msg_get ceph: no need to get parent inode in ceph_open ceph: remove the useless judgement ceph: remove redundant test of head->safe and silence static analysis warnings ceph: fix queuing inode to mdsdir's snaprealm libceph: rename con_work() to ceph_con_workfn() libceph: Avoid holding the zero page on ceph_msgr_slab_init errors libceph: remove the unused macro AES_KEY_SIZE ceph: invalidate dirty pages after forced umount ceph: EIO all operations after forced umount
| * | | | | libceph: use keepalive2 to verify the mon session is aliveYan, Zheng2015-09-083-1/+9
| |/ / / / | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | | | | Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2015-09-111-1/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm fixes from Dave Airlie: "Just a bunch of fixes to squeeze in before -rc1: - three nouveau regression fixes - one qxl regression fix - a bunch of i915 fixes ... and some core displayport/atomic fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/device: enable c800 quirk for tecra w50 drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x drm/nouveau/gr/nv04: fix big endian setting on gr context drm/qxl: validate monitors config modes drm/i915: Allow DSI dual link to be configured on any pipe drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS drm/i915: Fix CSR MMIO address check drm/i915: Limit the number of loops for reading a split 64bit register drm/i915: Fix broken mst get_hw_state. drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x uapi/drm/i915_drm.h: fix userspace compilation. drm/i915: Always mark the object as dirty when used by the GPU drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speed drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speed drm/dp: Define AUX_RETRY_INTERVAL as 500 us drm/atomic: Fix bookkeeping with TEST_ONLY, v3.
| * \ \ \ \ Merge tag 'drm-intel-next-fixes-2015-09-10' of ↵Dave Airlie2015-09-111-1/+1
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm-intel into drm-next Fixes headed for v4.3-rc1, including Maarten's DP MST state checker fix you requested. * tag 'drm-intel-next-fixes-2015-09-10' of git://anongit.freedesktop.org/drm-intel: drm/i915: Allow DSI dual link to be configured on any pipe drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS drm/i915: Fix CSR MMIO address check drm/i915: Limit the number of loops for reading a split 64bit register drm/i915: Fix broken mst get_hw_state. drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x uapi/drm/i915_drm.h: fix userspace compilation. drm/i915: Always mark the object as dirty when used by the GPU
| | * | | | | uapi/drm/i915_drm.h: fix userspace compilation.Artem Savkov2015-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 346add7834557b5b9628b9bf2387106d42e631d4 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 14 18:07:30 2015 +0200 drm/i915: Use expcitly fixed type in compat32 structs changed the type of param field in drm_i915_getparam from int to s32. This header is exported to userspace and needs to use userspace type __s32 instead. This fixes userspace compilation errors like the following: include/drm/i915_drm.h:361:2: error: unknown type name 's32' s32 param; Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
* | | | | | | Merge branch 'for-4.3/blkcg' of git://git.kernel.dk/linux-blockLinus Torvalds2015-09-105-155/+397
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull blk-cg updates from Jens Axboe: "A bit later in the cycle, but this has been in the block tree for a a while. This is basically four patchsets from Tejun, that improve our buffered cgroup writeback. It was dependent on the other cgroup changes, but they went in earlier in this cycle. Series 1 is set of 5 patches that has cgroup writeback updates: - bdi_writeback iteration fix which could lead to some wb's being skipped or repeated during e.g. sync under memory pressure. - Simplification of wb work wait mechanism. - Writeback tracepoints updated to report cgroup. Series 2 is is a set of updates for the CFQ cgroup writeback handling: cfq has always charged all async IOs to the root cgroup. It didn't have much choice as writeback didn't know about cgroups and there was no way to tell who to blame for a given writeback IO. writeback finally grew support for cgroups and now tags each writeback IO with the appropriate cgroup to charge it against. This patchset updates cfq so that it follows the blkcg each bio is tagged with. Async cfq_queues are now shared across cfq_group, which is per-cgroup, instead of per-request_queue cfq_data. This makes all IOs follow the weight based IO resource distribution implemented by cfq. - Switched from GFP_ATOMIC to GFP_NOWAIT as suggested by Jeff. - Other misc review points addressed, acks added and rebased. Series 3 is the blkcg policy cleanup patches: This patchset contains assorted cleanups for blkcg_policy methods and blk[c]g_policy_data handling. - alloc/free added for blkg_policy_data. exit dropped. - alloc/free added for blkcg_policy_data. - blk-throttle's async percpu allocation is replaced with direct allocation. - all methods now take blk[c]g_policy_data instead of blkcg_gq or blkcg. And finally, series 4 is a set of patches cleaning up the blkcg stats handling: blkcg's stats have always been somwhat of a mess. This patchset tries to improve the situation a bit. - The following patches added to consolidate blkcg entry point and blkg creation. This is in itself is an improvement and helps colllecting common stats on bio issue. - per-blkg stats now accounted on bio issue rather than request completion so that bio based and request based drivers can behave the same way. The issue was spotted by Vivek. - cfq-iosched implements custom recursive stats and blk-throttle implements custom per-cpu stats. This patchset make blkcg core support both by default. - cfq-iosched and blk-throttle keep track of the same stats multiple times. Unify them" * 'for-4.3/blkcg' of git://git.kernel.dk/linux-block: (45 commits) blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchy blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/ blkcg: implement interface for the unified hierarchy blkcg: misc preparations for unified hierarchy interface blkcg: separate out tg_conf_updated() from tg_set_conf() blkcg: move body parsing from blkg_conf_prep() to its callers blkcg: mark existing cftypes as legacy blkcg: rename subsystem name from blkio to io blkcg: refine error codes returned during blkcg configuration blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device() blkcg: reduce stack usage of blkg_rwstat_recursive_sum() blkcg: remove cfqg_stats->sectors blkcg: move io_service_bytes and io_serviced stats into blkcg_gq blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gq blkcg: make blkcg_[rw]stat per-cpu blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it blkcg: consolidate blkg creation in blkcg_bio_issue_check() blk-throttle: improve queue bypass handling blkcg: move root blkg lookup optimization from throtl_lookup_tg() to __blkg_lookup() blkcg: inline [__]blkg_lookup() ...
| * | | | | | | blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchyTejun Heo2015-08-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup is trying to make interface consistent across different controllers. For weight based resource control, the knob should have the range [1, 10000] and default to 100. This patch updates cfq-iosched so that the weight range conforms. The internal calculations have enough range and the widening of the weight range shouldn't cause any problem. * blkcg_policy->cpd_bind_fn() is added. If present, this is invoked when blkcg is attached to a hierarchy. * cfq_cpd_init() is updated to use the new default value on the unified hierarchy. * cfq_cpd_bind() callback is implemented to clear per-blkg configs and apply the default config matching the hierarchy type. * cfqd->root_group->[leaf_]weight initialization in cfq_init_queue() is moved into !CONFIG_CFQ_GROUP_IOSCHED block. cfq_cpd_bind() is now responsible for initializing the initial weights when blkcg is enabled. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: implement interface for the unified hierarchyTejun Heo2015-08-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg interface grew to be the biggest of all controllers and unfortunately most inconsistent too. The interface files are inconsistent with a number of cloes duplicates. Some files have recursive variants while others don't. There's distinction between normal and leaf weights which isn't intuitive and there are a lot of stat knobs which don't make much sense outside of debugging and expose too much implementation details to userland. In the unified hierarchy, everything is always hierarchical and internal nodes can't have tasks rendering the two structural issues twisting the current interface. The interface has to be updated in a significant anyway and this is a good chance to revamp it as a whole. This patch implements blkcg interface for the unified hierarchy. * (from a previous patch) blkcg is identified by "io" instead of "blkio" on the unified hierarchy. Given that the whole interface is updated anyway, the rename shouldn't carry noticeable conversion overhead. * The original interface consisted of 27 files is replaced with the following three files. blkio.stat : per-blkcg stats blkio.weight : per-cgroup and per-cgroup-queue weight settings blkio.max : per-cgroup-queue bps and iops max limits Documentation/cgroups/unified-hierarchy.txt updated accordingly. v2: blkcg_policy->dfl_cftypes wasn't removed on blkcg_policy_unregister() corrupting the cftypes list. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: misc preparations for unified hierarchy interfaceTejun Heo2015-08-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Export blkg_dev_name() * Drop unnecessary @cft from __cfq_set_weight(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: move body parsing from blkg_conf_prep() to its callersTejun Heo2015-08-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, blkg_conf_prep() expects input to be of the following form MAJ:MIN NUM and reads the NUM part into blkg_conf_ctx->v. This is quite restrictive and gets in the way in implementing blkcg interface for the unified hierarchy. This patch updates blkg_conf_prep() so that it expects MAJ:MIN BODY_STR where BODY_STR is an arbitrary string. blkg_conf_ctx->v is replaced with ->body which is a char pointer pointing to the start of BODY_STR. Parsing of the body is moved to blkg_conf_prep()'s callers. To allow using, for example, strsep() on blkg_conf_ctx->val, it is a non-const pointer and to accommodate that const is dropped from @input too. This doesn't cause any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: mark existing cftypes as legacyTejun Heo2015-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg is about to grow interface for the unified hierarchy. Add legacy to existing cftypes. * blkcg_policy->cftypes -> blkcg_policy->legacy_cftypes * blk-cgroup.c:blkcg_files -> blkcg_legacy_files * cfq-iosched.c:cfq_blkcg_files -> cfq_blkcg_legacy_files * blk-throttle.c:throtl_files -> throtl_legacy_files Pure renames. No functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: rename subsystem name from blkio to ioTejun Heo2015-08-183-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkio interface has become messy over time and is currently the largest. In addition to the inconsistent naming scheme, it has multiple stat files which report more or less the same thing, a number of debug stat files which expose internal details which shouldn't have been part of the public interface in the first place, recursive and non-recursive stats and leaf and non-leaf knobs. Both recursive vs. non-recursive and leaf vs. non-leaf distinctions don't make any sense on the unified hierarchy as only leaf cgroups can contain processes. cgroups is going through a major interface revision with the unified hierarchy involving significant fundamental usage changes and given that a significant portion of the interface doesn't make sense anymore, it's a good time to reorganize the interface. As the first step, this patch renames the external visible subsystem name from "blkio" to "io". This is more concise, matches the other two major subsystem names, "cpu" and "memory", and better suited as blkcg will be involved in anything writeback related too whether an actual block device is involved or not. As the subsystem legacy_name is set to "blkio", the only userland visible change outside the unified hierarchy is that blkcg is reported as "io" instead of "blkio" in the subsystem initialized message during boot. On the unified hierarchy, blkcg now appears as "io". Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: cgroups@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: move io_service_bytes and io_serviced stats into blkcg_gqTejun Heo2015-08-181-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, both cfq-iosched and blk-throttle keep track of io_service_bytes and io_serviced stats. While keeping track of them separately may be useful during development, it doesn't make much sense otherwise. Also, blk-throttle was counting bio's as IOs while cfq-iosched request's, which is more confusing than informative. This patch adds ->stat_bytes and ->stat_ios to blkg (blkcg_gq), removes the counterparts from cfq-iosched and blk-throttle and let them print from the common blkg counters. The common counters are incremented during bio issue in blkcg_bio_issue_check(). The outputs are still filtered by whether the policy has blkg_policy_data on a given blkg, so cfq's output won't show up if it has never been used for a given blkg. The only times when the outputs would differ significantly are when policies are attached on the fly or elevators are switched back and forth. Those are quite exceptional operations and I don't think they warrant keeping separate counters. v3: Update blkio-controller.txt accordingly. v2: Account IOs during bio issues instead of request completions so that bio-based drivers can be handled the same way. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gqTejun Heo2015-08-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, blkg_[rw]stat_recursive_sum() assume that the target counter is located in pd (blkg_policy_data); however, some counters are planned to be moved to blkg (blkcg_gq). This patch updates blkg_[rw]stat_recursive_sum() to take blkg and blkg_policy pointers instead of pd. If policy is NULL, it indexes into blkg. If non-NULL, into the blkg's pd of the policy. The existing usages are updated to maintain the current behaviors. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: make blkcg_[rw]stat per-cpuTejun Heo2015-08-181-53/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg_[rw]stat are used as stat counters for blkcg policies. It isn't per-cpu by itself and blk-throttle makes it per-cpu by wrapping around it. This patch makes blkcg_[rw]stat per-cpu and drop the ad-hoc per-cpu wrapping in blk-throttle. * blkg_[rw]stat->cnt is replaced with cpu_cnt which is struct percpu_counter. This makes syncp unnecessary as remote accesses are handled by percpu_counter itself. * blkg_[rw]stat_init() can now fail due to percpu allocation failure and thus are updated to return int. * percpu_counters need explicit freeing. blkg_[rw]stat_exit() added. * As blkg_rwstat->cpu_cnt[] can't be read directly anymore, reading and summing results are stored in ->aux_cnt[] instead. * Custom per-cpu stat implementation in blk-throttle is removed. This makes all blkcg stat counters per-cpu without complicating policy implmentations. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with itTejun Heo2015-08-181-13/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup stats are local to each cgroup and doesn't propagate to ancestors by default. When recursive stats are necessary, the sum is calculated over all the descendants. This initially was for backward compatibility to support both group-local and recursive stats but this mode of operation makes general sense as stat update is much hotter thafn reporting those stats. This however ends up losing recursive stats when a child is removed. To work around this, cfq-iosched adds its stats to its parent cfq_group->dead_stats which is summed up together when calculating recursive stats. It's planned that the core stats will be moved to blkcg_gq, so we want to move the mechanism for keeping track of the stats of dead children from cfq to blkcg core. This patch adds blkg_[rw]stat->aux_cnt which are atomic64_t's keeping track of auxiliary counts which are excluded when reading local counts but included for recursive. blkg_[rw]stat_merge() which were used by cfq to implement dead_stats are replaced by blkg_[rw]stat_add_aux(), and cfq now forwards stats of a dead cgroup to the aux counts of parent->stats instead of separate ->dead_stats. This will also help making blkg_[rw]stats per-cpu. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: consolidate blkg creation in blkcg_bio_issue_check()Tejun Heo2015-08-181-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkg (blkcg_gq) currently is created by blkcg policies invoking blkg_lookup_create() which ends up repeating about the same code in different policies. Theoretically, this can avoid the overhead of looking and/or creating blkg's if blkcg is enabled but no policy is in use; however, the cost of blkg lookup / creation is very low especially if only the root blkcg is in use which is highly likely if no blkcg policy is in active use - it boils down to a single very predictable conditional and surrounding RCU protection. This patch consolidates blkg creation to a new function blkcg_bio_issue_check() which is called during bio issue from generic_make_request_checks(). blkcg_bio_issue_check() is now the only function which tries to create missing blkg's. The subsequent policy and request_list operations just perform blkg_lookup() and if missing falls back to the root. * blk_get_rl() no longer tries to create blkg. It uses blkg_lookup() instead of blkg_lookup_create(). * blk_throtl_bio() is now called from blkcg_bio_issue_check() with rcu read locked and blkg already looked up. Both throtl_lookup_tg() and throtl_lookup_create_tg() are dropped. * cfq is similarly updated. cfq_lookup_create_cfqg() is replaced with cfq_lookup_cfqg()which uses blkg_lookup(). This consolidates blkg handling and avoids unnecessary blkg creation retries under memory pressure. In addition, this provides a common bio entry point into blkcg where things like common accounting can be performed. v2: Build fixes for !CONFIG_CFQ_GROUP_IOSCHED and !CONFIG_BLK_DEV_THROTTLING. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: move root blkg lookup optimization from throtl_lookup_tg() to ↵Tejun Heo2015-08-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __blkg_lookup() Currently, both throttle and cfq policies implement their own root blkg (blkcg_gq) lookup fast path. This patch moves root blkg optimization from throtl_lookup_tg() to __blkg_lookup(). cfq-iosched currently doesn't use blkg_lookup() but will be converted and drop the optimization too. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: inline [__]blkg_lookup()Tejun Heo2015-08-181-4/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkg_lookup() checks whether the target queue is bypassing and, if not, calls __blkg_lookup() which first checks the lookup hint and then performs radix tree walk. The operations upto hint checking are trivial and there are many users of this function. This patch inlines blkg_lookup() and the fast path part of __blkg_lookup(). The radix tree lookup and hint update are now in blkg_lookup_slowpath(). This will help consolidating blkg handling by easing moving root blkcg short-circuit to inlined lookup fast path. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methodsTejun Heo2015-08-181-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each active policy has a cpd (blkcg_policy_data) on each blkcg. The cpd's were allocated by blkcg core and each policy could request to allocate extra space at the end by setting blkcg_policy->cpd_size larger than the size of cpd. This is a bit unusual but blkg (blkcg_gq) policy data used to be handled this way too so it made sense to be consistent; however, blkg policy data switched to alloc/free callbacks. This patch makes similar changes to cpd handling. blkcg_policy->cpd_alloc/free_fn() are added to replace ->cpd_size. As cpd allocation is now done from policy side, it can simply allocate a larger area which embeds cpd at the beginning. As ->cpd_alloc_fn() may be able to perform all necessary initializations, this patch makes ->cpd_init_fn() optional. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: minor updates around blkcg_policy_dataTejun Heo2015-08-181-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Rename blkcg->pd[] to blkcg->cpd[] so that cpd is consistently used for blkcg_policy_data. * Make blkcg_policy->cpd_init_fn() take blkcg_policy_data instead of blkcg. This makes it consistent with blkg_policy_data methods and to-be-added cpd alloc/free methods. * blkcg_policy_data->blkcg and cpd_to_blkcg() added so that cpd_init_fn() can determine the associated blkcg from blkcg_policy_data. v2: blkcg_policy_data->blkcg initializations were missing. Added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: make blkcg_policy methods take a pointer to blkcg_policy_dataTejun Heo2015-08-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The newly added ->pd_alloc_fn() and ->pd_free_fn() deal with pd (blkg_policy_data) while the older ones use blkg (blkcg_gq). As using blkg doesn't make sense for ->pd_alloc_fn() and after allocation pd can always be mapped to blkg and given that these are policy-specific methods, it makes sense to converge on pd. This patch makes all methods deal with pd instead of blkg. Most conversions are trivial. In blk-cgroup.c, a couple method invocation sites now test whether pd exists instead of policy state for consistency. This shouldn't cause any behavioral differences. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blk-throttle: clean up blkg_policy_data alloc/init/exit/free methodsTejun Heo2015-08-181-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the recent addition of alloc and free methods, things became messier. This patch reorganizes them according to the followings. * ->pd_alloc_fn() Responsible for allocation and static initializations - the ones which can be done independent of where the pd might be attached. * ->pd_init_fn() Initializations which require the knowledge of where the pd is attached. * ->pd_free_fn() The counter part of pd_alloc_fn(). Static de-init and freeing. This leaves ->pd_exit_fn() without any users. Removed. While at it, collapse an one liner function throtl_pd_exit(), which has only one user, into its user. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: replace blkcg_policy->pd_size with ->pd_alloc/free_fn() methodsTejun Heo2015-08-181-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A blkg (blkcg_gq) represents the relationship between a cgroup and request_queue. Each active policy has a pd (blkg_policy_data) on each blkg. The pd's were allocated by blkcg core and each policy could request to allocate extra space at the end by setting blkcg_policy->pd_size larger than the size of pd. This is a bit unusual but was done this way mostly to simplify error handling and all the existing use cases could be handled this way; however, this is becoming too restrictive now that percpu memory can be allocated without blocking. This introduces two new mandatory blkcg_policy methods - pd_alloc_fn() and pd_free_fn() - which are used to allocate and release pd for a given policy. As pd allocation is now done from policy side, it can simply allocate a larger area which embeds pd at the beginning. This change makes ->pd_size pointless. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: restructure blkg_policy_data allocation in blkcg_activate_policy()Tejun Heo2015-08-181-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a policy gets activated, it needs to allocate and install its policy data on all existing blkg's (blkcg_gq's). Because blkg iteration is protected by a spinlock, it currently counts the total number of blkg's in the system, allocates the matching number of policy data on a list and installs them during a single iteration. This can be simplified by using speculative GFP_NOWAIT allocations while iterating and falling back to a preallocated policy data on failure. If the preallocated one has already been consumed, it releases the lock, preallocate with GFP_KERNEL and then restarts the iteration. This can be a bit more expensive than before but policy activation is a very cold path and shouldn't matter. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl()Tejun Heo2015-08-181-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since ec13b1d6f0a0 ("blkcg: always create the blkcg_gq for the root blkcg"), a request_list always has its blkg associated. Drop unnecessary rl->blkg NULL test from blk_put_rl(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | writeback: update writeback tracepoints to report cgroupTejun Heo2015-08-181-39/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following tracepoints are updated to report the cgroup used during cgroup writeback. * writeback_write_inode[_start] * writeback_queue * writeback_exec * writeback_start * writeback_written * writeback_wait * writeback_nowork * writeback_wake_background * wbc_writepage * writeback_queue_io * bdi_dirty_ratelimit * balance_dirty_pages * writeback_sb_inodes_requeue * writeback_single_inode[_start] Note that writeback_bdi_register is separated out from writeback_class as reporting cgroup doesn't make sense to it. Tracepoints which take bdi are updated to take bdi_writeback instead. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | kernfs: implement kernfs_path_len()Tejun Heo2015-08-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a function to determine the path length of a kernfs node. This for now will be used by writeback tracepoint updates. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | writeback: bdi_for_each_wb() iteration is memcg ID based not blkcgTejun Heo2015-08-181-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wb's (bdi_writeback's) are currently keyed by memcg ID; however, in an earlier implementation, wb's were keyed by blkcg ID. bdi_for_each_wb() walks bdi->cgwb_tree in the ascending ID order and allows iterations to start from an arbitrary ID which is used to interrupt and resume iterations. Unfortunately, while changing wb to be keyed by memcg ID instead of blkcg, bdi_for_each_wb() was missed and is still assuming that wb's are keyed by blkcg ID. This doesn't affect iterations which don't get interrupted but bdi_split_work_to_wbs() makes use of iteration resuming on allocation failures and thus may incorrectly skip or repeat wb's. Fix it by changing bdi_for_each_wb() to take memcg IDs instead of blkcg IDs and updating bdi_split_work_to_wbs() accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | | Merge branch 'for-4.3-unified-base' of ↵Jens Axboe2015-08-182-1/+13
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup into for-4.3/blkcg
* | \ \ \ \ \ \ \ Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-09-1015-35/+341
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge third patch-bomb from Andrew Morton: - even more of the rest of MM - lib/ updates - checkpatch updates - small changes to a few scruffy filesystems - kmod fixes/cleanups - kexec updates - a dma-mapping cleanup series from hch * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits) dma-mapping: consolidate dma_set_mask dma-mapping: consolidate dma_supported dma-mapping: cosolidate dma_mapping_error dma-mapping: consolidate dma_{alloc,free}_noncoherent dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd() mm: make sure all file VMAs have ->vm_ops set mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff() mm: mark most vm_operations_struct const namei: fix warning while make xmldocs caused by namei.c ipc: convert invalid scenarios to use WARN_ON zlib_deflate/deftree: remove bi_reverse() lib/decompress_unlzma: Do a NULL check for pointer lib/decompressors: use real out buf size for gunzip with kernel fs/affs: make root lookup from blkdev logical size sysctl: fix int -> unsigned long assignments in INT_MIN case kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo kexec: align crash_notes allocation to make it be inside one physical page kexec: remove unnecessary test in kimage_alloc_crash_control_pages() kexec: split kexec_load syscall from kexec core code ...
| * | | | | | | | | dma-mapping: consolidate dma_set_maskChristoph Hellwig2015-09-101-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | dma-mapping: consolidate dma_supportedChristoph Hellwig2015-09-101-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most architectures just call into ->dma_supported, but some also return 1 if the method is not present, or 0 if no dma ops are present (although that should never happeb). Consolidate this more broad version into common code. Also fix h8300 which inorrectly always returned 0, which would have been a problem if it's dma_set_mask implementation wasn't a similarly buggy noop. As a few architectures have much more elaborate implementations, we still allow for arch overrides. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | dma-mapping: cosolidate dma_mapping_errorChristoph Hellwig2015-09-101-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there are three valid implementations of dma_mapping_error: (1) call ->mapping_error (2) check for a hardcoded error code (3) always return 0 This patch provides a common implementation that calls ->mapping_error if present, then checks for DMA_ERROR_CODE if defined or otherwise returns 0. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | dma-mapping: consolidate dma_{alloc,free}_noncoherentChristoph Hellwig2015-09-101-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most architectures do not support non-coherent allocations and either define dma_{alloc,free}_noncoherent to their coherent versions or stub them out. Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips implements them directly. This patch moves the Openrisc version to common code, and handles the DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance. Note that actual non-coherent allocations require a dma_cache_sync implementation, so if non-coherent allocations didn't work on an architecture before this patch they still won't work after it. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}Christoph Hellwig2015-09-101-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()Oleg Nesterov2015-09-101-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the additional "vm_flags_t vm_flags" argument to do_mmap_pgoff(), rename it to do_mmap(), and re-introduce do_mmap_pgoff() as a simple wrapper on top of do_mmap(). Perhaps we should update the callers of do_mmap_pgoff() and kill it later. This way mpx_mmap() can simply call do_mmap(vm_flags => VM_MPX) and do not play with vm internals. After this change mmap_region() has a single user outside of mmap.c, arch/tile/mm/elf.c:arch_setup_additional_pages(). It would be nice to change arch/tile/ and unexport mmap_region(). [kirill@shutemov.name: fix build] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>