summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* HID: uhid: add internal message bufferDavid Herrmann2012-06-182-0/+34
| | | | | | | | | | | | | | | | | | | | | | | When receiving messages from the HID subsystem, we need to process them and store them in an internal buffer so user-space can read() on the char device to retrieve the messages. This adds a static buffer for 32 messages to each uhid device. Each message is dynamically allocated so the uhid_device structure does not get too big. uhid_queue() adds a message to the buffer. If the buffer is full, the message is discarded. uhid_queue_event() is an helper for messages without payload. This also adds a public header: uhid.h. It contains the declarations for the user-space API. It is built around "struct uhid_event" which contains a type field which specifies the event type and each event can then add a variable-length payload. For now, there is only a dummy event but later patches will add new event types and payloads. Signed-off-by: David Herrmann <dh.herrmann@googlemail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* Merge branch 'for-linus' of ↵Linus Torvalds2012-05-223-21/+23
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID subsystem updates from Jiri Kosina: "Apart from various driver updates and added support for a number of new devices (mostly multitouch ones, but not limited to), there is one change that is worth pointing out explicitly: creation of HID device groups and proper autoloading of hid-multitouch, implemented by Henrik Rydberg." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (50 commits) HID: wacom: fix build breakage without CONFIG_LEDS_CLASS HID: waltop: Extend barrel button fix HID: hyperv: Set the hid drvdata correctly HID: wacom: Unify speed setting HID: wacom: Add speed setting for Intuos4 WL HID: wacom: Move Graphire raport header check. HID: uclogic: Add support for UC-Logic TWHL850 HID: explain the signed/unsigned handling in hid_add_field() HID: handle logical min/max signedness properly in parser HID: logitech: read all 32 bits of report type bitfield HID: wacom: Add LED selector control for Wacom Intuos4 WL HID: hid-multitouch: fix wrong protocol detection HID: wiimote: Fix IR data parser HID: wacom: Add tilt reporting for Intuos4 WL HID: multitouch: MT interface matching for Baanto HID: hid-multitouch: Only match MT interfaces HID: Create a common generic driver HID: hid-multitouch: Switch to device groups HID: Create a generic device group HID: Allow bus wildcard matching ...
| * Merge branch 'upstream' into for-linusJiri Kosina2012-05-222-3/+3
| |\ | | | | | | | | | | | | Conflicts: drivers/hid/hid-core.c
| | * HID: fix return value of hidraw_report_event() when !CONFIG_HIDRAWJiri Kosina2012-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b6787242f327 ("HID: hidraw: add proper error handling to raw event reporting") forgot to update the static inline version of hidraw_report_event() for the case when CONFIG_HIDRAW is unset. Fix that up. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| | * HID: hidraw: add proper error handling to raw event reportingJiri Kosina2012-04-272-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If kmemdup() in hidraw_report_event() fails, we are not propagating this fact properly. Let hidraw_report_event() and hid_report_raw_event() return an error value to the caller. Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: hid-multitouch: Switch to device groupsHenrik Rydberg2012-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch the driver over to device group handling. By adding the HID_GROUP_MULTITOUCH group to hid-core, hid-generic will no longer match multitouch devices. By adding the HID_GROUP_MULTITOUCH entry to the device list, hid-multitouch will match all unknown multitouch devices, and udev will automatically load the module. Since HID_QUIRK_MULTITOUCH never gets set, the special quirks handling can be removed. Since all HID MT devices have HID_DG_CONTACTID, they can be removed from the hid_have_special_driver list. With this patch, the unknown device ids are no longer NULL, so the code is modified to check for the generic entry instead. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: Create a generic device groupHenrik Rydberg2012-05-011-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devices that do not have a special driver are handled by the generic driver. This patch does the same thing using device groups; Instead of forcing a particular driver, the appropriate driver is picked up by udev. As a consequence, one can now move a device from generic to specific handling by a simple rebind. By adding a new device id to the generic driver, the same thing can be done in reverse. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: Allow bus wildcard matchingHenrik Rydberg2012-05-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most HID drivers do not need to know what bus driver is in use. A generic group driver can drive any hid device, and the device list should not need to be duplicated for each new bus. This patch adds wildcard matching to the HID bus, simplifying device list handling for group drivers. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: Scan the device for group info before adding itHenrik Rydberg2012-05-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to allow the report descriptor to influence the hid device properties, one needs to parse the descriptor early, without reference to any driver. Scan the descriptor for group information during device add, before the device has been broadcast to userland. The device modalias will contain group information which can be used to differentiate between modules. For starters, just handle the generic group. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: Add device group to modaliasHenrik Rydberg2012-05-012-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HID devices are only partially presented to userland. Hotplugged devices emit events containing a modalias based on the basic bus, vendor and product entities. However, in practise a hid device can depend on details such as a single usb interface or a particular item in a report descriptor. This patch adds a device group to the hid device id, and broadcasts it using uevent and the device modalias. The module alias generation is modified to match. As a consequence, a device with a non-zero group will be processed by the corresponding group driver instead of by the generic hid driver. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
| * | HID: Handle driver-specific device descriptor in coreHenrik Rydberg2012-05-011-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The low-level driver can read the report descriptor, but it cannot determine driver-specific changes to it. The hid core can fixup and parse the report descriptor during driver attach, but does not have direct access to the descriptor when doing so. To be able to handle attach/detach of hid drivers properly, a semantic change to hid_parse_report() is needed. This function has been used in two ways, both as descriptor reader in the ll drivers and as a parsor in the probe of the drivers. This patch splits the usage by introducing hid_open_report(), and modifies the hid_parse() macro to call hid_open_report() instead. The only usage of hid_parse_report() is then to read and store the device descriptor. As a consequence, we can handle the report fixups automatically inside the hid core. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Tested-by: Nikolai Kondrashov <spbnick@gmail.com> Tested-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2012-05-223-92/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The biggest change is the cleanup/simplification of the load-balancer: instead of the current practice of architectures twiddling scheduler internal data structures and providing the scheduler domains in colorfully inconsistent ways, we now have generic scheduler code in kernel/sched/core.c:sched_init_numa() that looks at the architecture's node_distance() parameters and (while not fully trusting it) deducts a NUMA topology from it. This inevitably changes balancing behavior - hopefully for the better. There are various smaller optimizations, cleanups and fixlets as well" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Taint kernel with TAINT_WARN after sleep-in-atomic bug sched: Remove stale power aware scheduling remnants and dysfunctional knobs sched/debug: Fix printing large integers on 32-bit platforms sched/fair: Improve the ->group_imb logic sched/nohz: Fix rq->cpu_load[] calculations sched/numa: Don't scale the imbalance sched/fair: Revert sched-domain iteration breakage sched/x86: Rewrite set_cpu_sibling_map() sched/numa: Fix the new NUMA topology bits sched/numa: Rewrite the CONFIG_NUMA sched domain support sched/fair: Propagate 'struct lb_env' usage into find_busiest_group sched/fair: Add some serialization to the sched_domain load-balance walk sched/fair: Let minimally loaded cpu balance the group sched: Change rq->nr_running to unsigned int x86/numa: Check for nonsensical topologies on real hw as well x86/numa: Hard partition cpu topology masks on node boundaries x86/numa: Allow specifying node_distance() for numa=fake x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly sched: Update documentation and comments sched_rt: Avoid unnecessary dequeue and enqueue of pushable tasks in set_cpus_allowed_rt()
| * | | sched: Remove stale power aware scheduling remnants and dysfunctional knobsPeter Zijlstra2012-05-173-54/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's been broken forever (i.e. it's not scheduling in a power aware fashion), as reported by Suresh and others sending patches, and nobody cares enough to fix it properly ... so remove it to make space free for something better. There's various problems with the code as it stands today, first and foremost the user interface which is bound to topology levels and has multiple values per level. This results in a state explosion which the administrator or distro needs to master and almost nobody does. Furthermore large configuration state spaces aren't good, it means the thing doesn't just work right because it's either under so many impossibe to meet constraints, or even if there's an achievable state workloads have to be aware of it precisely and can never meet it for dynamic workloads. So pushing this kind of decision to user-space was a bad idea even with a single knob - it's exponentially worse with knobs on every node of the topology. There is a proposal to replace the user interface with a single 3 state knob: sched_balance_policy := { performance, power, auto } where 'auto' would be the preferred default which looks at things like Battery/AC mode and possible cpufreq state or whatever the hw exposes to show us power use expectations - but there's been no progress on it in the past many months. Aside from that, the actual implementation of the various knobs is known to be broken. There have been sporadic attempts at fixing things but these always stop short of reaching a mergable state. Therefore this wholesale removal with the hopes of spurring people who care to come forward once again and work on a coherent replacement. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched/fair: Revert sched-domain iteration breakagePeter Zijlstra2012-05-141-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patches c22402a2f ("sched/fair: Let minimally loaded cpu balance the group") and 0ce90475 ("sched/fair: Add some serialization to the sched_domain load-balance walk") are horribly broken so revert them. The problem is that while it sounds good to have the minimally loaded cpu do the pulling of more load, the way we walk the domains there is absolutely no guarantee this cpu will actually get to the domain. In fact its very likely it wont. Therefore the higher up the tree we get, the less likely it is we'll balance at all. The first of mask always walks up, while sucky in that it accumulates load on the first cpu and needs extra passes to spread it out at least guarantees a cpu gets up that far and load-balancing happens at all. Since its now always the first and idle cpus should always be able to balance so they get a task as fast as possible we can also do away with the added serialization. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-rpuhs5s56aiv1aw7khv9zkw6@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched/numa: Rewrite the CONFIG_NUMA sched domain supportPeter Zijlstra2012-05-091-37/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code groups up to 16 nodes in a level and then puts an ALLNODES domain spanning the entire tree on top of that. This doesn't reflect the numa topology and esp for the smaller not-fully-connected machines out there today this might make a difference. Therefore, build a proper numa topology based on node_distance(). Since there's no fixed numa layers anymore, the static SD_NODE_INIT and SD_ALLNODES_INIT aren't usable anymore, the new code tries to construct something similar and scales some values either on the number of cpus in the domain and/or the node_distance() ratio. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: linux-alpha@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-sh@vger.kernel.org Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: sparclinux@vger.kernel.org Cc: Tony Luck <tony.luck@intel.com> Cc: x86@kernel.org Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Greg Pearson <greg.pearson@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: bob.picco@oracle.com Cc: chris.mason@oracle.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-r74n3n8hhuc2ynbrnp3vt954@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched/fair: Add some serialization to the sched_domain load-balance walkPeter Zijlstra2012-05-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the sched_domain walk is completely unserialized (!SD_SERIALIZE) it is possible that multiple cpus in the group get elected to do the next level. Avoid this by adding some serialization. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-vqh9ai6s0ewmeakjz80w4qz6@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched: Update documentation and commentsHiroshi Shimamoto2012-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change sched_*.c to sched/*.c in documentation and comments. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4F795CAC.9080206@ct.jp.nec.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2012-05-225-31/+38
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...
| * \ \ \ Merge branch 'perf/core' of ↵Ingo Molnar2012-05-2119-36/+167
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * \ \ \ Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo2012-05-1819-36/+167
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit e7c72d8 perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | | ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()Steven Rostedt2012-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To remove duplicate code, have the ftrace arch_ftrace_update_code() use the generic ftrace_modify_all_code(). This requires that the default ftrace_replace_code() becomes a weak function so that an arch may override it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | ftrace: Make ftrace_modify_all_code() global for archs to useSteven Rostedt2012-05-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename __ftrace_modify_code() to ftrace_modify_all_code() and make it global for all archs to use. This will remove the duplication of code, as archs that can modify code without stop_machine() can use it directly outside of the stop_machine() call. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | ftrace: Return record ip addr for ftrace_location()Steven Rostedt2012-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ftrace_location() is passed an addr, and returns 1 if the addr is on a ftrace nop (or caller to ftrace_caller), and 0 otherwise. To let kprobes know if it should move a breakpoint or not, it must return the actual addr that is the start of the ftrace nop. This way a kprobe placed on the location of a ftrace nop, can instead be placed on the instruction after the nop. Even if the probe addr is on the second or later byte of the nop, it can simply be moved forward. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | ftrace: Sort all function addresses, not just per pageSteven Rostedt2012-05-161-1/+1
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of just sorting the ip's of the functions per ftrace page, sort the entire list before adding them to the ftrace pages. This will allow the bsearch algorithm to be sped up as it can also sort by pages, not just records within a page. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | Merge branch 'tip/perf/core' of ↵Ingo Molnar2012-05-091-2/+6
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
| | * | | | | tracing: Prevent wasting time evaluating parameters in trace_preempt_on/offMinho Ban2012-05-081-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes spending time for evaluating parameters in trace_preempt_on/off when the tracer config is off. The patch mainly inspired by Steven Rostedt, thanks Steven. Link: http://lkml.kernel.org/r/4FA73510.7070705@samsung.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Turner <pjt@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Minho Ban <mhban@samsung.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | sched, perf: Use a single callback into the schedulerPeter Zijlstra2012-05-091-18/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can easily use a single callback for both sched-in and sched-out. This reduces the code footprint in the scheduler path as well as removes the PMU black spot otherwise present between the out and in callback. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-o56ajxp1edwqg6x9d31wb805@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | perf: Pass last sampling period to perf_sample_data_init()Robert Richter2012-05-091-1/+4
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | Merge branch 'tip/perf/core-4' of ↵Ingo Molnar2012-05-073-8/+17
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
| | * | | | | ftrace/x86: Have arch x86_64 use breakpoints instead of stop machineSteven Rostedt2012-04-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This method changes x86 to add a breakpoint to the mcount locations instead of calling stop machine. Now that iret can be handled by NMIs, we perform the following to update code: 1) Add a breakpoint to all locations that will be modified 2) Sync all cores 3) Update all locations to be either a nop or call (except breakpoint op) 4) Sync all cores 5) Remove the breakpoint with the new code. 6) Sync all cores [ Added updates that Masami suggested: Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient. Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier. ] Cc: H. Peter Anvin <hpa@zytor.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | ring-buffer: Add per_cpu ring buffer control filesVaibhav Nagarnaik2012-04-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a debugfs entry under per_cpu/ folder for each cpu called buffer_size_kb to control the ring buffer size for each CPU independently. If the global file buffer_size_kb is used to set size, the individual ring buffers will be adjusted to the given size. The buffer_size_kb will report the common size to maintain backward compatibility. If the buffer_size_kb file under the per_cpu/ directory is used to change buffer size for a specific CPU, only the size of the respective ring buffer is updated. When tracing/buffer_size_kb is read, it reports 'X' to indicate that sizes of per_cpu ring buffers are not equivalent. Link: http://lkml.kernel.org/r/1328212844-11889-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Justin Teravest <teravest@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | tracing: Add percpu buffers for trace_printk()Steven Rostedt2012-04-231-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, trace_printk() uses a single buffer to write into to calculate the size and format needed to save the trace. To do this safely in an SMP environment, a spin_lock() is taken to only allow one writer at a time to the buffer. But this could also affect what is being traced, and add synchronization that would not be there otherwise. Ideally, using percpu buffers would be useful, but since trace_printk() is only used in development, having per cpu buffers for something never used is a waste of space. Thus, the use of the trace_bprintk() format section is changed to be used for static fmts as well as dynamic ones. Then at boot up, we can check if the section that holds the trace_printk formats is non-empty, and if it does contain something, then we know a trace_printk() has been added to the kernel. At this time the trace_printk per cpu buffers are allocated. A check is also done at module load time in case a module is added that contains a trace_printk(). Once the buffers are allocated, they are never freed. If you use a trace_printk() then you should know what you are doing. A buffer is made for each type of context: normal softirq irq nmi The context is checked and the appropriate buffer is used. This allows for totally lockless usage of trace_printk(), and they no longer even disable interrupts. Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | | | | Merge branch 'for-3.5' of ↵Linus Torvalds2012-05-224-35/+64
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "cgroup file type addition / removal is updated so that file types are added and removed instead of individual files so that dynamic file type addition / removal can be implemented by cgroup and used by controllers. blkio controller changes which will come through block tree are dependent on this. Other changes include res_counter cleanup and disallowing kthread / PF_THREAD_BOUND threads to be attached to non-root cgroups. There's a reported bug with the file type addition / removal handling which can lead to oops on cgroup umount. The issue is being looked into. It shouldn't cause problems for most setups and isn't a security concern." Fix up trivial conflict in Documentation/feature-removal-schedule.txt * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) res_counter: Account max_usage when calling res_counter_charge_nofail() res_counter: Merge res_counter_charge and res_counter_charge_nofail cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads cgroup: remove cgroup_subsys->populate() cgroup: get rid of populate for memcg cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg cgroup: make css->refcnt clearing on cgroup removal optional cgroup: use negative bias on css->refcnt to block css_tryget() cgroup: implement cgroup_rm_cftypes() cgroup: introduce struct cfent cgroup: relocate __d_cgrp() and __d_cft() cgroup: remove cgroup_add_file[s]() cgroup: convert memcg controller to the new cftype interface memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP cgroup: convert all non-memcg controllers to the new cftype interface cgroup: relocate cftype and cgroup_subsys definitions in controllers cgroup: merge cft_release_agent cftype array into the base files array cgroup: implement cgroup_add_cftypes() and friends cgroup: build list of all cgroups under a given cgroupfs_root cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() ...
| * | | | | | | res_counter: Merge res_counter_charge and res_counter_charge_nofailFrederic Weisbecker2012-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These two functions do almost the same thing and duplicate some code. Merge their implementation into a single common function. res_counter_charge_locked() takes one more parameter but it doesn't seem to be used outside res_counter.c yet anyway. There is no (intended) change in the behaviour. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Glauber Costa <glommer@parallels.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Li Zefan <lizefan@huawei.com>
| * | | | | | | cgroup: remove cgroup_subsys->populate()Tejun Heo2012-04-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With memcg converted, cgroup_subsys->populate() doesn't have any user left. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * | | | | | | cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcgGlauber Costa2012-04-102-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only reason cgroup was used, was to be consistent with the populate() interface. Now that we're getting rid of it, not only we no longer need it, but we also *can't* call it this way. Since we will no longer rely on populate(), this will be called from create(). During create, the association between struct mem_cgroup and struct cgroup does not yet exist, since cgroup internals hasn't yet initialized its bookkeeping. This means we would not be able to draw the memcg pointer from the cgroup pointer in these functions, which is highly undesirable. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org> CC: Li Zefan <lizefan@huawei.com> CC: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@suse.cz>
| * | | | | | | cgroup: make css->refcnt clearing on cgroup removal optionalTejun Heo2012-04-011-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cgroup removal tries to drain all css references. If there are active css references, the removal logic waits and retries ->pre_detroy() until either all refs drop to zero or removal is cancelled. This semantics is unusual and adds non-trivial complexity to cgroup core and IMHO is fundamentally misguided in that it couples internal implementation details (references to internal data structure) with externally visible operation (rmdir). To userland, this is a behavior peculiarity which is unnecessary and difficult to expect (css refs is otherwise invisible from userland), and, to policy implementations, this is an unnecessary restriction (e.g. blkcg wants to hold css refs for caching purposes but can't as that becomes visible as rmdir hang). Unfortunately, memcg currently depends on ->pre_destroy() retrials and cgroup removal vetoing and can't be immmediately switched to the new behavior. This patch introduces the new behavior of not waiting for css refs to drain and maintains the old behavior for subsystems which have __DEPRECATED_clear_css_refs set. Once, memcg is updated, we can drop the code paths for the old behavior as proposed in the following patch. Note that the following patch is incorrect in that dput work item is in cgroup and may lose some of dputs when multiples css's are released back-to-back, and __css_put() triggers check_for_release() when refcnt reaches 0 instead of 1; however, it shows what part can be removed. http://thread.gmane.org/gmane.linux.kernel.containers/22559/focus=75251 Note that, in not-too-distant future, cgroup core will start emitting warning messages for subsys which require the old behavior, so please get moving. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
| * | | | | | | cgroup: use negative bias on css->refcnt to block css_tryget()Tejun Heo2012-04-011-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cgroup is about to be removed, cgroup_clear_css_refs() is called to check and ensure that there are no active css references. This is currently achieved by dropping the refcnt to zero iff it has only the base ref. If all css refs could be dropped to zero, ref clearing is successful and CSS_REMOVED is set on all css. If not, the base ref is restored. While css ref is zero w/o CSS_REMOVED set, any css_tryget() attempt on it busy loops so that they are atomic w.r.t. the whole css ref clearing. This does work but dropping and re-instating the base ref is somewhat hairy and makes it difficult to add more logic to the put path as there are two of them - the regular css_put() and the reversible base ref clearing. This patch updates css ref clearing such that blocking new css_tryget() and putting the base ref are separate operations. CSS_DEACT_BIAS, defined as INT_MIN, is added to css->refcnt and css_tryget() busy loops while refcnt is negative. After all css refs are deactivated, if they were all one, ref clearing succeeded and CSS_REMOVED is set and the base ref is put using the regular css_put(); otherwise, CSS_DEACT_BIAS is subtracted from the refcnts and the original postive values are restored. css_refcnt() accessor which always returns the unbiased positive reference counts is added and used to simplify refcnt usages. While at it, relocate and reformat comments in cgroup_has_css_refs(). This separates css->refcnt deactivation and putting the base ref, which enables the next patch to make ref clearing optional. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * | | | | | | cgroup: implement cgroup_rm_cftypes()Tejun Heo2012-04-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement cgroup_rm_cftypes() which removes an array of cftypes from a subsystem. It can be called whether the target subsys is attached or not. cgroup core will remove the specified file from all existing cgroups. This will be used to improve sub-subsys modularity and will be helpful for unified hierarchy. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * | | | | | | cgroup: introduce struct cfentTejun Heo2012-04-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds cfent (cgroup file entry) which is the association between a cgroup and a file. This is in-cgroup representation of files under a cgroup directory. This simplifies walking walking cgroup files and thus cgroup_clear_directory(), which is now implemented in two parts - cgroup_rm_file() and a loop around it. cgroup_rm_file() will be used to implement cftype removal and cfent is scheduled to serve cgroup specific per-file data (e.g. for sysfs-like "sever" semantics). v2: - cfe was freed from cgroup_rm_file() which led to use-after-free if the file had openers at the time of removal. Moved to cgroup_diput(). - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs wasn't empty after removing all files. This triggered spuriously if some files were open during directory clearing. Removed. v3: - In cgroup_diput(), WARN_ONCE(!list_empty(&cfe->node)) could be spuriously triggered for root cgroups because they don't go through cgroup_clear_directory() on unmount. Don't trigger WARN for root cgroups. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com>
| * | | | | | | cgroup: remove cgroup_add_file[s]()Tejun Heo2012-04-011-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No controller is using cgroup_add_files[s](). Unexport them, and convert cgroup_add_files() to handle NULL entry terminated array instead of taking count explicitly and continue creation on failure for internal use. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * | | | | | | cgroup: implement cgroup_add_cftypes() and friendsTejun Heo2012-04-011-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cgroup directories are populated by subsys->populate() callback explicitly creating files on each cgroup creation. This level of flexibility isn't needed or desirable. It provides largely unused flexibility which call for abuses while severely limiting what the core layer can do through the lack of structure and conventions. Per each cgroup file type, the only distinction that cgroup users is making is whether a cgroup is root or not, which can easily be expressed with flags. This patch introduces cgroup_add_cftypes(). These deal with cftypes instead of individual files - controllers indicate that certain types of files exist for certain subsystem. Newly added CFTYPE_*_ON_ROOT flags indicate whether a cftype should be excluded or created only on the root cgroup. cgroup_add_cftypes() can be called any time whether the target subsystem is currently attached or not. cgroup core will create files on the existing cgroups as necessary. Also, cgroup_subsys->base_cftypes is added to ease registration of the base files for the subsystem. If non-NULL on subsys init, the cftypes pointed to by ->base_cftypes are automatically registered on subsys init / load. Further patches will convert the existing users and remove the file based interface. Note that this interface allows dynamic addition of files to an active controller. This will be used for sub-controller modularity and unified hierarchy in the longer term. This patch implements the new mechanism but doesn't apply it to any user. v2: replaced DECLARE_CGROUP_CFTYPES[_COND]() with cgroup_subsys->base_cftypes, which works better for cgroup_subsys which is loaded as module. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * | | | | | | cgroup: build list of all cgroups under a given cgroupfs_rootTejun Heo2012-04-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Build a list of all cgroups anchored at cgroupfs_root->allcg_list and going through cgroup->allcg_node. The list is protected by cgroup_mutex and will be used to improve cgroup file handling. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
* | | | | | | | Merge branch 'for-3.5' of ↵Linus Torvalds2012-05-222-56/+2
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Contains Alex Shi's three patches to remove percpu_xxx() which overlap with this_cpu_xxx(). There shouldn't be any functional change." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: remove percpu_xxx() functions x86: replace percpu_xxx funcs with this_cpu_xxx net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx
| * | | | | | | | percpu: remove percpu_xxx() functionsAlex Shi2012-05-141-54/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove percpu_xxx serial functions, all of them were replaced by this_cpu_xxx or __this_cpu_xxx serial functions Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | | | | | | x86: replace percpu_xxx funcs with this_cpu_xxxAlex Shi2012-05-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | | | | | | Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2012-05-221-0/+18
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull workqueue changes from Tejun Heo: "Nothing exciting. Most are updates to debug stuff and related fixes. Two not-too-critical bugs are fixed - WARN_ON() triggering spurious during cpu offlining and unlikely lockdep related oops." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: lockdep: fix oops in processing workqueue workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active workqueue: Catch more locking problems with flush_work() workqueue: change BUG_ON() to WARN_ON() trace: Remove unused workqueue tracer
| * | | | | | | | | lockdep: fix oops in processing workqueuePeter Zijlstra2012-05-151-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under memory load, on x86_64, with lockdep enabled, the workqueue's process_one_work() has been seen to oops in __lock_acquire(), barfing on a 0xffffffff00000000 pointer in the lockdep_map's class_cache[]. Because it's permissible to free a work_struct from its callout function, the map used is an onstack copy of the map given in the work_struct: and that copy is made without any locking. Surprisingly, gcc (4.5.1 in Hugh's case) uses "rep movsl" rather than "rep movsq" for that structure copy: which might race with a workqueue user's wait_on_work() doing lock_map_acquire() on the source of the copy, putting a pointer into the class_cache[], but only in time for the top half of that pointer to be copied to the destination map. Boom when process_one_work() subsequently does lock_map_acquire() on its onstack copy of the lockdep_map. Fix this, and a similar instance in call_timer_fn(), with a lockdep_copy_map() function which additionally NULLs the class_cache[]. Note: this oops was actually seen on 3.4-next, where flush_work() newly does the racing lock_map_acquire(); but Tejun points out that 3.4 and earlier are already vulnerable to the same through wait_on_work(). * Patch orginally from Peter. Hugh modified it a bit and wrote the description. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Hugh Dickins <hughd@google.com> LKML-Reference: <alpine.LSU.2.00.1205070951170.1544@eggly.anvils> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | | | | | | | Merge tag 'staging-3.5-rc1' of ↵Linus Torvalds2012-05-2219-17/+1745
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree changes from Greg Kroah-Hartman: "Here is the big staging tree pull request for the 3.5-rc1 merge window. Loads of changes here, and we just narrowly added more lines than we added: 622 files changed, 28356 insertions(+), 26059 deletions(-) But, good news is that there is a number of subsystems that moved out of the staging tree, to their respective "real" portions of the kernel. Code that moved out was: - iio core code - mei driver - vme core and bridge drivers There was one broken network driver that moved into staging as a step before it is removed from the tree (pc300), and there was a few new drivers added to the tree: - new iio drivers - gdm72xx wimax USB driver - ipack subsystem and 2 drivers All of the movements around have acks from the various subsystem maintainers, and all of this has been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed up various trivial conflicts, along with a non-trivial one found in -next and pointed out by Olof Johanssen: a clean - but incorrect - merge of the arch/arm/boot/dts/at91sam9g20.dtsi file. Fix up manually as per Stephen Rothwell. * tag 'staging-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (536 commits) Staging: bcm: Remove two unused variables from Adapter.h Staging: bcm: Removes the volatile type definition from Adapter.h Staging: bcm: Rename all "INT" to "int" in Adapter.h Staging: bcm: Fix warning: __packed vs. __attribute__((packed)) in Adapter.h Staging: bcm: Correctly format all comments in Adapter.h Staging: bcm: Fix all whitespace issues in Adapter.h Staging: bcm: Properly format braces in Adapter.h Staging: ipack/bridges/tpci200: remove unneeded casts Staging: ipack/bridges/tpci200: remove TPCI200_SHORTNAME constant Staging: ipack: remove board_name and bus_name fields from struct ipack_device Staging: ipack: improve the register of a bus and a device in the bus. staging: comedi: cleanup all the comedi_driver 'detach' functions staging: comedi: remove all 'default N' in Kconfig staging: line6/config.h: Delete unused header staging: gdm72xx depends on NET staging: gdm72xx: Set up parent link in sysfs for gdm72xx devices staging: drm/omap: initial dmabuf/prime import support staging: drm/omap: dmabuf/prime mmap support pstore/ram: Add ECC support pstore/ram: Switch to persistent_ram routines ...
| * | | | | | | | | | pstore/ram: Add ECC supportAnton Vorontsov2012-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is now straightforward: just introduce a module parameter and pass the needed value to persistent_ram_new(). Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Marco Stornelli <marco.stornelli@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>