diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2019-11-26 19:06:44 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2019-11-26 19:06:44 -0800 |
commit | 9e7a03233e02afd3ee061e373355f34d7254f1e6 (patch) | |
tree | 62e978438049432c52dddb8329aea8090a8a8243 | |
parent | c2da5bdc66a377f0b82ee959f19f5a6774706b83 (diff) | |
parent | e350b60f4e0f089f585d738e27213c8133fe9093 (diff) | |
download | linux-9e7a03233e02afd3ee061e373355f34d7254f1e6.tar.gz linux-9e7a03233e02afd3ee061e373355f34d7254f1e6.tar.bz2 linux-9e7a03233e02afd3ee061e373355f34d7254f1e6.zip |
Merge tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These include cpuidle changes to use nanoseconds (instead of
microseconds) as the unit of time and to simplify checks for disabled
idle states in the idle loop, some cpuidle fixes and governor updates,
assorted cpufreq updates (driver updates mostly and a few core fixes
and cleanups), devfreq updates (dominated by the tegra30 driver
changes), new CPU IDs for the RAPL power capping driver, relatively
minor updates of the generic power domains (genpd) and operation
performance points (OPP) frameworks, and assorted fixes and cleanups.
There are also two maintainer information updates: Chanwoo Choi will
be maintaining the devfreq subsystem going forward and Todd Brandt is
going to maintain the pm-graph utility (created by him).
Specifics:
- Use nanoseconds (instead of microseconds) as the unit of time in
the cpuidle core and simplify checks for disabled idle states in
the idle loop (Rafael Wysocki)
- Fix and clean up the teo cpuidle governor (Rafael Wysocki)
- Fix the cpuidle registration error code path (Zhenzhong Duan)
- Avoid excessive vmexits in the ACPI cpuidle driver (Yin Fengwei)
- Extend the idle injection infrastructure to be able to measure the
requested duration in nanoseconds and to allow an exit latency
limit for idle states to be specified (Daniel Lezcano)
- Fix cpufreq driver registration and clarify a comment in the
cpufreq core (Viresh Kumar)
- Add NULL checks to the show() and store() methods of sysfs
attributes exposed by cpufreq (Kai Shen)
- Update cpufreq drivers:
* Fix for a plain int as pointer warning from sparse in
intel_pstate (Jamal Shareef)
* Fix for a hardcoded number of CPUs and stack bloat in the
powernv driver (John Hubbard)
* Updates to the ti-cpufreq driver and DT files to support new
platforms and migrate bindings from opp-v1 to opp-v2 (Adam Ford,
H. Nikolaus Schaller)
* Merging of the arm_big_little and vexpress-spc drivers and
related cleanup (Sudeep Holla)
* Fix for imx's default speed grade value (Anson Huang)
* Minor cleanup of the s3c64xx driver (Nathan Chancellor)
* CPU speed bin detection fix for sun50i (Ondrej Jirman)
- Appoint Chanwoo Choi as the new devfreq maintainer.
- Update the devfreq core:
* Check NULL governor in available_governors_show sysfs to prevent
showing wrong governor information and fix a race condition
between devfreq_update_status() and trans_stat_show() (Leonard
Crestez)
* Add new 'interrupt-driven' flag for devfreq governors to allow
interrupt-driven governors to prevent the devfreq core from
polling devices for status (Dmitry Osipenko)
* Improve an error message in devfreq_add_device() (Matthias
Kaehlcke)
- Update devfreq drivers:
* tegra30 driver fixes and cleanups (Dmitry Osipenko)
* Removal of unused property from dt-binding documentation for the
exynos-bus driver (Kamil Konieczny)
* exynos-ppmu cleanup and DT bindings update (Lukasz Luba, Marek
Szyprowski)
- Add new CPU IDs for CometLake Mobile and Desktop to the Intel RAPL
power capping driver (Zhang Rui)
- Allow device initialization in the generic power domains (genpd)
framework to be more straightforward and clean it up (Ulf Hansson)
- Add support for adjusting OPP voltages at run time to the OPP
framework (Stephen Boyd)
- Avoid freeing memory that has never been allocated in the
hibernation core (Andy Whitcroft)
- Clean up function headers in a header file and coding style in the
wakeup IRQs handling code (Ulf Hansson, Xiaofei Tan)
- Clean up the SmartReflex adaptive voltage scaling (AVS) driver for
ARM (Ben Dooks, Geert Uytterhoeven)
- Wrap power management documentation to fit in 80 columns (Bjorn
Helgaas)
- Add pm-graph utility entry to MAINTAINERS (Todd Brandt)
- Update the cpupower utility:
* Fix the handling of set and info subcommands (Abhishek Goel)
* Fix build warnings (Nathan Chancellor)
* Improve mperf_monitor handling (Janakarajan Natarajan)"
* tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits)
PM: Wrap documentation to fit in 80 columns
cpuidle: Pass exit latency limit to cpuidle_use_deepest_state()
cpuidle: Allow idle injection to apply exit latency limit
cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks
cpuidle: teo: Avoid code duplication in conditionals
cpufreq: Register drivers only after CPU devices have been registered
cpuidle: teo: Avoid using "early hits" incorrectly
cpuidle: teo: Exclude cpuidle overhead from computations
PM / Domains: Convert to dev_to_genpd_safe() in genpd_syscore_switch()
mmc: tmio: Avoid boilerplate code in ->runtime_suspend()
PM / Domains: Implement the ->start() callback for genpd
PM / Domains: Introduce dev_pm_domain_start()
ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition
PM / wakeirq: remove unnecessary parentheses
power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call
cpuidle: Use nanoseconds as the unit of time
PM / OPP: Support adjusting OPP voltages at runtime
PM / core: Clean up some function headers in power.h
cpufreq: Add NULL checks to show() and store() methods of cpufreq
cpufreq: intel_pstate: Fix plain int as pointer warning from sparse
...
116 files changed, 2091 insertions, 1406 deletions
diff --git a/Documentation/devicetree/bindings/arm/omap/omap.txt b/Documentation/devicetree/bindings/arm/omap/omap.txt index b301f753ed2c..e77635c5422c 100644 --- a/Documentation/devicetree/bindings/arm/omap/omap.txt +++ b/Documentation/devicetree/bindings/arm/omap/omap.txt @@ -43,7 +43,7 @@ SoC Families: - OMAP2 generic - defaults to OMAP2420 compatible = "ti,omap2" -- OMAP3 generic - defaults to OMAP3430 +- OMAP3 generic compatible = "ti,omap3" - OMAP4 generic - defaults to OMAP4430 compatible = "ti,omap4" @@ -51,6 +51,8 @@ SoC Families: compatible = "ti,omap5" - DRA7 generic - defaults to DRA742 compatible = "ti,dra7" +- AM33x generic + compatible = "ti,am33xx" - AM43x generic - defaults to AM4372 compatible = "ti,am43" @@ -63,12 +65,14 @@ SoCs: - OMAP3430 compatible = "ti,omap3430", "ti,omap3" + legacy: "ti,omap34xx" - please do not use any more - AM3517 compatible = "ti,am3517", "ti,omap3" - OMAP3630 - compatible = "ti,omap36xx", "ti,omap3" -- AM33xx - compatible = "ti,am33xx", "ti,omap3" + compatible = "ti,omap3630", "ti,omap3" + legacy: "ti,omap36xx" - please do not use any more +- AM335x + compatible = "ti,am33xx" - OMAP4430 compatible = "ti,omap4430", "ti,omap4" @@ -110,19 +114,19 @@ SoCs: - AM4372 compatible = "ti,am4372", "ti,am43" -Boards: +Boards (incomplete list of examples): - OMAP3 BeagleBoard : Low cost community board - compatible = "ti,omap3-beagle", "ti,omap3" + compatible = "ti,omap3-beagle", "ti,omap3430", "ti,omap3" - OMAP3 Tobi with Overo : Commercial expansion board with daughter board - compatible = "gumstix,omap3-overo-tobi", "gumstix,omap3-overo", "ti,omap3" + compatible = "gumstix,omap3-overo-tobi", "gumstix,omap3-overo", "ti,omap3430", "ti,omap3" - OMAP4 SDP : Software Development Board - compatible = "ti,omap4-sdp", "ti,omap4430" + compatible = "ti,omap4-sdp", "ti,omap4430", "ti,omap4" - OMAP4 PandaBoard : Low cost community board - compatible = "ti,omap4-panda", "ti,omap4430" + compatible = "ti,omap4-panda", "ti,omap4430", "ti,omap4" - OMAP4 DuoVero with Parlor : Commercial expansion board with daughter board compatible = "gumstix,omap4-duovero-parlor", "gumstix,omap4-duovero", "ti,omap4430", "ti,omap4"; @@ -134,16 +138,16 @@ Boards: compatible = "variscite,var-dvk-om44", "variscite,var-som-om44", "ti,omap4460", "ti,omap4"; - OMAP3 EVM : Software Development Board for OMAP35x, AM/DM37x - compatible = "ti,omap3-evm", "ti,omap3" + compatible = "ti,omap3-evm", "ti,omap3630", "ti,omap3" - AM335X EVM : Software Development Board for AM335x - compatible = "ti,am335x-evm", "ti,am33xx", "ti,omap3" + compatible = "ti,am335x-evm", "ti,am33xx" - AM335X Bone : Low cost community board - compatible = "ti,am335x-bone", "ti,am33xx", "ti,omap3" + compatible = "ti,am335x-bone", "ti,am33xx" - AM3359 ICEv2 : Low cost Industrial Communication Engine EVM. - compatible = "ti,am3359-icev2", "ti,am33xx", "ti,omap3" + compatible = "ti,am3359-icev2", "ti,am33xx" - AM335X OrionLXm : Substation Automation Platform compatible = "novatech,am335x-lxm", "ti,am33xx" diff --git a/Documentation/devicetree/bindings/cpufreq/ti-cpufreq.txt b/Documentation/devicetree/bindings/cpufreq/ti-cpufreq.txt index 0c38e4b8fc51..1758051798fe 100644 --- a/Documentation/devicetree/bindings/cpufreq/ti-cpufreq.txt +++ b/Documentation/devicetree/bindings/cpufreq/ti-cpufreq.txt @@ -15,12 +15,16 @@ In 'cpus' nodes: In 'operating-points-v2' table: - compatible: Should be - - 'operating-points-v2-ti-cpu' for am335x, am43xx, and dra7xx/am57xx SoCs + - 'operating-points-v2-ti-cpu' for am335x, am43xx, and dra7xx/am57xx, + omap34xx, omap36xx and am3517 SoCs - syscon: A phandle pointing to a syscon node representing the control module register space of the SoC. Optional properties: -------------------- +- "vdd-supply", "vbb-supply": to define two regulators for dra7xx +- "cpu0-supply", "vbb-supply": to define two regulators for omap36xx + For each opp entry in 'operating-points-v2' table: - opp-supported-hw: Two bitfields indicating: 1. Which revision of the SoC the OPP is supported by diff --git a/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt b/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt index 3e36c1d11386..fb46b491791c 100644 --- a/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt +++ b/Documentation/devicetree/bindings/devfreq/event/exynos-ppmu.txt @@ -10,14 +10,23 @@ The Exynos PPMU driver uses the devfreq-event class to provide event data to various devfreq devices. The devfreq devices would use the event data when derterming the current state of each IP. -Required properties: +Required properties for PPMU device: - compatible: Should be "samsung,exynos-ppmu" or "samsung,exynos-ppmu-v2. - reg: physical base address of each PPMU and length of memory mapped region. -Optional properties: +Optional properties for PPMU device: - clock-names : the name of clock used by the PPMU, "ppmu" - clocks : phandles for clock specified in "clock-names" property +Required properties for 'events' child node of PPMU device: +- event-name : the unique event name among PPMU device +Optional properties for 'events' child node of PPMU device: +- event-data-type : Define the type of data which shell be counted +by the counter. You can check include/dt-bindings/pmu/exynos_ppmu.h for +all possible type, i.e. count read requests, count write data in bytes, +etc. This field is optional and when it is missing, the driver code +will use default data type. + Example1 : PPMUv1 nodes in exynos3250.dtsi are listed below. ppmu_dmc0: ppmu_dmc0@106a0000 { @@ -145,3 +154,16 @@ Example3 : PPMUv2 nodes in exynos5433.dtsi are listed below. reg = <0x104d0000 0x2000>; status = "disabled"; }; + +Example4 : 'event-data-type' in exynos4412-ppmu-common.dtsi are listed below. + + &ppmu_dmc0 { + status = "okay"; + events { + ppmu_dmc0_3: ppmu-event3-dmc0 { + event-name = "ppmu-event3-dmc0"; + event-data-type = <(PPMU_RO_DATA_CNT | + PPMU_WO_DATA_CNT)>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/devfreq/exynos-bus.txt b/Documentation/devicetree/bindings/devfreq/exynos-bus.txt index f8e946471a58..e71f752cc18f 100644 --- a/Documentation/devicetree/bindings/devfreq/exynos-bus.txt +++ b/Documentation/devicetree/bindings/devfreq/exynos-bus.txt @@ -50,8 +50,6 @@ Required properties only for passive bus device: Optional properties only for parent bus device: - exynos,saturation-ratio: the percentage value which is used to calibrate the performance count against total cycle count. -- exynos,voltage-tolerance: the percentage value for bus voltage tolerance - which is used to calculate the max voltage. Detailed correlation between sub-blocks and power line according to Exynos SoC: - In case of Exynos3250, there are two power line as following: diff --git a/Documentation/power/drivers-testing.rst b/Documentation/power/drivers-testing.rst index e53f1999fc39..d77d2894f9fe 100644 --- a/Documentation/power/drivers-testing.rst +++ b/Documentation/power/drivers-testing.rst @@ -39,9 +39,10 @@ c) Compile the driver directly into the kernel and try the test modes of d) Attempt to hibernate with the driver compiled directly into the kernel in the "reboot", "shutdown" and "platform" modes. -e) Try the test modes of suspend (see: Documentation/power/basic-pm-debugging.rst, - 2). [As far as the STR tests are concerned, it should not matter whether or - not the driver is built as a module.] +e) Try the test modes of suspend (see: + Documentation/power/basic-pm-debugging.rst, 2). [As far as the STR tests are + concerned, it should not matter whether or not the driver is built as a + module.] f) Attempt to suspend to RAM using the s2ram tool with the driver loaded (see: Documentation/power/basic-pm-debugging.rst, 2). diff --git a/Documentation/power/freezing-of-tasks.rst b/Documentation/power/freezing-of-tasks.rst index ef110fe55e82..8bd693399834 100644 --- a/Documentation/power/freezing-of-tasks.rst +++ b/Documentation/power/freezing-of-tasks.rst @@ -215,30 +215,31 @@ VI. Are there any precautions to be taken to prevent freezing failures? Yes, there are. -First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a piece of code -from system-wide sleep such as suspend/hibernation is not encouraged. -If possible, that piece of code must instead hook onto the suspend/hibernation -notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code -(kernel/cpu.c) for an example. - -However, if that is not feasible, and grabbing 'system_transition_mutex' is deemed necessary, -it is strongly discouraged to directly call mutex_[un]lock(&system_transition_mutex) since -that could lead to freezing failures, because if the suspend/hibernate code -successfully acquired the 'system_transition_mutex' lock, and hence that other entity failed -to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE -state. As a consequence, the freezer would not be able to freeze that task, -leading to freezing failure. +First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a +piece of code from system-wide sleep such as suspend/hibernation is not +encouraged. If possible, that piece of code must instead hook onto the +suspend/hibernation notifiers to achieve mutual exclusion. Look at the +CPU-Hotplug code (kernel/cpu.c) for an example. + +However, if that is not feasible, and grabbing 'system_transition_mutex' is +deemed necessary, it is strongly discouraged to directly call +mutex_[un]lock(&system_transition_mutex) since that could lead to freezing +failures, because if the suspend/hibernate code successfully acquired the +'system_transition_mutex' lock, and hence that other entity failed to acquire +the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE state. As a +consequence, the freezer would not be able to freeze that task, leading to +freezing failure. However, the [un]lock_system_sleep() APIs are safe to use in this scenario, since they ask the freezer to skip freezing this task, since it is anyway -"frozen enough" as it is blocked on 'system_transition_mutex', which will be released -only after the entire suspend/hibernation sequence is complete. -So, to summarize, use [un]lock_system_sleep() instead of directly using +"frozen enough" as it is blocked on 'system_transition_mutex', which will be +released only after the entire suspend/hibernation sequence is complete. So, to +summarize, use [un]lock_system_sleep() instead of directly using mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures. V. Miscellaneous ================ /sys/power/pm_freeze_timeout controls how long it will cost at most to freeze -all user space processes or all freezable kernel threads, in unit of millisecond. -The default value is 20000, with range of unsigned integer. +all user space processes or all freezable kernel threads, in unit of +millisecond. The default value is 20000, with range of unsigned integer. diff --git a/Documentation/power/opp.rst b/Documentation/power/opp.rst index 209c7613f5a4..e3cc4f349ea8 100644 --- a/Documentation/power/opp.rst +++ b/Documentation/power/opp.rst @@ -73,19 +73,21 @@ factors. Example usage: Thermal management or other exceptional situations where SoC framework might choose to disable a higher frequency OPP to safely continue operations until that OPP could be re-enabled if possible. -OPP library facilitates this concept in it's implementation. The following +OPP library facilitates this concept in its implementation. The following operational functions operate only on available opps: -opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count +opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, +dev_pm_opp_get_opp_count -dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then -be used for dev_pm_opp_enable/disable functions to make an opp available as required. +dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer +which can then be used for dev_pm_opp_enable/disable functions to make an +opp available as required. WARNING: Users of OPP library should refresh their availability count using -get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the -exact mechanism to trigger these or the notification mechanism to other -dependent subsystems such as cpufreq are left to the discretion of the SoC -specific framework which uses the OPP library. Similar care needs to be taken -care to refresh the cpufreq table in cases of these operations. +get_opp_count if dev_pm_opp_enable/disable functions are invoked for a +device, the exact mechanism to trigger these or the notification mechanism +to other dependent subsystems such as cpufreq are left to the discretion of +the SoC specific framework which uses the OPP library. Similar care needs +to be taken care to refresh the cpufreq table in cases of these operations. 2. Initial OPP List Registration ================================ @@ -99,11 +101,11 @@ OPPs dynamically using the dev_pm_opp_enable / disable functions. dev_pm_opp_add Add a new OPP for a specific domain represented by the device pointer. The OPP is defined using the frequency and voltage. Once added, the OPP - is assumed to be available and control of it's availability can be done - with the dev_pm_opp_enable/disable functions. OPP library internally stores - and manages this information in the opp struct. This function may be - used by SoC framework to define a optimal list as per the demands of - SoC usage environment. + is assumed to be available and control of its availability can be done + with the dev_pm_opp_enable/disable functions. OPP library + internally stores and manages this information in the opp struct. + This function may be used by SoC framework to define a optimal list + as per the demands of SoC usage environment. WARNING: Do not use this function in interrupt context. @@ -354,7 +356,7 @@ struct dev_pm_opp struct device This is used to identify a domain to the OPP layer. The - nature of the device and it's implementation is left to the user of + nature of the device and its implementation is left to the user of OPP library such as the SoC framework. Overall, in a simplistic view, the data structure operations is represented as diff --git a/Documentation/power/pci.rst b/Documentation/power/pci.rst index 0e2ef7429304..51e0a493d284 100644 --- a/Documentation/power/pci.rst +++ b/Documentation/power/pci.rst @@ -426,12 +426,12 @@ pm->runtime_idle() callback. 2.4. System-Wide Power Transitions ---------------------------------- There are a few different types of system-wide power transitions, described in -Documentation/driver-api/pm/devices.rst. Each of them requires devices to be handled -in a specific way and the PM core executes subsystem-level power management -callbacks for this purpose. They are executed in phases such that each phase -involves executing the same subsystem-level callback for every device belonging -to the given subsystem before the next phase begins. These phases always run -after tasks have been frozen. +Documentation/driver-api/pm/devices.rst. Each of them requires devices to be +handled in a specific way and the PM core executes subsystem-level power +management callbacks for this purpose. They are executed in phases such that +each phase involves executing the same subsystem-level callback for every device +belonging to the given subsystem before the next phase begins. These phases +always run after tasks have been frozen. 2.4.1. System Suspend ^^^^^^^^^^^^^^^^^^^^^ @@ -636,12 +636,12 @@ System restore requires a hibernation image to be loaded into memory and the pre-hibernation memory contents to be restored before the pre-hibernation system activity can be resumed. -As described in Documentation/driver-api/pm/devices.rst, the hibernation image is loaded -into memory by a fresh instance of the kernel, called the boot kernel, which in -turn is loaded and run by a boot loader in the usual way. After the boot kernel -has loaded the image, it needs to replace its own code and data with the code -and data of the "hibernated" kernel stored within the image, called the image -kernel. For this purpose all devices are frozen just like before creating +As described in Documentation/driver-api/pm/devices.rst, the hibernation image +is loaded into memory by a fresh instance of the kernel, called the boot kernel, +which in turn is loaded and run by a boot loader in the usual way. After the +boot kernel has loaded the image, it needs to replace its own code and data with +the code and data of the "hibernated" kernel stored within the image, called the +image kernel. For this purpose all devices are frozen just like before creating the image during hibernation, in the prepare, freeze, freeze_noirq @@ -691,8 +691,8 @@ controlling the runtime power management of their devices. At the time of this writing there are two ways to define power management callbacks for a PCI device driver, the recommended one, based on using a -dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and the -"legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and +dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and +the "legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and .resume() callbacks from struct pci_driver are used. The legacy approach, however, doesn't allow one to define runtime power management callbacks and is not really suitable for any new drivers. Therefore it is not covered by this diff --git a/Documentation/power/pm_qos_interface.rst b/Documentation/power/pm_qos_interface.rst index 3097694fba69..0d62d506caf0 100644 --- a/Documentation/power/pm_qos_interface.rst +++ b/Documentation/power/pm_qos_interface.rst @@ -8,8 +8,8 @@ one of the parameters. Two different PM QoS frameworks are available: 1. PM QoS classes for cpu_dma_latency -2. the per-device PM QoS framework provides the API to manage the per-device latency -constraints and PM QoS flags. +2. The per-device PM QoS framework provides the API to manage the + per-device latency constraints and PM QoS flags. Each parameters have defined units: @@ -47,14 +47,14 @@ void pm_qos_add_request(handle, param_class, target_value): pm_qos API functions. void pm_qos_update_request(handle, new_target_value): - Will update the list element pointed to by the handle with the new target value - and recompute the new aggregated target, calling the notification tree if the - target is changed. + Will update the list element pointed to by the handle with the new target + value and recompute the new aggregated target, calling the notification tree + if the target is changed. void pm_qos_remove_request(handle): - Will remove the element. After removal it will update the aggregate target and - call the notification tree if the target was changed as a result of removing - the request. + Will remove the element. After removal it will update the aggregate target + and call the notification tree if the target was changed as a result of + removing the request. int pm_qos_request(param_class): Returns the aggregated value for a given PM QoS class. @@ -167,9 +167,9 @@ int dev_pm_qos_expose_flags(device, value) change the value of the PM_QOS_FLAG_NO_POWER_OFF flag. void dev_pm_qos_hide_flags(device) - Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list - of flags and remove sysfs attribute pm_qos_no_power_off from the device's power - directory. + Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS + list of flags and remove sysfs attribute pm_qos_no_power_off from the device's + power directory. Notification mechanisms: @@ -179,8 +179,8 @@ int dev_pm_qos_add_notifier(device, notifier, type): Adds a notification callback function for the device for a particular request type. - The callback is called when the aggregated value of the device constraints list - is changed. + The callback is called when the aggregated value of the device constraints + list is changed. int dev_pm_qos_remove_notifier(device, notifier, type): Removes the notification callback function for the device. diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst index 2c2ec99b5088..ab8406c84254 100644 --- a/Documentation/power/runtime_pm.rst +++ b/Documentation/power/runtime_pm.rst @@ -268,8 +268,8 @@ defined in include/linux/pm.h: `unsigned int runtime_auto;` - if set, indicates that the user space has allowed the device driver to power manage the device at run time via the /sys/devices/.../power/control - `interface;` it may only be modified with the help of the pm_runtime_allow() - and pm_runtime_forbid() helper functions + `interface;` it may only be modified with the help of the + pm_runtime_allow() and pm_runtime_forbid() helper functions `unsigned int no_callbacks;` - indicates that the device does not use the runtime PM callbacks (see diff --git a/Documentation/power/suspend-and-cpuhotplug.rst b/Documentation/power/suspend-and-cpuhotplug.rst index 7ac8e1f549f4..572d968c5375 100644 --- a/Documentation/power/suspend-and-cpuhotplug.rst +++ b/Documentation/power/suspend-and-cpuhotplug.rst @@ -106,8 +106,8 @@ execution during resume): * Release system_transition_mutex lock. -It is to be noted here that the system_transition_mutex lock is acquired at the very -beginning, when we are just starting out to suspend, and then released only +It is to be noted here that the system_transition_mutex lock is acquired at the +very beginning, when we are just starting out to suspend, and then released only after the entire cycle is complete (i.e., suspend + resume). :: @@ -165,7 +165,8 @@ Important files and functions/entry points: - kernel/power/process.c : freeze_processes(), thaw_processes() - kernel/power/suspend.c : suspend_prepare(), suspend_enter(), suspend_finish() -- kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](), [disable|enable]_nonboot_cpus() +- kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](), + [disable|enable]_nonboot_cpus() diff --git a/Documentation/power/swsusp.rst b/Documentation/power/swsusp.rst index d000312f6965..8524f079e05c 100644 --- a/Documentation/power/swsusp.rst +++ b/Documentation/power/swsusp.rst @@ -118,7 +118,8 @@ In a really perfect world:: echo 1 > /proc/acpi/sleep # for standby echo 2 > /proc/acpi/sleep # for suspend to ram - echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative + echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power + # conservative echo 4 > /proc/acpi/sleep # for suspend to disk echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system @@ -192,8 +193,8 @@ Q: A: The freezing of tasks is a mechanism by which user space processes and some - kernel threads are controlled during hibernation or system-wide suspend (on some - architectures). See freezing-of-tasks.txt for details. + kernel threads are controlled during hibernation or system-wide suspend (on + some architectures). See freezing-of-tasks.txt for details. Q: What is the difference between "platform" and "shutdown"? @@ -282,7 +283,8 @@ A: suspend(PMSG_FREEZE): devices are frozen so that they don't interfere with state snapshot - state snapshot: copy of whole used memory is taken with interrupts disabled + state snapshot: copy of whole used memory is taken with interrupts + disabled resume(): devices are woken up so that we can write image to swap @@ -353,8 +355,8 @@ Q: A: Generally, yes, you can. However, it requires you to use the "resume=" and - "resume_offset=" kernel command line parameters, so the resume from a swap file - cannot be initiated from an initrd or initramfs image. See + "resume_offset=" kernel command line parameters, so the resume from a swap + file cannot be initiated from an initrd or initramfs image. See swsusp-and-swap-files.txt for details. Q: diff --git a/MAINTAINERS b/MAINTAINERS index 3ff146d4677e..c570f0204b48 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3548,7 +3548,7 @@ BUS FREQUENCY DRIVER FOR SAMSUNG EXYNOS M: Chanwoo Choi <cw00.choi@samsung.com> L: linux-pm@vger.kernel.org L: linux-samsung-soc@vger.kernel.org -T: git git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git S: Maintained F: drivers/devfreq/exynos-bus.c F: Documentation/devicetree/bindings/devfreq/exynos-bus.txt @@ -4282,14 +4282,13 @@ F: include/linux/cpufreq.h F: include/linux/sched/cpufreq.h F: tools/testing/selftests/cpufreq/ -CPU FREQUENCY DRIVERS - ARM BIG LITTLE +CPU FREQUENCY DRIVERS - VEXPRESS SPC ARM BIG LITTLE M: Viresh Kumar <viresh.kumar@linaro.org> M: Sudeep Holla <sudeep.holla@arm.com> L: linux-pm@vger.kernel.org W: http://www.arm.com/products/processors/technologies/biglittleprocessing.php S: Maintained -F: drivers/cpufreq/arm_big_little.h -F: drivers/cpufreq/arm_big_little.c +F: drivers/cpufreq/vexpress-spc-cpufreq.c CPU POWER MONITORING SUBSYSTEM M: Thomas Renninger <trenn@suse.com> @@ -4774,9 +4773,9 @@ F: include/linux/devcoredump.h DEVICE FREQUENCY (DEVFREQ) M: MyungJoo Ham <myungjoo.ham@samsung.com> M: Kyungmin Park <kyungmin.park@samsung.com> -R: Chanwoo Choi <cw00.choi@samsung.com> +M: Chanwoo Choi <cw00.choi@samsung.com> L: linux-pm@vger.kernel.org -T: git git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git S: Maintained F: drivers/devfreq/ F: include/linux/devfreq.h @@ -4786,10 +4785,11 @@ F: include/trace/events/devfreq.h DEVICE FREQUENCY EVENT (DEVFREQ-EVENT) M: Chanwoo Choi <cw00.choi@samsung.com> L: linux-pm@vger.kernel.org -T: git git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git S: Supported F: drivers/devfreq/event/ F: drivers/devfreq/devfreq-event.c +F: include/dt-bindings/pmu/exynos_ppmu.h F: include/linux/devfreq-event.h F: Documentation/devicetree/bindings/devfreq/event/ @@ -13076,6 +13076,15 @@ L: linux-scsi@vger.kernel.org S: Supported F: drivers/scsi/pm8001/ +PM-GRAPH UTILITY +M: "Todd E Brandt" <todd.e.brandt@linux.intel.com> +L: linux-pm@vger.kernel.org +W: https://01.org/pm-graph +B: https://bugzilla.kernel.org/buglist.cgi?component=pm-graph&product=Tools +T: git git://github.com/intel/pm-graph +S: Supported +F: tools/power/pm-graph + PNP SUPPORT M: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> S: Maintained diff --git a/arch/arm/boot/dts/am3517.dtsi b/arch/arm/boot/dts/am3517.dtsi index bf3002009b00..76f819f4ba48 100644 --- a/arch/arm/boot/dts/am3517.dtsi +++ b/arch/arm/boot/dts/am3517.dtsi @@ -16,6 +16,37 @@ can = &hecc; }; + cpus { + cpu: cpu@0 { + /* Based on OMAP3630 variants OPP50 and OPP100 */ + operating-points-v2 = <&cpu0_opp_table>; + + clock-latency = <300000>; /* From legacy driver */ + }; + }; + + cpu0_opp_table: opp-table { + compatible = "operating-points-v2-ti-cpu"; + syscon = <&scm_conf>; + /* + * AM3517 TRM only lists 600MHz @ 1.2V, but omap36xx + * appear to operate at 300MHz as well. Since AM3517 only + * lists one operating voltage, it will remain fixed at 1.2V + */ + opp50-300000000 { + opp-hz = /bits/ 64 <300000000>; + opp-microvolt = <1200000>; + opp-supported-hw = <0xffffffff 0xffffffff>; + opp-suspend; + }; + + opp100-600000000 { + opp-hz = /bits/ 64 <600000000>; + opp-microvolt = <1200000>; + opp-supported-hw = <0xffffffff 0xffffffff>; + }; + }; + ocp@68000000 { am35x_otg_hs: am35x_otg_hs@5c040000 { compatible = "ti,omap3-musb"; diff --git a/arch/arm/boot/dts/am3517_mt_ventoux.dts b/arch/arm/boot/dts/am3517_mt_ventoux.dts index e507e4ae0d88..e7d7124a34ba 100644 --- a/arch/arm/boot/dts/am3517_mt_ventoux.dts +++ b/arch/arm/boot/dts/am3517_mt_ventoux.dts @@ -8,7 +8,7 @@ / { model = "TeeJet Mt.Ventoux"; - compatible = "teejet,mt_ventoux", "ti,omap3"; + compatible = "teejet,mt_ventoux", "ti,am3517", "ti,omap3"; memory@80000000 { device_type = "memory"; diff --git a/arch/arm/boot/dts/logicpd-som-lv-35xx-devkit.dts b/arch/arm/boot/dts/logicpd-som-lv-35xx-devkit.dts index f7a841a28865..2a0a98fe67f0 100644 --- a/arch/arm/boot/dts/logicpd-som-lv-35xx-devkit.dts +++ b/arch/arm/boot/dts/logicpd-som-lv-35xx-devkit.dts @@ -9,5 +9,5 @@ / { model = "LogicPD Zoom OMAP35xx SOM-LV Development Kit"; - compatible = "logicpd,dm3730-som-lv-devkit", "ti,omap3"; + compatible = "logicpd,dm3730-som-lv-devkit", "ti,omap3430", "ti,omap3"; }; diff --git a/arch/arm/boot/dts/logicpd-torpedo-35xx-devkit.dts b/arch/arm/boot/dts/logicpd-torpedo-35xx-devkit.dts index 7675bc3fa868..57bae2aa910e 100644 --- a/arch/arm/boot/dts/logicpd-torpedo-35xx-devkit.dts +++ b/arch/arm/boot/dts/logicpd-torpedo-35xx-devkit.dts @@ -9,5 +9,5 @@ / { model = "LogicPD Zoom OMAP35xx Torpedo Development Kit"; - compatible = "logicpd,dm3730-torpedo-devkit", "ti,omap3"; + compatible = "logicpd,dm3730-torpedo-devkit", "ti,omap3430", "ti,omap3"; }; diff --git a/arch/arm/boot/dts/omap3-beagle-xm.dts b/arch/arm/boot/dts/omap3-beagle-xm.dts index 1aa99fc1487a..125ed933ca75 100644 --- a/arch/arm/boot/dts/omap3-beagle-xm.dts +++ b/arch/arm/boot/dts/omap3-beagle-xm.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3 BeagleBoard xM"; - compatible = "ti,omap3-beagle-xm", "ti,omap36xx", "ti,omap3"; + compatible = "ti,omap3-beagle-xm", "ti,omap3630", "ti,omap36xx", "ti,omap3"; cpus { cpu@0 { diff --git a/arch/arm/boot/dts/omap3-beagle.dts b/arch/arm/boot/dts/omap3-beagle.dts index e3df3c166902..4ed3f93f5841 100644 --- a/arch/arm/boot/dts/omap3-beagle.dts +++ b/arch/arm/boot/dts/omap3-beagle.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3 BeagleBoard"; - compatible = "ti,omap3-beagle", "ti,omap3"; + compatible = "ti,omap3-beagle", "ti,omap3430", "ti,omap3"; cpus { cpu@0 { diff --git a/arch/arm/boot/dts/omap3-cm-t3530.dts b/arch/arm/boot/dts/omap3-cm-t3530.dts index 76e52c78cbb4..32dbaeaed147 100644 --- a/arch/arm/boot/dts/omap3-cm-t3530.dts +++ b/arch/arm/boot/dts/omap3-cm-t3530.dts @@ -9,7 +9,7 @@ / { model = "CompuLab CM-T3530"; - compatible = "compulab,omap3-cm-t3530", "ti,omap34xx", "ti,omap3"; + compatible = "compulab,omap3-cm-t3530", "ti,omap3430", "ti,omap34xx", "ti,omap3"; /* Regulator to trigger the reset signal of the Wifi module */ mmc2_sdio_reset: regulator-mmc2-sdio-reset { diff --git a/arch/arm/boot/dts/omap3-cm-t3730.dts b/arch/arm/boot/dts/omap3-cm-t3730.dts index 6e944dfa0f3d..683819bf0915 100644 --- a/arch/arm/boot/dts/omap3-cm-t3730.dts +++ b/arch/arm/boot/dts/omap3-cm-t3730.dts @@ -9,7 +9,7 @@ / { model = "CompuLab CM-T3730"; - compatible = "compulab,omap3-cm-t3730", "ti,omap36xx", "ti,omap3"; + compatible = "compulab,omap3-cm-t3730", "ti,omap3630", "ti,omap36xx", "ti,omap3"; wl12xx_vmmc2: wl12xx_vmmc2 { compatible = "regulator-fixed"; diff --git a/arch/arm/boot/dts/omap3-devkit8000-lcd43.dts b/arch/arm/boot/dts/omap3-devkit8000-lcd43.dts index a80fc60bc773..afed85078ad8 100644 --- a/arch/arm/boot/dts/omap3-devkit8000-lcd43.dts +++ b/arch/arm/boot/dts/omap3-devkit8000-lcd43.dts @@ -11,7 +11,7 @@ #include "omap3-devkit8000-lcd-common.dtsi" / { model = "TimLL OMAP3 Devkit8000 with 4.3'' LCD panel"; - compatible = "timll,omap3-devkit8000", "ti,omap3"; + compatible = "timll,omap3-devkit8000", "ti,omap3430", "ti,omap3"; lcd0: display { panel-timing { diff --git a/arch/arm/boot/dts/omap3-devkit8000-lcd70.dts b/arch/arm/boot/dts/omap3-devkit8000-lcd70.dts index 0753776071f8..07c51a105c0d 100644 --- a/arch/arm/boot/dts/omap3-devkit8000-lcd70.dts +++ b/arch/arm/boot/dts/omap3-devkit8000-lcd70.dts @@ -11,7 +11,7 @@ #include "omap3-devkit8000-lcd-common.dtsi" / { model = "TimLL OMAP3 Devkit8000 with 7.0'' LCD panel"; - compatible = "timll,omap3-devkit8000", "ti,omap3"; + compatible = "timll,omap3-devkit8000", "ti,omap3430", "ti,omap3"; lcd0: display { panel-timing { diff --git a/arch/arm/boot/dts/omap3-devkit8000.dts b/arch/arm/boot/dts/omap3-devkit8000.dts index faafc48d8f61..162d0726b008 100644 --- a/arch/arm/boot/dts/omap3-devkit8000.dts +++ b/arch/arm/boot/dts/omap3-devkit8000.dts @@ -7,7 +7,7 @@ #include "omap3-devkit8000-common.dtsi" / { model = "TimLL OMAP3 Devkit8000"; - compatible = "timll,omap3-devkit8000", "ti,omap3"; + compatible = "timll,omap3-devkit8000", "ti,omap3430", "ti,omap3"; aliases { display1 = &dvi0; diff --git a/arch/arm/boot/dts/omap3-gta04.dtsi b/arch/arm/boot/dts/omap3-gta04.dtsi index b6ef1a7ac8a4..409a758c99f1 100644 --- a/arch/arm/boot/dts/omap3-gta04.dtsi +++ b/arch/arm/boot/dts/omap3-gta04.dtsi @@ -11,7 +11,7 @@ / { model = "OMAP3 GTA04"; - compatible = "ti,omap3-gta04", "ti,omap36xx", "ti,omap3"; + compatible = "ti,omap3-gta04", "ti,omap3630", "ti,omap36xx", "ti,omap3"; cpus { cpu@0 { diff --git a/arch/arm/boot/dts/omap3-ha-lcd.dts b/arch/arm/boot/dts/omap3-ha-lcd.dts index badb9b3c8897..c9ecbc45c8e2 100644 --- a/arch/arm/boot/dts/omap3-ha-lcd.dts +++ b/arch/arm/boot/dts/omap3-ha-lcd.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3 HEAD acoustics LCD-baseboard with TAO3530 SOM"; - compatible = "headacoustics,omap3-ha-lcd", "technexion,omap3-tao3530", "ti,omap34xx", "ti,omap3"; + compatible = "headacoustics,omap3-ha-lcd", "technexion,omap3-tao3530", "ti,omap3430", "ti,omap34xx", "ti,omap3"; }; &omap3_pmx_core { diff --git a/arch/arm/boot/dts/omap3-ha.dts b/arch/arm/boot/dts/omap3-ha.dts index a5365252bfbe..35c4e15abeb7 100644 --- a/arch/arm/boot/dts/omap3-ha.dts +++ b/arch/arm/boot/dts/omap3-ha.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3 HEAD acoustics baseboard with TAO3530 SOM"; - compatible = "headacoustics,omap3-ha", "technexion,omap3-tao3530", "ti,omap34xx", "ti,omap3"; + compatible = "headacoustics,omap3-ha", "technexion,omap3-tao3530", "ti,omap3430", "ti,omap34xx", "ti,omap3"; }; &omap3_pmx_core { diff --git a/arch/arm/boot/dts/omap3-igep0020-rev-f.dts b/arch/arm/boot/dts/omap3-igep0020-rev-f.dts index 03dcd05fb8a0..d134ce1cffc0 100644 --- a/arch/arm/boot/dts/omap3-igep0020-rev-f.dts +++ b/arch/arm/boot/dts/omap3-igep0020-rev-f.dts @@ -10,7 +10,7 @@ / { model = "IGEPv2 Rev. F (TI OMAP AM/DM37x)"; - compatible = "isee,omap3-igep0020-rev-f", "ti,omap36xx", "ti,omap3"; + compatible = "isee,omap3-igep0020-rev-f", "ti,omap3630", "ti,omap36xx", "ti,omap3"; /* Regulator to trigger the WL_EN signal of the Wifi module */ lbep5clwmc_wlen: regulator-lbep5clwmc-wlen { diff --git a/arch/arm/boot/dts/omap3-igep0020.dts b/arch/arm/boot/dts/omap3-igep0020.dts index 6d0519e3dfd0..e341535a7162 100644 --- a/arch/arm/boot/dts/omap3-igep0020.dts +++ b/arch/arm/boot/dts/omap3-igep0020.dts @@ -10,7 +10,7 @@ / { model = "IGEPv2 Rev. C (TI OMAP AM/DM37x)"; - compatible = "isee,omap3-igep0020", "ti,omap36xx", "ti,omap3"; + compatible = "isee,omap3-igep0020", "ti,omap3630", "ti,omap36xx", "ti,omap3"; vmmcsdio_fixed: fixedregulator-mmcsdio { compatible = "regulator-fixed"; diff --git a/arch/arm/boot/dts/omap3-igep0030-rev-g.dts b/arch/arm/boot/dts/omap3-igep0030-rev-g.dts index 060acd1e803a..9ca1d0f61964 100644 --- a/arch/arm/boot/dts/omap3-igep0030-rev-g.dts +++ b/arch/arm/boot/dts/omap3-igep0030-rev-g.dts @@ -10,7 +10,7 @@ / { model = "IGEP COM MODULE Rev. G (TI OMAP AM/DM37x)"; - compatible = "isee,omap3-igep0030-rev-g", "ti,omap36xx", "ti,omap3"; + compatible = "isee,omap3-igep0030-rev-g", "ti,omap3630", "ti,omap36xx", "ti,omap3"; /* Regulator to trigger the WL_EN signal of the Wifi module */ lbep5clwmc_wlen: regulator-lbep5clwmc-wlen { diff --git a/arch/arm/boot/dts/omap3-igep0030.dts b/arch/arm/boot/dts/omap3-igep0030.dts index 25170bd3c573..32f31035daa2 100644 --- a/arch/arm/boot/dts/omap3-igep0030.dts +++ b/arch/arm/boot/dts/omap3-igep0030.dts @@ -10,7 +10,7 @@ / { model = "IGEP COM MODULE Rev. E (TI OMAP AM/DM37x)"; - compatible = "isee,omap3-igep0030", "ti,omap36xx", "ti,omap3"; + compatible = "isee,omap3-igep0030", "ti,omap3630", "ti,omap36xx", "ti,omap3"; vmmcsdio_fixed: fixedregulator-mmcsdio { compatible = "regulator-fixed"; diff --git a/arch/arm/boot/dts/omap3-ldp.dts b/arch/arm/boot/dts/omap3-ldp.dts index 9a5fde2d9bce..ec9ba04ef43b 100644 --- a/arch/arm/boot/dts/omap3-ldp.dts +++ b/arch/arm/boot/dts/omap3-ldp.dts @@ -10,7 +10,7 @@ / { model = "TI OMAP3430 LDP (Zoom1 Labrador)"; - compatible = "ti,omap3-ldp", "ti,omap3"; + compatible = "ti,omap3-ldp", "ti,omap3430", "ti,omap3"; memory@80000000 { device_type = "memory"; diff --git a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi index c22833d4e568..73d477898ec2 100644 --- a/arch/arm/boot/dts/omap3-lilly-a83x.dtsi +++ b/arch/arm/boot/dts/omap3-lilly-a83x.dtsi @@ -7,7 +7,7 @@ / { model = "INCOstartec LILLY-A83X module (DM3730)"; - compatible = "incostartec,omap3-lilly-a83x", "ti,omap36xx", "ti,omap3"; + compatible = "incostartec,omap3-lilly-a83x", "ti,omap3630", "ti,omap36xx", "ti,omap3"; chosen { bootargs = "console=ttyO0,115200n8 vt.global_cursor_default=0 consoleblank=0"; diff --git a/arch/arm/boot/dts/omap3-lilly-dbb056.dts b/arch/arm/boot/dts/omap3-lilly-dbb056.dts index fec335400074..ecb4ef738e07 100644 --- a/arch/arm/boot/dts/omap3-lilly-dbb056.dts +++ b/arch/arm/boot/dts/omap3-lilly-dbb056.dts @@ -8,7 +8,7 @@ / { model = "INCOstartec LILLY-DBB056 (DM3730)"; - compatible = "incostartec,omap3-lilly-dbb056", "incostartec,omap3-lilly-a83x", "ti,omap36xx", "ti,omap3"; + compatible = "incostartec,omap3-lilly-dbb056", "incostartec,omap3-lilly-a83x", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &twl { diff --git a/arch/arm/boot/dts/omap3-n9.dts b/arch/arm/boot/dts/omap3-n9.dts index 74c0ff2350d3..2495a696cec6 100644 --- a/arch/arm/boot/dts/omap3-n9.dts +++ b/arch/arm/boot/dts/omap3-n9.dts @@ -12,7 +12,7 @@ / { model = "Nokia N9"; - compatible = "nokia,omap3-n9", "ti,omap36xx", "ti,omap3"; + compatible = "nokia,omap3-n9", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &i2c2 { diff --git a/arch/arm/boot/dts/omap3-n950-n9.dtsi b/arch/arm/boot/dts/omap3-n950-n9.dtsi index 6681d4519e97..a075b63f3087 100644 --- a/arch/arm/boot/dts/omap3-n950-n9.dtsi +++ b/arch/arm/boot/dts/omap3-n950-n9.dtsi @@ -11,13 +11,6 @@ cpus { cpu@0 { cpu0-supply = <&vcc>; - operating-points = < - /* kHz uV */ - 300000 1012500 - 600000 1200000 - 800000 1325000 - 1000000 1375000 - >; }; }; diff --git a/arch/arm/boot/dts/omap3-n950.dts b/arch/arm/boot/dts/omap3-n950.dts index 9886bf8b90ab..31d47a1fad84 100644 --- a/arch/arm/boot/dts/omap3-n950.dts +++ b/arch/arm/boot/dts/omap3-n950.dts @@ -12,7 +12,7 @@ / { model = "Nokia N950"; - compatible = "nokia,omap3-n950", "ti,omap36xx", "ti,omap3"; + compatible = "nokia,omap3-n950", "ti,omap3630", "ti,omap36xx", "ti,omap3"; keys { compatible = "gpio-keys"; diff --git a/arch/arm/boot/dts/omap3-overo-storm-alto35.dts b/arch/arm/boot/dts/omap3-overo-storm-alto35.dts index 18338576c41d..7f04dfad8203 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-alto35.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-alto35.dts @@ -14,5 +14,5 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Alto35"; - compatible = "gumstix,omap3-overo-alto35", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-alto35", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; diff --git a/arch/arm/boot/dts/omap3-overo-storm-chestnut43.dts b/arch/arm/boot/dts/omap3-overo-storm-chestnut43.dts index f204c8af8281..bc5a04e03336 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-chestnut43.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-chestnut43.dts @@ -14,7 +14,7 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Chestnut43"; - compatible = "gumstix,omap3-overo-chestnut43", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-chestnut43", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-overo-storm-gallop43.dts b/arch/arm/boot/dts/omap3-overo-storm-gallop43.dts index c633f7cee68e..065c31cbf0e2 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-gallop43.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-gallop43.dts @@ -14,7 +14,7 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Gallop43"; - compatible = "gumstix,omap3-overo-gallop43", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-gallop43", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-overo-storm-palo35.dts b/arch/arm/boot/dts/omap3-overo-storm-palo35.dts index fb88ebc9858c..e38c1c51392c 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-palo35.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-palo35.dts @@ -14,7 +14,7 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Palo35"; - compatible = "gumstix,omap3-overo-palo35", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-palo35", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-overo-storm-palo43.dts b/arch/arm/boot/dts/omap3-overo-storm-palo43.dts index 76cca00d97b6..e6dc23159c4d 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-palo43.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-palo43.dts @@ -14,7 +14,7 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Palo43"; - compatible = "gumstix,omap3-overo-palo43", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-palo43", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-overo-storm-summit.dts b/arch/arm/boot/dts/omap3-overo-storm-summit.dts index cc081a9e4c1e..587c08ce282d 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-summit.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-summit.dts @@ -14,7 +14,7 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Summit"; - compatible = "gumstix,omap3-overo-summit", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-summit", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-overo-storm-tobi.dts b/arch/arm/boot/dts/omap3-overo-storm-tobi.dts index 1de41c0826e0..f57de6010994 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-tobi.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-tobi.dts @@ -14,6 +14,6 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on Tobi"; - compatible = "gumstix,omap3-overo-tobi", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-tobi", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; diff --git a/arch/arm/boot/dts/omap3-overo-storm-tobiduo.dts b/arch/arm/boot/dts/omap3-overo-storm-tobiduo.dts index 9ed13118ed8e..281af6c113be 100644 --- a/arch/arm/boot/dts/omap3-overo-storm-tobiduo.dts +++ b/arch/arm/boot/dts/omap3-overo-storm-tobiduo.dts @@ -14,5 +14,5 @@ / { model = "OMAP36xx/AM37xx/DM37xx Gumstix Overo on TobiDuo"; - compatible = "gumstix,omap3-overo-tobiduo", "gumstix,omap3-overo", "ti,omap36xx", "ti,omap3"; + compatible = "gumstix,omap3-overo-tobiduo", "gumstix,omap3-overo", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; diff --git a/arch/arm/boot/dts/omap3-pandora-1ghz.dts b/arch/arm/boot/dts/omap3-pandora-1ghz.dts index 81b957f33c9f..ea509956d7ac 100644 --- a/arch/arm/boot/dts/omap3-pandora-1ghz.dts +++ b/arch/arm/boot/dts/omap3-pandora-1ghz.dts @@ -16,7 +16,7 @@ / { model = "Pandora Handheld Console 1GHz"; - compatible = "openpandora,omap3-pandora-1ghz", "ti,omap36xx", "ti,omap3"; + compatible = "openpandora,omap3-pandora-1ghz", "ti,omap3630", "ti,omap36xx", "ti,omap3"; }; &omap3_pmx_core2 { diff --git a/arch/arm/boot/dts/omap3-sbc-t3530.dts b/arch/arm/boot/dts/omap3-sbc-t3530.dts index ae96002abb3b..24bf3fd86641 100644 --- a/arch/arm/boot/dts/omap3-sbc-t3530.dts +++ b/arch/arm/boot/dts/omap3-sbc-t3530.dts @@ -8,7 +8,7 @@ / { model = "CompuLab SBC-T3530 with CM-T3530"; - compatible = "compulab,omap3-sbc-t3530", "compulab,omap3-cm-t3530", "ti,omap34xx", "ti,omap3"; + compatible = "compulab,omap3-sbc-t3530", "compulab,omap3-cm-t3530", "ti,omap3430", "ti,omap34xx", "ti,omap3"; aliases { display0 = &dvi0; diff --git a/arch/arm/boot/dts/omap3-sbc-t3730.dts b/arch/arm/boot/dts/omap3-sbc-t3730.dts index 7de6df16fc17..eb3893b9535e 100644 --- a/arch/arm/boot/dts/omap3-sbc-t3730.dts +++ b/arch/arm/boot/dts/omap3-sbc-t3730.dts @@ -8,7 +8,7 @@ / { model = "CompuLab SBC-T3730 with CM-T3730"; - compatible = "compulab,omap3-sbc-t3730", "compulab,omap3-cm-t3730", "ti,omap36xx", "ti,omap3"; + compatible = "compulab,omap3-sbc-t3730", "compulab,omap3-cm-t3730", "ti,omap3630", "ti,omap36xx", "ti,omap3"; aliases { display0 = &dvi0; diff --git a/arch/arm/boot/dts/omap3-sniper.dts b/arch/arm/boot/dts/omap3-sniper.dts index 40a87330e8c3..b6879cdc5c13 100644 --- a/arch/arm/boot/dts/omap3-sniper.dts +++ b/arch/arm/boot/dts/omap3-sniper.dts @@ -9,7 +9,7 @@ / { model = "LG Optimus Black"; - compatible = "lg,omap3-sniper", "ti,omap36xx", "ti,omap3"; + compatible = "lg,omap3-sniper", "ti,omap3630", "ti,omap36xx", "ti,omap3"; cpus { cpu@0 { diff --git a/arch/arm/boot/dts/omap3-thunder.dts b/arch/arm/boot/dts/omap3-thunder.dts index 6276e7079b36..64221e3b3477 100644 --- a/arch/arm/boot/dts/omap3-thunder.dts +++ b/arch/arm/boot/dts/omap3-thunder.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3 Thunder baseboard with TAO3530 SOM"; - compatible = "technexion,omap3-thunder", "technexion,omap3-tao3530", "ti,omap34xx", "ti,omap3"; + compatible = "technexion,omap3-thunder", "technexion,omap3-tao3530", "ti,omap3430", "ti,omap34xx", "ti,omap3"; }; &omap3_pmx_core { diff --git a/arch/arm/boot/dts/omap3-zoom3.dts b/arch/arm/boot/dts/omap3-zoom3.dts index db3a2fe84e99..d240e39f2151 100644 --- a/arch/arm/boot/dts/omap3-zoom3.dts +++ b/arch/arm/boot/dts/omap3-zoom3.dts @@ -9,7 +9,7 @@ / { model = "TI Zoom3"; - compatible = "ti,omap3-zoom3", "ti,omap36xx", "ti,omap3"; + compatible = "ti,omap3-zoom3", "ti,omap3630", "ti,omap36xx", "ti,omap3"; cpus { cpu@0 { diff --git a/arch/arm/boot/dts/omap3430-sdp.dts b/arch/arm/boot/dts/omap3430-sdp.dts index 0abd61108a53..7bfde8aac7ae 100644 --- a/arch/arm/boot/dts/omap3430-sdp.dts +++ b/arch/arm/boot/dts/omap3430-sdp.dts @@ -8,7 +8,7 @@ / { model = "TI OMAP3430 SDP"; - compatible = "ti,omap3430-sdp", "ti,omap3"; + compatible = "ti,omap3430-sdp", "ti,omap3430", "ti,omap3"; memory@80000000 { device_type = "memory"; diff --git a/arch/arm/boot/dts/omap34xx.dtsi b/arch/arm/boot/dts/omap34xx.dtsi index 7b09cbee8bb8..c4dd9801840d 100644 --- a/arch/arm/boot/dts/omap34xx.dtsi +++ b/arch/arm/boot/dts/omap34xx.dtsi @@ -16,19 +16,67 @@ / { cpus { cpu: cpu@0 { - /* OMAP343x/OMAP35xx variants OPP1-5 */ - operating-points = < - /* kHz uV */ - 125000 975000 - 250000 1075000 - 500000 1200000 - 550000 1270000 - 600000 1350000 - >; + /* OMAP343x/OMAP35xx variants OPP1-6 */ + operating-points-v2 = <&cpu0_opp_table>; + clock-latency = <300000>; /* From legacy driver */ }; }; + /* see Documentation/devicetree/bindings/opp/opp.txt */ + cpu0_opp_table: opp-table { + compatible = "operating-points-v2-ti-cpu"; + syscon = <&scm_conf>; + + opp1-125000000 { + opp-hz = /bits/ 64 <125000000>; + /* + * we currently only select the max voltage from table + * Table 3-3 of the omap3530 Data sheet (SPRS507F). + * Format is: <target min max> + */ + opp-microvolt = <975000 975000 975000>; + /* + * first value is silicon revision bit mask + * second one 720MHz Device Identification bit mask + */ + opp-supported-hw = <0xffffffff 3>; + }; + + opp2-250000000 { + opp-hz = /bits/ 64 <250000000>; + opp-microvolt = <1075000 1075000 1075000>; + opp-supported-hw = <0xffffffff 3>; + opp-suspend; + }; + + opp3-500000000 { + opp-hz = /bits/ 64 <500000000>; + opp-microvolt = <1200000 1200000 1200000>; + opp-supported-hw = <0xffffffff 3>; + }; + + opp4-550000000 { + opp-hz = /bits/ 64 <550000000>; + opp-microvolt = <1275000 1275000 1275000>; + opp-supported-hw = <0xffffffff 3>; + }; + + opp5-600000000 { + opp-hz = /bits/ 64 <600000000>; + opp-microvolt = <1350000 1350000 1350000>; + opp-supported-hw = <0xffffffff 3>; + }; + + opp6-720000000 { + opp-hz = /bits/ 64 <720000000>; + opp-microvolt = <1350000 1350000 1350000>; + /* only high-speed grade omap3530 devices */ + opp-supported-hw = <0xffffffff 2>; + turbo-mode; + }; + }; + ocp@68000000 { omap3_pmx_core2: pinmux@480025d8 { compatible = "ti,omap3-padconf", "pinctrl-single"; diff --git a/arch/arm/boot/dts/omap36xx.dtsi b/arch/arm/boot/dts/omap36xx.dtsi index 1e552f08f120..c618cb257d00 100644 --- a/arch/arm/boot/dts/omap36xx.dtsi +++ b/arch/arm/boot/dts/omap36xx.dtsi @@ -19,16 +19,65 @@ }; cpus { - /* OMAP3630/OMAP37xx 'standard device' variants OPP50 to OPP130 */ + /* OMAP3630/OMAP37xx variants OPP50 to OPP130 and OPP1G */ cpu: cpu@0 { - operating-points = < - /* kHz uV */ - 300000 1012500 - 600000 1200000 - 800000 1325000 - >; - clock-latency = <300000>; /* From legacy driver */ + operating-points-v2 = <&cpu0_opp_table>; + + vbb-supply = <&abb_mpu_iva>; + clock-latency = <300000>; /* From omap-cpufreq driver */ + }; + }; + + /* see Documentation/devicetree/bindings/opp/opp.txt */ + cpu0_opp_table: opp-table { + compatible = "operating-points-v2-ti-cpu"; + syscon = <&scm_conf>; + + opp50-300000000 { + opp-hz = /bits/ 64 <300000000>; + /* + * we currently only select the max voltage from table + * Table 4-19 of the DM3730 Data sheet (SPRS685B) + * Format is: cpu0-supply: <target min max> + * vbb-supply: <target min max> + */ + opp-microvolt = <1012500 1012500 1012500>, + <1012500 1012500 1012500>; + /* + * first value is silicon revision bit mask + * second one is "speed binned" bit mask + */ + opp-supported-hw = <0xffffffff 3>; + opp-suspend; + }; + + opp100-600000000 { + opp-hz = /bits/ 64 <600000000>; + opp-microvolt = <1200000 1200000 1200000>, + <1200000 1200000 1200000>; + opp-supported-hw = <0xffffffff 3>; + }; + + opp130-800000000 { + opp-hz = /bits/ 64 <800000000>; + opp-microvolt = <1325000 1325000 1325000>, + <1325000 1325000 1325000>; + opp-supported-hw = <0xffffffff 3>; }; + + opp1g-1000000000 { + opp-hz = /bits/ 64 <1000000000>; + opp-microvolt = <1375000 1375000 1375000>, + <1375000 1375000 1375000>; + /* only on am/dm37x with speed-binned bit set */ + opp-supported-hw = <0xffffffff 2>; + turbo-mode; + }; + }; + + opp_supply_mpu_iva: opp_supply { + compatible = "ti,omap-opp-supply"; + ti,absolute-max-voltage-uv = <1375000>; }; ocp@68000000 { diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-imx6q.c index 39a7d9393641..24dd5bbe60e4 100644 --- a/arch/arm/mach-imx/cpuidle-imx6q.c +++ b/arch/arm/mach-imx/cpuidle-imx6q.c @@ -62,13 +62,13 @@ static struct cpuidle_driver imx6q_cpuidle_driver = { */ void imx6q_cpuidle_fec_irqs_used(void) { - imx6q_cpuidle_driver.states[1].disabled = true; + cpuidle_driver_state_disabled(&imx6q_cpuidle_driver, 1, true); } EXPORT_SYMBOL_GPL(imx6q_cpuidle_fec_irqs_used); void imx6q_cpuidle_fec_irqs_unused(void) { - imx6q_cpuidle_driver.states[1].disabled = false; + cpuidle_driver_state_disabled(&imx6q_cpuidle_driver, 1, false); } EXPORT_SYMBOL_GPL(imx6q_cpuidle_fec_irqs_unused); diff --git a/arch/arm/mach-tegra/cpuidle-tegra20.c b/arch/arm/mach-tegra/cpuidle-tegra20.c index 2447427cb4a8..69f3fa270fbe 100644 --- a/arch/arm/mach-tegra/cpuidle-tegra20.c +++ b/arch/arm/mach-tegra/cpuidle-tegra20.c @@ -203,7 +203,7 @@ void tegra20_cpuidle_pcie_irqs_in_use(void) { pr_info_once( "Disabling cpuidle LP2 state, since PCIe IRQs are in use\n"); - tegra_idle_driver.states[1].disabled = true; + cpuidle_driver_state_disabled(&tegra_idle_driver, 1, true); } int __init tegra20_cpuidle_init(void) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index ed56c6d20b08..2ae95df2e74f 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -642,6 +642,19 @@ static int acpi_idle_bm_check(void) return bm_status; } +static void wait_for_freeze(void) +{ +#ifdef CONFIG_X86 + /* No delay is needed if we are in guest */ + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return; +#endif + /* Dummy wait op - must do something useless after P_LVL2 read + because chipsets cannot guarantee that STPCLK# signal + gets asserted in time to freeze execution properly. */ + inl(acpi_gbl_FADT.xpm_timer_block.address); +} + /** * acpi_idle_do_entry - enter idle state using the appropriate method * @cx: cstate data @@ -658,10 +671,7 @@ static void __cpuidle acpi_idle_do_entry(struct acpi_processor_cx *cx) } else { /* IO port based C-state */ inb(cx->address); - /* Dummy wait op - must do something useless after P_LVL2 read - because chipsets cannot guarantee that STPCLK# signal - gets asserted in time to freeze execution properly. */ - inl(acpi_gbl_FADT.xpm_timer_block.address); + wait_for_freeze(); } } @@ -682,8 +692,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) safe_halt(); else if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) { inb(cx->address); - /* See comment in acpi_idle_do_entry() */ - inl(acpi_gbl_FADT.xpm_timer_block.address); + wait_for_freeze(); } else return -ENODEV; } diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c index 8db98a1f83dc..bbddb267c2e6 100644 --- a/drivers/base/power/common.c +++ b/drivers/base/power/common.c @@ -188,6 +188,26 @@ void dev_pm_domain_detach(struct device *dev, bool power_off) EXPORT_SYMBOL_GPL(dev_pm_domain_detach); /** + * dev_pm_domain_start - Start the device through its PM domain. + * @dev: Device to start. + * + * This function should typically be called during probe by a subsystem/driver, + * when it needs to start its device from the PM domain's perspective. Note + * that, it's assumed that the PM domain is already powered on when this + * function is called. + * + * Returns 0 on success and negative error values on failures. + */ +int dev_pm_domain_start(struct device *dev) +{ + if (dev->pm_domain && dev->pm_domain->start) + return dev->pm_domain->start(dev); + + return 0; +} +EXPORT_SYMBOL_GPL(dev_pm_domain_start); + +/** * dev_pm_domain_set - Set PM domain of a device. * @dev: Device whose PM domain is to be set. * @pd: PM domain to be set, or NULL. diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index cc85e87eaf05..8e5725b11ee8 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -634,6 +634,13 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) return ret; } +static int genpd_dev_pm_start(struct device *dev) +{ + struct generic_pm_domain *genpd = dev_to_genpd(dev); + + return genpd_start_dev(genpd, dev); +} + static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, unsigned long val, void *ptr) { @@ -922,24 +929,6 @@ static int __init genpd_power_off_unused(void) } late_initcall(genpd_power_off_unused); -#if defined(CONFIG_PM_SLEEP) || defined(CONFIG_PM_GENERIC_DOMAINS_OF) - -static bool genpd_present(const struct generic_pm_domain *genpd) -{ - const struct generic_pm_domain *gpd; - - if (IS_ERR_OR_NULL(genpd)) - return false; - - list_for_each_entry(gpd, &gpd_list, gpd_list_node) - if (gpd == genpd) - return true; - - return false; -} - -#endif - #ifdef CONFIG_PM_SLEEP /** @@ -1354,8 +1343,8 @@ static void genpd_syscore_switch(struct device *dev, bool suspend) { struct generic_pm_domain *genpd; - genpd = dev_to_genpd(dev); - if (!genpd_present(genpd)) + genpd = dev_to_genpd_safe(dev); + if (!genpd) return; if (suspend) { @@ -1805,6 +1794,7 @@ int pm_genpd_init(struct generic_pm_domain *genpd, genpd->domain.ops.poweroff_noirq = genpd_poweroff_noirq; genpd->domain.ops.restore_noirq = genpd_restore_noirq; genpd->domain.ops.complete = genpd_complete; + genpd->domain.start = genpd_dev_pm_start; if (genpd->flags & GENPD_FLAG_PM_CLK) { genpd->dev_ops.stop = pm_clk_suspend; @@ -2020,6 +2010,16 @@ static int genpd_add_provider(struct device_node *np, genpd_xlate_t xlate, return 0; } +static bool genpd_present(const struct generic_pm_domain *genpd) +{ + const struct generic_pm_domain *gpd; + + list_for_each_entry(gpd, &gpd_list, gpd_list_node) + if (gpd == genpd) + return true; + return false; +} + /** * of_genpd_add_provider_simple() - Register a simple PM domain provider * @np: Device node pointer associated with the PM domain provider. diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h index 39a06a0cfdaa..444f5c169a0b 100644 --- a/drivers/base/power/power.h +++ b/drivers/base/power/power.h @@ -117,6 +117,13 @@ static inline bool device_pm_initialized(struct device *dev) return dev->power.in_dpm_list; } +/* drivers/base/power/wakeup_stats.c */ +extern int wakeup_source_sysfs_add(struct device *parent, + struct wakeup_source *ws); +extern void wakeup_source_sysfs_remove(struct wakeup_source *ws); + +extern int pm_wakeup_source_sysfs_add(struct device *parent); + #else /* !CONFIG_PM_SLEEP */ static inline void device_pm_sleep_init(struct device *dev) {} @@ -141,6 +148,11 @@ static inline bool device_pm_initialized(struct device *dev) return device_is_registered(dev); } +static inline int pm_wakeup_source_sysfs_add(struct device *parent) +{ + return 0; +} + #endif /* !CONFIG_PM_SLEEP */ static inline void device_pm_init(struct device *dev) @@ -149,21 +161,3 @@ static inline void device_pm_init(struct device *dev) device_pm_sleep_init(dev); pm_runtime_init(dev); } - -#ifdef CONFIG_PM_SLEEP - -/* drivers/base/power/wakeup_stats.c */ -extern int wakeup_source_sysfs_add(struct device *parent, - struct wakeup_source *ws); -extern void wakeup_source_sysfs_remove(struct wakeup_source *ws); - -extern int pm_wakeup_source_sysfs_add(struct device *parent); - -#else /* !CONFIG_PM_SLEEP */ - -static inline int pm_wakeup_source_sysfs_add(struct device *parent) -{ - return 0; -} - -#endif /* CONFIG_PM_SLEEP */ diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c index 5ce77d1ef9fc..8e021082dba8 100644 --- a/drivers/base/power/wakeirq.c +++ b/drivers/base/power/wakeirq.c @@ -272,7 +272,7 @@ void dev_pm_enable_wake_irq_check(struct device *dev, { struct wake_irq *wirq = dev->power.wakeirq; - if (!wirq || !((wirq->status & WAKE_IRQ_DEDICATED_MASK))) + if (!wirq || !(wirq->status & WAKE_IRQ_DEDICATED_MASK)) return; if (likely(wirq->status & WAKE_IRQ_DEDICATED_MANAGED)) { @@ -299,7 +299,7 @@ void dev_pm_disable_wake_irq_check(struct device *dev) { struct wake_irq *wirq = dev->power.wakeirq; - if (!wirq || !((wirq->status & WAKE_IRQ_DEDICATED_MASK))) + if (!wirq || !(wirq->status & WAKE_IRQ_DEDICATED_MASK)) return; if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED) diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index a905796f7f85..3858d86cf409 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -49,14 +49,6 @@ config ARM_ARMADA_8K_CPUFREQ If in doubt, say N. -# big LITTLE core layer and glue drivers -config ARM_BIG_LITTLE_CPUFREQ - tristate "Generic ARM big LITTLE CPUfreq driver" - depends on ARM_CPU_TOPOLOGY && HAVE_CLK - select PM_OPP - help - This enables the Generic CPUfreq driver for ARM big.LITTLE platforms. - config ARM_SCPI_CPUFREQ tristate "SCPI based CPUfreq driver" depends on ARM_SCPI_PROTOCOL && COMMON_CLK_SCPI @@ -69,7 +61,9 @@ config ARM_SCPI_CPUFREQ config ARM_VEXPRESS_SPC_CPUFREQ tristate "Versatile Express SPC based CPUfreq driver" - depends on ARM_BIG_LITTLE_CPUFREQ && ARCH_VEXPRESS_SPC + depends on ARM_CPU_TOPOLOGY && HAVE_CLK + depends on ARCH_VEXPRESS_SPC + select PM_OPP help This add the CPUfreq driver support for Versatile Express big.LITTLE platforms using SPC for power management. diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index 9a9f5ccd13d9..f6670c4abbb0 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -47,8 +47,6 @@ obj-$(CONFIG_X86_SFI_CPUFREQ) += sfi-cpufreq.o ################################################################################## # ARM SoC drivers -obj-$(CONFIG_ARM_BIG_LITTLE_CPUFREQ) += arm_big_little.o - obj-$(CONFIG_ARM_ARMADA_37XX_CPUFREQ) += armada-37xx-cpufreq.o obj-$(CONFIG_ARM_ARMADA_8K_CPUFREQ) += armada-8k-cpufreq.o obj-$(CONFIG_ARM_BRCMSTB_AVS_CPUFREQ) += brcmstb-avs-cpufreq.o diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c deleted file mode 100644 index 7fe52fcddcf1..000000000000 --- a/drivers/cpufreq/arm_big_little.c +++ /dev/null @@ -1,658 +0,0 @@ -/* - * ARM big.LITTLE Platforms CPUFreq support - * - * Copyright (C) 2013 ARM Ltd. - * Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> - * - * Copyright (C) 2013 Linaro. - * Viresh Kumar <viresh.kumar@linaro.org> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed "as is" WITHOUT ANY WARRANTY of any - * kind, whether express or implied; without even the implied warranty - * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include <linux/clk.h> -#include <linux/cpu.h> -#include <linux/cpufreq.h> -#include <linux/cpumask.h> -#include <linux/cpu_cooling.h> -#include <linux/export.h> -#include <linux/module.h> -#include <linux/mutex.h> -#include <linux/of_platform.h> -#include <linux/pm_opp.h> -#include <linux/slab.h> -#include <linux/topology.h> -#include <linux/types.h> - -#include "arm_big_little.h" - -/* Currently we support only two clusters */ -#define A15_CLUSTER 0 -#define A7_CLUSTER 1 -#define MAX_CLUSTERS 2 - -#ifdef CONFIG_BL_SWITCHER -#include <asm/bL_switcher.h> -static bool bL_switching_enabled; -#define is_bL_switching_enabled() bL_switching_enabled -#define set_switching_enabled(x) (bL_switching_enabled = (x)) -#else -#define is_bL_switching_enabled() false -#define set_switching_enabled(x) do { } while (0) -#define bL_switch_request(...) do { } while (0) -#define bL_switcher_put_enabled() do { } while (0) -#define bL_switcher_get_enabled() do { } while (0) -#endif - -#define ACTUAL_FREQ(cluster, freq) ((cluster == A7_CLUSTER) ? freq << 1 : freq) -#define VIRT_FREQ(cluster, freq) ((cluster == A7_CLUSTER) ? freq >> 1 : freq) - -static struct thermal_cooling_device *cdev[MAX_CLUSTERS]; -static const struct cpufreq_arm_bL_ops *arm_bL_ops; -static struct clk *clk[MAX_CLUSTERS]; -static struct cpufreq_frequency_table *freq_table[MAX_CLUSTERS + 1]; -static atomic_t cluster_usage[MAX_CLUSTERS + 1]; - -static unsigned int clk_big_min; /* (Big) clock frequencies */ -static unsigned int clk_little_max; /* Maximum clock frequency (Little) */ - -static DEFINE_PER_CPU(unsigned int, physical_cluster); -static DEFINE_PER_CPU(unsigned int, cpu_last_req_freq); - -static struct mutex cluster_lock[MAX_CLUSTERS]; - -static inline int raw_cpu_to_cluster(int cpu) -{ - return topology_physical_package_id(cpu); -} - -static inline int cpu_to_cluster(int cpu) -{ - return is_bL_switching_enabled() ? - MAX_CLUSTERS : raw_cpu_to_cluster(cpu); -} - -static unsigned int find_cluster_maxfreq(int cluster) -{ - int j; - u32 max_freq = 0, cpu_freq; - - for_each_online_cpu(j) { - cpu_freq = per_cpu(cpu_last_req_freq, j); - - if ((cluster == per_cpu(physical_cluster, j)) && - (max_freq < cpu_freq)) - max_freq = cpu_freq; - } - - pr_debug("%s: cluster: %d, max freq: %d\n", __func__, cluster, - max_freq); - - return max_freq; -} - -static unsigned int clk_get_cpu_rate(unsigned int cpu) -{ - u32 cur_cluster = per_cpu(physical_cluster, cpu); - u32 rate = clk_get_rate(clk[cur_cluster]) / 1000; - - /* For switcher we use virtual A7 clock rates */ - if (is_bL_switching_enabled()) - rate = VIRT_FREQ(cur_cluster, rate); - - pr_debug("%s: cpu: %d, cluster: %d, freq: %u\n", __func__, cpu, - cur_cluster, rate); - - return rate; -} - -static unsigned int bL_cpufreq_get_rate(unsigned int cpu) -{ - if (is_bL_switching_enabled()) { - pr_debug("%s: freq: %d\n", __func__, per_cpu(cpu_last_req_freq, - cpu)); - - return per_cpu(cpu_last_req_freq, cpu); - } else { - return clk_get_cpu_rate(cpu); - } -} - -static unsigned int -bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) -{ - u32 new_rate, prev_rate; - int ret; - bool bLs = is_bL_switching_enabled(); - - mutex_lock(&cluster_lock[new_cluster]); - - if (bLs) { - prev_rate = per_cpu(cpu_last_req_freq, cpu); - per_cpu(cpu_last_req_freq, cpu) = rate; - per_cpu(physical_cluster, cpu) = new_cluster; - - new_rate = find_cluster_maxfreq(new_cluster); - new_rate = ACTUAL_FREQ(new_cluster, new_rate); - } else { - new_rate = rate; - } - - pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", - __func__, cpu, old_cluster, new_cluster, new_rate); - - ret = clk_set_rate(clk[new_cluster], new_rate * 1000); - if (!ret) { - /* - * FIXME: clk_set_rate hasn't returned an error here however it - * may be that clk_change_rate failed due to hardware or - * firmware issues and wasn't able to report that due to the - * current design of the clk core layer. To work around this - * problem we will read back the clock rate and check it is - * correct. This needs to be removed once clk core is fixed. - */ - if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) - ret = -EIO; - } - - if (WARN_ON(ret)) { - pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, - new_cluster); - if (bLs) { - per_cpu(cpu_last_req_freq, cpu) = prev_rate; - per_cpu(physical_cluster, cpu) = old_cluster; - } - - mutex_unlock(&cluster_lock[new_cluster]); - - return ret; - } - - mutex_unlock(&cluster_lock[new_cluster]); - - /* Recalc freq for old cluster when switching clusters */ - if (old_cluster != new_cluster) { - pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d\n", - __func__, cpu, old_cluster, new_cluster); - - /* Switch cluster */ - bL_switch_request(cpu, new_cluster); - - mutex_lock(&cluster_lock[old_cluster]); - - /* Set freq of old cluster if there are cpus left on it */ - new_rate = find_cluster_maxfreq(old_cluster); - new_rate = ACTUAL_FREQ(old_cluster, new_rate); - - if (new_rate) { - pr_debug("%s: Updating rate of old cluster: %d, to freq: %d\n", - __func__, old_cluster, new_rate); - - if (clk_set_rate(clk[old_cluster], new_rate * 1000)) - pr_err("%s: clk_set_rate failed: %d, old cluster: %d\n", - __func__, ret, old_cluster); - } - mutex_unlock(&cluster_lock[old_cluster]); - } - - return 0; -} - -/* Set clock frequency */ -static int bL_cpufreq_set_target(struct cpufreq_policy *policy, - unsigned int index) -{ - u32 cpu = policy->cpu, cur_cluster, new_cluster, actual_cluster; - unsigned int freqs_new; - int ret; - - cur_cluster = cpu_to_cluster(cpu); - new_cluster = actual_cluster = per_cpu(physical_cluster, cpu); - - freqs_new = freq_table[cur_cluster][index].frequency; - - if (is_bL_switching_enabled()) { - if ((actual_cluster == A15_CLUSTER) && - (freqs_new < clk_big_min)) { - new_cluster = A7_CLUSTER; - } else if ((actual_cluster == A7_CLUSTER) && - (freqs_new > clk_little_max)) { - new_cluster = A15_CLUSTER; - } - } - - ret = bL_cpufreq_set_rate(cpu, actual_cluster, new_cluster, freqs_new); - - if (!ret) { - arch_set_freq_scale(policy->related_cpus, freqs_new, - policy->cpuinfo.max_freq); - } - - return ret; -} - -static inline u32 get_table_count(struct cpufreq_frequency_table *table) -{ - int count; - - for (count = 0; table[count].frequency != CPUFREQ_TABLE_END; count++) - ; - - return count; -} - -/* get the minimum frequency in the cpufreq_frequency_table */ -static inline u32 get_table_min(struct cpufreq_frequency_table *table) -{ - struct cpufreq_frequency_table *pos; - uint32_t min_freq = ~0; - cpufreq_for_each_entry(pos, table) - if (pos->frequency < min_freq) - min_freq = pos->frequency; - return min_freq; -} - -/* get the maximum frequency in the cpufreq_frequency_table */ -static inline u32 get_table_max(struct cpufreq_frequency_table *table) -{ - struct cpufreq_frequency_table *pos; - uint32_t max_freq = 0; - cpufreq_for_each_entry(pos, table) - if (pos->frequency > max_freq) - max_freq = pos->frequency; - return max_freq; -} - -static int merge_cluster_tables(void) -{ - int i, j, k = 0, count = 1; - struct cpufreq_frequency_table *table; - - for (i = 0; i < MAX_CLUSTERS; i++) - count += get_table_count(freq_table[i]); - - table = kcalloc(count, sizeof(*table), GFP_KERNEL); - if (!table) - return -ENOMEM; - - freq_table[MAX_CLUSTERS] = table; - - /* Add in reverse order to get freqs in increasing order */ - for (i = MAX_CLUSTERS - 1; i >= 0; i--) { - for (j = 0; freq_table[i][j].frequency != CPUFREQ_TABLE_END; - j++) { - table[k].frequency = VIRT_FREQ(i, - freq_table[i][j].frequency); - pr_debug("%s: index: %d, freq: %d\n", __func__, k, - table[k].frequency); - k++; - } - } - - table[k].driver_data = k; - table[k].frequency = CPUFREQ_TABLE_END; - - pr_debug("%s: End, table: %p, count: %d\n", __func__, table, k); - - return 0; -} - -static void _put_cluster_clk_and_freq_table(struct device *cpu_dev, - const struct cpumask *cpumask) -{ - u32 cluster = raw_cpu_to_cluster(cpu_dev->id); - - if (!freq_table[cluster]) - return; - - clk_put(clk[cluster]); - dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table[cluster]); - if (arm_bL_ops->free_opp_table) - arm_bL_ops->free_opp_table(cpumask); - dev_dbg(cpu_dev, "%s: cluster: %d\n", __func__, cluster); -} - -static void put_cluster_clk_and_freq_table(struct device *cpu_dev, - const struct cpumask *cpumask) -{ - u32 cluster = cpu_to_cluster(cpu_dev->id); - int i; - - if (atomic_dec_return(&cluster_usage[cluster])) - return; - - if (cluster < MAX_CLUSTERS) - return _put_cluster_clk_and_freq_table(cpu_dev, cpumask); - - for_each_present_cpu(i) { - struct device *cdev = get_cpu_device(i); - if (!cdev) { - pr_err("%s: failed to get cpu%d device\n", __func__, i); - return; - } - - _put_cluster_clk_and_freq_table(cdev, cpumask); - } - - /* free virtual table */ - kfree(freq_table[cluster]); -} - -static int _get_cluster_clk_and_freq_table(struct device *cpu_dev, - const struct cpumask *cpumask) -{ - u32 cluster = raw_cpu_to_cluster(cpu_dev->id); - int ret; - - if (freq_table[cluster]) - return 0; - - ret = arm_bL_ops->init_opp_table(cpumask); - if (ret) { - dev_err(cpu_dev, "%s: init_opp_table failed, cpu: %d, err: %d\n", - __func__, cpu_dev->id, ret); - goto out; - } - - ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table[cluster]); - if (ret) { - dev_err(cpu_dev, "%s: failed to init cpufreq table, cpu: %d, err: %d\n", - __func__, cpu_dev->id, ret); - goto free_opp_table; - } - - clk[cluster] = clk_get(cpu_dev, NULL); - if (!IS_ERR(clk[cluster])) { - dev_dbg(cpu_dev, "%s: clk: %p & freq table: %p, cluster: %d\n", - __func__, clk[cluster], freq_table[cluster], - cluster); - return 0; - } - - dev_err(cpu_dev, "%s: Failed to get clk for cpu: %d, cluster: %d\n", - __func__, cpu_dev->id, cluster); - ret = PTR_ERR(clk[cluster]); - dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table[cluster]); - -free_opp_table: - if (arm_bL_ops->free_opp_table) - arm_bL_ops->free_opp_table(cpumask); -out: - dev_err(cpu_dev, "%s: Failed to get data for cluster: %d\n", __func__, - cluster); - return ret; -} - -static int get_cluster_clk_and_freq_table(struct device *cpu_dev, - const struct cpumask *cpumask) -{ - u32 cluster = cpu_to_cluster(cpu_dev->id); - int i, ret; - - if (atomic_inc_return(&cluster_usage[cluster]) != 1) - return 0; - - if (cluster < MAX_CLUSTERS) { - ret = _get_cluster_clk_and_freq_table(cpu_dev, cpumask); - if (ret) - atomic_dec(&cluster_usage[cluster]); - return ret; - } - - /* - * Get data for all clusters and fill virtual cluster with a merge of - * both - */ - for_each_present_cpu(i) { - struct device *cdev = get_cpu_device(i); - if (!cdev) { - pr_err("%s: failed to get cpu%d device\n", __func__, i); - return -ENODEV; - } - - ret = _get_cluster_clk_and_freq_table(cdev, cpumask); - if (ret) - goto put_clusters; - } - - ret = merge_cluster_tables(); - if (ret) - goto put_clusters; - - /* Assuming 2 cluster, set clk_big_min and clk_little_max */ - clk_big_min = get_table_min(freq_table[0]); - clk_little_max = VIRT_FREQ(1, get_table_max(freq_table[1])); - - pr_debug("%s: cluster: %d, clk_big_min: %d, clk_little_max: %d\n", - __func__, cluster, clk_big_min, clk_little_max); - - return 0; - -put_clusters: - for_each_present_cpu(i) { - struct device *cdev = get_cpu_device(i); - if (!cdev) { - pr_err("%s: failed to get cpu%d device\n", __func__, i); - return -ENODEV; - } - - _put_cluster_clk_and_freq_table(cdev, cpumask); - } - - atomic_dec(&cluster_usage[cluster]); - - return ret; -} - -/* Per-CPU initialization */ -static int bL_cpufreq_init(struct cpufreq_policy *policy) -{ - u32 cur_cluster = cpu_to_cluster(policy->cpu); - struct device *cpu_dev; - int ret; - - cpu_dev = get_cpu_device(policy->cpu); - if (!cpu_dev) { - pr_err("%s: failed to get cpu%d device\n", __func__, - policy->cpu); - return -ENODEV; - } - - if (cur_cluster < MAX_CLUSTERS) { - int cpu; - - cpumask_copy(policy->cpus, topology_core_cpumask(policy->cpu)); - - for_each_cpu(cpu, policy->cpus) - per_cpu(physical_cluster, cpu) = cur_cluster; - } else { - /* Assumption: during init, we are always running on A15 */ - per_cpu(physical_cluster, policy->cpu) = A15_CLUSTER; - } - - ret = get_cluster_clk_and_freq_table(cpu_dev, policy->cpus); - if (ret) - return ret; - - policy->freq_table = freq_table[cur_cluster]; - policy->cpuinfo.transition_latency = - arm_bL_ops->get_transition_latency(cpu_dev); - - dev_pm_opp_of_register_em(policy->cpus); - - if (is_bL_switching_enabled()) - per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu); - - dev_info(cpu_dev, "%s: CPU %d initialized\n", __func__, policy->cpu); - return 0; -} - -static int bL_cpufreq_exit(struct cpufreq_policy *policy) -{ - struct device *cpu_dev; - int cur_cluster = cpu_to_cluster(policy->cpu); - - if (cur_cluster < MAX_CLUSTERS) { - cpufreq_cooling_unregister(cdev[cur_cluster]); - cdev[cur_cluster] = NULL; - } - - cpu_dev = get_cpu_device(policy->cpu); - if (!cpu_dev) { - pr_err("%s: failed to get cpu%d device\n", __func__, - policy->cpu); - return -ENODEV; - } - - put_cluster_clk_and_freq_table(cpu_dev, policy->related_cpus); - dev_dbg(cpu_dev, "%s: Exited, cpu: %d\n", __func__, policy->cpu); - - return 0; -} - -static void bL_cpufreq_ready(struct cpufreq_policy *policy) -{ - int cur_cluster = cpu_to_cluster(policy->cpu); - - /* Do not register a cpu_cooling device if we are in IKS mode */ - if (cur_cluster >= MAX_CLUSTERS) - return; - - cdev[cur_cluster] = of_cpufreq_cooling_register(policy); -} - -static struct cpufreq_driver bL_cpufreq_driver = { - .name = "arm-big-little", - .flags = CPUFREQ_STICKY | - CPUFREQ_HAVE_GOVERNOR_PER_POLICY | - CPUFREQ_NEED_INITIAL_FREQ_CHECK, - .verify = cpufreq_generic_frequency_table_verify, - .target_index = bL_cpufreq_set_target, - .get = bL_cpufreq_get_rate, - .init = bL_cpufreq_init, - .exit = bL_cpufreq_exit, - .ready = bL_cpufreq_ready, - .attr = cpufreq_generic_attr, -}; - -#ifdef CONFIG_BL_SWITCHER -static int bL_cpufreq_switcher_notifier(struct notifier_block *nfb, - unsigned long action, void *_arg) -{ - pr_debug("%s: action: %ld\n", __func__, action); - - switch (action) { - case BL_NOTIFY_PRE_ENABLE: - case BL_NOTIFY_PRE_DISABLE: - cpufreq_unregister_driver(&bL_cpufreq_driver); - break; - - case BL_NOTIFY_POST_ENABLE: - set_switching_enabled(true); - cpufreq_register_driver(&bL_cpufreq_driver); - break; - - case BL_NOTIFY_POST_DISABLE: - set_switching_enabled(false); - cpufreq_register_driver(&bL_cpufreq_driver); - break; - - default: - return NOTIFY_DONE; - } - - return NOTIFY_OK; -} - -static struct notifier_block bL_switcher_notifier = { - .notifier_call = bL_cpufreq_switcher_notifier, -}; - -static int __bLs_register_notifier(void) -{ - return bL_switcher_register_notifier(&bL_switcher_notifier); -} - -static int __bLs_unregister_notifier(void) -{ - return bL_switcher_unregister_notifier(&bL_switcher_notifier); -} -#else -static int __bLs_register_notifier(void) { return 0; } -static int __bLs_unregister_notifier(void) { return 0; } -#endif - -int bL_cpufreq_register(const struct cpufreq_arm_bL_ops *ops) -{ - int ret, i; - - if (arm_bL_ops) { - pr_debug("%s: Already registered: %s, exiting\n", __func__, - arm_bL_ops->name); - return -EBUSY; - } - - if (!ops || !strlen(ops->name) || !ops->init_opp_table || - !ops->get_transition_latency) { - pr_err("%s: Invalid arm_bL_ops, exiting\n", __func__); - return -ENODEV; - } - - arm_bL_ops = ops; - - set_switching_enabled(bL_switcher_get_enabled()); - - for (i = 0; i < MAX_CLUSTERS; i++) - mutex_init(&cluster_lock[i]); - - ret = cpufreq_register_driver(&bL_cpufreq_driver); - if (ret) { - pr_info("%s: Failed registering platform driver: %s, err: %d\n", - __func__, ops->name, ret); - arm_bL_ops = NULL; - } else { - ret = __bLs_register_notifier(); - if (ret) { - cpufreq_unregister_driver(&bL_cpufreq_driver); - arm_bL_ops = NULL; - } else { - pr_info("%s: Registered platform driver: %s\n", - __func__, ops->name); - } - } - - bL_switcher_put_enabled(); - return ret; -} -EXPORT_SYMBOL_GPL(bL_cpufreq_register); - -void bL_cpufreq_unregister(const struct cpufreq_arm_bL_ops *ops) -{ - if (arm_bL_ops != ops) { - pr_err("%s: Registered with: %s, can't unregister, exiting\n", - __func__, arm_bL_ops->name); - return; - } - - bL_switcher_get_enabled(); - __bLs_unregister_notifier(); - cpufreq_unregister_driver(&bL_cpufreq_driver); - bL_switcher_put_enabled(); - pr_info("%s: Un-registered platform driver: %s\n", __func__, - arm_bL_ops->name); - arm_bL_ops = NULL; -} -EXPORT_SYMBOL_GPL(bL_cpufreq_unregister); - -MODULE_AUTHOR("Viresh Kumar <viresh.kumar@linaro.org>"); -MODULE_DESCRIPTION("Generic ARM big LITTLE cpufreq driver"); -MODULE_LICENSE("GPL v2"); diff --git a/drivers/cpufreq/arm_big_little.h b/drivers/cpufreq/arm_big_little.h deleted file mode 100644 index 88a176e466c8..000000000000 --- a/drivers/cpufreq/arm_big_little.h +++ /dev/null @@ -1,43 +0,0 @@ -/* - * ARM big.LITTLE platform's CPUFreq header file - * - * Copyright (C) 2013 ARM Ltd. - * Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> - * - * Copyright (C) 2013 Linaro. - * Viresh Kumar <viresh.kumar@linaro.org> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed "as is" WITHOUT ANY WARRANTY of any - * kind, whether express or implied; without even the implied warranty - * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - */ -#ifndef CPUFREQ_ARM_BIG_LITTLE_H -#define CPUFREQ_ARM_BIG_LITTLE_H - -#include <linux/cpufreq.h> -#include <linux/device.h> -#include <linux/types.h> - -struct cpufreq_arm_bL_ops { - char name[CPUFREQ_NAME_LEN]; - - /* - * This must set opp table for cpu_dev in a similar way as done by - * dev_pm_opp_of_add_table(). - */ - int (*init_opp_table)(const struct cpumask *cpumask); - - /* Optional */ - int (*get_transition_latency)(struct device *cpu_dev); - void (*free_opp_table)(const struct cpumask *cpumask); -}; - -int bL_cpufreq_register(const struct cpufreq_arm_bL_ops *ops); -void bL_cpufreq_unregister(const struct cpufreq_arm_bL_ops *ops); - -#endif /* CPUFREQ_ARM_BIG_LITTLE_H */ diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index bca8d1f47fd2..54bc76743b1f 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -86,7 +86,6 @@ static const struct of_device_id whitelist[] __initconst = { { .compatible = "st-ericsson,u9540", }, { .compatible = "ti,omap2", }, - { .compatible = "ti,omap3", }, { .compatible = "ti,omap4", }, { .compatible = "ti,omap5", }, @@ -137,6 +136,7 @@ static const struct of_device_id blacklist[] __initconst = { { .compatible = "ti,am33xx", }, { .compatible = "ti,am43", }, { .compatible = "ti,dra7", }, + { .compatible = "ti,omap3", }, { } }; diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index ee23eaf20f35..77114a3897fb 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -936,6 +936,9 @@ static ssize_t show(struct kobject *kobj, struct attribute *attr, char *buf) struct freq_attr *fattr = to_attr(attr); ssize_t ret; + if (!fattr->show) + return -EIO; + down_read(&policy->rwsem); ret = fattr->show(policy, buf); up_read(&policy->rwsem); @@ -950,6 +953,9 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr, struct freq_attr *fattr = to_attr(attr); ssize_t ret = -EINVAL; + if (!fattr->store) + return -EIO; + /* * cpus_read_trylock() is used here to work around a circular lock * dependency problem with respect to the cpufreq_register_driver(). @@ -2388,7 +2394,10 @@ int cpufreq_set_policy(struct cpufreq_policy *policy, new_policy->min = freq_qos_read_value(&policy->constraints, FREQ_QOS_MIN); new_policy->max = freq_qos_read_value(&policy->constraints, FREQ_QOS_MAX); - /* verify the cpu speed can be set within this limit */ + /* + * Verify that the CPU speed can be set within these limits and make sure + * that min <= max. + */ ret = cpufreq_driver->verify(new_policy); if (ret) return ret; @@ -2631,6 +2640,13 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data) if (cpufreq_disabled()) return -ENODEV; + /* + * The cpufreq core depends heavily on the availability of device + * structure, make sure they are available before proceeding further. + */ + if (!get_cpu_device(0)) + return -EPROBE_DEFER; + if (!driver_data || !driver_data->verify || !driver_data->init || !(driver_data->setpolicy || driver_data->target_index || driver_data->target) || diff --git a/drivers/cpufreq/imx-cpufreq-dt.c b/drivers/cpufreq/imx-cpufreq-dt.c index 35db14cf3102..85a6efd6b68f 100644 --- a/drivers/cpufreq/imx-cpufreq-dt.c +++ b/drivers/cpufreq/imx-cpufreq-dt.c @@ -44,19 +44,19 @@ static int imx_cpufreq_dt_probe(struct platform_device *pdev) mkt_segment = (cell_value & OCOTP_CFG3_MKT_SEGMENT_MASK) >> OCOTP_CFG3_MKT_SEGMENT_SHIFT; /* - * Early samples without fuses written report "0 0" which means - * consumer segment and minimum speed grading. - * - * According to datasheet minimum speed grading is not supported for - * consumer parts so clamp to 1 to avoid warning for "no OPPs" + * Early samples without fuses written report "0 0" which may NOT + * match any OPP defined in DT. So clamp to minimum OPP defined in + * DT to avoid warning for "no OPPs". * * Applies to i.MX8M series SoCs. */ - if (mkt_segment == 0 && speed_grade == 0 && ( - of_machine_is_compatible("fsl,imx8mm") || - of_machine_is_compatible("fsl,imx8mn") || - of_machine_is_compatible("fsl,imx8mq"))) - speed_grade = 1; + if (mkt_segment == 0 && speed_grade == 0) { + if (of_machine_is_compatible("fsl,imx8mm") || + of_machine_is_compatible("fsl,imx8mq")) + speed_grade = 1; + if (of_machine_is_compatible("fsl,imx8mn")) + speed_grade = 0xb; + } supported_hw[0] = BIT(speed_grade); supported_hw[1] = BIT(mkt_segment); diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 8ab31702cf6a..d2fa3e9ccd97 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -2662,21 +2662,21 @@ enum { /* Hardware vendor-specific info that has its own power management modes */ static struct acpi_platform_list plat_info[] __initdata = { - {"HP ", "ProLiant", 0, ACPI_SIG_FADT, all_versions, 0, PSS}, - {"ORACLE", "X4-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4-2L ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4-2B ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X3-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X3-2L ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X3-2B ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4470M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4270M3 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4270M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4170M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4170 M3", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X4275 M3", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "X6-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, - {"ORACLE", "Sudbury ", 0, ACPI_SIG_FADT, all_versions, 0, PPC}, + {"HP ", "ProLiant", 0, ACPI_SIG_FADT, all_versions, NULL, PSS}, + {"ORACLE", "X4-2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4-2L ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4-2B ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X3-2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X3-2L ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X3-2B ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4470M2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4270M3 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4270M2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4170M2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4170 M3", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X4275 M3", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "X6-2 ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, + {"ORACLE", "Sudbury ", 0, ACPI_SIG_FADT, all_versions, NULL, PPC}, { } /* End */ }; diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index 6061850e59c9..56f4bc0d209e 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -1041,9 +1041,14 @@ static struct cpufreq_driver powernv_cpufreq_driver = { static int init_chip_info(void) { - unsigned int chip[256]; + unsigned int *chip; unsigned int cpu, i; unsigned int prev_chip_id = UINT_MAX; + int ret = 0; + + chip = kcalloc(num_possible_cpus(), sizeof(*chip), GFP_KERNEL); + if (!chip) + return -ENOMEM; for_each_possible_cpu(cpu) { unsigned int id = cpu_to_chip_id(cpu); @@ -1055,8 +1060,10 @@ static int init_chip_info(void) } chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL); - if (!chips) - return -ENOMEM; + if (!chips) { + ret = -ENOMEM; + goto free_and_return; + } for (i = 0; i < nr_chips; i++) { chips[i].id = chip[i]; @@ -1066,7 +1073,9 @@ static int init_chip_info(void) per_cpu(chip_info, cpu) = &chips[i]; } - return 0; +free_and_return: + kfree(chip); + return ret; } static inline void clean_chip_info(void) diff --git a/drivers/cpufreq/s3c64xx-cpufreq.c b/drivers/cpufreq/s3c64xx-cpufreq.c index af0c00dabb22..c6bdfc308e99 100644 --- a/drivers/cpufreq/s3c64xx-cpufreq.c +++ b/drivers/cpufreq/s3c64xx-cpufreq.c @@ -19,7 +19,6 @@ static struct regulator *vddarm; static unsigned long regulator_latency; -#ifdef CONFIG_CPU_S3C6410 struct s3c64xx_dvfs { unsigned int vddarm_min; unsigned int vddarm_max; @@ -48,7 +47,6 @@ static struct cpufreq_frequency_table s3c64xx_freq_table[] = { { 0, 4, 800000 }, { 0, 0, CPUFREQ_TABLE_END }, }; -#endif static int s3c64xx_cpufreq_set_target(struct cpufreq_policy *policy, unsigned int index) @@ -149,11 +147,6 @@ static int s3c64xx_cpufreq_driver_init(struct cpufreq_policy *policy) if (policy->cpu != 0) return -EINVAL; - if (s3c64xx_freq_table == NULL) { - pr_err("No frequency information for this CPU\n"); - return -ENODEV; - } - policy->clk = clk_get(NULL, "armclk"); if (IS_ERR(policy->clk)) { pr_err("Unable to obtain ARMCLK: %ld\n", diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c index 2b51e0718c9f..20d1f85d5f5a 100644 --- a/drivers/cpufreq/scpi-cpufreq.c +++ b/drivers/cpufreq/scpi-cpufreq.c @@ -1,8 +1,6 @@ /* * System Control and Power Interface (SCPI) based CPUFreq Interface driver * - * It provides necessary ops to arm_big_little cpufreq driver. - * * Copyright (C) 2015 ARM Ltd. * Sudeep Holla <sudeep.holla@arm.com> * diff --git a/drivers/cpufreq/sun50i-cpufreq-nvmem.c b/drivers/cpufreq/sun50i-cpufreq-nvmem.c index eca32e443716..9907a165135b 100644 --- a/drivers/cpufreq/sun50i-cpufreq-nvmem.c +++ b/drivers/cpufreq/sun50i-cpufreq-nvmem.c @@ -25,7 +25,7 @@ static struct platform_device *cpufreq_dt_pdev, *sun50i_cpufreq_pdev; /** - * sun50i_cpufreq_get_efuse() - Parse and return efuse value present on SoC + * sun50i_cpufreq_get_efuse() - Determine speed grade from efuse value * @versions: Set to the value parsed from efuse * * Returns 0 if success. @@ -69,21 +69,16 @@ static int sun50i_cpufreq_get_efuse(u32 *versions) return PTR_ERR(speedbin); efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK; - switch (efuse_value) { - case 0b0001: - *versions = 1; - break; - case 0b0011: - *versions = 2; - break; - default: - /* - * For other situations, we treat it as bin0. - * This vf table can be run for any good cpu. - */ + + /* + * We treat unexpected efuse values as if the SoC was from + * the slowest bin. Expected efuse values are 1-3, slowest + * to fastest. + */ + if (efuse_value >= 1 && efuse_value <= 3) + *versions = efuse_value - 1; + else *versions = 0; - break; - } kfree(speedbin); return 0; diff --git a/drivers/cpufreq/ti-cpufreq.c b/drivers/cpufreq/ti-cpufreq.c index aeaa883a8c9d..557cb513bf7f 100644 --- a/drivers/cpufreq/ti-cpufreq.c +++ b/drivers/cpufreq/ti-cpufreq.c @@ -31,11 +31,17 @@ #define DRA7_EFUSE_OD_MPU_OPP BIT(1) #define DRA7_EFUSE_HIGH_MPU_OPP BIT(2) +#define OMAP3_CONTROL_DEVICE_STATUS 0x4800244C +#define OMAP3_CONTROL_IDCODE 0x4830A204 +#define OMAP34xx_ProdID_SKUID 0x4830A20C +#define OMAP3_SYSCON_BASE (0x48000000 + 0x2000 + 0x270) + #define VERSION_COUNT 2 struct ti_cpufreq_data; struct ti_cpufreq_soc_data { + const char * const *reg_names; unsigned long (*efuse_xlate)(struct ti_cpufreq_data *opp_data, unsigned long efuse); unsigned long efuse_fallback; @@ -85,6 +91,13 @@ static unsigned long dra7_efuse_xlate(struct ti_cpufreq_data *opp_data, return calculated_efuse; } +static unsigned long omap3_efuse_xlate(struct ti_cpufreq_data *opp_data, + unsigned long efuse) +{ + /* OPP enable bit ("Speed Binned") */ + return BIT(efuse); +} + static struct ti_cpufreq_soc_data am3x_soc_data = { .efuse_xlate = amx3_efuse_xlate, .efuse_fallback = AM33XX_800M_ARM_MPU_MAX_FREQ, @@ -112,6 +125,74 @@ static struct ti_cpufreq_soc_data dra7_soc_data = { .multi_regulator = true, }; +/* + * OMAP35x TRM (SPRUF98K): + * CONTROL_IDCODE (0x4830 A204) describes Silicon revisions. + * Control OMAP Status Register 15:0 (Address 0x4800 244C) + * to separate between omap3503, omap3515, omap3525, omap3530 + * and feature presence. + * There are encodings for versions limited to 400/266MHz + * but we ignore. + * Not clear if this also holds for omap34xx. + * some eFuse values e.g. CONTROL_FUSE_OPP1_VDD1 + * are stored in the SYSCON register range + * Register 0x4830A20C [ProdID.SKUID] [0:3] + * 0x0 for normal 600/430MHz device. + * 0x8 for 720/520MHz device. + * Not clear what omap34xx value is. + */ + +static struct ti_cpufreq_soc_data omap34xx_soc_data = { + .efuse_xlate = omap3_efuse_xlate, + .efuse_offset = OMAP34xx_ProdID_SKUID - OMAP3_SYSCON_BASE, + .efuse_shift = 3, + .efuse_mask = BIT(3), + .rev_offset = OMAP3_CONTROL_IDCODE - OMAP3_SYSCON_BASE, + .multi_regulator = false, +}; + +/* + * AM/DM37x TRM (SPRUGN4M) + * CONTROL_IDCODE (0x4830 A204) describes Silicon revisions. + * Control Device Status Register 15:0 (Address 0x4800 244C) + * to separate between am3703, am3715, dm3725, dm3730 + * and feature presence. + * Speed Binned = Bit 9 + * 0 800/600 MHz + * 1 1000/800 MHz + * some eFuse values e.g. CONTROL_FUSE_OPP 1G_VDD1 + * are stored in the SYSCON register range. + * There is no 0x4830A20C [ProdID.SKUID] register (exists but + * seems to always read as 0). + */ + +static const char * const omap3_reg_names[] = {"cpu0", "vbb"}; + +static struct ti_cpufreq_soc_data omap36xx_soc_data = { + .reg_names = omap3_reg_names, + .efuse_xlate = omap3_efuse_xlate, + .efuse_offset = OMAP3_CONTROL_DEVICE_STATUS - OMAP3_SYSCON_BASE, + .efuse_shift = 9, + .efuse_mask = BIT(9), + .rev_offset = OMAP3_CONTROL_IDCODE - OMAP3_SYSCON_BASE, + .multi_regulator = true, +}; + +/* + * AM3517 is quite similar to AM/DM37x except that it has no + * high speed grade eFuse and no abb ldo + */ + +static struct ti_cpufreq_soc_data am3517_soc_data = { + .efuse_xlate = omap3_efuse_xlate, + .efuse_offset = OMAP3_CONTROL_DEVICE_STATUS - OMAP3_SYSCON_BASE, + .efuse_shift = 0, + .efuse_mask = 0, + .rev_offset = OMAP3_CONTROL_IDCODE - OMAP3_SYSCON_BASE, + .multi_regulator = false, +}; + + /** * ti_cpufreq_get_efuse() - Parse and return efuse value present on SoC * @opp_data: pointer to ti_cpufreq_data context @@ -128,7 +209,17 @@ static int ti_cpufreq_get_efuse(struct ti_cpufreq_data *opp_data, ret = regmap_read(opp_data->syscon, opp_data->soc_data->efuse_offset, &efuse); - if (ret) { + if (ret == -EIO) { + /* not a syscon register! */ + void __iomem *regs = ioremap(OMAP3_SYSCON_BASE + + opp_data->soc_data->efuse_offset, 4); + + if (!regs) + return -ENOMEM; + efuse = readl(regs); + iounmap(regs); + } + else if (ret) { dev_err(dev, "Failed to read the efuse value from syscon: %d\n", ret); @@ -159,7 +250,17 @@ static int ti_cpufreq_get_rev(struct ti_cpufreq_data *opp_data, ret = regmap_read(opp_data->syscon, opp_data->soc_data->rev_offset, &revision); - if (ret) { + if (ret == -EIO) { + /* not a syscon register! */ + void __iomem *regs = ioremap(OMAP3_SYSCON_BASE + + opp_data->soc_data->rev_offset, 4); + + if (!regs) + return -ENOMEM; + revision = readl(regs); + iounmap(regs); + } + else if (ret) { dev_err(dev, "Failed to read the revision number from syscon: %d\n", ret); @@ -189,8 +290,14 @@ static int ti_cpufreq_setup_syscon_register(struct ti_cpufreq_data *opp_data) static const struct of_device_id ti_cpufreq_of_match[] = { { .compatible = "ti,am33xx", .data = &am3x_soc_data, }, + { .compatible = "ti,am3517", .data = &am3517_soc_data, }, { .compatible = "ti,am43", .data = &am4x_soc_data, }, { .compatible = "ti,dra7", .data = &dra7_soc_data }, + { .compatible = "ti,omap34xx", .data = &omap34xx_soc_data, }, + { .compatible = "ti,omap36xx", .data = &omap36xx_soc_data, }, + /* legacy */ + { .compatible = "ti,omap3430", .data = &omap34xx_soc_data, }, + { .compatible = "ti,omap3630", .data = &omap36xx_soc_data, }, {}, }; @@ -212,7 +319,7 @@ static int ti_cpufreq_probe(struct platform_device *pdev) const struct of_device_id *match; struct opp_table *ti_opp_table; struct ti_cpufreq_data *opp_data; - const char * const reg_names[] = {"vdd", "vbb"}; + const char * const default_reg_names[] = {"vdd", "vbb"}; int ret; match = dev_get_platdata(&pdev->dev); @@ -268,9 +375,13 @@ static int ti_cpufreq_probe(struct platform_device *pdev) opp_data->opp_table = ti_opp_table; if (opp_data->soc_data->multi_regulator) { + const char * const *reg_names = default_reg_names; + + if (opp_data->soc_data->reg_names) + reg_names = opp_data->soc_data->reg_names; ti_opp_table = dev_pm_opp_set_regulators(opp_data->cpu_dev, reg_names, - ARRAY_SIZE(reg_names)); + ARRAY_SIZE(default_reg_names)); if (IS_ERR(ti_opp_table)) { dev_pm_opp_put_supported_hw(opp_data->opp_table); ret = PTR_ERR(ti_opp_table); diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c b/drivers/cpufreq/vexpress-spc-cpufreq.c index 53237289e606..506e3f2bf53a 100644 --- a/drivers/cpufreq/vexpress-spc-cpufreq.c +++ b/drivers/cpufreq/vexpress-spc-cpufreq.c @@ -1,61 +1,592 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Versatile Express SPC CPUFreq Interface driver * - * It provides necessary ops to arm_big_little cpufreq driver. + * Copyright (C) 2013 - 2019 ARM Ltd. + * Sudeep Holla <sudeep.holla@arm.com> * - * Copyright (C) 2013 ARM Ltd. - * Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed "as is" WITHOUT ANY WARRANTY of any - * kind, whether express or implied; without even the implied warranty - * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. + * Copyright (C) 2013 Linaro. + * Viresh Kumar <viresh.kumar@linaro.org> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include <linux/clk.h> #include <linux/cpu.h> #include <linux/cpufreq.h> +#include <linux/cpumask.h> +#include <linux/cpu_cooling.h> +#include <linux/device.h> #include <linux/module.h> +#include <linux/mutex.h> +#include <linux/of_platform.h> #include <linux/platform_device.h> #include <linux/pm_opp.h> +#include <linux/slab.h> +#include <linux/topology.h> #include <linux/types.h> -#include "arm_big_little.h" +/* Currently we support only two clusters */ +#define A15_CLUSTER 0 +#define A7_CLUSTER 1 +#define MAX_CLUSTERS 2 + +#ifdef CONFIG_BL_SWITCHER +#include <asm/bL_switcher.h> +static bool bL_switching_enabled; +#define is_bL_switching_enabled() bL_switching_enabled +#define set_switching_enabled(x) (bL_switching_enabled = (x)) +#else +#define is_bL_switching_enabled() false +#define set_switching_enabled(x) do { } while (0) +#define bL_switch_request(...) do { } while (0) +#define bL_switcher_put_enabled() do { } while (0) +#define bL_switcher_get_enabled() do { } while (0) +#endif + +#define ACTUAL_FREQ(cluster, freq) ((cluster == A7_CLUSTER) ? freq << 1 : freq) +#define VIRT_FREQ(cluster, freq) ((cluster == A7_CLUSTER) ? freq >> 1 : freq) + +static struct thermal_cooling_device *cdev[MAX_CLUSTERS]; +static struct clk *clk[MAX_CLUSTERS]; +static struct cpufreq_frequency_table *freq_table[MAX_CLUSTERS + 1]; +static atomic_t cluster_usage[MAX_CLUSTERS + 1]; + +static unsigned int clk_big_min; /* (Big) clock frequencies */ +static unsigned int clk_little_max; /* Maximum clock frequency (Little) */ + +static DEFINE_PER_CPU(unsigned int, physical_cluster); +static DEFINE_PER_CPU(unsigned int, cpu_last_req_freq); + +static struct mutex cluster_lock[MAX_CLUSTERS]; + +static inline int raw_cpu_to_cluster(int cpu) +{ + return topology_physical_package_id(cpu); +} + +static inline int cpu_to_cluster(int cpu) +{ + return is_bL_switching_enabled() ? + MAX_CLUSTERS : raw_cpu_to_cluster(cpu); +} + +static unsigned int find_cluster_maxfreq(int cluster) +{ + int j; + u32 max_freq = 0, cpu_freq; + + for_each_online_cpu(j) { + cpu_freq = per_cpu(cpu_last_req_freq, j); + + if (cluster == per_cpu(physical_cluster, j) && + max_freq < cpu_freq) + max_freq = cpu_freq; + } + + return max_freq; +} + +static unsigned int clk_get_cpu_rate(unsigned int cpu) +{ + u32 cur_cluster = per_cpu(physical_cluster, cpu); + u32 rate = clk_get_rate(clk[cur_cluster]) / 1000; + + /* For switcher we use virtual A7 clock rates */ + if (is_bL_switching_enabled()) + rate = VIRT_FREQ(cur_cluster, rate); + + return rate; +} + +static unsigned int ve_spc_cpufreq_get_rate(unsigned int cpu) +{ + if (is_bL_switching_enabled()) + return per_cpu(cpu_last_req_freq, cpu); + else + return clk_get_cpu_rate(cpu); +} + +static unsigned int +ve_spc_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) +{ + u32 new_rate, prev_rate; + int ret; + bool bLs = is_bL_switching_enabled(); + + mutex_lock(&cluster_lock[new_cluster]); + + if (bLs) { + prev_rate = per_cpu(cpu_last_req_freq, cpu); + per_cpu(cpu_last_req_freq, cpu) = rate; + per_cpu(physical_cluster, cpu) = new_cluster; + + new_rate = find_cluster_maxfreq(new_cluster); + new_rate = ACTUAL_FREQ(new_cluster, new_rate); + } else { + new_rate = rate; + } + + ret = clk_set_rate(clk[new_cluster], new_rate * 1000); + if (!ret) { + /* + * FIXME: clk_set_rate hasn't returned an error here however it + * may be that clk_change_rate failed due to hardware or + * firmware issues and wasn't able to report that due to the + * current design of the clk core layer. To work around this + * problem we will read back the clock rate and check it is + * correct. This needs to be removed once clk core is fixed. + */ + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) + ret = -EIO; + } + + if (WARN_ON(ret)) { + if (bLs) { + per_cpu(cpu_last_req_freq, cpu) = prev_rate; + per_cpu(physical_cluster, cpu) = old_cluster; + } + + mutex_unlock(&cluster_lock[new_cluster]); + + return ret; + } + + mutex_unlock(&cluster_lock[new_cluster]); + + /* Recalc freq for old cluster when switching clusters */ + if (old_cluster != new_cluster) { + /* Switch cluster */ + bL_switch_request(cpu, new_cluster); + + mutex_lock(&cluster_lock[old_cluster]); + + /* Set freq of old cluster if there are cpus left on it */ + new_rate = find_cluster_maxfreq(old_cluster); + new_rate = ACTUAL_FREQ(old_cluster, new_rate); + + if (new_rate && + clk_set_rate(clk[old_cluster], new_rate * 1000)) { + pr_err("%s: clk_set_rate failed: %d, old cluster: %d\n", + __func__, ret, old_cluster); + } + mutex_unlock(&cluster_lock[old_cluster]); + } + + return 0; +} + +/* Set clock frequency */ +static int ve_spc_cpufreq_set_target(struct cpufreq_policy *policy, + unsigned int index) +{ + u32 cpu = policy->cpu, cur_cluster, new_cluster, actual_cluster; + unsigned int freqs_new; + int ret; + + cur_cluster = cpu_to_cluster(cpu); + new_cluster = actual_cluster = per_cpu(physical_cluster, cpu); + + freqs_new = freq_table[cur_cluster][index].frequency; + + if (is_bL_switching_enabled()) { + if (actual_cluster == A15_CLUSTER && freqs_new < clk_big_min) + new_cluster = A7_CLUSTER; + else if (actual_cluster == A7_CLUSTER && + freqs_new > clk_little_max) + new_cluster = A15_CLUSTER; + } + + ret = ve_spc_cpufreq_set_rate(cpu, actual_cluster, new_cluster, + freqs_new); + + if (!ret) { + arch_set_freq_scale(policy->related_cpus, freqs_new, + policy->cpuinfo.max_freq); + } + + return ret; +} + +static inline u32 get_table_count(struct cpufreq_frequency_table *table) +{ + int count; + + for (count = 0; table[count].frequency != CPUFREQ_TABLE_END; count++) + ; + + return count; +} + +/* get the minimum frequency in the cpufreq_frequency_table */ +static inline u32 get_table_min(struct cpufreq_frequency_table *table) +{ + struct cpufreq_frequency_table *pos; + u32 min_freq = ~0; + + cpufreq_for_each_entry(pos, table) + if (pos->frequency < min_freq) + min_freq = pos->frequency; + return min_freq; +} + +/* get the maximum frequency in the cpufreq_frequency_table */ +static inline u32 get_table_max(struct cpufreq_frequency_table *table) +{ + struct cpufreq_frequency_table *pos; + u32 max_freq = 0; + + cpufreq_for_each_entry(pos, table) + if (pos->frequency > max_freq) + max_freq = pos->frequency; + return max_freq; +} + +static bool search_frequency(struct cpufreq_frequency_table *table, int size, + unsigned int freq) +{ + int count; + + for (count = 0; count < size; count++) { + if (table[count].frequency == freq) + return true; + } + + return false; +} + +static int merge_cluster_tables(void) +{ + int i, j, k = 0, count = 1; + struct cpufreq_frequency_table *table; + + for (i = 0; i < MAX_CLUSTERS; i++) + count += get_table_count(freq_table[i]); + + table = kcalloc(count, sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + freq_table[MAX_CLUSTERS] = table; + + /* Add in reverse order to get freqs in increasing order */ + for (i = MAX_CLUSTERS - 1; i >= 0; i--, count = k) { + for (j = 0; freq_table[i][j].frequency != CPUFREQ_TABLE_END; + j++) { + if (i == A15_CLUSTER && + search_frequency(table, count, freq_table[i][j].frequency)) + continue; /* skip duplicates */ + table[k++].frequency = + VIRT_FREQ(i, freq_table[i][j].frequency); + } + } + + table[k].driver_data = k; + table[k].frequency = CPUFREQ_TABLE_END; + + return 0; +} -static int ve_spc_init_opp_table(const struct cpumask *cpumask) +static void _put_cluster_clk_and_freq_table(struct device *cpu_dev, + const struct cpumask *cpumask) { - struct device *cpu_dev = get_cpu_device(cpumask_first(cpumask)); + u32 cluster = raw_cpu_to_cluster(cpu_dev->id); + + if (!freq_table[cluster]) + return; + + clk_put(clk[cluster]); + dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table[cluster]); +} + +static void put_cluster_clk_and_freq_table(struct device *cpu_dev, + const struct cpumask *cpumask) +{ + u32 cluster = cpu_to_cluster(cpu_dev->id); + int i; + + if (atomic_dec_return(&cluster_usage[cluster])) + return; + + if (cluster < MAX_CLUSTERS) + return _put_cluster_clk_and_freq_table(cpu_dev, cpumask); + + for_each_present_cpu(i) { + struct device *cdev = get_cpu_device(i); + + if (!cdev) + return; + + _put_cluster_clk_and_freq_table(cdev, cpumask); + } + + /* free virtual table */ + kfree(freq_table[cluster]); +} + +static int _get_cluster_clk_and_freq_table(struct device *cpu_dev, + const struct cpumask *cpumask) +{ + u32 cluster = raw_cpu_to_cluster(cpu_dev->id); + int ret; + + if (freq_table[cluster]) + return 0; + /* * platform specific SPC code must initialise the opp table * so just check if the OPP count is non-zero */ - return dev_pm_opp_get_opp_count(cpu_dev) <= 0; + ret = dev_pm_opp_get_opp_count(cpu_dev) <= 0; + if (ret) + goto out; + + ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table[cluster]); + if (ret) + goto out; + + clk[cluster] = clk_get(cpu_dev, NULL); + if (!IS_ERR(clk[cluster])) + return 0; + + dev_err(cpu_dev, "%s: Failed to get clk for cpu: %d, cluster: %d\n", + __func__, cpu_dev->id, cluster); + ret = PTR_ERR(clk[cluster]); + dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table[cluster]); + +out: + dev_err(cpu_dev, "%s: Failed to get data for cluster: %d\n", __func__, + cluster); + return ret; } -static int ve_spc_get_transition_latency(struct device *cpu_dev) +static int get_cluster_clk_and_freq_table(struct device *cpu_dev, + const struct cpumask *cpumask) { - return 1000000; /* 1 ms */ + u32 cluster = cpu_to_cluster(cpu_dev->id); + int i, ret; + + if (atomic_inc_return(&cluster_usage[cluster]) != 1) + return 0; + + if (cluster < MAX_CLUSTERS) { + ret = _get_cluster_clk_and_freq_table(cpu_dev, cpumask); + if (ret) + atomic_dec(&cluster_usage[cluster]); + return ret; + } + + /* + * Get data for all clusters and fill virtual cluster with a merge of + * both + */ + for_each_present_cpu(i) { + struct device *cdev = get_cpu_device(i); + + if (!cdev) + return -ENODEV; + + ret = _get_cluster_clk_and_freq_table(cdev, cpumask); + if (ret) + goto put_clusters; + } + + ret = merge_cluster_tables(); + if (ret) + goto put_clusters; + + /* Assuming 2 cluster, set clk_big_min and clk_little_max */ + clk_big_min = get_table_min(freq_table[A15_CLUSTER]); + clk_little_max = VIRT_FREQ(A7_CLUSTER, + get_table_max(freq_table[A7_CLUSTER])); + + return 0; + +put_clusters: + for_each_present_cpu(i) { + struct device *cdev = get_cpu_device(i); + + if (!cdev) + return -ENODEV; + + _put_cluster_clk_and_freq_table(cdev, cpumask); + } + + atomic_dec(&cluster_usage[cluster]); + + return ret; } -static const struct cpufreq_arm_bL_ops ve_spc_cpufreq_ops = { - .name = "vexpress-spc", - .get_transition_latency = ve_spc_get_transition_latency, - .init_opp_table = ve_spc_init_opp_table, +/* Per-CPU initialization */ +static int ve_spc_cpufreq_init(struct cpufreq_policy *policy) +{ + u32 cur_cluster = cpu_to_cluster(policy->cpu); + struct device *cpu_dev; + int ret; + + cpu_dev = get_cpu_device(policy->cpu); + if (!cpu_dev) { + pr_err("%s: failed to get cpu%d device\n", __func__, + policy->cpu); + return -ENODEV; + } + + if (cur_cluster < MAX_CLUSTERS) { + int cpu; + + cpumask_copy(policy->cpus, topology_core_cpumask(policy->cpu)); + + for_each_cpu(cpu, policy->cpus) + per_cpu(physical_cluster, cpu) = cur_cluster; + } else { + /* Assumption: during init, we are always running on A15 */ + per_cpu(physical_cluster, policy->cpu) = A15_CLUSTER; + } + + ret = get_cluster_clk_and_freq_table(cpu_dev, policy->cpus); + if (ret) + return ret; + + policy->freq_table = freq_table[cur_cluster]; + policy->cpuinfo.transition_latency = 1000000; /* 1 ms */ + + dev_pm_opp_of_register_em(policy->cpus); + + if (is_bL_switching_enabled()) + per_cpu(cpu_last_req_freq, policy->cpu) = + clk_get_cpu_rate(policy->cpu); + + dev_info(cpu_dev, "%s: CPU %d initialized\n", __func__, policy->cpu); + return 0; +} + +static int ve_spc_cpufreq_exit(struct cpufreq_policy *policy) +{ + struct device *cpu_dev; + int cur_cluster = cpu_to_cluster(policy->cpu); + + if (cur_cluster < MAX_CLUSTERS) { + cpufreq_cooling_unregister(cdev[cur_cluster]); + cdev[cur_cluster] = NULL; + } + + cpu_dev = get_cpu_device(policy->cpu); + if (!cpu_dev) { + pr_err("%s: failed to get cpu%d device\n", __func__, + policy->cpu); + return -ENODEV; + } + + put_cluster_clk_and_freq_table(cpu_dev, policy->related_cpus); + return 0; +} + +static void ve_spc_cpufreq_ready(struct cpufreq_policy *policy) +{ + int cur_cluster = cpu_to_cluster(policy->cpu); + + /* Do not register a cpu_cooling device if we are in IKS mode */ + if (cur_cluster >= MAX_CLUSTERS) + return; + + cdev[cur_cluster] = of_cpufreq_cooling_register(policy); +} + +static struct cpufreq_driver ve_spc_cpufreq_driver = { + .name = "vexpress-spc", + .flags = CPUFREQ_STICKY | + CPUFREQ_HAVE_GOVERNOR_PER_POLICY | + CPUFREQ_NEED_INITIAL_FREQ_CHECK, + .verify = cpufreq_generic_frequency_table_verify, + .target_index = ve_spc_cpufreq_set_target, + .get = ve_spc_cpufreq_get_rate, + .init = ve_spc_cpufreq_init, + .exit = ve_spc_cpufreq_exit, + .ready = ve_spc_cpufreq_ready, + .attr = cpufreq_generic_attr, }; +#ifdef CONFIG_BL_SWITCHER +static int bL_cpufreq_switcher_notifier(struct notifier_block *nfb, + unsigned long action, void *_arg) +{ + pr_debug("%s: action: %ld\n", __func__, action); + + switch (action) { + case BL_NOTIFY_PRE_ENABLE: + case BL_NOTIFY_PRE_DISABLE: + cpufreq_unregister_driver(&ve_spc_cpufreq_driver); + break; + + case BL_NOTIFY_POST_ENABLE: + set_switching_enabled(true); + cpufreq_register_driver(&ve_spc_cpufreq_driver); + break; + + case BL_NOTIFY_POST_DISABLE: + set_switching_enabled(false); + cpufreq_register_driver(&ve_spc_cpufreq_driver); + break; + + default: + return NOTIFY_DONE; + } + + return NOTIFY_OK; +} + +static struct notifier_block bL_switcher_notifier = { + .notifier_call = bL_cpufreq_switcher_notifier, +}; + +static int __bLs_register_notifier(void) +{ + return bL_switcher_register_notifier(&bL_switcher_notifier); +} + +static int __bLs_unregister_notifier(void) +{ + return bL_switcher_unregister_notifier(&bL_switcher_notifier); +} +#else +static int __bLs_register_notifier(void) { return 0; } +static int __bLs_unregister_notifier(void) { return 0; } +#endif + static int ve_spc_cpufreq_probe(struct platform_device *pdev) { - return bL_cpufreq_register(&ve_spc_cpufreq_ops); + int ret, i; + + set_switching_enabled(bL_switcher_get_enabled()); + + for (i = 0; i < MAX_CLUSTERS; i++) + mutex_init(&cluster_lock[i]); + + ret = cpufreq_register_driver(&ve_spc_cpufreq_driver); + if (ret) { + pr_info("%s: Failed registering platform driver: %s, err: %d\n", + __func__, ve_spc_cpufreq_driver.name, ret); + } else { + ret = __bLs_register_notifier(); + if (ret) + cpufreq_unregister_driver(&ve_spc_cpufreq_driver); + else + pr_info("%s: Registered platform driver: %s\n", + __func__, ve_spc_cpufreq_driver.name); + } + + bL_switcher_put_enabled(); + return ret; } static int ve_spc_cpufreq_remove(struct platform_device *pdev) { - bL_cpufreq_unregister(&ve_spc_cpufreq_ops); + bL_switcher_get_enabled(); + __bLs_unregister_notifier(); + cpufreq_unregister_driver(&ve_spc_cpufreq_driver); + bL_switcher_put_enabled(); + pr_info("%s: Un-registered platform driver: %s\n", __func__, + ve_spc_cpufreq_driver.name); return 0; } @@ -68,4 +599,7 @@ static struct platform_driver ve_spc_cpufreq_platdrv = { }; module_platform_driver(ve_spc_cpufreq_platdrv); -MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Viresh Kumar <viresh.kumar@linaro.org>"); +MODULE_AUTHOR("Sudeep Holla <sudeep.holla@arm.com>"); +MODULE_DESCRIPTION("Vexpress SPC ARM big LITTLE cpufreq driver"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 84b1ebe212b3..1b299e801f74 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -56,13 +56,10 @@ static u64 get_snooze_timeout(struct cpuidle_device *dev, return default_snooze_timeout; for (i = index + 1; i < drv->state_count; i++) { - struct cpuidle_state *s = &drv->states[i]; - struct cpuidle_state_usage *su = &dev->states_usage[i]; - - if (s->disabled || su->disable) + if (dev->states_usage[i].disable) continue; - return s->target_residency * tb_ticks_per_usec; + return drv->states[i].target_residency * tb_ticks_per_usec; } return default_snooze_timeout; diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 0895b988fa92..569dbac443bd 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -75,44 +75,45 @@ int cpuidle_play_dead(void) static int find_deepest_state(struct cpuidle_driver *drv, struct cpuidle_device *dev, - unsigned int max_latency, + u64 max_latency_ns, unsigned int forbidden_flags, bool s2idle) { - unsigned int latency_req = 0; + u64 latency_req = 0; int i, ret = 0; for (i = 1; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; - struct cpuidle_state_usage *su = &dev->states_usage[i]; - if (s->disabled || su->disable || s->exit_latency <= latency_req - || s->exit_latency > max_latency - || (s->flags & forbidden_flags) - || (s2idle && !s->enter_s2idle)) + if (dev->states_usage[i].disable || + s->exit_latency_ns <= latency_req || + s->exit_latency_ns > max_latency_ns || + (s->flags & forbidden_flags) || + (s2idle && !s->enter_s2idle)) continue; - latency_req = s->exit_latency; + latency_req = s->exit_latency_ns; ret = i; } return ret; } /** - * cpuidle_use_deepest_state - Set/clear governor override flag. - * @enable: New value of the flag. + * cpuidle_use_deepest_state - Set/unset governor override mode. + * @latency_limit_ns: Idle state exit latency limit (or no override if 0). * - * Set/unset the current CPU to use the deepest idle state (override governors - * going forward if set). + * If @latency_limit_ns is nonzero, set the current CPU to use the deepest idle + * state with exit latency within @latency_limit_ns (override governors going + * forward), or do not override governors if it is zero. */ -void cpuidle_use_deepest_state(bool enable) +void cpuidle_use_deepest_state(u64 latency_limit_ns) { struct cpuidle_device *dev; preempt_disable(); dev = cpuidle_get_device(); if (dev) - dev->use_deepest_state = enable; + dev->forced_idle_latency_limit_ns = latency_limit_ns; preempt_enable(); } @@ -122,9 +123,10 @@ void cpuidle_use_deepest_state(bool enable) * @dev: cpuidle device for the given CPU. */ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, - struct cpuidle_device *dev) + struct cpuidle_device *dev, + u64 latency_limit_ns) { - return find_deepest_state(drv, dev, UINT_MAX, 0, false); + return find_deepest_state(drv, dev, latency_limit_ns, 0, false); } #ifdef CONFIG_SUSPEND @@ -180,7 +182,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) * that interrupts won't be enabled when it exits and allows the tick to * be frozen safely. */ - index = find_deepest_state(drv, dev, UINT_MAX, 0, true); + index = find_deepest_state(drv, dev, U64_MAX, 0, true); if (index > 0) enter_s2idle_proper(drv, dev, index); @@ -209,7 +211,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, * CPU as a broadcast timer, this call may fail if it is not available. */ if (broadcast && tick_broadcast_enter()) { - index = find_deepest_state(drv, dev, target_state->exit_latency, + index = find_deepest_state(drv, dev, target_state->exit_latency_ns, CPUIDLE_FLAG_TIMER_STOP, false); if (index < 0) { default_idle_call(); @@ -247,7 +249,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, local_irq_enable(); if (entered_state >= 0) { - s64 diff, delay = drv->states[entered_state].exit_latency; + s64 diff, delay = drv->states[entered_state].exit_latency_ns; int i; /* @@ -255,18 +257,15 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, * This can be moved to within driver enter routine, * but that results in multiple copies of same code. */ - diff = ktime_us_delta(time_end, time_start); - if (diff > INT_MAX) - diff = INT_MAX; + diff = ktime_sub(time_end, time_start); - dev->last_residency = (int)diff; - dev->states_usage[entered_state].time += dev->last_residency; + dev->last_residency_ns = diff; + dev->states_usage[entered_state].time_ns += diff; dev->states_usage[entered_state].usage++; - if (diff < drv->states[entered_state].target_residency) { + if (diff < drv->states[entered_state].target_residency_ns) { for (i = entered_state - 1; i >= 0; i--) { - if (drv->states[i].disabled || - dev->states_usage[i].disable) + if (dev->states_usage[i].disable) continue; /* Shallower states are enabled, so update. */ @@ -275,22 +274,21 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, } } else if (diff > delay) { for (i = entered_state + 1; i < drv->state_count; i++) { - if (drv->states[i].disabled || - dev->states_usage[i].disable) + if (dev->states_usage[i].disable) continue; /* * Update if a deeper state would have been a * better match for the observed idle duration. */ - if (diff - delay >= drv->states[i].target_residency) + if (diff - delay >= drv->states[i].target_residency_ns) dev->states_usage[entered_state].below++; break; } } } else { - dev->last_residency = 0; + dev->last_residency_ns = 0; } return entered_state; @@ -380,10 +378,10 @@ u64 cpuidle_poll_time(struct cpuidle_driver *drv, limit_ns = TICK_NSEC; for (i = 1; i < drv->state_count; i++) { - if (drv->states[i].disabled || dev->states_usage[i].disable) + if (dev->states_usage[i].disable) continue; - limit_ns = (u64)drv->states[i].target_residency * NSEC_PER_USEC; + limit_ns = (u64)drv->states[i].target_residency_ns; } dev->poll_limit_ns = limit_ns; @@ -554,7 +552,7 @@ static void __cpuidle_unregister_device(struct cpuidle_device *dev) static void __cpuidle_device_init(struct cpuidle_device *dev) { memset(dev->states_usage, 0, sizeof(dev->states_usage)); - dev->last_residency = 0; + dev->last_residency_ns = 0; dev->next_hrtimer = 0; } @@ -567,12 +565,16 @@ static void __cpuidle_device_init(struct cpuidle_device *dev) */ static int __cpuidle_register_device(struct cpuidle_device *dev) { - int ret; struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); + int i, ret; if (!try_module_get(drv->owner)) return -EINVAL; + for (i = 0; i < drv->state_count; i++) + if (drv->states[i].disabled) + dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_DRIVER; + per_cpu(cpuidle_devices, dev->cpu) = dev; list_add(&dev->device_list, &cpuidle_detected_devices); diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 80c1a830d991..c76423aaef4d 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -62,24 +62,23 @@ static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv) * __cpuidle_set_driver - set per CPU driver variables for the given driver. * @drv: a valid pointer to a struct cpuidle_driver * - * For each CPU in the driver's cpumask, unset the registered driver per CPU - * to @drv. - * - * Returns 0 on success, -EBUSY if the CPUs have driver(s) already. + * Returns 0 on success, -EBUSY if any CPU in the cpumask have a driver + * different from drv already. */ static inline int __cpuidle_set_driver(struct cpuidle_driver *drv) { int cpu; for_each_cpu(cpu, drv->cpumask) { + struct cpuidle_driver *old_drv; - if (__cpuidle_get_cpu_driver(cpu)) { - __cpuidle_unset_driver(drv); + old_drv = __cpuidle_get_cpu_driver(cpu); + if (old_drv && old_drv != drv) return -EBUSY; - } + } + for_each_cpu(cpu, drv->cpumask) per_cpu(cpuidle_drivers, cpu) = drv; - } return 0; } @@ -166,16 +165,27 @@ static void __cpuidle_driver_init(struct cpuidle_driver *drv) if (!drv->cpumask) drv->cpumask = (struct cpumask *)cpu_possible_mask; - /* - * Look for the timer stop flag in the different states, so that we know - * if the broadcast timer has to be set up. The loop is in the reverse - * order, because usually one of the deeper states have this flag set. - */ - for (i = drv->state_count - 1; i >= 0 ; i--) { - if (drv->states[i].flags & CPUIDLE_FLAG_TIMER_STOP) { + for (i = 0; i < drv->state_count; i++) { + struct cpuidle_state *s = &drv->states[i]; + + /* + * Look for the timer stop flag in the different states and if + * it is found, indicate that the broadcast timer has to be set + * up. + */ + if (s->flags & CPUIDLE_FLAG_TIMER_STOP) drv->bctimer = 1; - break; - } + + /* + * The core will use the target residency and exit latency + * values in nanoseconds, but allow drivers to provide them in + * microseconds too. + */ + if (s->target_residency > 0) + s->target_residency_ns = s->target_residency * NSEC_PER_USEC; + + if (s->exit_latency > 0) + s->exit_latency_ns = s->exit_latency * NSEC_PER_USEC; } } @@ -379,3 +389,31 @@ void cpuidle_driver_unref(void) spin_unlock(&cpuidle_driver_lock); } + +/** + * cpuidle_driver_state_disabled - Disable or enable an idle state + * @drv: cpuidle driver owning the state + * @idx: State index + * @disable: Whether or not to disable the state + */ +void cpuidle_driver_state_disabled(struct cpuidle_driver *drv, int idx, + bool disable) +{ + unsigned int cpu; + + mutex_lock(&cpuidle_lock); + + for_each_cpu(cpu, drv->cpumask) { + struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu); + + if (!dev) + continue; + + if (disable) + dev->states_usage[idx].disable |= CPUIDLE_STATE_DISABLED_BY_DRIVER; + else + dev->states_usage[idx].disable &= ~CPUIDLE_STATE_DISABLED_BY_DRIVER; + } + + mutex_unlock(&cpuidle_lock); +} diff --git a/drivers/cpuidle/governor.c b/drivers/cpuidle/governor.c index e9801f26c732..e48271e117a3 100644 --- a/drivers/cpuidle/governor.c +++ b/drivers/cpuidle/governor.c @@ -107,11 +107,14 @@ int cpuidle_register_governor(struct cpuidle_governor *gov) * cpuidle_governor_latency_req - Compute a latency constraint for CPU * @cpu: Target CPU */ -int cpuidle_governor_latency_req(unsigned int cpu) +s64 cpuidle_governor_latency_req(unsigned int cpu) { int global_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); struct device *device = get_cpu_device(cpu); int device_req = dev_pm_qos_raw_resume_latency(device); - return device_req < global_req ? device_req : global_req; + if (device_req > global_req) + device_req = global_req; + + return (s64)device_req * NSEC_PER_USEC; } diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c index 7a703d2e0064..cb2a96eafc02 100644 --- a/drivers/cpuidle/governors/haltpoll.c +++ b/drivers/cpuidle/governors/haltpoll.c @@ -49,7 +49,7 @@ static int haltpoll_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, bool *stop_tick) { - int latency_req = cpuidle_governor_latency_req(dev->cpu); + s64 latency_req = cpuidle_governor_latency_req(dev->cpu); if (!drv->state_count || latency_req == 0) { *stop_tick = false; @@ -75,10 +75,9 @@ static int haltpoll_select(struct cpuidle_driver *drv, return 0; } -static void adjust_poll_limit(struct cpuidle_device *dev, unsigned int block_us) +static void adjust_poll_limit(struct cpuidle_device *dev, u64 block_ns) { unsigned int val; - u64 block_ns = block_us*NSEC_PER_USEC; /* Grow cpu_halt_poll_us if * cpu_halt_poll_us < block_ns < guest_halt_poll_us @@ -115,7 +114,7 @@ static void haltpoll_reflect(struct cpuidle_device *dev, int index) dev->last_state_idx = index; if (index != 0) - adjust_poll_limit(dev, dev->last_residency); + adjust_poll_limit(dev, dev->last_residency_ns); } /** diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c index 428eeb832fe7..8e9058c4ea63 100644 --- a/drivers/cpuidle/governors/ladder.c +++ b/drivers/cpuidle/governors/ladder.c @@ -27,8 +27,8 @@ struct ladder_device_state { struct { u32 promotion_count; u32 demotion_count; - u32 promotion_time; - u32 demotion_time; + u64 promotion_time_ns; + u64 demotion_time_ns; } threshold; struct { int promotion_count; @@ -68,9 +68,10 @@ static int ladder_select_state(struct cpuidle_driver *drv, { struct ladder_device *ldev = this_cpu_ptr(&ladder_devices); struct ladder_device_state *last_state; - int last_residency, last_idx = dev->last_state_idx; + int last_idx = dev->last_state_idx; int first_idx = drv->states[0].flags & CPUIDLE_FLAG_POLLING ? 1 : 0; - int latency_req = cpuidle_governor_latency_req(dev->cpu); + s64 latency_req = cpuidle_governor_latency_req(dev->cpu); + s64 last_residency; /* Special case when user has set very strict latency requirement */ if (unlikely(latency_req == 0)) { @@ -80,14 +81,13 @@ static int ladder_select_state(struct cpuidle_driver *drv, last_state = &ldev->states[last_idx]; - last_residency = dev->last_residency - drv->states[last_idx].exit_latency; + last_residency = dev->last_residency_ns - drv->states[last_idx].exit_latency_ns; /* consider promotion */ if (last_idx < drv->state_count - 1 && - !drv->states[last_idx + 1].disabled && !dev->states_usage[last_idx + 1].disable && - last_residency > last_state->threshold.promotion_time && - drv->states[last_idx + 1].exit_latency <= latency_req) { + last_residency > last_state->threshold.promotion_time_ns && + drv->states[last_idx + 1].exit_latency_ns <= latency_req) { last_state->stats.promotion_count++; last_state->stats.demotion_count = 0; if (last_state->stats.promotion_count >= last_state->threshold.promotion_count) { @@ -98,13 +98,12 @@ static int ladder_select_state(struct cpuidle_driver *drv, /* consider demotion */ if (last_idx > first_idx && - (drv->states[last_idx].disabled || - dev->states_usage[last_idx].disable || - drv->states[last_idx].exit_latency > latency_req)) { + (dev->states_usage[last_idx].disable || + drv->states[last_idx].exit_latency_ns > latency_req)) { int i; for (i = last_idx - 1; i > first_idx; i--) { - if (drv->states[i].exit_latency <= latency_req) + if (drv->states[i].exit_latency_ns <= latency_req) break; } ladder_do_selection(dev, ldev, last_idx, i); @@ -112,7 +111,7 @@ static int ladder_select_state(struct cpuidle_driver *drv, } if (last_idx > first_idx && - last_residency < last_state->threshold.demotion_time) { + last_residency < last_state->threshold.demotion_time_ns) { last_state->stats.demotion_count++; last_state->stats.promotion_count = 0; if (last_state->stats.demotion_count >= last_state->threshold.demotion_count) { @@ -152,9 +151,9 @@ static int ladder_enable_device(struct cpuidle_driver *drv, lstate->threshold.demotion_count = DEMOTION_COUNT; if (i < drv->state_count - 1) - lstate->threshold.promotion_time = state->exit_latency; + lstate->threshold.promotion_time_ns = state->exit_latency_ns; if (i > first_idx) - lstate->threshold.demotion_time = state->exit_latency; + lstate->threshold.demotion_time_ns = state->exit_latency_ns; } return 0; diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index e5a5d0c8d66b..b0a7ad566081 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -19,22 +19,12 @@ #include <linux/sched/stat.h> #include <linux/math64.h> -/* - * Please note when changing the tuning values: - * If (MAX_INTERESTING-1) * RESOLUTION > UINT_MAX, the result of - * a scaling operation multiplication may overflow on 32 bit platforms. - * In that case, #define RESOLUTION as ULL to get 64 bit result: - * #define RESOLUTION 1024ULL - * - * The default values do not overflow. - */ #define BUCKETS 12 #define INTERVAL_SHIFT 3 #define INTERVALS (1UL << INTERVAL_SHIFT) #define RESOLUTION 1024 #define DECAY 8 -#define MAX_INTERESTING 50000 - +#define MAX_INTERESTING (50000 * NSEC_PER_USEC) /* * Concepts and ideas behind the menu governor @@ -120,14 +110,14 @@ struct menu_device { int needs_update; int tick_wakeup; - unsigned int next_timer_us; + u64 next_timer_ns; unsigned int bucket; unsigned int correction_factor[BUCKETS]; unsigned int intervals[INTERVALS]; int interval_ptr; }; -static inline int which_bucket(unsigned int duration, unsigned long nr_iowaiters) +static inline int which_bucket(u64 duration_ns, unsigned long nr_iowaiters) { int bucket = 0; @@ -140,15 +130,15 @@ static inline int which_bucket(unsigned int duration, unsigned long nr_iowaiters if (nr_iowaiters) bucket = BUCKETS/2; - if (duration < 10) + if (duration_ns < 10ULL * NSEC_PER_USEC) return bucket; - if (duration < 100) + if (duration_ns < 100ULL * NSEC_PER_USEC) return bucket + 1; - if (duration < 1000) + if (duration_ns < 1000ULL * NSEC_PER_USEC) return bucket + 2; - if (duration < 10000) + if (duration_ns < 10000ULL * NSEC_PER_USEC) return bucket + 3; - if (duration < 100000) + if (duration_ns < 100000ULL * NSEC_PER_USEC) return bucket + 4; return bucket + 5; } @@ -276,13 +266,13 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, bool *stop_tick) { struct menu_device *data = this_cpu_ptr(&menu_devices); - int latency_req = cpuidle_governor_latency_req(dev->cpu); - int i; - int idx; - unsigned int interactivity_req; + s64 latency_req = cpuidle_governor_latency_req(dev->cpu); unsigned int predicted_us; + u64 predicted_ns; + u64 interactivity_req; unsigned long nr_iowaiters; ktime_t delta_next; + int i, idx; if (data->needs_update) { menu_update(drv, dev); @@ -290,15 +280,15 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, } /* determine the expected residency time, round up */ - data->next_timer_us = ktime_to_us(tick_nohz_get_sleep_length(&delta_next)); + data->next_timer_ns = tick_nohz_get_sleep_length(&delta_next); nr_iowaiters = nr_iowait_cpu(dev->cpu); - data->bucket = which_bucket(data->next_timer_us, nr_iowaiters); + data->bucket = which_bucket(data->next_timer_ns, nr_iowaiters); if (unlikely(drv->state_count <= 1 || latency_req == 0) || - ((data->next_timer_us < drv->states[1].target_residency || - latency_req < drv->states[1].exit_latency) && - !drv->states[0].disabled && !dev->states_usage[0].disable)) { + ((data->next_timer_ns < drv->states[1].target_residency_ns || + latency_req < drv->states[1].exit_latency_ns) && + !dev->states_usage[0].disable)) { /* * In this case state[0] will be used no matter what, so return * it right away and keep the tick running if state[0] is a @@ -308,18 +298,15 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, return 0; } - /* - * Force the result of multiplication to be 64 bits even if both - * operands are 32 bits. - * Make sure to round up for half microseconds. - */ - predicted_us = DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us * - data->correction_factor[data->bucket], - RESOLUTION * DECAY); - /* - * Use the lowest expected idle interval to pick the idle state. - */ - predicted_us = min(predicted_us, get_typical_interval(data, predicted_us)); + /* Round up the result for half microseconds. */ + predicted_us = div_u64(data->next_timer_ns * + data->correction_factor[data->bucket] + + (RESOLUTION * DECAY * NSEC_PER_USEC) / 2, + RESOLUTION * DECAY * NSEC_PER_USEC); + /* Use the lowest expected idle interval to pick the idle state. */ + predicted_ns = (u64)min(predicted_us, + get_typical_interval(data, predicted_us)) * + NSEC_PER_USEC; if (tick_nohz_tick_stopped()) { /* @@ -330,14 +317,15 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * the known time till the closest timer event for the idle * state selection. */ - if (predicted_us < TICK_USEC) - predicted_us = ktime_to_us(delta_next); + if (predicted_ns < TICK_NSEC) + predicted_ns = delta_next; } else { /* * Use the performance multiplier and the user-configurable * latency_req to determine the maximum exit latency. */ - interactivity_req = predicted_us / performance_multiplier(nr_iowaiters); + interactivity_req = div64_u64(predicted_ns, + performance_multiplier(nr_iowaiters)); if (latency_req > interactivity_req) latency_req = interactivity_req; } @@ -349,27 +337,26 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, idx = -1; for (i = 0; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; - struct cpuidle_state_usage *su = &dev->states_usage[i]; - if (s->disabled || su->disable) + if (dev->states_usage[i].disable) continue; if (idx == -1) idx = i; /* first enabled state */ - if (s->target_residency > predicted_us) { + if (s->target_residency_ns > predicted_ns) { /* * Use a physical idle state, not busy polling, unless * a timer is going to trigger soon enough. */ if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) && - s->exit_latency <= latency_req && - s->target_residency <= data->next_timer_us) { - predicted_us = s->target_residency; + s->exit_latency_ns <= latency_req && + s->target_residency_ns <= data->next_timer_ns) { + predicted_ns = s->target_residency_ns; idx = i; break; } - if (predicted_us < TICK_USEC) + if (predicted_ns < TICK_NSEC) break; if (!tick_nohz_tick_stopped()) { @@ -379,7 +366,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * tick in that case and let the governor run * again in the next iteration of the loop. */ - predicted_us = drv->states[idx].target_residency; + predicted_ns = drv->states[idx].target_residency_ns; break; } @@ -389,13 +376,13 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * closest timer event, select this one to avoid getting * stuck in the shallow one for too long. */ - if (drv->states[idx].target_residency < TICK_USEC && - s->target_residency <= ktime_to_us(delta_next)) + if (drv->states[idx].target_residency_ns < TICK_NSEC && + s->target_residency_ns <= delta_next) idx = i; return idx; } - if (s->exit_latency > latency_req) + if (s->exit_latency_ns > latency_req) break; idx = i; @@ -409,12 +396,10 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * expected idle duration is shorter than the tick period length. */ if (((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) || - predicted_us < TICK_USEC) && !tick_nohz_tick_stopped()) { - unsigned int delta_next_us = ktime_to_us(delta_next); - + predicted_ns < TICK_NSEC) && !tick_nohz_tick_stopped()) { *stop_tick = false; - if (idx > 0 && drv->states[idx].target_residency > delta_next_us) { + if (idx > 0 && drv->states[idx].target_residency_ns > delta_next) { /* * The tick is not going to be stopped and the target * residency of the state to be returned is not within @@ -422,12 +407,11 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * tick, so try to correct that. */ for (i = idx - 1; i >= 0; i--) { - if (drv->states[i].disabled || - dev->states_usage[i].disable) + if (dev->states_usage[i].disable) continue; idx = i; - if (drv->states[i].target_residency <= delta_next_us) + if (drv->states[i].target_residency_ns <= delta_next) break; } } @@ -463,7 +447,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) struct menu_device *data = this_cpu_ptr(&menu_devices); int last_idx = dev->last_state_idx; struct cpuidle_state *target = &drv->states[last_idx]; - unsigned int measured_us; + u64 measured_ns; unsigned int new_factor; /* @@ -481,7 +465,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * assume the state was never reached and the exit latency is 0. */ - if (data->tick_wakeup && data->next_timer_us > TICK_USEC) { + if (data->tick_wakeup && data->next_timer_ns > TICK_NSEC) { /* * The nohz code said that there wouldn't be any events within * the tick boundary (if the tick was stopped), but the idle @@ -491,7 +475,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * have been idle long (but not forever) to help the idle * duration predictor do a better job next time. */ - measured_us = 9 * MAX_INTERESTING / 10; + measured_ns = 9 * MAX_INTERESTING / 10; } else if ((drv->states[last_idx].flags & CPUIDLE_FLAG_POLLING) && dev->poll_time_limit) { /* @@ -501,28 +485,29 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * the CPU might have been woken up from idle by the next timer. * Assume that to be the case. */ - measured_us = data->next_timer_us; + measured_ns = data->next_timer_ns; } else { /* measured value */ - measured_us = dev->last_residency; + measured_ns = dev->last_residency_ns; /* Deduct exit latency */ - if (measured_us > 2 * target->exit_latency) - measured_us -= target->exit_latency; + if (measured_ns > 2 * target->exit_latency_ns) + measured_ns -= target->exit_latency_ns; else - measured_us /= 2; + measured_ns /= 2; } /* Make sure our coefficients do not exceed unity */ - if (measured_us > data->next_timer_us) - measured_us = data->next_timer_us; + if (measured_ns > data->next_timer_ns) + measured_ns = data->next_timer_ns; /* Update our correction ratio */ new_factor = data->correction_factor[data->bucket]; new_factor -= new_factor / DECAY; - if (data->next_timer_us > 0 && measured_us < MAX_INTERESTING) - new_factor += RESOLUTION * measured_us / data->next_timer_us; + if (data->next_timer_ns > 0 && measured_ns < MAX_INTERESTING) + new_factor += div64_u64(RESOLUTION * measured_ns, + data->next_timer_ns); else /* * we were idle so long that we count it as a perfect @@ -542,7 +527,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) data->correction_factor[data->bucket] = new_factor; /* update the repeating-pattern data */ - data->intervals[data->interval_ptr++] = measured_us; + data->intervals[data->interval_ptr++] = ktime_to_us(measured_ns); if (data->interval_ptr >= INTERVALS) data->interval_ptr = 0; } diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c index b5a0e498f798..de7e706efd46 100644 --- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -104,7 +104,7 @@ struct teo_cpu { u64 sleep_length_ns; struct teo_idle_state states[CPUIDLE_STATE_MAX]; int interval_idx; - unsigned int intervals[INTERVALS]; + u64 intervals[INTERVALS]; }; static DEFINE_PER_CPU(struct teo_cpu, teo_cpus); @@ -117,9 +117,8 @@ static DEFINE_PER_CPU(struct teo_cpu, teo_cpus); static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); - unsigned int sleep_length_us = ktime_to_us(cpu_data->sleep_length_ns); int i, idx_hit = -1, idx_timer = -1; - unsigned int measured_us; + u64 measured_ns; if (cpu_data->time_span_ns >= cpu_data->sleep_length_ns) { /* @@ -127,23 +126,28 @@ static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * enough to the closest timer event expected at the idle state * selection time to be discarded. */ - measured_us = UINT_MAX; + measured_ns = U64_MAX; } else { - unsigned int lat; + u64 lat_ns = drv->states[dev->last_state_idx].exit_latency_ns; - lat = drv->states[dev->last_state_idx].exit_latency; - - measured_us = ktime_to_us(cpu_data->time_span_ns); + /* + * The computations below are to determine whether or not the + * (saved) time till the next timer event and the measured idle + * duration fall into the same "bin", so use last_residency_ns + * for that instead of time_span_ns which includes the cpuidle + * overhead. + */ + measured_ns = dev->last_residency_ns; /* * The delay between the wakeup and the first instruction * executed by the CPU is not likely to be worst-case every * time, so take 1/2 of the exit latency as a very rough * approximation of the average of it. */ - if (measured_us >= lat) - measured_us -= lat / 2; + if (measured_ns >= lat_ns) + measured_ns -= lat_ns / 2; else - measured_us /= 2; + measured_ns /= 2; } /* @@ -155,9 +159,9 @@ static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) cpu_data->states[i].early_hits -= early_hits >> DECAY_SHIFT; - if (drv->states[i].target_residency <= sleep_length_us) { + if (drv->states[i].target_residency_ns <= cpu_data->sleep_length_ns) { idx_timer = i; - if (drv->states[i].target_residency <= measured_us) + if (drv->states[i].target_residency_ns <= measured_ns) idx_hit = i; } } @@ -193,30 +197,35 @@ static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * Save idle duration values corresponding to non-timer wakeups for * pattern detection. */ - cpu_data->intervals[cpu_data->interval_idx++] = measured_us; + cpu_data->intervals[cpu_data->interval_idx++] = measured_ns; if (cpu_data->interval_idx > INTERVALS) cpu_data->interval_idx = 0; } +static bool teo_time_ok(u64 interval_ns) +{ + return !tick_nohz_tick_stopped() || interval_ns >= TICK_NSEC; +} + /** * teo_find_shallower_state - Find shallower idle state matching given duration. * @drv: cpuidle driver containing state data. * @dev: Target CPU. * @state_idx: Index of the capping idle state. - * @duration_us: Idle duration value to match. + * @duration_ns: Idle duration value to match. */ static int teo_find_shallower_state(struct cpuidle_driver *drv, struct cpuidle_device *dev, int state_idx, - unsigned int duration_us) + u64 duration_ns) { int i; for (i = state_idx - 1; i >= 0; i--) { - if (drv->states[i].disabled || dev->states_usage[i].disable) + if (dev->states_usage[i].disable) continue; state_idx = i; - if (drv->states[i].target_residency <= duration_us) + if (drv->states[i].target_residency_ns <= duration_ns) break; } return state_idx; @@ -232,9 +241,10 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, bool *stop_tick) { struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); - int latency_req = cpuidle_governor_latency_req(dev->cpu); - unsigned int duration_us, count; - int max_early_idx, constraint_idx, idx, i; + s64 latency_req = cpuidle_governor_latency_req(dev->cpu); + u64 duration_ns; + unsigned int hits, misses, early_hits; + int max_early_idx, prev_max_early_idx, constraint_idx, idx, i; ktime_t delta_tick; if (dev->last_state_idx >= 0) { @@ -244,50 +254,92 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, cpu_data->time_span_ns = local_clock(); - cpu_data->sleep_length_ns = tick_nohz_get_sleep_length(&delta_tick); - duration_us = ktime_to_us(cpu_data->sleep_length_ns); + duration_ns = tick_nohz_get_sleep_length(&delta_tick); + cpu_data->sleep_length_ns = duration_ns; - count = 0; + hits = 0; + misses = 0; + early_hits = 0; max_early_idx = -1; + prev_max_early_idx = -1; constraint_idx = drv->state_count; idx = -1; for (i = 0; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; - struct cpuidle_state_usage *su = &dev->states_usage[i]; - if (s->disabled || su->disable) { + if (dev->states_usage[i].disable) { + /* + * Ignore disabled states with target residencies beyond + * the anticipated idle duration. + */ + if (s->target_residency_ns > duration_ns) + continue; + + /* + * This state is disabled, so the range of idle duration + * values corresponding to it is covered by the current + * candidate state, but still the "hits" and "misses" + * metrics of the disabled state need to be used to + * decide whether or not the state covering the range in + * question is good enough. + */ + hits = cpu_data->states[i].hits; + misses = cpu_data->states[i].misses; + + if (early_hits >= cpu_data->states[i].early_hits || + idx < 0) + continue; + /* - * If the "early hits" metric of a disabled state is - * greater than the current maximum, it should be taken - * into account, because it would be a mistake to select - * a deeper state with lower "early hits" metric. The - * index cannot be changed to point to it, however, so - * just increase the max count alone and let the index - * still point to a shallower idle state. + * If the current candidate state has been the one with + * the maximum "early hits" metric so far, the "early + * hits" metric of the disabled state replaces the + * current "early hits" count to avoid selecting a + * deeper state with lower "early hits" metric. */ - if (max_early_idx >= 0 && - count < cpu_data->states[i].early_hits) - count = cpu_data->states[i].early_hits; + if (max_early_idx == idx) { + early_hits = cpu_data->states[i].early_hits; + continue; + } + + /* + * The current candidate state is closer to the disabled + * one than the current maximum "early hits" state, so + * replace the latter with it, but in case the maximum + * "early hits" state index has not been set so far, + * check if the current candidate state is not too + * shallow for that role. + */ + if (teo_time_ok(drv->states[idx].target_residency_ns)) { + prev_max_early_idx = max_early_idx; + early_hits = cpu_data->states[i].early_hits; + max_early_idx = idx; + } continue; } - if (idx < 0) + if (idx < 0) { idx = i; /* first enabled state */ + hits = cpu_data->states[i].hits; + misses = cpu_data->states[i].misses; + } - if (s->target_residency > duration_us) + if (s->target_residency_ns > duration_ns) break; - if (s->exit_latency > latency_req && constraint_idx > i) + if (s->exit_latency_ns > latency_req && constraint_idx > i) constraint_idx = i; idx = i; + hits = cpu_data->states[i].hits; + misses = cpu_data->states[i].misses; - if (count < cpu_data->states[i].early_hits && - !(tick_nohz_tick_stopped() && - drv->states[i].target_residency < TICK_USEC)) { - count = cpu_data->states[i].early_hits; + if (early_hits < cpu_data->states[i].early_hits && + teo_time_ok(drv->states[i].target_residency_ns)) { + prev_max_early_idx = max_early_idx; + early_hits = cpu_data->states[i].early_hits; max_early_idx = i; } } @@ -300,10 +352,19 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * "early hits" metric, but if that cannot be determined, just use the * state selected so far. */ - if (cpu_data->states[idx].hits <= cpu_data->states[idx].misses && - max_early_idx >= 0) { - idx = max_early_idx; - duration_us = drv->states[idx].target_residency; + if (hits <= misses) { + /* + * The current candidate state is not suitable, so take the one + * whose "early hits" metric is the maximum for the range of + * shallower states. + */ + if (idx == max_early_idx) + max_early_idx = prev_max_early_idx; + + if (max_early_idx >= 0) { + idx = max_early_idx; + duration_ns = drv->states[idx].target_residency_ns; + } } /* @@ -316,18 +377,17 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, if (idx < 0) { idx = 0; /* No states enabled. Must use 0. */ } else if (idx > 0) { + unsigned int count = 0; u64 sum = 0; - count = 0; - /* * Count and sum the most recent idle duration values less than * the current expected idle duration value. */ for (i = 0; i < INTERVALS; i++) { - unsigned int val = cpu_data->intervals[i]; + u64 val = cpu_data->intervals[i]; - if (val >= duration_us) + if (val >= duration_ns) continue; count++; @@ -339,17 +399,17 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * values are in the interesting range. */ if (count > INTERVALS / 2) { - unsigned int avg_us = div64_u64(sum, count); + u64 avg_ns = div64_u64(sum, count); /* * Avoid spending too much time in an idle state that * would be too shallow. */ - if (!(tick_nohz_tick_stopped() && avg_us < TICK_USEC)) { - duration_us = avg_us; - if (drv->states[idx].target_residency > avg_us) + if (teo_time_ok(avg_ns)) { + duration_ns = avg_ns; + if (drv->states[idx].target_residency_ns > avg_ns) idx = teo_find_shallower_state(drv, dev, - idx, avg_us); + idx, avg_ns); } } } @@ -359,9 +419,7 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * expected idle duration is shorter than the tick period length. */ if (((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) || - duration_us < TICK_USEC) && !tick_nohz_tick_stopped()) { - unsigned int delta_tick_us = ktime_to_us(delta_tick); - + duration_ns < TICK_NSEC) && !tick_nohz_tick_stopped()) { *stop_tick = false; /* @@ -370,8 +428,8 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * till the closest timer including the tick, try to correct * that. */ - if (idx > 0 && drv->states[idx].target_residency > delta_tick_us) - idx = teo_find_shallower_state(drv, dev, idx, delta_tick_us); + if (idx > 0 && drv->states[idx].target_residency_ns > delta_tick) + idx = teo_find_shallower_state(drv, dev, idx, delta_tick); } return idx; @@ -415,7 +473,7 @@ static int teo_enable_device(struct cpuidle_driver *drv, memset(cpu_data, 0, sizeof(*cpu_data)); for (i = 0; i < INTERVALS; i++) - cpu_data->intervals[i] = UINT_MAX; + cpu_data->intervals[i] = U64_MAX; return 0; } diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c index c8fa5f41dfc4..9f1ace9c53da 100644 --- a/drivers/cpuidle/poll_state.c +++ b/drivers/cpuidle/poll_state.c @@ -49,6 +49,8 @@ void cpuidle_poll_state_init(struct cpuidle_driver *drv) snprintf(state->desc, CPUIDLE_DESC_LEN, "CPUIDLE CORE POLL IDLE"); state->exit_latency = 0; state->target_residency = 0; + state->exit_latency_ns = 0; + state->target_residency_ns = 0; state->power_usage = -1; state->enter = poll_idle; state->disabled = false; diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 2bb2683b493c..38ef770be90d 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -255,25 +255,6 @@ static ssize_t show_state_##_name(struct cpuidle_state *state, \ return sprintf(buf, "%u\n", state->_name);\ } -#define define_store_state_ull_function(_name) \ -static ssize_t store_state_##_name(struct cpuidle_state *state, \ - struct cpuidle_state_usage *state_usage, \ - const char *buf, size_t size) \ -{ \ - unsigned long long value; \ - int err; \ - if (!capable(CAP_SYS_ADMIN)) \ - return -EPERM; \ - err = kstrtoull(buf, 0, &value); \ - if (err) \ - return err; \ - if (value) \ - state_usage->_name = 1; \ - else \ - state_usage->_name = 0; \ - return size; \ -} - #define define_show_state_ull_function(_name) \ static ssize_t show_state_##_name(struct cpuidle_state *state, \ struct cpuidle_state_usage *state_usage, \ @@ -292,18 +273,60 @@ static ssize_t show_state_##_name(struct cpuidle_state *state, \ return sprintf(buf, "%s\n", state->_name);\ } -define_show_state_function(exit_latency) -define_show_state_function(target_residency) +#define define_show_state_time_function(_name) \ +static ssize_t show_state_##_name(struct cpuidle_state *state, \ + struct cpuidle_state_usage *state_usage, \ + char *buf) \ +{ \ + return sprintf(buf, "%llu\n", ktime_to_us(state->_name##_ns)); \ +} + +define_show_state_time_function(exit_latency) +define_show_state_time_function(target_residency) define_show_state_function(power_usage) define_show_state_ull_function(usage) -define_show_state_ull_function(time) define_show_state_str_function(name) define_show_state_str_function(desc) -define_show_state_ull_function(disable) -define_store_state_ull_function(disable) define_show_state_ull_function(above) define_show_state_ull_function(below) +static ssize_t show_state_time(struct cpuidle_state *state, + struct cpuidle_state_usage *state_usage, + char *buf) +{ + return sprintf(buf, "%llu\n", ktime_to_us(state_usage->time_ns)); +} + +static ssize_t show_state_disable(struct cpuidle_state *state, + struct cpuidle_state_usage *state_usage, + char *buf) +{ + return sprintf(buf, "%llu\n", + state_usage->disable & CPUIDLE_STATE_DISABLED_BY_USER); +} + +static ssize_t store_state_disable(struct cpuidle_state *state, + struct cpuidle_state_usage *state_usage, + const char *buf, size_t size) +{ + unsigned int value; + int err; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + err = kstrtouint(buf, 0, &value); + if (err) + return err; + + if (value) + state_usage->disable |= CPUIDLE_STATE_DISABLED_BY_USER; + else + state_usage->disable &= ~CPUIDLE_STATE_DISABLED_BY_USER; + + return size; +} + define_one_state_ro(name, show_state_name); define_one_state_ro(desc, show_state_desc); define_one_state_ro(latency, show_state_exit_latency); diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index 446490c9d635..f840e61e5a27 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -160,6 +160,7 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq) int lev, prev_lev, ret = 0; unsigned long cur_time; + lockdep_assert_held(&devfreq->lock); cur_time = jiffies; /* Immediately exit if previous_freq is not initialized yet. */ @@ -409,6 +410,9 @@ static void devfreq_monitor(struct work_struct *work) */ void devfreq_monitor_start(struct devfreq *devfreq) { + if (devfreq->governor->interrupt_driven) + return; + INIT_DEFERRABLE_WORK(&devfreq->work, devfreq_monitor); if (devfreq->profile->polling_ms) queue_delayed_work(devfreq_wq, &devfreq->work, @@ -426,6 +430,9 @@ EXPORT_SYMBOL(devfreq_monitor_start); */ void devfreq_monitor_stop(struct devfreq *devfreq) { + if (devfreq->governor->interrupt_driven) + return; + cancel_delayed_work_sync(&devfreq->work); } EXPORT_SYMBOL(devfreq_monitor_stop); @@ -453,6 +460,10 @@ void devfreq_monitor_suspend(struct devfreq *devfreq) devfreq_update_status(devfreq, devfreq->previous_freq); devfreq->stop_polling = true; mutex_unlock(&devfreq->lock); + + if (devfreq->governor->interrupt_driven) + return; + cancel_delayed_work_sync(&devfreq->work); } EXPORT_SYMBOL(devfreq_monitor_suspend); @@ -473,11 +484,15 @@ void devfreq_monitor_resume(struct devfreq *devfreq) if (!devfreq->stop_polling) goto out; + if (devfreq->governor->interrupt_driven) + goto out_update; + if (!delayed_work_pending(&devfreq->work) && devfreq->profile->polling_ms) queue_delayed_work(devfreq_wq, &devfreq->work, msecs_to_jiffies(devfreq->profile->polling_ms)); +out_update: devfreq->last_stat_updated = jiffies; devfreq->stop_polling = false; @@ -509,6 +524,9 @@ void devfreq_interval_update(struct devfreq *devfreq, unsigned int *delay) if (devfreq->stop_polling) goto out; + if (devfreq->governor->interrupt_driven) + goto out; + /* if new delay is zero, stop polling */ if (!new_delay) { mutex_unlock(&devfreq->lock); @@ -625,7 +643,7 @@ struct devfreq *devfreq_add_device(struct device *dev, devfreq = find_device_devfreq(dev); mutex_unlock(&devfreq_list_lock); if (!IS_ERR(devfreq)) { - dev_err(dev, "%s: Unable to create devfreq for the device.\n", + dev_err(dev, "%s: devfreq device already exists!\n", __func__); err = -EINVAL; goto err_out; @@ -1195,7 +1213,7 @@ static ssize_t available_governors_show(struct device *d, * The devfreq with immutable governor (e.g., passive) shows * only own governor. */ - if (df->governor->immutable) { + if (df->governor && df->governor->immutable) { count = scnprintf(&buf[count], DEVFREQ_NAME_LEN, "%s ", df->governor_name); /* @@ -1397,12 +1415,17 @@ static ssize_t trans_stat_show(struct device *dev, int i, j; unsigned int max_state = devfreq->profile->max_state; - if (!devfreq->stop_polling && - devfreq_update_status(devfreq, devfreq->previous_freq)) - return 0; if (max_state == 0) return sprintf(buf, "Not Supported.\n"); + mutex_lock(&devfreq->lock); + if (!devfreq->stop_polling && + devfreq_update_status(devfreq, devfreq->previous_freq)) { + mutex_unlock(&devfreq->lock); + return 0; + } + mutex_unlock(&devfreq->lock); + len = sprintf(buf, " From : To\n"); len += sprintf(buf + len, " :"); for (i = 0; i < max_state; i++) diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c index 87b42055e6bc..85c7a77bf3f0 100644 --- a/drivers/devfreq/event/exynos-ppmu.c +++ b/drivers/devfreq/event/exynos-ppmu.c @@ -673,7 +673,6 @@ static int exynos_ppmu_probe(struct platform_device *pdev) for (i = 0; i < info->num_events; i++) { edev[i] = devm_devfreq_event_add_edev(&pdev->dev, &desc[i]); if (IS_ERR(edev[i])) { - ret = PTR_ERR(edev[i]); dev_err(&pdev->dev, "failed to add devfreq-event device\n"); return PTR_ERR(edev[i]); diff --git a/drivers/devfreq/governor.h b/drivers/devfreq/governor.h index bbe5ff9fcecf..dc7533ccc3db 100644 --- a/drivers/devfreq/governor.h +++ b/drivers/devfreq/governor.h @@ -31,6 +31,8 @@ * @name: Governor's name * @immutable: Immutable flag for governor. If the value is 1, * this govenror is never changeable to other governor. + * @interrupt_driven: Devfreq core won't schedule polling work for this + * governor if value is set to 1. * @get_target_freq: Returns desired operating frequency for the device. * Basically, get_target_freq will run * devfreq_dev_profile.get_dev_status() to get the @@ -49,6 +51,7 @@ struct devfreq_governor { const char name[DEVFREQ_NAME_LEN]; const unsigned int immutable; + const unsigned int interrupt_driven; int (*get_target_freq)(struct devfreq *this, unsigned long *freq); int (*event_handler)(struct devfreq *devfreq, unsigned int event, void *data); diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index a6ba75f4106d..0b65f89d74d5 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -11,11 +11,13 @@ #include <linux/devfreq.h> #include <linux/interrupt.h> #include <linux/io.h> +#include <linux/irq.h> #include <linux/module.h> -#include <linux/mod_devicetable.h> +#include <linux/of_device.h> #include <linux/platform_device.h> #include <linux/pm_opp.h> #include <linux/reset.h> +#include <linux/workqueue.h> #include "governor.h" @@ -33,6 +35,8 @@ #define ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN BIT(30) #define ACTMON_DEV_CTRL_ENB BIT(31) +#define ACTMON_DEV_CTRL_STOP 0x00000000 + #define ACTMON_DEV_UPPER_WMARK 0x4 #define ACTMON_DEV_LOWER_WMARK 0x8 #define ACTMON_DEV_INIT_AVG 0xc @@ -68,6 +72,8 @@ #define KHZ 1000 +#define KHZ_MAX (ULONG_MAX / KHZ) + /* Assume that the bus is saturated if the utilization is 25% */ #define BUS_SATURATION_RATIO 25 @@ -90,9 +96,10 @@ struct tegra_devfreq_device_config { unsigned int boost_down_threshold; /* - * Threshold of activity (cycles) below which the CPU frequency isn't - * to be taken into account. This is to avoid increasing the EMC - * frequency when the CPU is very busy but not accessing the bus often. + * Threshold of activity (cycles translated to kHz) below which the + * CPU frequency isn't to be taken into account. This is to avoid + * increasing the EMC frequency when the CPU is very busy but not + * accessing the bus often. */ u32 avg_dependency_threshold; }; @@ -102,7 +109,7 @@ enum tegra_actmon_device { MCCPU, }; -static struct tegra_devfreq_device_config actmon_device_configs[] = { +static const struct tegra_devfreq_device_config actmon_device_configs[] = { { /* MCALL: All memory accesses (including from the CPUs) */ .offset = 0x1c0, @@ -117,10 +124,10 @@ static struct tegra_devfreq_device_config actmon_device_configs[] = { .offset = 0x200, .irq_mask = 1 << 25, .boost_up_coeff = 800, - .boost_down_coeff = 90, + .boost_down_coeff = 40, .boost_up_threshold = 27, .boost_down_threshold = 10, - .avg_dependency_threshold = 50000, + .avg_dependency_threshold = 16000, /* 16MHz in kHz units */ }, }; @@ -156,11 +163,16 @@ struct tegra_devfreq { struct clk *emc_clock; unsigned long max_freq; unsigned long cur_freq; - struct notifier_block rate_change_nb; + struct notifier_block clk_rate_change_nb; + + struct delayed_work cpufreq_update_work; + struct notifier_block cpu_rate_change_nb; struct tegra_devfreq_device devices[ARRAY_SIZE(actmon_device_configs)]; - int irq; + unsigned int irq; + + bool started; }; struct tegra_actmon_emc_ratio { @@ -168,8 +180,8 @@ struct tegra_actmon_emc_ratio { unsigned long emc_freq; }; -static struct tegra_actmon_emc_ratio actmon_emc_ratios[] = { - { 1400000, ULONG_MAX }, +static const struct tegra_actmon_emc_ratio actmon_emc_ratios[] = { + { 1400000, KHZ_MAX }, { 1200000, 750000 }, { 1100000, 600000 }, { 1000000, 500000 }, @@ -199,18 +211,26 @@ static void device_writel(struct tegra_devfreq_device *dev, u32 val, writel_relaxed(val, dev->regs + offset); } -static unsigned long do_percent(unsigned long val, unsigned int pct) +static unsigned long do_percent(unsigned long long val, unsigned int pct) { - return val * pct / 100; + val = val * pct; + do_div(val, 100); + + /* + * High freq + high boosting percent + large polling interval are + * resulting in integer overflow when watermarks are calculated. + */ + return min_t(u64, val, U32_MAX); } static void tegra_devfreq_update_avg_wmark(struct tegra_devfreq *tegra, struct tegra_devfreq_device *dev) { - u32 avg = dev->avg_count; u32 avg_band_freq = tegra->max_freq * ACTMON_DEFAULT_AVG_BAND / KHZ; - u32 band = avg_band_freq * ACTMON_SAMPLING_PERIOD; + u32 band = avg_band_freq * tegra->devfreq->profile->polling_ms; + u32 avg; + avg = min(dev->avg_count, U32_MAX - band); device_writel(dev, avg + band, ACTMON_DEV_AVG_UPPER_WMARK); avg = max(dev->avg_count, band); @@ -220,7 +240,7 @@ static void tegra_devfreq_update_avg_wmark(struct tegra_devfreq *tegra, static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, struct tegra_devfreq_device *dev) { - u32 val = tegra->cur_freq * ACTMON_SAMPLING_PERIOD; + u32 val = tegra->cur_freq * tegra->devfreq->profile->polling_ms; device_writel(dev, do_percent(val, dev->config->boost_up_threshold), ACTMON_DEV_UPPER_WMARK); @@ -229,12 +249,6 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, ACTMON_DEV_LOWER_WMARK); } -static void actmon_write_barrier(struct tegra_devfreq *tegra) -{ - /* ensure the update has reached the ACTMON */ - readl(tegra->regs + ACTMON_GLB_STATUS); -} - static void actmon_isr_device(struct tegra_devfreq *tegra, struct tegra_devfreq_device *dev) { @@ -256,10 +270,10 @@ static void actmon_isr_device(struct tegra_devfreq *tegra, dev_ctrl |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; - if (dev->boost_freq >= tegra->max_freq) + if (dev->boost_freq >= tegra->max_freq) { + dev_ctrl &= ~ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; dev->boost_freq = tegra->max_freq; - else - dev_ctrl |= ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; + } } else if (intr_status & ACTMON_DEV_INTR_CONSECUTIVE_LOWER) { /* * new_boost = old_boost * down_coef @@ -270,31 +284,22 @@ static void actmon_isr_device(struct tegra_devfreq *tegra, dev_ctrl |= ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; - if (dev->boost_freq < (ACTMON_BOOST_FREQ_STEP >> 1)) - dev->boost_freq = 0; - else - dev_ctrl |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; - } - - if (dev->config->avg_dependency_threshold) { - if (dev->avg_count >= dev->config->avg_dependency_threshold) - dev_ctrl |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; - else if (dev->boost_freq == 0) + if (dev->boost_freq < (ACTMON_BOOST_FREQ_STEP >> 1)) { dev_ctrl &= ~ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; + dev->boost_freq = 0; + } } device_writel(dev, dev_ctrl, ACTMON_DEV_CTRL); device_writel(dev, ACTMON_INTR_STATUS_CLEAR, ACTMON_DEV_INTR_STATUS); - - actmon_write_barrier(tegra); } static unsigned long actmon_cpu_to_emc_rate(struct tegra_devfreq *tegra, unsigned long cpu_freq) { unsigned int i; - struct tegra_actmon_emc_ratio *ratio = actmon_emc_ratios; + const struct tegra_actmon_emc_ratio *ratio = actmon_emc_ratios; for (i = 0; i < ARRAY_SIZE(actmon_emc_ratios); i++, ratio++) { if (cpu_freq >= ratio->cpu_freq) { @@ -308,25 +313,37 @@ static unsigned long actmon_cpu_to_emc_rate(struct tegra_devfreq *tegra, return 0; } +static unsigned long actmon_device_target_freq(struct tegra_devfreq *tegra, + struct tegra_devfreq_device *dev) +{ + unsigned int avg_sustain_coef; + unsigned long target_freq; + + target_freq = dev->avg_count / tegra->devfreq->profile->polling_ms; + avg_sustain_coef = 100 * 100 / dev->config->boost_up_threshold; + target_freq = do_percent(target_freq, avg_sustain_coef); + + return target_freq; +} + static void actmon_update_target(struct tegra_devfreq *tegra, struct tegra_devfreq_device *dev) { unsigned long cpu_freq = 0; unsigned long static_cpu_emc_freq = 0; - unsigned int avg_sustain_coef; - if (dev->config->avg_dependency_threshold) { - cpu_freq = cpufreq_get(0); - static_cpu_emc_freq = actmon_cpu_to_emc_rate(tegra, cpu_freq); - } + dev->target_freq = actmon_device_target_freq(tegra, dev); - dev->target_freq = dev->avg_count / ACTMON_SAMPLING_PERIOD; - avg_sustain_coef = 100 * 100 / dev->config->boost_up_threshold; - dev->target_freq = do_percent(dev->target_freq, avg_sustain_coef); - dev->target_freq += dev->boost_freq; + if (dev->config->avg_dependency_threshold && + dev->config->avg_dependency_threshold <= dev->target_freq) { + cpu_freq = cpufreq_quick_get(0); + static_cpu_emc_freq = actmon_cpu_to_emc_rate(tegra, cpu_freq); - if (dev->avg_count >= dev->config->avg_dependency_threshold) + dev->target_freq += dev->boost_freq; dev->target_freq = max(dev->target_freq, static_cpu_emc_freq); + } else { + dev->target_freq += dev->boost_freq; + } } static irqreturn_t actmon_thread_isr(int irq, void *data) @@ -354,8 +371,8 @@ static irqreturn_t actmon_thread_isr(int irq, void *data) return handled ? IRQ_HANDLED : IRQ_NONE; } -static int tegra_actmon_rate_notify_cb(struct notifier_block *nb, - unsigned long action, void *ptr) +static int tegra_actmon_clk_notify_cb(struct notifier_block *nb, + unsigned long action, void *ptr) { struct clk_notifier_data *data = ptr; struct tegra_devfreq *tegra; @@ -365,7 +382,7 @@ static int tegra_actmon_rate_notify_cb(struct notifier_block *nb, if (action != POST_RATE_CHANGE) return NOTIFY_OK; - tegra = container_of(nb, struct tegra_devfreq, rate_change_nb); + tegra = container_of(nb, struct tegra_devfreq, clk_rate_change_nb); tegra->cur_freq = data->new_rate / KHZ; @@ -375,7 +392,79 @@ static int tegra_actmon_rate_notify_cb(struct notifier_block *nb, tegra_devfreq_update_wmark(tegra, dev); } - actmon_write_barrier(tegra); + return NOTIFY_OK; +} + +static void tegra_actmon_delayed_update(struct work_struct *work) +{ + struct tegra_devfreq *tegra = container_of(work, struct tegra_devfreq, + cpufreq_update_work.work); + + mutex_lock(&tegra->devfreq->lock); + update_devfreq(tegra->devfreq); + mutex_unlock(&tegra->devfreq->lock); +} + +static unsigned long +tegra_actmon_cpufreq_contribution(struct tegra_devfreq *tegra, + unsigned int cpu_freq) +{ + struct tegra_devfreq_device *actmon_dev = &tegra->devices[MCCPU]; + unsigned long static_cpu_emc_freq, dev_freq; + + dev_freq = actmon_device_target_freq(tegra, actmon_dev); + + /* check whether CPU's freq is taken into account at all */ + if (dev_freq < actmon_dev->config->avg_dependency_threshold) + return 0; + + static_cpu_emc_freq = actmon_cpu_to_emc_rate(tegra, cpu_freq); + + if (dev_freq >= static_cpu_emc_freq) + return 0; + + return static_cpu_emc_freq; +} + +static int tegra_actmon_cpu_notify_cb(struct notifier_block *nb, + unsigned long action, void *ptr) +{ + struct cpufreq_freqs *freqs = ptr; + struct tegra_devfreq *tegra; + unsigned long old, new, delay; + + if (action != CPUFREQ_POSTCHANGE) + return NOTIFY_OK; + + tegra = container_of(nb, struct tegra_devfreq, cpu_rate_change_nb); + + /* + * Quickly check whether CPU frequency should be taken into account + * at all, without blocking CPUFreq's core. + */ + if (mutex_trylock(&tegra->devfreq->lock)) { + old = tegra_actmon_cpufreq_contribution(tegra, freqs->old); + new = tegra_actmon_cpufreq_contribution(tegra, freqs->new); + mutex_unlock(&tegra->devfreq->lock); + + /* + * If CPU's frequency shouldn't be taken into account at + * the moment, then there is no need to update the devfreq's + * state because ISR will re-check CPU's frequency on the + * next interrupt. + */ + if (old == new) + return NOTIFY_OK; + } + + /* + * CPUFreq driver should support CPUFREQ_ASYNC_NOTIFICATION in order + * to allow asynchronous notifications. This means we can't block + * here for too long, otherwise CPUFreq's core will complain with a + * warning splat. + */ + delay = msecs_to_jiffies(ACTMON_SAMPLING_PERIOD); + schedule_delayed_work(&tegra->cpufreq_update_work, delay); return NOTIFY_OK; } @@ -385,9 +474,12 @@ static void tegra_actmon_configure_device(struct tegra_devfreq *tegra, { u32 val = 0; + /* reset boosting on governor's restart */ + dev->boost_freq = 0; + dev->target_freq = tegra->cur_freq; - dev->avg_count = tegra->cur_freq * ACTMON_SAMPLING_PERIOD; + dev->avg_count = tegra->cur_freq * tegra->devfreq->profile->polling_ms; device_writel(dev, dev->avg_count, ACTMON_DEV_INIT_AVG); tegra_devfreq_update_avg_wmark(tegra, dev); @@ -405,45 +497,116 @@ static void tegra_actmon_configure_device(struct tegra_devfreq *tegra, << ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_NUM_SHIFT; val |= ACTMON_DEV_CTRL_AVG_ABOVE_WMARK_EN; val |= ACTMON_DEV_CTRL_AVG_BELOW_WMARK_EN; - val |= ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN; val |= ACTMON_DEV_CTRL_CONSECUTIVE_ABOVE_WMARK_EN; val |= ACTMON_DEV_CTRL_ENB; device_writel(dev, val, ACTMON_DEV_CTRL); } -static void tegra_actmon_start(struct tegra_devfreq *tegra) +static void tegra_actmon_stop_devices(struct tegra_devfreq *tegra) { + struct tegra_devfreq_device *dev = tegra->devices; unsigned int i; - disable_irq(tegra->irq); + for (i = 0; i < ARRAY_SIZE(tegra->devices); i++, dev++) { + device_writel(dev, ACTMON_DEV_CTRL_STOP, ACTMON_DEV_CTRL); + device_writel(dev, ACTMON_INTR_STATUS_CLEAR, + ACTMON_DEV_INTR_STATUS); + } +} - actmon_writel(tegra, ACTMON_SAMPLING_PERIOD - 1, +static int tegra_actmon_resume(struct tegra_devfreq *tegra) +{ + unsigned int i; + int err; + + if (!tegra->devfreq->profile->polling_ms || !tegra->started) + return 0; + + actmon_writel(tegra, tegra->devfreq->profile->polling_ms - 1, ACTMON_GLB_PERIOD_CTRL); + /* + * CLK notifications are needed in order to reconfigure the upper + * consecutive watermark in accordance to the actual clock rate + * to avoid unnecessary upper interrupts. + */ + err = clk_notifier_register(tegra->emc_clock, + &tegra->clk_rate_change_nb); + if (err) { + dev_err(tegra->devfreq->dev.parent, + "Failed to register rate change notifier\n"); + return err; + } + + tegra->cur_freq = clk_get_rate(tegra->emc_clock) / KHZ; + for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) tegra_actmon_configure_device(tegra, &tegra->devices[i]); - actmon_write_barrier(tegra); + /* + * We are estimating CPU's memory bandwidth requirement based on + * amount of memory accesses and system's load, judging by CPU's + * frequency. We also don't want to receive events about CPU's + * frequency transaction when governor is stopped, hence notifier + * is registered dynamically. + */ + err = cpufreq_register_notifier(&tegra->cpu_rate_change_nb, + CPUFREQ_TRANSITION_NOTIFIER); + if (err) { + dev_err(tegra->devfreq->dev.parent, + "Failed to register rate change notifier: %d\n", err); + goto err_stop; + } enable_irq(tegra->irq); + + return 0; + +err_stop: + tegra_actmon_stop_devices(tegra); + + clk_notifier_unregister(tegra->emc_clock, &tegra->clk_rate_change_nb); + + return err; } -static void tegra_actmon_stop(struct tegra_devfreq *tegra) +static int tegra_actmon_start(struct tegra_devfreq *tegra) { - unsigned int i; + int ret = 0; - disable_irq(tegra->irq); + if (!tegra->started) { + tegra->started = true; - for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) { - device_writel(&tegra->devices[i], 0x00000000, ACTMON_DEV_CTRL); - device_writel(&tegra->devices[i], ACTMON_INTR_STATUS_CLEAR, - ACTMON_DEV_INTR_STATUS); + ret = tegra_actmon_resume(tegra); + if (ret) + tegra->started = false; } - actmon_write_barrier(tegra); + return ret; +} - enable_irq(tegra->irq); +static void tegra_actmon_pause(struct tegra_devfreq *tegra) +{ + if (!tegra->devfreq->profile->polling_ms || !tegra->started) + return; + + disable_irq(tegra->irq); + + cpufreq_unregister_notifier(&tegra->cpu_rate_change_nb, + CPUFREQ_TRANSITION_NOTIFIER); + + cancel_delayed_work_sync(&tegra->cpufreq_update_work); + + tegra_actmon_stop_devices(tegra); + + clk_notifier_unregister(tegra->emc_clock, &tegra->clk_rate_change_nb); +} + +static void tegra_actmon_stop(struct tegra_devfreq *tegra) +{ + tegra_actmon_pause(tegra); + tegra->started = false; } static int tegra_devfreq_target(struct device *dev, unsigned long *freq, @@ -463,7 +626,7 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq, rate = dev_pm_opp_get_freq(opp); dev_pm_opp_put(opp); - err = clk_set_min_rate(tegra->emc_clock, rate); + err = clk_set_min_rate(tegra->emc_clock, rate * KHZ); if (err) return err; @@ -492,7 +655,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->private_data = tegra; /* The below are to be used by the other governors */ - stat->current_frequency = cur_freq * KHZ; + stat->current_frequency = cur_freq; actmon_dev = &tegra->devices[MCALL]; @@ -503,7 +666,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->busy_time *= 100 / BUS_SATURATION_RATIO; /* Number of cycles in a sampling period */ - stat->total_time = ACTMON_SAMPLING_PERIOD * cur_freq; + stat->total_time = tegra->devfreq->profile->polling_ms * cur_freq; stat->busy_time = min(stat->busy_time, stat->total_time); @@ -511,7 +674,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, } static struct devfreq_dev_profile tegra_devfreq_profile = { - .polling_ms = 0, + .polling_ms = ACTMON_SAMPLING_PERIOD, .target = tegra_devfreq_target, .get_dev_status = tegra_devfreq_get_dev_status, }; @@ -542,7 +705,7 @@ static int tegra_governor_get_target(struct devfreq *devfreq, target_freq = max(target_freq, dev->target_freq); } - *freq = target_freq * KHZ; + *freq = target_freq; return 0; } @@ -551,11 +714,19 @@ static int tegra_governor_event_handler(struct devfreq *devfreq, unsigned int event, void *data) { struct tegra_devfreq *tegra = dev_get_drvdata(devfreq->dev.parent); + unsigned int *new_delay = data; + int ret = 0; + + /* + * Couple devfreq-device with the governor early because it is + * needed at the moment of governor's start (used by ISR). + */ + tegra->devfreq = devfreq; switch (event) { case DEVFREQ_GOV_START: devfreq_monitor_start(devfreq); - tegra_actmon_start(tegra); + ret = tegra_actmon_start(tegra); break; case DEVFREQ_GOV_STOP: @@ -563,6 +734,21 @@ static int tegra_governor_event_handler(struct devfreq *devfreq, devfreq_monitor_stop(devfreq); break; + case DEVFREQ_GOV_INTERVAL: + /* + * ACTMON hardware supports up to 256 milliseconds for the + * sampling period. + */ + if (*new_delay > 256) { + ret = -EINVAL; + break; + } + + tegra_actmon_pause(tegra); + devfreq_interval_update(devfreq, new_delay); + ret = tegra_actmon_resume(tegra); + break; + case DEVFREQ_GOV_SUSPEND: tegra_actmon_stop(tegra); devfreq_monitor_suspend(devfreq); @@ -570,11 +756,11 @@ static int tegra_governor_event_handler(struct devfreq *devfreq, case DEVFREQ_GOV_RESUME: devfreq_monitor_resume(devfreq); - tegra_actmon_start(tegra); + ret = tegra_actmon_start(tegra); break; } - return 0; + return ret; } static struct devfreq_governor tegra_devfreq_governor = { @@ -582,14 +768,16 @@ static struct devfreq_governor tegra_devfreq_governor = { .get_target_freq = tegra_governor_get_target, .event_handler = tegra_governor_event_handler, .immutable = true, + .interrupt_driven = true, }; static int tegra_devfreq_probe(struct platform_device *pdev) { - struct tegra_devfreq *tegra; struct tegra_devfreq_device *dev; + struct tegra_devfreq *tegra; + struct devfreq *devfreq; unsigned int i; - unsigned long rate; + long rate; int err; tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL); @@ -618,12 +806,22 @@ static int tegra_devfreq_probe(struct platform_device *pdev) return PTR_ERR(tegra->emc_clock); } - tegra->irq = platform_get_irq(pdev, 0); - if (tegra->irq < 0) { - err = tegra->irq; + err = platform_get_irq(pdev, 0); + if (err < 0) { dev_err(&pdev->dev, "Failed to get IRQ: %d\n", err); return err; } + tegra->irq = err; + + irq_set_status_flags(tegra->irq, IRQ_NOAUTOEN); + + err = devm_request_threaded_irq(&pdev->dev, tegra->irq, NULL, + actmon_thread_isr, IRQF_ONESHOT, + "tegra-devfreq", tegra); + if (err) { + dev_err(&pdev->dev, "Interrupt request failed: %d\n", err); + return err; + } reset_control_assert(tegra->reset); @@ -636,8 +834,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev) reset_control_deassert(tegra->reset); - tegra->max_freq = clk_round_rate(tegra->emc_clock, ULONG_MAX) / KHZ; - tegra->cur_freq = clk_get_rate(tegra->emc_clock) / KHZ; + rate = clk_round_rate(tegra->emc_clock, ULONG_MAX); + if (rate < 0) { + dev_err(&pdev->dev, "Failed to round clock rate: %ld\n", rate); + return rate; + } + + tegra->max_freq = rate / KHZ; for (i = 0; i < ARRAY_SIZE(actmon_device_configs); i++) { dev = tegra->devices + i; @@ -648,7 +851,14 @@ static int tegra_devfreq_probe(struct platform_device *pdev) for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) { rate = clk_round_rate(tegra->emc_clock, rate); - err = dev_pm_opp_add(&pdev->dev, rate, 0); + if (rate < 0) { + dev_err(&pdev->dev, + "Failed to round clock rate: %ld\n", rate); + err = rate; + goto remove_opps; + } + + err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0); if (err) { dev_err(&pdev->dev, "Failed to add OPP: %d\n", err); goto remove_opps; @@ -657,49 +867,33 @@ static int tegra_devfreq_probe(struct platform_device *pdev) platform_set_drvdata(pdev, tegra); - tegra->rate_change_nb.notifier_call = tegra_actmon_rate_notify_cb; - err = clk_notifier_register(tegra->emc_clock, &tegra->rate_change_nb); - if (err) { - dev_err(&pdev->dev, - "Failed to register rate change notifier\n"); - goto remove_opps; - } + tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb; + tegra->cpu_rate_change_nb.notifier_call = tegra_actmon_cpu_notify_cb; + + INIT_DELAYED_WORK(&tegra->cpufreq_update_work, + tegra_actmon_delayed_update); err = devfreq_add_governor(&tegra_devfreq_governor); if (err) { dev_err(&pdev->dev, "Failed to add governor: %d\n", err); - goto unreg_notifier; + goto remove_opps; } tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock); - tegra->devfreq = devfreq_add_device(&pdev->dev, - &tegra_devfreq_profile, - "tegra_actmon", - NULL); - if (IS_ERR(tegra->devfreq)) { - err = PTR_ERR(tegra->devfreq); - goto remove_governor; - } + tegra_devfreq_profile.initial_freq /= KHZ; - err = devm_request_threaded_irq(&pdev->dev, tegra->irq, NULL, - actmon_thread_isr, IRQF_ONESHOT, - "tegra-devfreq", tegra); - if (err) { - dev_err(&pdev->dev, "Interrupt request failed: %d\n", err); - goto remove_devfreq; + devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile, + "tegra_actmon", NULL); + if (IS_ERR(devfreq)) { + err = PTR_ERR(devfreq); + goto remove_governor; } return 0; -remove_devfreq: - devfreq_remove_device(tegra->devfreq); - remove_governor: devfreq_remove_governor(&tegra_devfreq_governor); -unreg_notifier: - clk_notifier_unregister(tegra->emc_clock, &tegra->rate_change_nb); - remove_opps: dev_pm_opp_remove_all_dynamic(&pdev->dev); @@ -716,7 +910,6 @@ static int tegra_devfreq_remove(struct platform_device *pdev) devfreq_remove_device(tegra->devfreq); devfreq_remove_governor(&tegra_devfreq_governor); - clk_notifier_unregister(tegra->emc_clock, &tegra->rate_change_nb); dev_pm_opp_remove_all_dynamic(&pdev->dev); reset_control_reset(tegra->reset); diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h index 2f0b092d6dcc..c5ba13fae399 100644 --- a/drivers/mmc/host/tmio_mmc.h +++ b/drivers/mmc/host/tmio_mmc.h @@ -163,7 +163,6 @@ struct tmio_mmc_host { unsigned long last_req_ts; struct mutex ios_lock; /* protect set_ios() context */ bool native_hotplug; - bool runtime_synced; bool sdio_irq_enabled; /* Mandatory callback */ diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c index 9b6e1001e77c..86b591100f16 100644 --- a/drivers/mmc/host/tmio_mmc_core.c +++ b/drivers/mmc/host/tmio_mmc_core.c @@ -39,6 +39,7 @@ #include <linux/module.h> #include <linux/pagemap.h> #include <linux/platform_device.h> +#include <linux/pm_domain.h> #include <linux/pm_qos.h> #include <linux/pm_runtime.h> #include <linux/regulator/consumer.h> @@ -1248,10 +1249,12 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host) /* See if we also get DMA */ tmio_mmc_request_dma(_host, pdata); + dev_pm_domain_start(&pdev->dev); + pm_runtime_get_noresume(&pdev->dev); + pm_runtime_set_active(&pdev->dev); pm_runtime_set_autosuspend_delay(&pdev->dev, 50); pm_runtime_use_autosuspend(&pdev->dev); pm_runtime_enable(&pdev->dev); - pm_runtime_get_sync(&pdev->dev); ret = mmc_add_host(mmc); if (ret) @@ -1333,11 +1336,6 @@ int tmio_mmc_host_runtime_resume(struct device *dev) { struct tmio_mmc_host *host = dev_get_drvdata(dev); - if (!host->runtime_synced) { - host->runtime_synced = true; - return 0; - } - tmio_mmc_clk_enable(host); tmio_mmc_hw_reset(host->mmc); diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 9ff0538ee83a..be7a7d332332 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -2103,6 +2103,75 @@ put_table: } /** + * dev_pm_opp_adjust_voltage() - helper to change the voltage of an OPP + * @dev: device for which we do this operation + * @freq: OPP frequency to adjust voltage of + * @u_volt: new OPP target voltage + * @u_volt_min: new OPP min voltage + * @u_volt_max: new OPP max voltage + * + * Return: -EINVAL for bad pointers, -ENOMEM if no memory available for the + * copy operation, returns 0 if no modifcation was done OR modification was + * successful. + */ +int dev_pm_opp_adjust_voltage(struct device *dev, unsigned long freq, + unsigned long u_volt, unsigned long u_volt_min, + unsigned long u_volt_max) + +{ + struct opp_table *opp_table; + struct dev_pm_opp *tmp_opp, *opp = ERR_PTR(-ENODEV); + int r = 0; + + /* Find the opp_table */ + opp_table = _find_opp_table(dev); + if (IS_ERR(opp_table)) { + r = PTR_ERR(opp_table); + dev_warn(dev, "%s: Device OPP not found (%d)\n", __func__, r); + return r; + } + + mutex_lock(&opp_table->lock); + + /* Do we have the frequency? */ + list_for_each_entry(tmp_opp, &opp_table->opp_list, node) { + if (tmp_opp->rate == freq) { + opp = tmp_opp; + break; + } + } + + if (IS_ERR(opp)) { + r = PTR_ERR(opp); + goto adjust_unlock; + } + + /* Is update really needed? */ + if (opp->supplies->u_volt == u_volt) + goto adjust_unlock; + + opp->supplies->u_volt = u_volt; + opp->supplies->u_volt_min = u_volt_min; + opp->supplies->u_volt_max = u_volt_max; + + dev_pm_opp_get(opp); + mutex_unlock(&opp_table->lock); + + /* Notify the voltage change of the OPP */ + blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ADJUST_VOLTAGE, + opp); + + dev_pm_opp_put(opp); + goto adjust_put_table; + +adjust_unlock: + mutex_unlock(&opp_table->lock); +adjust_put_table: + dev_pm_opp_put_opp_table(opp_table); + return r; +} + +/** * dev_pm_opp_enable() - Enable a specific OPP * @dev: device for which we do this operation * @freq: OPP frequency to enable diff --git a/drivers/power/avs/smartreflex.c b/drivers/power/avs/smartreflex.c index 4684e7df833a..5376f3d22f31 100644 --- a/drivers/power/avs/smartreflex.c +++ b/drivers/power/avs/smartreflex.c @@ -905,7 +905,7 @@ static int omap_sr_probe(struct platform_device *pdev) sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir); debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir, - (void *)sr_info, &pm_sr_fops); + sr_info, &pm_sr_fops); debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir, &sr_info->err_weight); debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir, diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index 94ddd7d659c8..a67701ed93e8 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -978,6 +978,8 @@ static const struct x86_cpu_id rapl_ids[] __initconst = { INTEL_CPU_FAM6(ICELAKE_NNPI, rapl_defaults_core), INTEL_CPU_FAM6(ICELAKE_X, rapl_defaults_hsw_server), INTEL_CPU_FAM6(ICELAKE_D, rapl_defaults_hsw_server), + INTEL_CPU_FAM6(COMETLAKE_L, rapl_defaults_core), + INTEL_CPU_FAM6(COMETLAKE, rapl_defaults_core), INTEL_CPU_FAM6(ATOM_SILVERMONT, rapl_defaults_byt), INTEL_CPU_FAM6(ATOM_AIRMONT, rapl_defaults_cht), diff --git a/include/dt-bindings/pmu/exynos_ppmu.h b/include/dt-bindings/pmu/exynos_ppmu.h new file mode 100644 index 000000000000..8724abe130f3 --- /dev/null +++ b/include/dt-bindings/pmu/exynos_ppmu.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Samsung Exynos PPMU event types for counting in regs + * + * Copyright (c) 2019, Samsung Electronics + * Author: Lukasz Luba <l.luba@partner.samsung.com> + */ + +#ifndef __DT_BINDINGS_PMU_EXYNOS_PPMU_H +#define __DT_BINDINGS_PMU_EXYNOS_PPMU_H + +#define PPMU_RO_BUSY_CYCLE_CNT 0x0 +#define PPMU_WO_BUSY_CYCLE_CNT 0x1 +#define PPMU_RW_BUSY_CYCLE_CNT 0x2 +#define PPMU_RO_REQUEST_CNT 0x3 +#define PPMU_WO_REQUEST_CNT 0x4 +#define PPMU_RO_DATA_CNT 0x5 +#define PPMU_WO_DATA_CNT 0x6 +#define PPMU_RO_LATENCY 0x12 +#define PPMU_WO_LATENCY 0x16 +#define PPMU_V2_RO_DATA_CNT 0x4 +#define PPMU_V2_WO_DATA_CNT 0x5 +#define PPMU_V2_EVT3_RW_DATA_CNT 0x22 + +#endif diff --git a/include/linux/cpu.h b/include/linux/cpu.h index bc6c879bd110..1ca2baf817ed 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -184,7 +184,12 @@ void arch_cpu_idle_dead(void); int cpu_report_state(int cpu); int cpu_check_up_prepare(int cpu); void cpu_set_state_online(int cpu); -void play_idle(unsigned long duration_us); +void play_idle_precise(u64 duration_ns, u64 latency_ns); + +static inline void play_idle(unsigned long duration_us) +{ + play_idle_precise(duration_us * NSEC_PER_USEC, U64_MAX); +} #ifdef CONFIG_HOTPLUG_CPU bool cpu_wait_death(unsigned int cpu, int seconds); diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 4b6b5bea8f79..2dbe46b7c213 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -29,10 +29,13 @@ struct cpuidle_driver; * CPUIDLE DEVICE INTERFACE * ****************************/ +#define CPUIDLE_STATE_DISABLED_BY_USER BIT(0) +#define CPUIDLE_STATE_DISABLED_BY_DRIVER BIT(1) + struct cpuidle_state_usage { unsigned long long disable; unsigned long long usage; - unsigned long long time; /* in US */ + u64 time_ns; unsigned long long above; /* Number of times it's been too deep */ unsigned long long below; /* Number of times it's been too shallow */ #ifdef CONFIG_SUSPEND @@ -45,6 +48,8 @@ struct cpuidle_state { char name[CPUIDLE_NAME_LEN]; char desc[CPUIDLE_DESC_LEN]; + u64 exit_latency_ns; + u64 target_residency_ns; unsigned int flags; unsigned int exit_latency; /* in US */ int power_usage; /* in mW */ @@ -80,14 +85,14 @@ struct cpuidle_driver_kobj; struct cpuidle_device { unsigned int registered:1; unsigned int enabled:1; - unsigned int use_deepest_state:1; unsigned int poll_time_limit:1; unsigned int cpu; ktime_t next_hrtimer; int last_state_idx; - int last_residency; + u64 last_residency_ns; u64 poll_limit_ns; + u64 forced_idle_latency_limit_ns; struct cpuidle_state_usage states_usage[CPUIDLE_STATE_MAX]; struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX]; struct cpuidle_driver_kobj *kobj_driver; @@ -144,6 +149,8 @@ extern int cpuidle_register_driver(struct cpuidle_driver *drv); extern struct cpuidle_driver *cpuidle_get_driver(void); extern struct cpuidle_driver *cpuidle_driver_ref(void); extern void cpuidle_driver_unref(void); +extern void cpuidle_driver_state_disabled(struct cpuidle_driver *drv, int idx, + bool disable); extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); extern int cpuidle_register_device(struct cpuidle_device *dev); extern void cpuidle_unregister_device(struct cpuidle_device *dev); @@ -181,6 +188,8 @@ static inline int cpuidle_register_driver(struct cpuidle_driver *drv) static inline struct cpuidle_driver *cpuidle_get_driver(void) {return NULL; } static inline struct cpuidle_driver *cpuidle_driver_ref(void) {return NULL; } static inline void cpuidle_driver_unref(void) {} +static inline void cpuidle_driver_state_disabled(struct cpuidle_driver *drv, + int idx, bool disable) { } static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { } static inline int cpuidle_register_device(struct cpuidle_device *dev) {return -ENODEV; } @@ -204,18 +213,20 @@ static inline struct cpuidle_device *cpuidle_get_device(void) {return NULL; } #ifdef CONFIG_CPU_IDLE extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv, - struct cpuidle_device *dev); + struct cpuidle_device *dev, + u64 latency_limit_ns); extern int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev); -extern void cpuidle_use_deepest_state(bool enable); +extern void cpuidle_use_deepest_state(u64 latency_limit_ns); #else static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv, - struct cpuidle_device *dev) + struct cpuidle_device *dev, + u64 latency_limit_ns) {return -ENODEV; } static inline int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) {return -ENODEV; } -static inline void cpuidle_use_deepest_state(bool enable) +static inline void cpuidle_use_deepest_state(u64 latency_limit_ns) { } #endif @@ -260,7 +271,7 @@ struct cpuidle_governor { #ifdef CONFIG_CPU_IDLE extern int cpuidle_register_governor(struct cpuidle_governor *gov); -extern int cpuidle_governor_latency_req(unsigned int cpu); +extern s64 cpuidle_governor_latency_req(unsigned int cpu); #else static inline int cpuidle_register_governor(struct cpuidle_governor *gov) {return 0;} diff --git a/include/linux/pm.h b/include/linux/pm.h index 4c441be03079..e057d1fa2469 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -637,6 +637,7 @@ extern void dev_pm_put_subsys_data(struct device *dev); * struct dev_pm_domain - power management domain representation. * * @ops: Power management operations associated with this domain. + * @start: Called when a user needs to start the device via the domain. * @detach: Called when removing a device from the domain. * @activate: Called before executing probe routines for bus types and drivers. * @sync: Called after successful driver probe. @@ -648,6 +649,7 @@ extern void dev_pm_put_subsys_data(struct device *dev); */ struct dev_pm_domain { struct dev_pm_ops ops; + int (*start)(struct device *dev); void (*detach)(struct device *dev, bool power_off); int (*activate)(struct device *dev); void (*sync)(struct device *dev); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index baf02ff91a31..5a31c711b896 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -366,6 +366,7 @@ struct device *dev_pm_domain_attach_by_id(struct device *dev, struct device *dev_pm_domain_attach_by_name(struct device *dev, const char *name); void dev_pm_domain_detach(struct device *dev, bool power_off); +int dev_pm_domain_start(struct device *dev); void dev_pm_domain_set(struct device *dev, struct dev_pm_domain *pd); #else static inline int dev_pm_domain_attach(struct device *dev, bool power_on) @@ -383,6 +384,10 @@ static inline struct device *dev_pm_domain_attach_by_name(struct device *dev, return NULL; } static inline void dev_pm_domain_detach(struct device *dev, bool power_off) {} +static inline int dev_pm_domain_start(struct device *dev) +{ + return 0; +} static inline void dev_pm_domain_set(struct device *dev, struct dev_pm_domain *pd) {} #endif diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h index b8197ab014f2..747861816f4f 100644 --- a/include/linux/pm_opp.h +++ b/include/linux/pm_opp.h @@ -22,6 +22,7 @@ struct opp_table; enum dev_pm_opp_event { OPP_EVENT_ADD, OPP_EVENT_REMOVE, OPP_EVENT_ENABLE, OPP_EVENT_DISABLE, + OPP_EVENT_ADJUST_VOLTAGE, }; /** @@ -113,6 +114,10 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, void dev_pm_opp_remove(struct device *dev, unsigned long freq); void dev_pm_opp_remove_all_dynamic(struct device *dev); +int dev_pm_opp_adjust_voltage(struct device *dev, unsigned long freq, + unsigned long u_volt, unsigned long u_volt_min, + unsigned long u_volt_max); + int dev_pm_opp_enable(struct device *dev, unsigned long freq); int dev_pm_opp_disable(struct device *dev, unsigned long freq); @@ -242,6 +247,14 @@ static inline void dev_pm_opp_remove_all_dynamic(struct device *dev) { } +static inline int +dev_pm_opp_adjust_voltage(struct device *dev, unsigned long freq, + unsigned long u_volt, unsigned long u_volt_min, + unsigned long u_volt_max) +{ + return 0; +} + static inline int dev_pm_opp_enable(struct device *dev, unsigned long freq) { return 0; diff --git a/include/linux/power/smartreflex.h b/include/linux/power/smartreflex.h index d0b37e937037..971c9264179e 100644 --- a/include/linux/power/smartreflex.h +++ b/include/linux/power/smartreflex.h @@ -293,6 +293,9 @@ struct omap_sr_data { struct voltagedomain *voltdm; }; + +extern struct omap_sr_data omap_sr_pdata[OMAP_SR_NR]; + #ifdef CONFIG_POWER_AVS_OMAP /* Smartreflex module enable/disable interface */ diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 83105874f255..26b9168321e7 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -734,8 +734,15 @@ zone_found: * We have found the zone. Now walk the radix tree to find the leaf node * for our PFN. */ + + /* + * If the zone we wish to scan is the the current zone and the + * pfn falls into the current node then we do not need to walk + * the tree. + */ node = bm->cur.node; - if (((pfn - zone->start_pfn) & ~BM_BLOCK_MASK) == bm->cur.node_pfn) + if (zone == bm->cur.zone && + ((pfn - zone->start_pfn) & ~BM_BLOCK_MASK) == bm->cur.node_pfn) goto node_found; node = zone->rtree; diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 428cd05c0b5d..ffa959e91227 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -104,7 +104,7 @@ static int call_cpuidle(struct cpuidle_driver *drv, struct cpuidle_device *dev, * update no idle residency and return. */ if (current_clr_polling_and_test()) { - dev->last_residency = 0; + dev->last_residency_ns = 0; local_irq_enable(); return -EBUSY; } @@ -165,7 +165,9 @@ static void cpuidle_idle_call(void) * until a proper wakeup interrupt happens. */ - if (idle_should_enter_s2idle() || dev->use_deepest_state) { + if (idle_should_enter_s2idle() || dev->forced_idle_latency_limit_ns) { + u64 max_latency_ns; + if (idle_should_enter_s2idle()) { rcu_idle_enter(); @@ -176,12 +178,16 @@ static void cpuidle_idle_call(void) } rcu_idle_exit(); + + max_latency_ns = U64_MAX; + } else { + max_latency_ns = dev->forced_idle_latency_limit_ns; } tick_nohz_idle_stop_tick(); rcu_idle_enter(); - next_state = cpuidle_find_deepest_state(drv, dev); + next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns); call_cpuidle(drv, dev, next_state); } else { bool stop_tick = true; @@ -311,7 +317,7 @@ static enum hrtimer_restart idle_inject_timer_fn(struct hrtimer *timer) return HRTIMER_NORESTART; } -void play_idle(unsigned long duration_us) +void play_idle_precise(u64 duration_ns, u64 latency_ns) { struct idle_timer it; @@ -323,29 +329,29 @@ void play_idle(unsigned long duration_us) WARN_ON_ONCE(current->nr_cpus_allowed != 1); WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); - WARN_ON_ONCE(!duration_us); + WARN_ON_ONCE(!duration_ns); rcu_sleep_check(); preempt_disable(); current->flags |= PF_IDLE; - cpuidle_use_deepest_state(true); + cpuidle_use_deepest_state(latency_ns); it.done = 0; hrtimer_init_on_stack(&it.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); it.timer.function = idle_inject_timer_fn; - hrtimer_start(&it.timer, ns_to_ktime(duration_us * NSEC_PER_USEC), + hrtimer_start(&it.timer, ns_to_ktime(duration_ns), HRTIMER_MODE_REL_PINNED); while (!READ_ONCE(it.done)) do_idle(); - cpuidle_use_deepest_state(false); + cpuidle_use_deepest_state(0); current->flags &= ~PF_IDLE; preempt_fold_need_resched(); preempt_enable(); } -EXPORT_SYMBOL_GPL(play_idle); +EXPORT_SYMBOL_GPL(play_idle_precise); void cpu_startup_entry(enum cpuhp_state state) { diff --git a/tools/power/cpupower/ToDo b/tools/power/cpupower/ToDo index 6e8b89f282e6..b196a139a3e4 100644 --- a/tools/power/cpupower/ToDo +++ b/tools/power/cpupower/ToDo @@ -8,3 +8,17 @@ ToDos sorted by priority: - Add another c1e debug idle monitor -> Is by design racy with BIOS, but could be added with a --force option and some "be careful" messages +- Add cpu_start()/cpu_stop() callbacks for monitor + -> This is to move the per_cpu logic from inside the + monitor to outside it. This can be given higher + priority in fork_it. +- Fork as many processes as there are CPUs in case the + per_cpu_schedule flag is set. + -> Bind forked process to each cpu. + -> Execute start measures via the forked processes on + each cpu. + -> Run test executable in a forked process. + -> Execute stop measures via the forked processes on + each cpu. + This would be ideal as it will not introduce noise in the + tested executable. diff --git a/tools/power/cpupower/utils/cpupower-info.c b/tools/power/cpupower/utils/cpupower-info.c index 4c9d342b70ff..d3755ea70d4d 100644 --- a/tools/power/cpupower/utils/cpupower-info.c +++ b/tools/power/cpupower/utils/cpupower-info.c @@ -10,6 +10,7 @@ #include <errno.h> #include <string.h> #include <getopt.h> +#include <sys/utsname.h> #include "helpers/helpers.h" #include "helpers/sysfs.h" @@ -30,6 +31,7 @@ int cmd_info(int argc, char **argv) extern char *optarg; extern int optind, opterr, optopt; unsigned int cpu; + struct utsname uts; union { struct { @@ -39,6 +41,13 @@ int cmd_info(int argc, char **argv) } params = {}; int ret = 0; + ret = uname(&uts); + if (!ret && (!strcmp(uts.machine, "ppc64le") || + !strcmp(uts.machine, "ppc64"))) { + fprintf(stderr, _("Subcommand not supported on POWER.\n")); + return ret; + } + setlocale(LC_ALL, ""); textdomain(PACKAGE); diff --git a/tools/power/cpupower/utils/cpupower-set.c b/tools/power/cpupower/utils/cpupower-set.c index 3cd95c6cb974..3cca6f715dd9 100644 --- a/tools/power/cpupower/utils/cpupower-set.c +++ b/tools/power/cpupower/utils/cpupower-set.c @@ -10,6 +10,7 @@ #include <errno.h> #include <string.h> #include <getopt.h> +#include <sys/utsname.h> #include "helpers/helpers.h" #include "helpers/sysfs.h" @@ -31,6 +32,7 @@ int cmd_set(int argc, char **argv) extern char *optarg; extern int optind, opterr, optopt; unsigned int cpu; + struct utsname uts; union { struct { @@ -41,6 +43,13 @@ int cmd_set(int argc, char **argv) int perf_bias = 0; int ret = 0; + ret = uname(&uts); + if (!ret && (!strcmp(uts.machine, "ppc64le") || + !strcmp(uts.machine, "ppc64"))) { + fprintf(stderr, _("Subcommand not supported on POWER.\n")); + return ret; + } + setlocale(LC_ALL, ""); textdomain(PACKAGE); diff --git a/tools/power/cpupower/utils/helpers/cpuid.c b/tools/power/cpupower/utils/helpers/cpuid.c index 5cc39d4e23ed..73bfafc60e9b 100644 --- a/tools/power/cpupower/utils/helpers/cpuid.c +++ b/tools/power/cpupower/utils/helpers/cpuid.c @@ -131,6 +131,10 @@ out: if (ext_cpuid_level >= 0x80000007 && (cpuid_edx(0x80000007) & (1 << 9))) cpu_info->caps |= CPUPOWER_CAP_AMD_CBP; + + if (ext_cpuid_level >= 0x80000008 && + cpuid_ebx(0x80000008) & (1 << 4)) + cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU; } if (cpu_info->vendor == X86_VENDOR_INTEL) { diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index 357b19bb136e..c258eeccd05f 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -69,6 +69,7 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL, #define CPUPOWER_CAP_HAS_TURBO_RATIO 0x00000010 #define CPUPOWER_CAP_IS_SNB 0x00000020 #define CPUPOWER_CAP_INTEL_IDA 0x00000040 +#define CPUPOWER_CAP_AMD_RDPRU 0x00000080 #define CPUPOWER_AMD_CPBDIS 0x02000000 diff --git a/tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c b/tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c index 3f893b99b337..33dc34db4f3c 100644 --- a/tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c +++ b/tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c @@ -328,7 +328,7 @@ struct cpuidle_monitor amd_fam14h_monitor = { .stop = amd_fam14h_stop, .do_register = amd_fam14h_register, .unregister = amd_fam14h_unregister, - .needs_root = 1, + .flags.needs_root = 1, .overflow_s = OVERFLOW_MS / 1000, }; #endif /* #if defined(__i386__) || defined(__x86_64__) */ diff --git a/tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c b/tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c index f634aeb65c5f..3c4cee160b0e 100644 --- a/tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c +++ b/tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c @@ -207,6 +207,6 @@ struct cpuidle_monitor cpuidle_sysfs_monitor = { .stop = cpuidle_stop, .do_register = cpuidle_register, .unregister = cpuidle_unregister, - .needs_root = 0, + .flags.needs_root = 0, .overflow_s = UINT_MAX, }; diff --git a/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c b/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c index d3c3e6e7aa26..6d44fec55ad5 100644 --- a/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c +++ b/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c @@ -408,7 +408,7 @@ int cmd_monitor(int argc, char **argv) dprint("Try to register: %s\n", all_monitors[num]->name); test_mon = all_monitors[num]->do_register(); if (test_mon) { - if (test_mon->needs_root && !run_as_root) { + if (test_mon->flags.needs_root && !run_as_root) { fprintf(stderr, _("Available monitor %s needs " "root access\n"), test_mon->name); continue; diff --git a/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h b/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h index a2d901d3bfaf..5b5eb1da0cce 100644 --- a/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h +++ b/tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h @@ -60,7 +60,10 @@ struct cpuidle_monitor { struct cpuidle_monitor* (*do_register) (void); void (*unregister)(void); unsigned int overflow_s; - int needs_root; + struct { + unsigned int needs_root:1; + unsigned int per_cpu_schedule:1; + } flags; }; extern long long timespec_diff_us(struct timespec start, struct timespec end); diff --git a/tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c b/tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c index 7c7451d3f494..97ad3233a521 100644 --- a/tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c +++ b/tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c @@ -39,7 +39,6 @@ static cstate_t hsw_ext_cstates[HSW_EXT_CSTATE_COUNT] = { { .name = "PC9", .desc = N_("Processor Package C9"), - .desc = N_("Processor Package C2"), .id = PC9, .range = RANGE_PACKAGE, .get_count_percent = hsw_ext_get_count_percent, @@ -188,7 +187,7 @@ struct cpuidle_monitor intel_hsw_ext_monitor = { .stop = hsw_ext_stop, .do_register = hsw_ext_register, .unregister = hsw_ext_unregister, - .needs_root = 1, + .flags.needs_root = 1, .overflow_s = 922000000 /* 922337203 seconds TSC overflow at 20GHz */ }; diff --git a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c index 44806a6dae11..e7d48cb563c0 100644 --- a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c +++ b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c @@ -19,6 +19,10 @@ #define MSR_APERF 0xE8 #define MSR_MPERF 0xE7 +#define RDPRU ".byte 0x0f, 0x01, 0xfd" +#define RDPRU_ECX_MPERF 0 +#define RDPRU_ECX_APERF 1 + #define MSR_TSC 0x10 #define MSR_AMD_HWCR 0xc0010015 @@ -86,15 +90,51 @@ static int mperf_get_tsc(unsigned long long *tsc) return ret; } +static int get_aperf_mperf(int cpu, unsigned long long *aval, + unsigned long long *mval) +{ + unsigned long low_a, high_a; + unsigned long low_m, high_m; + int ret; + + /* + * Running on the cpu from which we read the registers will + * prevent APERF/MPERF from going out of sync because of IPI + * latency introduced by read_msr()s. + */ + if (mperf_monitor.flags.per_cpu_schedule) { + if (bind_cpu(cpu)) + return 1; + } + + if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_RDPRU) { + asm volatile(RDPRU + : "=a" (low_a), "=d" (high_a) + : "c" (RDPRU_ECX_APERF)); + asm volatile(RDPRU + : "=a" (low_m), "=d" (high_m) + : "c" (RDPRU_ECX_MPERF)); + + *aval = ((low_a) | (high_a) << 32); + *mval = ((low_m) | (high_m) << 32); + + return 0; + } + + ret = read_msr(cpu, MSR_APERF, aval); + ret |= read_msr(cpu, MSR_MPERF, mval); + + return ret; +} + static int mperf_init_stats(unsigned int cpu) { - unsigned long long val; + unsigned long long aval, mval; int ret; - ret = read_msr(cpu, MSR_APERF, &val); - aperf_previous_count[cpu] = val; - ret |= read_msr(cpu, MSR_MPERF, &val); - mperf_previous_count[cpu] = val; + ret = get_aperf_mperf(cpu, &aval, &mval); + aperf_previous_count[cpu] = aval; + mperf_previous_count[cpu] = mval; is_valid[cpu] = !ret; return 0; @@ -102,13 +142,12 @@ static int mperf_init_stats(unsigned int cpu) static int mperf_measure_stats(unsigned int cpu) { - unsigned long long val; + unsigned long long aval, mval; int ret; - ret = read_msr(cpu, MSR_APERF, &val); - aperf_current_count[cpu] = val; - ret |= read_msr(cpu, MSR_MPERF, &val); - mperf_current_count[cpu] = val; + ret = get_aperf_mperf(cpu, &aval, &mval); + aperf_current_count[cpu] = aval; + mperf_current_count[cpu] = mval; is_valid[cpu] = !ret; return 0; @@ -305,6 +344,9 @@ struct cpuidle_monitor *mperf_register(void) if (init_maxfreq_mode()) return NULL; + if (cpupower_cpu_info.vendor == X86_VENDOR_AMD) + mperf_monitor.flags.per_cpu_schedule = 1; + /* Free this at program termination */ is_valid = calloc(cpu_count, sizeof(int)); mperf_previous_count = calloc(cpu_count, sizeof(unsigned long long)); @@ -333,7 +375,7 @@ struct cpuidle_monitor mperf_monitor = { .stop = mperf_stop, .do_register = mperf_register, .unregister = mperf_unregister, - .needs_root = 1, + .flags.needs_root = 1, .overflow_s = 922000000 /* 922337203 seconds TSC overflow at 20GHz */ }; diff --git a/tools/power/cpupower/utils/idle_monitor/nhm_idle.c b/tools/power/cpupower/utils/idle_monitor/nhm_idle.c index be7256696a37..114271165182 100644 --- a/tools/power/cpupower/utils/idle_monitor/nhm_idle.c +++ b/tools/power/cpupower/utils/idle_monitor/nhm_idle.c @@ -208,7 +208,7 @@ struct cpuidle_monitor intel_nhm_monitor = { .stop = nhm_stop, .do_register = intel_nhm_register, .unregister = intel_nhm_unregister, - .needs_root = 1, + .flags.needs_root = 1, .overflow_s = 922000000 /* 922337203 seconds TSC overflow at 20GHz */ }; diff --git a/tools/power/cpupower/utils/idle_monitor/snb_idle.c b/tools/power/cpupower/utils/idle_monitor/snb_idle.c index 968333571cad..df8b223cc096 100644 --- a/tools/power/cpupower/utils/idle_monitor/snb_idle.c +++ b/tools/power/cpupower/utils/idle_monitor/snb_idle.c @@ -192,7 +192,7 @@ struct cpuidle_monitor intel_snb_monitor = { .stop = snb_stop, .do_register = snb_register, .unregister = snb_unregister, - .needs_root = 1, + .flags.needs_root = 1, .overflow_s = 922000000 /* 922337203 seconds TSC overflow at 20GHz */ }; |