summaryrefslogtreecommitdiffstats
path: root/tools/power/cpupower/utils
Commit message (Collapse)AuthorAgeFilesLines
* pm: cpupower: remove hard-coded topology depth valuesShuah Khan2025-03-061-11/+29
| | | | | | | | | Remove hard-coded topology depth values and replace them with defines to improve code readability and maintainability in cpupower-monitor code. Link: https://lore.kernel.org/r/20250305225342.19447-3-skhan@linuxfoundation.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* pm: cpupower: Fix cmd_monitor() error legs to free cpu_topologyShuah Khan2025-03-061-0/+2
| | | | | | | | | | cmd_monitor() calls get_cpu_topology() to allocate memory for cpu topology and fails to release in error legs. Fix it to call cpu_topology_release() from error legs. Link: https://lore.kernel.org/r/20250305225342.19447-2-skhan@linuxfoundation.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: monitor: Exit with error status if execvp() failYiwei Lin2025-02-201-1/+5
| | | | | | | | | | | | In the case that we give a invalid command to idle_monitor for monitoring, the execvp() will fail and thus go to the next line. As a result, we'll see two differnt monitoring output. For example, running `cpupower monitor -i 5 invalidcmd` which `invalidcmd` is not executable. Link: https://lore.kernel.org/r/20250220163846.2765-1-s921975628@gmail.com Signed-off-by: Yiwei Lin <s921975628@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Adjust whitespace for amd-pstate specific printsMario Limonciello2024-12-201-4/+6
| | | | | | | | | | The amd-pstate section is grouped under boost, which isn't appropriate. Adjust the indentation so that it is it's own section. Link: https://lore.kernel.org/r/20241218191144.3440854-8-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Don't fetch maximum latency when EPP is enabledMario Limonciello2024-12-201-0/+3
| | | | | | | | | | | | | When EPP has been enabled the hardware will autonomously change frequencies on it's own and thus there is no latency with changing from the kernel. Avoid doing the maximum latency check when EPP is found. This will apply to both amd-pstate and intel-pstate drivers. Link: https://lore.kernel.org/r/20241218191144.3440854-7-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add support for showing energy performance preferenceMario Limonciello2024-12-201-1/+24
| | | | | | | | | The EPP value is useful for characterization of performance. Show it in cpupower frequency-info output. Link: https://lore.kernel.org/r/20241218191144.3440854-6-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Don't try to read frequency from hardware when kernel uses aperfmperfMario Limonciello2024-12-201-1/+6
| | | | | | | | | | | | | When the amd-pstate is in use frequency is set by the hardware and measured by the kernel through using the aperf and mperf registers. There is no direct call to the hardware to indicate current frequency. Detect that this feature is in use and skip the check. Link: https://lore.kernel.org/r/20241218191144.3440854-5-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add support for amd-pstate preferred core rankingsMario Limonciello2024-12-201-0/+8
| | | | | | | | | The rankings are useful information to determine if the scheduler is placing tasks appropriately for the hardware. Link: https://lore.kernel.org/r/20241218191144.3440854-4-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Remove spurious return statementMario Limonciello2024-12-201-1/+0
| | | | | | | | | print_duration() has a return; statement at the end of the function that is not necessary as it's a void function. Link: https://lore.kernel.org/r/20241218191144.3440854-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: fix TSC MHz calculationHe Rongguang2024-12-161-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 'cpupower: Make TSC read per CPU for Mperf monitor' (c2adb1877b7) changes TSC counter reads per cpu, but left time diff global (from start of all cpus to end of all cpus), thus diff(time) is too large for a cpu's tsc counting, resulting in far less than acutal TSC_Mhz and thus `cpupower monitor` showing far less than actual cpu realtime frequency. /proc/cpuinfo shows frequency: cat /proc/cpuinfo | egrep -e 'processor' -e 'MHz' ... processor : 171 cpu MHz : 4108.498 ... before fix (System 100% busy): | Mperf || Idle_Stats CPU| C0 | Cx | Freq || POLL | C1 | C2 171| 0.77| 99.23| 2279|| 0.00| 0.00| 0.00 after fix (System 100% busy): | Mperf || Idle_Stats CPU| C0 | Cx | Freq || POLL | C1 | C2 171| 0.46| 99.54| 4095|| 0.00| 0.00| 0.00 Fixes: c2adb1877b76 ("cpupower: Make TSC read per CPU for Mperf monitor") Signed-off-by: He Rongguang <herongguang@linux.alibaba.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: revise is_valid flag handling for idle_monitorwangfushuai2024-12-094-6/+6
| | | | | | | | | | | The is_valid flag should reflect the validity state of both the XXX_start and XXX_stop functions. But the use of '=' in XXX_stop overwrites the validity state set by XXX_start. This commit changes '=' to '|=' in XXX_stop to preserve and combine the validity state of XXX_start and XXX_stop. Signed-off-by: wangfushuai <wangfushuai@baidu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* tools/cpupower: display residency value in idle-infoAboorva Devarajan2024-08-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update cpuidle tool to display the residency value of cpuidle states. This addition provides a clearer and more detailed view of idle state information when using cpuidle-info. -------------------------------- Before Patch: -------------------------------- $ cpupower idle-info CPUidle driver: intel_idle CPUidle governor: menu analyzing CPU 28: Number of idle states: 3 Available idle states: POLL C1 C1E POLL: Flags/Description: CPUIDLE CORE POLL IDLE Latency: 0 Usage: 7448 Duration: 207170 C1: Flags/Description: MWAIT 0x00 Latency: 2 Usage: 7023 Duration: 3736853 C1E: Flags/Description: MWAIT 0x01 Latency: 10 Usage: 18468 Duration: 11396212 -------------------------------- After Patch: -------------------------------- $ cpupower idle-info CPUidle driver: intel_idle CPUidle governor: menu analyzing CPU 12: Number of idle states: 3 Available idle states: POLL C1 C1E POLL: Flags/Description: CPUIDLE CORE POLL IDLE Latency: 0 Residency: 0 Usage: 1950 Duration: 38458 C1: Flags/Description: MWAIT 0x00 Latency: 2 Residency: 2 Usage: 10688 Duration: 7133020 C1E: Flags/Description: MWAIT 0x01 Latency: 10 Residency: 20 Usage: 22356 Duration: 15687259 -------------------------------- Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Change the var type of the 'monitor' subcommand display modeRoman Storozhenko2024-06-201-1/+1
| | | | | | | | | | | | | There is a type 'enum operation_mode_e' contains the display modes of the 'monitor' subcommand. This type isn't used though, instead the variable 'mode' is of a simple 'int' type. Change 'mode' variable type from 'int' to 'enum operation_mode_e' in order to improve compiler type checking. Built and tested this with different monitor cmdline params. Everything works as expected, that is nothing changed and no regressions encountered. Signed-off-by: Roman Storozhenko <romeusmeister@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUsDhananjay Ugwekar2024-05-281-3/+23
| | | | | | | | | | | | | Update cpupower's P-State frequency calculation and reporting with AMD Family 1Ah+ processors, when using the acpi-cpufreq driver. This is due to a change in the PStateDef MSR layout in AMD Family 1Ah+. Tested on 4th and 5th Gen AMD EPYC system Signed-off-by: Ananth Narayan <Ananth.Narayan@amd.com> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.Likhitha Korrapati2023-07-181-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For both the d and e options in 'cpupower idle_set' command, an atoi() conversion is done without checking if the input argument is all numeric. So, an atoi conversion is done on any character provided as input and the CPU idle_set operation continues with that integer value, which may not be what is intended or entirely correct. The output of cpuidle-set before patch is as follows: [root@xxx cpupower]# cpupower idle-set -e 1$ Idlestate 1 enabled on CPU 0 [snip] Idlestate 1 enabled on CPU 47 [root@xxx cpupower]# cpupower idle-set -e 11 Idlestate 11 not available on CPU 0 [snip] Idlestate 11 not available on CPU 47 [root@xxx cpupower]# cpupower idle-set -d 12 Idlestate 12 not available on CPU 0 [snip] Idlestate 12 not available on CPU 47 [root@xxx cpupower]# cpupower idle-set -d qw Idlestate 0 disabled on CPU 0 [snip] Idlestate 0 disabled on CPU 47 This patch adds a check for both d and e options in cpuidle-set.c to see that the idle_set value is all numeric before doing a string-to-int conversion using strtol(). The output of cpuidle-set after the patch is as below: [root@xxx cpupower]# ./cpupower idle-set -e 1$ Bad idle_set value: 1$. Integer expected [root@xxx cpupower]# ./cpupower idle-set -e 11 Idlestate 11 not available on CPU 0 [snip] Idlestate 11 not available on CPU 47 [root@xxx cpupower]# ./cpupower idle-set -d 12 Idlestate 12 not available on CPU 0 [snip] Idlestate 12 not available on CPU 47 [root@xxx cpupower]# ./cpupower idle-set -d qw Bad idle_set value: qw. Integer expected Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com> Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com> Tested-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add turbo-boost support in cpupowerWyes Karny2023-07-183-1/+42
| | | | | | | | | | | | | | | | | | | | If boost sysfs (/sys/devices/system/cpu/cpufreq/boost) file is present turbo-boost is feature is supported in the hardware. By default this feature should be enabled. But to disable/enable it write to the sysfs file. Use the same to control this feature via cpupower. To enable: cpupower set --turbo-boost 1 To disable: cpupower set --turbo-boost 0 Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Tested-by: Perry Yuan <Perry.Yuan@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add support for amd_pstate mode changeWyes Karny2023-07-183-2/+43
| | | | | | | | | | | | | | | amd_pstate supports changing of its mode dynamically via `status` sysfs file. Add the same capability in cpupower. To change the mode to active mode use below command: cpupower set --amd-pstate-mode active Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Tested-by: Perry Yuan <Perry.Yuan@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add EPP value change supportWyes Karny2023-07-183-1/+46
| | | | | | | | | | | | | | | | amd_pstate and intel_pstate active mode drivers support energy performance preference feature. Through this user can convey it's energy/performance preference to platform. Add this value change capability to cpupower. To change the EPP value use below command: cpupower set --epp performance Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Tested-by: Perry Yuan <Perry.Yuan@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Recognise amd-pstate active mode driverWyes Karny2023-07-181-1/+1
| | | | | | | | | | | | | amd-pstate active mode driver name is "amd-pstate-epp". Use common prefix for string matching condition to recognise amd-pstate active mode driver. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Perry Yuan <Perry.Yuan@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Make TSC read per CPU for Mperf monitorWyes Karny2023-05-081-17/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | System-wide TSC read could cause a drift in C0 percentage calculation. Because if first TSC is read and then one by one mperf is read for all cpus, this introduces drift between mperf reading of later CPUs and TSC reading. To lower this drift read TSC per CPU and also just after mperf read. This technique improves C0 percentage calculation in Mperf monitor. Before fix: (System 100% busy) | Mperf || RAPL || Idle_Stats PKG|CORE| CPU| C0 | Cx | Freq || pack | core || POLL | C1 | C2 0| 0| 0| 87.15| 12.85| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 0| 256| 84.62| 15.38| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 1| 1| 87.15| 12.85| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 1| 257| 84.08| 15.92| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 2| 2| 86.61| 13.39| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 2| 258| 83.26| 16.74| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 3| 3| 86.61| 13.39| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 3| 259| 83.60| 16.40| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 4| 4| 86.33| 13.67| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 4| 260| 83.33| 16.67| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 5| 5| 86.06| 13.94| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 5| 261| 83.05| 16.95| 2695||168659003|3970468|| 0.00| 0.00| 0.00 0| 6| 6| 85.51| 14.49| 2695||168659003|3970468|| 0.00| 0.00| 0.00 After fix: (System 100% busy) | Mperf || RAPL || Idle_Stats PKG|CORE| CPU| C0 | Cx | Freq || pack | core || POLL | C1 | C2 0| 0| 0| 98.03| 1.97| 2415||163295480|3811189|| 0.00| 0.00| 0.00 0| 0| 256| 98.50| 1.50| 2394||163295480|3811189|| 0.00| 0.00| 0.00 0| 1| 1| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 1| 257| 99.99| 0.01| 2375||163295480|3811189|| 0.00| 0.00| 0.00 0| 2| 2| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 2| 258|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 3| 3|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 3| 259| 99.99| 0.01| 2435||163295480|3811189|| 0.00| 0.00| 0.00 0| 4| 4|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 4| 260|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00 0| 5| 5| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 5| 261|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00 0| 6| 6|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00 0| 6| 262|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00 Cc: Thomas Renninger <trenn@suse.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Fixes: 7fe2f6399a84 ("cpupowerutils - cpufrequtils extended with quite some features") Signed-off-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: rapl monitor - shows the used power consumption in uj for each ↵Thomas Renninger2022-11-303-3/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | rapl domain This CPU power monitor shows the power consumption as exposed by the powercap subsystem, cmp with: Documentation/power/powercap/powercap.rst cpupower monitor -m RAPL | RAPL CPU| pack | core | unco 0|6853926|967832|442381 8|6853926|967832|442381 1|6853926|967832|442381 9|6853926|967832|442381 Unfortunately RAPL domains cannot be directly mapped to the corresponding CPU socket/package, core it belongs to. Not sure this is possible at all with the current data exposed from the kernel. Still it can be worthful information for developers trying to optimize power consumption of workloads or their system in general. Signed-off-by: Thomas Renninger <trenn@suse.de> CC: Zhang Rui <rui.zhang@intel.com> CC: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Introduce powercap intel-rapl library and powercap-info commandThomas Renninger2022-11-303-0/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Read out powercap zone information via: cpupower powercap-info and show the zone hierarchy to the user: ./cpupower powercap-info Driver: intel-rapl Powercap domain hierarchy: Zone: package-0 (enabled) Power consumption can be monitored in micro Watts Zone: core (disabled) Power consumption can be monitored in micro Watts Zone: uncore (disabled) Power consumption can be monitored in micro Watts Zone: dram (disabled) Power consumption can be monitored in micro Watts There is a dummy -a option for powercap-info which can/should be used to show more detailed info later. Like that other args can be added easily later as well. A enable/disable option via powercap-set subcommand is also an enhancement for later. Also not all RAPL domains are shown. The func walking through RAPL subdomains is restricted and hardcoded to: "intel-rapl/intel-rapl:0" On my system above powercap domains map to: intel-rapl/intel-rapl:0 -> pack (age-0) intel-rapl/intel-rapl:0/intel-rapl:0:0 -> core intel-rapl/intel-rapl:0/intel-rapl:0:1 -> uncore Missing ones on my system are: intel-rapl-mmio/intel-rapl-mmio:0 -> pack (age-0) intel-rapl/intel-rapl:1 -> psys This could get enhanced in: struct powercap_zone *powercap_init_zones() and adopted to walk through all intel-rapl zones, but also to other powercap drivers like dtpm (Dynamic Thermal Power Management framework), cmp with: drivers/powercap/dtpm_* Signed-off-by: Thomas Renninger <trenn@suse.de> CC: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* tools/cpupower: Choose base_cpu to display default cpupower detailsSaket Kumar Bhaskar2022-11-253-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default output of cpupower info utils shows unexpected output when CPU 0 is disabled. Considering a case where CPU 0 is disabled, output of cpupower idle-info: Before change: cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 0: *is offline After change: ./cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 50: Number of idle states: 2 Available idle states: snooze CEDE snooze: Flags/Description: snooze Latency: 0 Usage: 101748 Duration: 2724058 CEDE: Flags/Description: CEDE Latency: 12 Usage: 270004 Duration: 283019526849 If -c option is not passed, CPU 0 was chosen as the default chosen CPU to display details. However when CPU 0 is offline, it results in showing unexpected output. This commit chooses the base_cpu instead of CPU 0, hence keeping the output more relevant in all cases. The base_cpu is the number of CPU on which the calling thread is currently executing. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add "perf" option to print AMD P-State informationHuang Rui2022-02-231-1/+18
| | | | | | | | | | | | Add "-c --perf" option in cpupower-frequency-info to get the performance and frequency values for AMD P-State. Commit message amended: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add function to print AMD P-State performance capabilitiesHuang Rui2022-02-223-3/+40
| | | | | | | | | | AMD P-State kernel module is using the fine grain frequency instead of acpi hardware pstate. So add a function to print performance and frequency values. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Move print_speed function into misc helperHuang Rui2022-02-223-48/+52
| | | | | | | | | The print_speed can be as a common function, and expose it into misc helper header. Then it can be used on other helper files as well. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Enable boost state support for AMD P-State moduleHuang Rui2022-02-223-0/+25
| | | | | | | | | | | | | | | | | | | | | | | The legacy ACPI hardware P-States function has 3 P-States on ACPI table, the CPU frequency only can be switched between the 3 P-States. While the processor supports the boost state, it will have another boost state that the frequency can be higher than P0 state, and the state can be decoded by the function of decode_pstates() and read by amd_pci_get_num_boost_states(). However, the new AMD P-State function is different than legacy ACPI hardware P-State on AMD processors. That has a finer grain frequency range between the highest and lowest frequency. And boost frequency is actually the frequency which is mapped on highest performance ratio. The similar previous P0 frequency is mapped on nominal performance ratio. If the highest performance on the processor is higher than nominal performance, then we think the current processor supports the boost state. And it uses amd_pstate_boost_init() to initialize boost for AMD P-State function. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add AMD P-State sysfs definition and access helperHuang Rui2022-02-221-0/+30
| | | | | | | | | | | Introduce the marco definitions and access helper function for AMD P-State sysfs interfaces such as each performance goals and frequency levels in amd helper file. They will be used to read the sysfs attribute from AMD P-State cpufreq driver for cpupower utilities. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Initial AMD P-State capabilityHuang Rui2022-02-221-0/+13
| | | | | | | | | | | If kernel starts the AMD P-State module, the cpupower will initial the capability flag as CPUPOWER_CAP_AMD_PSTATE. And once AMD P-State capability is set, it won't need to set legacy ACPI relative capabilities anymore. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add the function to check AMD P-State enabledHuang Rui2022-02-222-0/+28
| | | | | | | | | | | | | The processor with AMD P-State function also supports legacy ACPI hardware P-States feature as well. Once driver sets AMD P-State eanbled, the processor will respond the finer grain AMD P-State feature instead of legacy ACPI P-States. So it introduces the cpupower_amd_pstate_enabled() to check whether the current kernel enables AMD P-State or AMD CPUFreq module. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add AMD P-State capability flagHuang Rui2022-02-221-0/+1
| | | | | | | | | Add AMD P-State capability flag in cpupower to indicate AMD new P-State kernel module support on Ryzen processors. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add cpuid cap flag for MSR_AMD_HWCR supportNathan Fontenot2021-01-263-7/+7
| | | | | | | | | | | | | Remove the family check for accessing the MSR_AMD_HWCR MSR and replace it with a cpupower cap flag. This update also allows for the removal of the local cpupower_cpu_info variable in cpufreq_has_boost_support() since we no longer need it to check the family. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Remove family arg to decode_pstates()Nathan Fontenot2021-01-263-17/+14
| | | | | | | | | | | The decode_pstates() routine no longer uses the CPU family and the caleed routines (get_cof() and get_did()) can grab the family from the global cpupower_cpu_info struct. These update removes passing the family arg to all these routines. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Condense pstate enabled bit checks in decode_pstates()Nathan Fontenot2021-01-261-3/+3
| | | | | | | | | | The enabled bit (bit 63) is common for all families so we can remove the multiple enabled checks based on family and have a common check for HW pstate enabled. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Update family checks when decoding HW pstatesNathan Fontenot2021-01-263-5/+10
| | | | | | | | | | | The family checks in get_cof() and get_did() need to use the correct MSR format depending on the family. Add a cpupower capability for using the pstatedef (family 17h and newer) to control this instead of direct family checks. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Remove unused pscur variable.Nathan Fontenot2021-01-261-8/+1
| | | | | | | | | | | | | The pscur variable is set but not uused, just remove it. This may have previsously been set to validate the MSR_AMD_PSTATE_STATUS MSR. With the addition of the CPUPOWER_CAP_AMD_HW_PSTATE cap flag this is no longer needed since the cpuid bit to enable this cap flag also validates that the MSR_AMD_PSTATE_STATUS MSR is present. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Add CPUPOWER_CAP_AMD_HW_PSTATE cpuid caps flagNathan Fontenot2021-01-263-8/+14
| | | | | | | | | | | | | | | Add a check in get_cpu_info() for the ability to read frequencies from hardware and set the CPUPOWER_CAP_AMD_HW_PSTATE cpuid flag. The cpuid flag is set when CPUID_80000007_EDX[7] is set, which is all families >= 10h. The check excludes family 14h because HW pstate reporting was not implemented on family 14h. This is intended to reduce family checks in the main code paths. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: skhan@linuxfoundation.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Correct macro name for CPB caps flagRobert Richter2021-01-263-3/+3
| | | | | | | | | The name is Core Performance Boost (CPB) for the cpuid flag. Correct cpuid caps flag to use this name (instead of CBP). Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Update msr_pstate union struct namingNathan Fontenot2021-01-261-12/+14
| | | | | | | | | | | | | | | | | | | The msr_pstate union struct named fam17h_bits is misleading since this is the struct to use for all families >= 0x17, not just for family 0x17. Rename the bits structs to be 'pstate' (for pre family 17h CPUs) and 'pstatedef' (for CPUs since fam 17h) to align closer with PPR/BDKG (1) naming. There are no functional changes as part of this update. 1: AMD Processor Programming Reference (PPR) and BIOS and Kernel Developer's Guide (BKDG) available at: http://developer.amd.com/resources/developer-guides-manuals Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: skhan@linuxfoundation.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* Merge tag 'pm-5.11-rc1' of ↵Linus Torvalds2020-12-155-2/+89
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update cpufreq (core and drivers), cpuidle (polling state implementation and the PSCI driver), the OPP (operating performance points) framework, devfreq (core and drivers), the power capping RAPL (Running Average Power Limit) driver, the Energy Model support, the generic power domains (genpd) framework, the ACPI device power management, the core system-wide suspend code and power management utilities. Specifics: - Use local_clock() instead of jiffies in the cpufreq statistics to improve accuracy (Viresh Kumar). - Fix up OPP usage in the cpufreq-dt and qcom-cpufreq-nvmem cpufreq drivers (Viresh Kumar). - Clean up the cpufreq core, the intel_pstate driver and the schedutil cpufreq governor (Rafael Wysocki). - Fix up error code paths in the sti-cpufreq and mediatek cpufreq drivers (Yangtao Li, Qinglang Miao). - Fix cpufreq_online() to return error codes instead of success (0) in all cases when it fails (Wang ShaoBo). - Add mt8167 support to the mediatek cpufreq driver and blacklist mt8516 in the cpufreq-dt-platdev driver (Fabien Parent). - Modify the tegra194 cpufreq driver to always return values from the frequency table as the current frequency and clean up that driver (Sumit Gupta, Jon Hunter). - Modify the arm_scmi cpufreq driver to allow it to discover the power scale present in the performance protocol and provide this information to the Energy Model (Lukasz Luba). - Add missing MODULE_DEVICE_TABLE to several cpufreq drivers (Pali Rohár). - Clean up the CPPC cpufreq driver (Ionela Voinescu). - Fix NVMEM_IMX_OCOTP dependency in the imx cpufreq driver (Arnd Bergmann). - Rework the poling interval selection for the polling state in cpuidle (Mel Gorman). - Enable suspend-to-idle for PSCI OSI mode in the PSCI cpuidle driver (Ulf Hansson). - Modify the OPP framework to support empty (node-less) OPP tables in DT for passing dependency information (Nicola Mazzucato). - Fix potential lockdep issue in the OPP core and clean up the OPP core (Viresh Kumar). - Modify dev_pm_opp_put_regulators() to accept a NULL argument and update its users accordingly (Viresh Kumar). - Add frequency changes tracepoint to devfreq (Matthias Kaehlcke). - Add support for governor feature flags to devfreq, make devfreq sysfs file permissions depend on the governor and clean up the devfreq core (Chanwoo Choi). - Clean up the tegra20 devfreq driver and deprecate it to allow another driver based on EMC_STAT to be used instead of it (Dmitry Osipenko). - Add interconnect support to the tegra30 devfreq driver, allow it to take the interconnect and OPP information from DT and clean it up (Dmitry Osipenko). - Add interconnect support to the exynos-bus devfreq driver along with interconnect properties documentation (Sylwester Nawrocki). - Add suport for AMD Fam17h and Fam19h processors to the RAPL power capping driver (Victor Ding, Kim Phillips). - Fix handling of overly long constraint names in the powercap framework (Lukasz Luba). - Fix the wakeup configuration handling for bridges in the ACPI device power management core (Rafael Wysocki). - Add support for using an abstract scale for power units in the Energy Model (EM) and document it (Lukasz Luba). - Add em_cpu_energy() micro-optimization to the EM (Pavankumar Kondeti). - Modify the generic power domains (genpd) framwework to support suspend-to-idle (Ulf Hansson). - Fix creation of debugfs nodes in genpd (Thierry Strudel). - Clean up genpd (Lina Iyer). - Clean up the core system-wide suspend code and make it print driver flags for devices with debug enabled (Alex Shi, Patrice Chotard, Chen Yu). - Modify the ACPI system reboot code to make it prepare for system power off to avoid confusing the platform firmware (Kai-Heng Feng). - Update the pm-graph (multiple changes, mostly usability-related) and cpupower (online and offline CPU information support) PM utilities (Todd Brandt, Brahadambal Srinivasan)" * tag 'pm-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits) cpufreq: Fix cpufreq_online() return value on errors cpufreq: Fix up several kerneldoc comments cpufreq: stats: Use local_clock() instead of jiffies cpufreq: schedutil: Simplify sugov_update_next_freq() cpufreq: intel_pstate: Simplify intel_cpufreq_update_pstate() PM: domains: create debugfs nodes when adding power domains opp: of: Allow empty opp-table with opp-shared dt-bindings: opp: Allow empty OPP tables media: venus: dev_pm_opp_put_*() accepts NULL argument drm/panfrost: dev_pm_opp_put_*() accepts NULL argument drm/lima: dev_pm_opp_put_*() accepts NULL argument PM / devfreq: exynos: dev_pm_opp_put_*() accepts NULL argument cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_*() accepts NULL argument cpufreq: dt: dev_pm_opp_put_regulators() accepts NULL argument opp: Allow dev_pm_opp_put_*() APIs to accept NULL opp_table opp: Don't create an OPP table from dev_pm_opp_get_opp_table() cpufreq: dt: Don't (ab)use dev_pm_opp_get_opp_table() to create OPP table opp: Reduce the size of critical section in _opp_kref_release() PM / EM: Micro optimization in em_cpu_energy cpufreq: arm_scmi: Discover the power scale in performance protocol ...
| * cpupower: Provide online and offline CPU informationBrahadambal Srinivasan2020-10-265-1/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a user tries to modify cpuidle or cpufreq properties on offline CPUs, the tool returns success (exit status 0) but also does not provide any warning message regarding offline cpus that may have been specified but left unchanged. In case of all or a few CPUs being offline, it can be difficult to keep track of which CPUs didn't get the new frequency or idle state set. Silent failures are difficult to keep track of when there are a huge number of CPUs on which the action is performed. This patch adds helper functions to find both online and offline CPUs and print them out accordingly. We use these helper functions in cpuidle-set and cpufreq-set to print an additional message if the user attempts to modify offline cpus. Reported-by: Pavithra R. Prakash <pavrampu@in.ibm.com> Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* | tools/power/cpupower: Read energy_perf_bias from sysfsBorislav Petkov2020-11-165-34/+54
|/ | | | | | | | | | | | ... instead of poking at the MSR. For that, move the accessor functions to misc.c and add a sysfs-writing function too. There should be no functional changes resulting from this. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Cc: Thomas Renninger <trenn@suse.com> Link: https://lkml.kernel.org/r/20201029190259.3476-2-bp@alien8.de
* tools: Avoid comma separated statementsJoe Perches2020-10-021-5/+9
| | | | | | | Use semicolons and braces. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Fix comparing pointer to 0 coccicheck warnsShuah Khan2020-07-061-3/+3
| | | | | | | | | | | Fix cocciccheck wanrns found by: make coccicheck MODE=report M=tools/power/cpupower/ tools/power/cpupower/utils/helpers/bitmask.c:29:12-13: WARNING comparing pointer to 0, suggest !E tools/power/cpupower/utils/helpers/bitmask.c:29:12-13: WARNING comparing pointer to 0 tools/power/cpupower/utils/helpers/bitmask.c:43:12-13: WARNING comparing pointer to 0 Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Remove unneeded semicolonZou Wei2020-05-087-9/+9
| | | | | | | | | | | | | | | | | | Fixes coccicheck warnings: tools/power/cpupower/utils/cpupower-info.c:65:2-3: Unneeded semicolon tools/power/cpupower/utils/cpupower-set.c:75:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/amd_fam14h_idle.c:120:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c:175:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c:56:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c:75:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c:82:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/nhm_idle.c:94:2-3: Unneeded semicolon tools/power/cpupower/utils/idle_monitor/snb_idle.c:80:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: avoid multiple definition with gcc -fno-commonMike Gilbert2020-03-024-3/+5
| | | | | | | | | | | | | | | | | | | | | Building cpupower with -fno-common in CFLAGS results in errors due to multiple definitions of the 'cpu_count' and 'start_time' variables. ./utils/idle_monitor/snb_idle.o:./utils/idle_monitor/cpupower-monitor.h:28: multiple definition of `cpu_count'; ./utils/idle_monitor/nhm_idle.o:./utils/idle_monitor/cpupower-monitor.h:28: first defined here ... ./utils/idle_monitor/cpuidle_sysfs.o:./utils/idle_monitor/cpuidle_sysfs.c:22: multiple definition of `start_time'; ./utils/idle_monitor/amd_fam14h_idle.o:./utils/idle_monitor/amd_fam14h_idle.c:85: first defined here The -fno-common option will be enabled by default in GCC 10. Bug: https://bugs.gentoo.org/707462 Signed-off-by: Mike Gilbert <floppym@gentoo.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Revert library ABI changes from commit ae2917093fb60bdc1ed3eThomas Renninger2020-01-171-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit ae2917093fb6 ("tools/power/cpupower: Display boost frequency separately") modified the library function: struct cpufreq_available_frequencies *cpufreq_get_available_frequencies(unsigned int cpu) to struct cpufreq_frequencies *cpufreq_get_frequencies(const char *type, unsigned int cpu) This patch recovers the old API and implements the new functionality in a newly introduce method: struct cpufreq_boost_frequencies *cpufreq_get_available_frequencies(unsigned int cpu) This one should get merged into stable kernels back to 5.0 when the above had been introduced. Fixes: ae2917093fb6 ("tools/power/cpupower: Display boost frequency separately") Cc: stable@vger.kernel.org Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: mperf_monitor: Update cpupower to use the RDPRU instructionJanakarajan Natarajan2019-11-053-0/+25
| | | | | | | | | | | | | | | | | AMD Zen 2 introduces the RDPRU instruction which can be used to access some processor registers which are typically only accessible in privilege level 0. ECX specifies the register to read and EDX:EAX will contain the value read. ECX: 0 - Register MPERF 1 - Register APERF This has the added advantage of not having to use the msr module, since the userspace to kernel transitions which occur during each read_msr() might cause APERF and MPERF to go out of sync. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Acked-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: mperf_monitor: Introduce per_cpu_schedule flagJanakarajan Natarajan2019-11-052-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The per_cpu_schedule flag is used to move the cpupower process to the cpu on which we are looking to read the APERF/MPERF registers. This prevents IPIs from being generated by read_msr()s as we are already on the cpu of interest. Ex: If cpupower is running on CPU 0 and we execute read_msr(20, MSR_APERF, val) then, read_msr(20, MSR_MPERF, val) the msr module will generate an IPI from CPU 0 to CPU 20 to query for the MSR_APERF and then the MSR_MPERF in separate IPIs. This delay, caused by IPI latency, between reading the APERF and MPERF registers may cause both of them to go out of sync. The use of the per_cpu_schedule flag reduces the probability of this from happening. It comes at the cost of a negligible increase in cpu consumption caused by the migration of cpupower across each of the cpus of the system. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Acked-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* cpupower: Move needs_root variable into a sub-structJanakarajan Natarajan2019-11-058-8/+10
| | | | | | | | | | | Move the needs_root variable into a sub-struct. This is in preparation for adding a new flag for cpuidle_monitor. Update all uses of the needs_root variable to reflect this change. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Acked-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>