summaryrefslogtreecommitdiffstats
path: root/drivers/thermal/thermal_core.c
Commit message (Collapse)AuthorAgeFilesLines
* thermal: core: Remove excess empty line from a commentFlavio Suligoi2024-03-051-1/+0
| | | | | | | | | | | | The first and the third lines of the kerneldoc comment for: thermal_zone_device_set_polling() belong to the same sentences, so join them together. Signed-off-by: Flavio Suligoi <f.suligoi@asem.it> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Eliminate writable trip points masksRafael J. Wysocki2024-02-271-25/+3
| | | | | | | | | | | | | | All of the thermal_zone_device_register_with_trips() callers pass zero writable trip points masks to it, so drop the mask argument from that function and update all of its callers accordingly. This also removes the artificial trip points per zone limit of 32, related to using writable trip points masks. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Add flags to struct thermal_tripRafael J. Wysocki2024-02-271-1/+8
| | | | | | | | | | | | | | | | | | | | | | In order to allow thermal zone creators to specify the writability of trip point temperature and hysteresis on a per-trip basis, add a flags field to struct thermal_trip and define flags to represent the desired trip properties. Also make thermal_zone_device_register_with_trips() set the THERMAL_TRIP_FLAG_RW_TEMP flag for all trips covered by the writable trips mask passed to it and modify the thermal sysfs code to look at the trip flags instead of using the writable trips mask directly or checking the presence of the .set_trip_hyst() zone callback. Additionally, make trip_point_temp_store() and trip_point_hyst_store() fail with an error code if the trip passed to one of them has THERMAL_TRIP_FLAG_RW_TEMP or THERMAL_TRIP_FLAG_RW_HYST, respectively, clear in its flags. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Move initial num_trips assignment before memcpy()Nathan Chancellor2024-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain that supports __counted_by() (such as clang-18 and newer), there is a panic on boot: [ 2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0 [ 2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74 ... [ 3.039208] Call trace: [ 3.041643] __fortify_report+0x5c/0x74 [ 3.045469] __fortify_panic+0x18/0x20 [ 3.049209] thermal_zone_device_register_with_trips+0x4c8/0x4f8 This panic occurs because trips is counted by num_trips but num_trips is assigned after the call to memcpy(), so the fortify checks think the buffer size is zero because tz was allocated with kzalloc(). Move the num_trips assignment before the memcpy() to resolve the panic and ensure that the fortify checks work properly. Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Store zone ops in struct thermal_zone_deviceRafael J. Wysocki2024-02-231-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | The current code requires thermal zone creators to pass pointers to writable ops structures to thermal_zone_device_register_with_trips() which needs to modify the target struct thermal_zone_device_ops object if the "critical" operation in it is NULL. Moreover, the callers of thermal_zone_device_register_with_trips() are required to hold on to the struct thermal_zone_device_ops object passed to it until the given thermal zone is unregistered. Both of these requirements are quite inconvenient, so modify struct thermal_zone_device to contain struct thermal_zone_device_ops as field and make thermal_zone_device_register_with_trips() copy the contents of the struct thermal_zone_device_ops passed to it via a pointer (which can be const now) to that field. Also adjust the code using thermal zone ops accordingly and modify thermal_of_zone_register() to use a local ops variable during thermal zone registration so ops do not need to be freed in thermal_of_zone_unregister() any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Store zone trips table in struct thermal_zone_deviceRafael J. Wysocki2024-02-231-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code expects thermal zone creators to pass a pointer to a writable trips table to thermal_zone_device_register_with_trips() and that trips table is then used by the thermal core going forward. Consequently, the callers of thermal_zone_device_register_with_trips() are required to hold on to the trips table passed to it until the given thermal zone is unregistered, at which point the trips table can be freed, but at the same time they are not expected to access that table directly. This is both error prone and confusing. To address it, turn the trips table pointer in struct thermal_zone_device into a flex array (counted by its num_trips field), allocate it during thermal zone device allocation and copy the contents of the trips table supplied by the zone creator (which can be const now) into it, which will allow the callers of thermal_zone_device_register_with_trips() to drop their trip tables right after the zone registration. This requires the imx thermal driver to be adjusted to store the new temperature in its internal trips table in imx_set_trip_temp(), because it will be separate from the core's trips table now and it has to be explicitly kept in sync with the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Use kstrdup_const() during cooling device registrationChristophe JAILLET2024-01-121-3/+3
| | | | | | | | | | Some *thermal_cooling_device_register() calls pass a string literal as the 'type' parameter, so kstrdup_const() can be used instead of kstrdup() to avoid a memory allocation in such cases. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal/debugfs: Add thermal debugfs information for mitigation episodesDaniel Lezcano2024-01-121-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mitigation episodes are recorded. A mitigation episode happens when the first trip point is crossed the way up and then the way down. During this episode other trip points can be crossed also and are accounted for this mitigation episode. The interesting information is the average temperature at the trip point, the undershot and the overshot. The standard deviation of the mitigated temperature will be added later. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ `-- thermal_zones |-- 0 | `-- mitigations `-- 1 `-- mitigations The content of the mitigations file has the following format: ,-Mitigation at 349988258us, duration=130136ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 | | 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 | ,-Mitigation at 272451637us, duration=75000ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 | ,-Mitigation at 238184119us, duration=27316ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 | ,-Mitigation at 39863713us, duration=136196ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 | More information for a better understanding of the thermal behavior will be added after. The idea is to give detailed statistics information about the undershots and overshots, the temperature speed, etc... As all the information in a single file is too much, the idea would be to create a directory named with the mitigation timestamp where all data could be added. Please note this code is immune against trip ordering but not against a trip temperature change while a mitigation is happening. However, this situation should be extremely rare, perhaps not happening and we might question ourselves if something should be done in the core framework for other components first. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal/debugfs: Add thermal cooling device debugfs informationDaniel Lezcano2024-01-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thermal framework does not have any debug information except a sysfs stat which is a bit controversial. This one allocates big chunks of memory for every cooling devices with a high number of states and could represent on some systems in production several megabytes of memory for just a portion of it. As the sysfs is limited to a page size, the output is not exploitable with large data array and gets truncated. The patch provides the same information than sysfs except the transitions are dynamically allocated, thus they won't show more events than the ones which actually occurred. There is no longer a size limitation and it opens the field for more debugging information where the debugfs is designed for, not sysfs. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ -- cooling_devices |-- 0 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 1 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 2 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 3 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table `-- 4 |-- clear |-- time_in_state_ms |-- total_trans `-- trans_table The content of the files in the cooling devices directory is the same as the sysfs one except for the trans_table which has the following format: Transition Hits 1->0 246 0->1 246 2->1 632 1->2 632 3->2 98 2->3 98 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: netlink: Pass thermal zone pointer to notify routinesRafael J. Wysocki2024-01-091-8/+5
| | | | | | | | | | | | | | | | There are several rountines in the thermal netlink API that take a thermal zone ID or a thermal zone type as their arguments, but from their callers perspective it would be more convenient to pass a thermal zone pointer to them and let them extract the necessary data from the given thermal zone object by themselves. Modify the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()Rafael J. Wysocki2024-01-091-6/+2
| | | | | | | | | | | | | Instead of requiring the callers of thermal_notify_tz_trip_up/down() to provide specific values needed to populate struct param in them, make them extract those values from objects passed by the callers via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* Merge tag 'thermal-v6.8-rc1' of ↵Rafael J. Wysocki2024-01-021-4/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into thermal Merge thermal control material for 6.8-rc1 from Daniel Lezcano: "- Converted Mediatek Thermal to the json-schema (Rafał Miłecki) - Fixed DT bindings issue on Loongson (Binbin Zhou) - Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou) - Added the DT binding for the tsens on SM8650 platform (Neil Armstrong) - Added a reboot on critical option feature (Fabio Estevam) - Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König) - Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev) - Fixed example in the DT binding for QCom SPMI (Johan Hovold) - Fixed compilation warning for the tmon utility (Florian Eckert) - Added interrupt based configuration on Exynos along with a set of related cleanups (Mateusz Majewski)" * tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (24 commits) thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void thermal/thermal_of: Allow rebooting after critical temp reboot: Introduce thermal_zone_device_critical_reboot() thermal/core: Prepare for introduction of thermal reboot dt-bindings: thermal-zones: Document critical-action ...
| * reboot: Introduce thermal_zone_device_critical_reboot()Fabio Estevam2024-01-021-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce thermal_zone_device_critical_reboot() to trigger an emergency reboot. It is a counterpart of thermal_zone_device_critical() with the difference that it will force a reboot instead of shutdown. The motivation for doing this is to allow the thermal subystem to trigger a reboot when the temperature reaches the critical temperature. Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-3-festevam@gmail.com
| * thermal/core: Prepare for introduction of thermal rebootFabio Estevam2024-01-021-4/+10
| | | | | | | | | | | | | | | | | | | | | | Add some helper functions to make it easier introducing the support for thermal reboot. No functional change. Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-2-festevam@gmail.com
* | thermal: core: Add governor callback for thermal zone changeLukasz Luba2023-12-291-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new callback to the struct thermal_governor. It can be used for updating governors when there is a change in the thermal zone internals, e.g. thermal cooling device is bind to the thermal zone. That makes possible to move some heavy operations like memory allocations related to the number of cooling instances out of the throttle() callback. Both callback code paths (throttle() and update_tz()) are protected with the same thermal zone lock, which guaranties the consistency. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | thermal: core: Resume thermal zones asynchronouslyRafael J. Wysocki2023-12-281-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resume of thermal zones in thermal_pm_notify() is carried out sequentially, which may be a problem if __thermal_zone_device_update() takes a significant time to run for some thermal zones, because some other thermal zones may need to wait for them to resume then and if any other PM notifiers are going to be invoked after the thermal one, they will need to wait for it either. To address this, make thermal_pm_notify() switch the poll_queue delayed work over to a one-shot thermal_zone_device_resume() work function that will restore the original one during the thermal zone resume and queue up poll_queue without a delay for each thermal zone. Link: https://lore.kernel.org/linux-pm/20231120234015.3273143-1-radusolea@google.com/ Reported-by: Radu Solea <radusolea@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | thermal: core: Initialize poll_queue in thermal_zone_device_init()Rafael J. Wysocki2023-12-281-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for a subsequent change, move the initialization of the poll_queue delayed work from thermal_zone_device_register_with_trips() to thermal_zone_device_init() which is called by the former. However, because thermal_zone_device_init() is also called by thermal_pm_notify(), make the latter call cancel_delayed_work() on poll_queue before invoking the former, so as to allow the work item to be re-initialized safely. Also move thermal_zone_device_check() which needs to be defined before thermal_zone_device_init(), so the latter can pass it to the INIT_DELAYED_WORK() macro. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | thermal: core: Fix thermal zone suspend-resume synchronizationRafael J. Wysocki2023-12-281-7/+23
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 3 synchronization issues with thermal zone suspend-resume during system-wide transitions: 1. The resume code runs in a PM notifier which is invoked after user space has been thawed, so it can run concurrently with user space which can trigger a thermal zone device removal. If that happens, the thermal zone resume code may use a stale pointer to the next list element and crash, because it does not hold thermal_list_lock while walking thermal_tz_list. 2. The thermal zone resume code calls thermal_zone_device_init() outside the zone lock, so user space or an update triggered by the platform firmware may see an inconsistent state of a thermal zone leading to unexpected behavior. 3. Clearing the in_suspend global variable in thermal_pm_notify() allows __thermal_zone_device_update() to continue for all thermal zones and it may as well run before the thermal_tz_list walk (or at any point during the list walk for that matter) and attempt to operate on a thermal zone that has not been resumed yet. It may also race destructively with thermal_zone_device_init(). To address these issues, add thermal_list_lock locking to thermal_pm_notify(), especially arount the thermal_tz_list, make it call thermal_zone_device_init() back-to-back with __thermal_zone_device_update() under the zone lock and replace in_suspend with per-zone bool "suspend" indicators set and unset under the given zone's lock. Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/ Reported-by: Bo Ye <bo.ye@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Fix NULL pointer dereference in zone registration error pathRafael J. Wysocki2023-12-151-1/+0
| | | | | | | | | | | | | | | | | | | | If device_register() in thermal_zone_device_register_with_trips() returns an error, the tz variable is set to NULL and subsequently dereferenced in kfree(tz->tzp). Commit adc8749b150c ("thermal/drivers/core: Use put_device() if device_register() fails") added the tz = NULL assignment in question to avoid a possible double-free after dropping the reference to the zone device. However, after commit 4649620d9404 ("thermal: core: Make thermal_zone_device_unregister() return after freeing the zone"), that assignment has become redundant, because dropping the reference to the zone device does not cause the zone object to be freed any more. Drop it to address the NULL pointer dereference. Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
* thermal/core: Check get_temp ops is present when registering a tzDaniel Lezcano2023-12-131-6/+1
| | | | | | | | | | | | | Initially the check against the get_temp ops in the thermal_zone_device_update() was put in there in order to catch drivers not providing this method. Instead of checking again and again the function if the ops exists in the update function, let's do the check at registration time, so it is checked one time and for all. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Rework thermal zone availability checkRafael J. Wysocki2023-12-121-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid running __thermal_zone_device_update() for thermal zones going away, the thermal zone lock is held around device_del() in thermal_zone_device_unregister() and thermal_zone_device_update() passes the given thermal zone device to device_is_registered(). This allows thermal_zone_device_update() to skip the __thermal_zone_device_update() if device_del() has already run for the thermal zone at hand. However, instead of looking at driver core internals, the thermal subsystem may as well rely on its own data structures for this purpose. Namely, if the thermal zone is not present in thermal_tz_list, it can be regarded as unavailable, which in fact is already the case in thermal_zone_device_unregister(). Accordingly, the device_is_registered() check in thermal_zone_device_update() can be replaced with checking whether or not the node list_head in struct thermal_zone_device is empty, in which case it is not there in thermal_tz_list. To make this work, though, it is necessary to initialize tz->node in thermal_zone_device_register_with_trips() before registering the thermal zone device and it needs to be added to thermal_tz_list and deleted from it under its zone lock. After the above modifications, the zone lock does not need to be held around device_del() in thermal_zone_device_unregister() any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: Drop redundant and confusing device_is_registered() checksRafael J. Wysocki2023-12-121-9/+0
| | | | | | | | | | | | | | | | | | | | | Multiple places in the thermal subsystem (most importantly, sysfs attribute callback functions) check if the given thermal zone device is still registered in order to return early in case the device_del() in thermal_zone_device_unregister() has run already. However, after thermal_zone_device_unregister() has been made wait for all of the zone-related activity to complete before returning, it is not necessary to do that any more, because all of the code holding a reference to the thermal zone device object will be waited for even if it does not do anything special to enforce this. Accordingly, drop all of the device_is_registered() checks that are now redundant and get rid of the zone locking that is not necessary any more after dropping them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Make thermal_zone_device_unregister() return after freeing ↵Rafael J. Wysocki2023-12-111-1/+5
| | | | | | | | | | | | | | | | | | | | the zone Make thermal_zone_device_unregister() wait until all of the references to the given thermal zone object have been dropped and free it before returning. This guarantees that when thermal_zone_device_unregister() returns, there is no leftover activity regarding the thermal zone in question which is required by some of its callers (for instance, modular driver code that wants to know when it is safe to let the module go away). Subsequently, this will allow some confusing device_is_registered() checks to be dropped from the thermal sysfs and core code. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Add trip thresholds for trip crossing detectionRafael J. Wysocki2023-11-201-7/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trip crossing detection in handle_thermal_trip() does not work correctly in the cases when a trip point is crossed on the way up and then the zone temperature stays above its low temperature (that is, its temperature decreased by its hysteresis). The trip temperature may be passed by the zone temperature subsequently in that case, even multiple times, but that does not count as the trip crossing as long as the zone temperature does not fall below the trip's low temperature or, in other words, until the trip is crossed on the way down. |-----------low--------high------------| |<--------->| | hyst | | | | -|--> crossed on the way up | <---|-- crossed on the way down However, handle_thermal_trip() will invoke thermal_notify_tz_trip_up() every time the trip temperature is passed by the zone temperature on the way up regardless of whether or not the trip has been crossed on the way down yet. Moreover, it will not call thermal_notify_tz_trip_down() if the last zone temperature was between the trip's temperature and its low temperature, so some "trip crossed on the way down" events may not be reported. To address this issue, introduce trip thresholds equal to either the temperature of the given trip, or its low temperature, such that if the trip's threshold is passed by the zone temperature on the way up, its value will be set to the trip's low temperature and thermal_notify_tz_trip_up() will be called, and if the trip's threshold is passed by the zone temperature on the way down, its value will be set to the trip's temperature (high) and thermal_notify_tz_trip_down() will be called. Accordingly, if the threshold is passed on the way up, it cannot be passed on the way up again until its passed on the way down and if it is passed on the way down, it cannot be passed on the way down again until it is passed on the way up which guarantees correct triggering of trip crossing notifications. If the last temperature of the zone is invalid, the trip's threshold will be set depending of the zone's current temperature: If that temperature is above the trip's temperature, its threshold will be set to its low temperature or otherwise its threshold will be set to its (high) temperature. Because the zone temperature is initially set to invalid and tz->last_temperature is only updated by update_temperature(), this is sufficient to set the correct initial threshold values for all trips. Link: https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Pass trip pointer to governor throttle callbackRafael J. Wysocki2023-10-201-25/+25
| | | | | | | | | | | | | | Modify the governor .throttle() callback definition so that it takes a trip pointer instead of a trip index as its second argument, adjust the governors accordingly and update the core code invoking .throttle(). This causes the governors to become independent of the representation of the list of trips in the thermal zone structure. This change is not expected to alter the general functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* Merge branch 'acpi-thermal'Rafael J. Wysocki2023-10-111-19/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ACPI thermal driver changes include some thermal core modifications that are depended on by subsequent thermal core changes, so merge them. * acpi-thermal: (26 commits) thermal: trip: Drop lockdep assertion from thermal_zone_trip_id() thermal: trip: Remove lockdep assertion from for_each_thermal_trip() thermal: core: Drop thermal_zone_device_exec() ACPI: thermal: Use thermal_zone_for_each_trip() for updating trips ACPI: thermal: Combine passive and active trip update functions ACPI: thermal: Move get_active_temp() ACPI: thermal: Fix up function header formatting in two places ACPI: thermal: Drop list of device ACPI handles from struct acpi_thermal ACPI: thermal: Rename structure fields holding temperature in deci-Kelvin ACPI: thermal: Drop critical_valid and hot_valid trip flags ACPI: thermal: Do not use trip indices for cooling device binding ACPI: thermal: Mark uninitialized active trips as invalid ACPI: thermal: Merge trip initialization functions ACPI: thermal: Collapse trip devices update function wrappers ACPI: thermal: Collapse trip devices update functions ACPI: thermal: Add device list to struct acpi_thermal_trip ACPI: thermal: Fix a small leak in acpi_thermal_add() ACPI: thermal: Drop valid flag from struct acpi_thermal_trip ACPI: thermal: Drop redundant trip point flags ACPI: thermal: Untangle initialization and updates of active trips ...
| * thermal: core: Drop thermal_zone_device_exec()Rafael J. Wysocki2023-10-051-19/+0
| | | | | | | | | | | | | | | | | | | | | | Because thermal_zone_device_exec() has no users any more and there are no plans to use it anywhere, revert commit 9a99a996d1ec ("thermal: core: Introduce thermal_zone_device_exec()") that introduced it. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* | thermal: core: prevent potential string overflowDan Carpenter2023-10-111-2/+4
|/ | | | | | | | | The dev->id value comes from ida_alloc() so it's a number between zero and INT_MAX. If it's too high then these sprintf()s will overflow. Fixes: 203d3d4aa482 ("the generic thermal sysfs driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Allow trip pointers to be used for cooling device bindingRafael J. Wysocki2023-09-281-20/+34
| | | | | | | | | | | | | | Add new helper functions, thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip(), to allow a trip pointer to be used for binding a cooling device to a trip point and unbinding it, respectively, and redefine the existing helpers, thermal_zone_bind_cooling_device() and thermal_zone_unbind_cooling_device(), as wrappers around the new ones, respectively. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Store trip pointer in struct thermal_instanceRafael J. Wysocki2023-09-281-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the integer trip number stored in struct thermal_instance with a pointer to the relevant trip and adjust the code using the structure in question accordingly. The main reason for making this change is to allow the trip point to cooling device binding code more straightforward, as illustrated by subsequent modifications of the ACPI thermal driver, but it also helps to clarify the overall design and allows the governor code overhead to be reduced (through subsequent modifications). The only case in which it adds complexity is trip_point_show() that needs to walk the trips[] table to find the index of the given trip point, but this is not a critical path and the interface that trip_point_show() belongs to is problematic anyway (for instance, it doesn't cover the case when the same cooling devices is associated with multiple trip points). This is a preliminary change and the affected code will be refined by a series of subsequent modifications of thermal governors, the core and the ACPI thermal driver. The general functionality is not expected to be affected by this change. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Drop trips_disabled bitmaskRafael J. Wysocki2023-09-251-13/+0
| | | | | | | | | | | | | | | | | | | After recent changes, thermal_zone_get_trip() cannot fail, as invoked from thermal_zone_device_register_with_trips(), so the only role of the trips_disabled bitmask is struct thermal_zone_device is to make handle_thermal_trip() skip trip points whose temperature was initially zero. However, since the unit of temperature in the thermal core is millicelsius, zero may very well be a valid temperature value at least in some usage scenarios and the trip temperature may as well change later. Thus there is no reason to permanently disable trip points with initial temperature equal to zero. Accordingly, drop the trips_disabled bitmask along with the code related to it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* thermal: core: Fix disabled trip point check in handle_thermal_trip()Rafael J. Wysocki2023-09-141-2/+4
| | | | | | | | | | | | | | | Commit bc840ea5f9a9 ("thermal: core: Do not handle trip points with invalid temperature") added a check for invalid temperature to the disabled trip point check in handle_thermal_trip(), but that check was added at a point when the trip structure has not been initialized yet. This may cause handle_thermal_trip() to skip a valid trip point in some cases, so fix it by moving the check to a suitable place, after __thermal_zone_get_trip() has been called to populate the trip structure. Fixes: bc840ea5f9a9 ("thermal: core: Do not handle trip points with invalid temperature") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Drop thermal_zone_device_register()Rafael J. Wysocki2023-09-051-11/+0
| | | | | | | | | | | | There are no more users of thermal_zone_device_register(), so drop it from the core. Note that thermal_zone_device_register_with_trips() may be renamed to thermal_zone_device_register() in the future, but only after a grace period allowing all of the possible work in progress that may be using the latter to adjust. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Add function for registering tripless thermal zonesRafael J. Wysocki2023-09-051-0/+11
| | | | | | | | Multiple callers of thermal_zone_device_register() don't pass any trips to it and they might use a shortened argument list for that, so add a special function with fewer arguments for this purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Drop unused .get_trip_*() callbacksRafael J. Wysocki2023-08-291-1/+1
| | | | | | | | | | After recent changes in the ACPI thermal driver and in the Intel DTS IOSF thermal driver, all thermal zone drivers are expected to use trip tables for initialization and none of them should implement .get_trip_type(), .get_trip_temp() or .get_trip_hyst() callbacks, so drop these callbacks entirely from the core. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Merge branch 'acpi-thermal'Rafael J. Wysocki2023-08-251-1/+21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge ACPI thermal driver changes for 6.6-rc1: - Drop non-functional nocrt parameter from ACPI thermal (Mario Limonciello). - Clean up the ACPI thermal driver, rework the handling of firmware notifications in it and make it provide a table of generic trip point structures to the core during initialization (Rafael Wysocki). * acpi-thermal: ACPI: thermal: Eliminate code duplication from acpi_thermal_notify() ACPI: thermal: Drop unnecessary thermal zone callbacks ACPI: thermal: Rework thermal_get_trend() ACPI: thermal: Use trip point table to register thermal zones thermal: core: Rework and rename __for_each_thermal_trip() ACPI: thermal: Introduce struct acpi_thermal_trip ACPI: thermal: Carry out trip point updates under zone lock ACPI: thermal: Clean up acpi_thermal_register_thermal_zone() thermal: core: Add priv pointer to struct thermal_trip thermal: core: Introduce thermal_zone_device_exec() thermal: core: Do not handle trip points with invalid temperature ACPI: thermal: Drop redundant local variable from acpi_thermal_resume() ACPI: thermal: Do not attach private data to ACPI handles ACPI: thermal: Drop enabled flag from struct acpi_thermal_active ACPI: thermal: Drop nocrt parameter
| * thermal: core: Introduce thermal_zone_device_exec()Rafael J. Wysocki2023-08-171-0/+19
| | | | | | | | | | | | | | Introduce a new helper function, thermal_zone_device_exec(), that can be used by drivers to run a given callback routine under the zone lock. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * thermal: core: Do not handle trip points with invalid temperatureRafael J. Wysocki2023-08-101-1/+2
| | | | | | | | | | | | | | | | Trip points with temperature set to THERMAL_TEMP_INVALID are as good as disabled, so make handle_thermal_trip() ignore them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
* | thermal: core: constify params in thermal_zone_device_registerAhmad Fatoum2023-07-241-2/+2
|/ | | | | | | | | | | | | | | Since commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure"), thermal_zone_device_register() allocates a copy of the tzp argument and callers need not explicitly manage its lifetime. This means the function no longer cares about the parameter being mutable, so constify it. No functional change. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal: core: Encapsulate tz->device fieldDaniel Lezcano2023-04-271-0/+6
| | | | | | | | | | | | | | | | | | | | | There are still some drivers needing to play with the thermal zone device internals. That is not the best but until we can figure out if the information is really needed, let's encapsulate the field used in the thermal zone device structure, so we can move forward relocating the thermal zone device structure definition in the thermal framework private headers. Some drivers are accessing tz->device, that implies they need to have the knowledge of the thermal_zone_device structure but we want to self-encapsulate this structure and reduce the scope of the structure to the thermal core only. By adding this wrapper, these drivers won't need the thermal zone device structure definition and are no longer an obstacle to its relocation to the private thermal core headers. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* thermal/core: Alloc-copy-free the thermal zone parameters structureDaniel Lezcano2023-04-071-2/+13
| | | | | | | | | | | | | | | | | | | | | | | The caller of the function thermal_zone_device_register_with_trips() can pass a thermal_zone_params structure parameter. This one is used by the thermal core code until the thermal zone is destroyed. That forces the caller, so the driver, to keep the pointer valid until it unregisters the thermal zone if we want to make the thermal zone device structure private the core code. As the thermal zone device structure would be private, the driver can not access to thermal zone device structure to retrieve the tzp field after it passed it to register the thermal zone. So instead of forcing the users of the function to deal with the tzp structure life cycle, make the usage easier by allocating our own thermal zone params, copying the parameter content and by freeing at unregister time. The user can then create the parameters on the stack, pass it to the registering function and forget about it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20230404075138.2914680-3-daniel.lezcano@linaro.org
* thermal/core: Remove thermal_bind_params structureZhang Rui2023-04-071-118/+11
| | | | | | | | | Remove struct thermal_bind_params because no one is using it for thermal binding now. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20230330104526.3196-1-rui.zhang@intel.com
* Merge tag 'thermal-v6.4-rc1-1' of ↵Rafael J. Wysocki2023-04-031-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal control material for 6.4-rc1 from Daniel Lezcano: "- Add more thermal zone device encapsulation: prevent setting structure field directly, access the sensor device instead the thermal zone's device for trace, relocate the traces in drivers/thermal (Daniel Lezcano) - Use the generic trip point for the i.MX and remove the get_trip_temp ops (Daniel Lezcano) - Use the devm_platform_ioremap_resource() in the Hisilicon driver (Yang Li) - Remove R-Car H3 ES1.* handling as public has only access to the ES2 version and the upstream support for the ES1 has been shutdown (Wolfram Sang) - Add a delay after initializing the bank in order to let the time to the hardware to initialze itself before reading the temperature (Amjad Ouled-Ameur) - Add MT8365 support (Amjad Ouled-Ameur)" * tag 'thermal-v6.4-rc1-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/drivers/ti: Use fixed update interval thermal/drivers/stm: Don't set no_hwmon to false thermal/drivers/db8500: Use driver dev instead of tz->device thermal/core: Relocate the traces definition in thermal directory thermal/drivers/hisi: Use devm_platform_ioremap_resource() thermal/drivers/imx: Use the thermal framework for the trip point thermal/drivers/imx: Remove get_trip_temp ops thermal/drivers/rcar_gen3_thermal: Remove R-Car H3 ES1.* handling thermal/drivers/mediatek: Add delay after thermal banks initialization thermal/drivers/mediatek: Add support for MT8365 SoC thermal/drivers/mediatek: Control buffer enablement tweaks dt-bindings: thermal: mediatek: Add binding documentation for MT8365 SoC
| * thermal/core: Relocate the traces definition in thermal directoryDaniel Lezcano2023-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The traces are exported but only local to the thermal core code. On the other side, the traces take the thermal zone device structure as argument, thus they have to rely on the exported thermal.h header file. As we want to move the structure to the private thermal core header, first we have to relocate those traces to the same place as many drivers do. Cc: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20230307133735.90772-2-daniel.lezcano@linaro.org
* | thermal: core: Clean up thermal_list_lock lockingRafael J. Wysocki2023-04-031-6/+2
| | | | | | | | | | | | | | | | | | | | | | Once thermal_list_lock has been acquired in __thermal_cooling_device_register(), it is not necessary to drop it and take it again until all of the thermal zones have been updated, so change the code accordingly. No expected functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | Merge back thermal control material for 6.4-rc1.Rafael J. Wysocki2023-03-271-0/+18
|\|
| * thermal: Add a thermal zone id accessorDaniel Lezcano2023-03-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to get the thermal zone id but without directly accessing the thermal zone device structure, add an accessor. Use the accessor in the hwmon_scmi and acpi_thermal. No functional change intented. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * thermal/core: Add thermal_zone_device structure 'type' accessorDaniel Lezcano2023-03-031-0/+6
| | | | | | | | | | | | | | | | | | | | The thermal zone device structure is exposed via the exported thermal.h header. This structure should stay private the thermal core code. In order to encapsulate the structure, let's add an accessor to get the 'type' of the thermal zone. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * thermal/core: Add a thermal zone 'devdata' accessorDaniel Lezcano2023-03-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thermal zone device structure is exposed to the different drivers and obviously they access the internals while that should be restricted to the core thermal code. In order to self-encapsulate the thermal core code, we need to prevent the drivers accessing directly the thermal zone structure and provide accessor functions to deal with. Provide an accessor to the 'devdata' structure and make use of it in the different drivers. No functional changes intended. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | Merge branch 'thermal-acpi'Rafael J. Wysocki2023-03-241-7/+97
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge a fix for a recent thermal-related regression in the ACPI processor driver. * thermal-acpi: ACPI: processor: thermal: Update CPU cooling devices on cpufreq policy changes thermal: core: Introduce thermal_cooling_device_update() thermal: core: Introduce thermal_cooling_device_present() ACPI: processor: Reorder acpi_processor_driver_init()