summaryrefslogtreecommitdiff
path: root/drivers/cpufreq/cppc_cpufreq.c
AgeCommit message (Collapse)AuthorFilesLines
2024-11-19Merge tag 'cpufreq-arm-updates-6.13' of ↵Rafael J. Wysocki1-83/+53
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge ARM cpufreq updates for 6.13 from Viresh Kumar: "- Add virtual cpufreq driver for guest kernels (David Dai). - Minor cleanup to various cpufreq drivers (Andy Shevchenko, Dhruva Gole, Jie Zhan, Jinjie Ruan, Shuosheng Huang, Sibi Sankar, and Yuan Can). - Revert "cpufreq: brcmstb-avs-cpufreq: Fix initial command check" (Colin Ian King). - Improve DT bindings for qcom-hw driver (Dmitry Baryshkov, Konrad Dybcio, and Nikunj Kela)." * tag 'cpufreq-arm-updates-6.13' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: arm64: dts: qcom: sc8180x: Add a SoC-specific compatible to cpufreq-hw dt-bindings: cpufreq: cpufreq-qcom-hw: Add SC8180X compatible cpufreq: sun50i: add a100 cpufreq support cpufreq: mediatek-hw: Fix wrong return value in mtk_cpufreq_get_cpu_power() cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_power() cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_cost() cpufreq: loongson3: Check for error code from devm_mutex_init() call cpufreq: scmi: Fix cleanup path when boost enablement fails cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost() cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw() Revert "cpufreq: brcmstb-avs-cpufreq: Fix initial command check" dt-bindings: cpufreq: cpufreq-qcom-hw: Add SAR2130P compatible cpufreq: add virtual-cpufreq driver dt-bindings: cpufreq: add virtual cpufreq device cpufreq: loongson2: Unregister platform_driver on failure cpufreq: ti-cpufreq: Remove revision offsets in AM62 family cpufreq: ti-cpufreq: Allow backward compatibility for efuse syscon cppc_cpufreq: Remove HiSilicon CPPC workaround cppc_cpufreq: Use desired perf if feedback ctrs are 0 or unchanged dt-bindings: cpufreq: qcom-hw: document support for SA8255p
2024-11-11cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_power()Jinjie Ruan1-1/+1
cppc_get_cpu_power() return 0 if the policy is NULL. Then in em_create_perf_table(), the later zero check for power is not valid as power is uninitialized. As Quentin pointed out, kernel energy model core check the return value of active_power() first, so if the callback failed it should tell the core. So return -EINVAL to fix it. Fixes: a78e72075642 ("cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw()") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Quentin Perret <qperret@google.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-11-11cpufreq: CPPC: Fix wrong return value in cppc_get_cpu_cost()Jinjie Ruan1-1/+1
cppc_get_cpu_cost() return 0 if the policy is NULL. Then in em_compute_costs(), the later zero check for cost is not valid as cost is uninitialized. As Quentin pointed out, kernel energy model core check the return value of get_cost() first, so if the callback failed it should tell the core. Return -EINVAL to fix it. Fixes: 1a1374bb8c59 ("cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost()") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/c4765377-7830-44c2-84fa-706b6e304e10@stanley.mountain/ Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Quentin Perret <qperret@google.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-30cpufreq: CPPC: Fix possible null-ptr-deref for cppc_get_cpu_cost()Jinjie Ruan1-0/+3
cpufreq_cpu_get_raw() may return NULL if the cpu is not in policy->cpus cpu mask and it will cause null pointer dereference, so check NULL for cppc_get_cpu_cost(). Fixes: 740fcdc2c20e ("cpufreq: CPPC: Register EM based on efficiency class information") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-30cpufreq: CPPC: Fix possible null-ptr-deref for cpufreq_cpu_get_raw()Jinjie Ruan1-0/+3
cpufreq_cpu_get_raw() may return NULL if the cpu is not in policy->cpus cpu mask and it will cause null pointer dereference. Fixes: 740fcdc2c20e ("cpufreq: CPPC: Register EM based on efficiency class information") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-03cppc_cpufreq: Remove HiSilicon CPPC workaroundJie Zhan1-72/+1
Since commit 6c8d750f9784 ("cpufreq / cppc: Work around for Hisilicon CPPC cpufreq"), we introduce a workround for HiSilicon platforms that do not support performance feedback counters, whereas they can get the actual frequency from the desired perf register. Later on, FIE is disabled in that workaround as well. Now the workround can be handled by the common code. Desired perf would be read and converted to frequency if feedback counters don't change. FIE would be disabled if the CPPC regs are in PCC region. Hence, the workaround is no longer needed and can be safely removed, in an effort to consolidate the driver procedure. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: Huisong Li <lihuisong@huawei.com> [ Viresh: Move fie_disabled withing CONFIG option to fix warning ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-03cppc_cpufreq: Use desired perf if feedback ctrs are 0 or unchangedJie Zhan1-11/+46
The CPPC performance feedback counters could be 0 or unchanged when the target cpu is in a low-power idle state, e.g. power-gated or clock-gated. When the counters are 0, cppc_cpufreq_get_rate() returns 0 KHz, which makes cpufreq_online() get a false error and fail to generate a cpufreq policy. When the counters are unchanged, the existing cppc_perf_from_fbctrs() returns a cached desired perf, but some platforms may update the real frequency back to the desired perf reg. For the above cases in cppc_cpufreq_get_rate(), get the latest desired perf from the CPPC reg to reflect the frequency because some platforms may update the actual frequency back there; if failed, use the cached desired perf. Fixes: 6a4fec4f6d30 ("cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro1-1/+1
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-11cpufreq/cppc: Use NSEC_PER_MSEC for deadline taskChristian Loehle1-3/+3
Convert the cppc deadline task attributes to use the available definitions to make them more readable. No functional change. Signed-off-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20240813144348.1180344-4-christian.loehle@arm.com
2024-07-09cpufreq: Make cpufreq_driver->exit() return voidLizhe1-2/+1
The cpufreq core doesn't check the return type of the exit() callback and there is not much the core can do on failures at that point. Just drop the returned value and make it return void. Signed-off-by: Lizhe <sensor1010@163.com> [ Viresh: Reworked the patches to fix all missing changes together. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # Mediatek Acked-by: Sudeep Holla <sudeep.holla@arm.com> # scpi, scmi, vexpress Acked-by: Mario Limonciello <mario.limonciello@amd.com> # amd Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> # bmips Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Kevin Hilman <khilman@baylibre.com> # omap
2024-06-13cpufreq/cppc: Don't compare desired_perf in target()Riwen Lu1-7/+2
There is a corner case where the desired_perf is exactly same as the old perf, but the actual current freq is not. This happens during S3 while the cpufreq governor is set to powersave. During cpufreq resume process, the booting CPU's new_freq obtained via .get() is the highest frequency, while the policy->cur and cpu->perf_ctrls.desired_perf are set to the lowest level (powersave governor). This causes the warning: "CPU frequency out of sync:", and the cpufreq core sets policy->cur to new_freq. Then the governor->limits() calls cppc_cpufreq_set_target() to configures the CPU frequency and returns directly because the desired_perf converted from target_freq is same as the cpu->perf_ctrls.desired_perf and both are the lowest_perf. Since target_freq and policy->cur have been already compared in __cpufreq_driver_target(), there's no need to compare them again here. Drop the comparison. Signed-off-by: Riwen Lu <luriwen@kylinos.cn> [ Viresh: Updated commit message / subject ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-04-19cppc_cpufreq: Fix possible null pointer dereferenceAleksandr Mishin1-2/+12
cppc_cpufreq_get_rate() and hisi_cppc_cpufreq_get_rate() can be called from different places with various parameters. So cpufreq_cpu_get() can return null as 'policy' in some circumstances. Fix this bug by adding null return check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: a28b2bfc099c ("cppc_cpufreq: replace per-cpu data array with a list") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-12-23cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()Vincent Guittot1-122/+17
Move and rename cppc_cpufreq_perf_to_khz() and cppc_cpufreq_khz_to_perf() to use them outside cppc_cpufreq in topology_init_cpu_capacity_cppc(). Modify the interface to use struct cppc_perf_caps *caps instead of struct cppc_cpudata *cpu_data as we only use the fields of cppc_perf_caps. cppc_cpufreq was converting the lowest and nominal freq from MHz to kHz before using them. We move this conversion inside cppc_perf_to_khz and cppc_khz_to_perf to make them generic and usable outside cppc_cpufreq. No functional change Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-6-vincent.guittot@linaro.org
2023-08-17cpufreq: cppc: Set fie_disabled to FIE_DISABLED if fails to create kworker_fieLiao Chang1-3/+6
The function cppc_freq_invariance_init() may failed to create kworker_fie, make it more robust by setting fie_disabled to FIE_DISBALED to prevent an invalid pointer dereference in kthread_destroy_worker(), which called from cppc_freq_invariance_exit(). Signed-off-by: Liao Chang <liaochang1@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-08-17cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.Liao Chang1-2/+2
The cpufreq framework used to use the zero of return value to reflect the cppc_cpufreq_get_rate() had failed to get current frequecy and treat all positive integer to be succeed. Since cppc_get_perf_ctrs() returns a negative integer in error case, so it is better to convert the value to zero as the return value of cppc_cpufreq_get_rate(). Signed-off-by: Liao Chang <liaochang1@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-12-27cpufreq: CPPC: Add u64 casts to avoid overflowingPierre Gondois1-5/+6
The fields of the _CPC object are unsigned 32-bits values. To avoid overflows while using _CPC's values, add 'u64' casts. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-09-24ACPI: CPPC: Disable FIE if registers in PCC regionsJeremy Linton1-4/+21
PCC regions utilize a mailbox to set/retrieve register values used by the CPPC code. This is fine as long as the operations are infrequent. With the FIE code enabled though the overhead can range from 2-11% of system CPU overhead (ex: as measured by top) on Arm based machines. So, before enabling FIE assure none of the registers used by cppc_get_perf_ctrs() are in the PCC region. Finally, add a module parameter which can override the PCC region detection at boot or module reload. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-25ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()Perry Yuan1-1/+1
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers don't need to check that separately. This will also cause the AMD pstate driver to refuse to load right away when ACPI is disabled. Also update the warning message in amd_pstate_init() to mention the ACPI disabled case for completeness. Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> [ rjw: Subject edits, new changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-30cpufreq: CPPC: Fix unused-function warningPierre Gondois1-9/+8
Building the cppc_cpufreq driver with for arm64 with CONFIG_ENERGY_MODEL=n triggers the following warnings: drivers/cpufreq/cppc_cpufreq.c:550:12: error: ‘cppc_get_cpu_cost’ defined but not used [-Werror=unused-function] 550 | static int cppc_get_cpu_cost(struct device *cpu_dev, unsigned long KHz, | ^~~~~~~~~~~~~~~~~ drivers/cpufreq/cppc_cpufreq.c:481:12: error: ‘cppc_get_cpu_power’ defined but not used [-Werror=unused-function] 481 | static int cppc_get_cpu_power(struct device *cpu_dev, | ^~~~~~~~~~~~~~~~~~ Move the Energy Model related functions into specific guards. This allows to fix the warning and prevent doing extra work when the Energy Model is not present. Fixes: 740fcdc2c20e ("cpufreq: CPPC: Register EM based on efficiency class information") Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-30cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIEZheng Bin1-1/+2
If CONFIG_ACPI_CPPC_CPUFREQ_FIE is not set, building fails: drivers/cpufreq/cppc_cpufreq.c: In function ‘populate_efficiency_class’: drivers/cpufreq/cppc_cpufreq.c:584:2: error: ‘cppc_cpufreq_driver’ undeclared (first use in this function); did you mean ‘cpufreq_driver’? cppc_cpufreq_driver.register_em = cppc_cpufreq_register_em; ^~~~~~~~~~~~~~~~~~~ cpufreq_driver Make declare of cppc_cpufreq_driver out of CONFIG_ACPI_CPPC_CPUFREQ_FIE to fix this. Fixes: 740fcdc2c20e ("cpufreq: CPPC: Register EM based on efficiency class information") Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19cpufreq: CPPC: Enable dvfs_possible_from_any_cpuPierre Gondois1-0/+1
The communication mean of the _CPC desired performance can be PCC, System Memory, System IO, or Functional Fixed Hardware (FFH). PCC, SystemMemory and SystemIo address spaces are available from any CPU. Thus, dvfs_possible_from_any_cpu should be enabled in such case. For FFH, let the FFH implementation do smp_call_function_*() calls. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19cpufreq: CPPC: Enable fast_switchPierre Gondois1-0/+24
The communication mean of the _CPC desired performance can be PCC, System Memory, System IO, or Functional Fixed Hardware. commit b7898fda5bc7 ("cpufreq: Support for fast frequency switching") fast_switching is 'for switching CPU frequencies from interrupt context'. Writes to SystemMemory and SystemIo are fast and suitable this. This is not the case for PCC and might not be the case for FFH. Enable fast_switching for the cppc_cpufreq driver in above cases. Add cppc_allow_fast_switch() to check the desired performance register address space and set fast_switching accordingly. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-06cpufreq: CPPC: Register EM based on efficiency class informationPierre Gondois1-0/+144
Performance states and energy consumption values are not advertised in ACPI. In the GicC structure of the MADT table, the "Processor Power Efficiency Class field" (called efficiency class from now) allows to describe the relative energy efficiency of CPUs. To leverage the EM and EAS, the CPPC driver creates a set of artificial performance states and registers them in the Energy Model (EM), such as: - Every 20 capacity unit, a performance state is created. - The energy cost of each performance state gradually increases. No power value is generated as only the cost is used in the EM. During task placement, a task can raise the frequency of its whole pd. This can make EAS place a task on a pd with CPUs that are individually less energy efficient. As cost values are artificial, and to place tasks on CPUs with the lower efficiency class, a gap in cost values is generated for adjacent efficiency classes. E.g.: - efficiency class = 0, capacity is in [0-1024], so cost values are in [0: 51] (one performance state every 20 capacity unit) - efficiency class = 1, capacity is in [0-1024], cost values are in [1*gap+0: 1*gap+51]. The value of the cost gap is chosen to absorb a the energy of 4 CPUs at their maximum capacity. This means that between: 1- a pd of 4 CPUs, each of them being used at almost their full capacity. Their efficiency class is N. 2- a CPU using almost none of its capacity. Its efficiency class is N+1 EAS will choose the first option. This patch also populates the (struct cpufreq_driver).register_em callback if the valid efficiency_class ACPI values are provided. Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-06cpufreq: CPPC: Add per_cpu efficiency_classPierre Gondois1-0/+42
In ACPI, describing power efficiency of CPUs can be done through the following arm specific field: ACPI 6.4, s5.2.12.14 'GIC CPU Interface (GICC) Structure', 'Processor Power Efficiency Class field': Describes the relative power efficiency of the associated pro- cessor. Lower efficiency class numbers are more efficient than higher ones (e.g. efficiency class 0 should be treated as more efficient than efficiency class 1). However, absolute values of this number have no meaning: 2 isn’t necessarily half as efficient as 1. The efficiency_class field is stored in the GicC structure of the ACPI MADT table and it's currently supported in Linux for arm64 only. Thus, this new functionality is introduced for arm64 only. To allow the cppc_cpufreq driver to know and preprocess the efficiency_class values of all the CPUs, add a per_cpu efficiency_class variable to store them. At least 2 different efficiency classes must be present, otherwise there is no use in creating an Energy Model. The efficiency_class values are squeezed in [0:#efficiency_class-1] while conserving the order. For instance, efficiency classes of: [111, 212, 250] will be mapped to: [0 (was 111), 1 (was 212), 2 (was 250)]. Each policy being independently registered in the driver, populating the per_cpu efficiency_class is done only once at the driver initialization. This prevents from having each policy re-searching the efficiency_class values of other CPUs. The EM will be registered in a following patch. The patch also exports acpi_cpu_get_madt_gicc() to fetch the GicC structure of the ACPI MADT table for each CPU. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-10cpufreq: CPPC: Fix performance/frequency conversionPierre Gondois1-22/+21
CPUfreq governors request CPU frequencies using information on current CPU usage. The CPPC driver converts them to performance requests. Frequency targets are computed as: target_freq = (util / cpu_capacity) * max_freq target_freq is then clamped between [policy->min, policy->max]. The CPPC driver converts performance values to frequencies (and vice-versa) using cppc_cpufreq_perf_to_khz() and cppc_cpufreq_khz_to_perf(). These functions both use two different factors depending on the range of the input value. For cppc_cpufreq_khz_to_perf(): - (NOMINAL_PERF / NOMINAL_FREQ) or - (LOWEST_PERF / LOWEST_FREQ) and for cppc_cpufreq_perf_to_khz(): - (NOMINAL_FREQ / NOMINAL_PERF) or - ((NOMINAL_PERF - LOWEST_FREQ) / (NOMINAL_PERF - LOWEST_PERF)) This means: 1- the functions are not inverse for some values: (perf_to_khz(khz_to_perf(x)) != x) 2- cppc_cpufreq_perf_to_khz(LOWEST_PERF) can sometimes give a different value from LOWEST_FREQ due to integer approximation 3- it is implied that performance and frequency are proportional (NOMINAL_FREQ / NOMINAL_PERF) == (LOWEST_PERF / LOWEST_FREQ) This patch changes the conversion functions to an affine function. This fixes the 3 points above. Suggested-by: Lukasz Luba <lukasz.luba@arm.com> Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-10-04cpufreq: remove useless INIT_LIST_HEAD()Han Wang1-2/+0
list cpu_data_list has been inited staticly through LIST_HEAD, so there's no need to call another INIT_LIST_HEAD. Simply remove it from cppc_cpufreq_init. Signed-off-by: Han Wang <zjuwanghan@outlook.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-07-01cpufreq: CPPC: Add support for frequency invarianceViresh Kumar1-13/+239
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. Normally, this scaling factor can be obtained directly with the help of the cpufreq drivers as they know the exact frequency the hardware is running at. But that isn't the case for CPPC cpufreq driver. Another way of obtaining that is using the arch specific counter support, which is already present in kernel, but that hardware is optional for platforms. This patch updates the CPPC driver to register itself with the topology core to provide its own implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which gets called by the scheduler on every tick. Note that the arch specific counters have higher priority than CPPC counters, if available, though the CPPC driver doesn't need to have any special handling for that. On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we reach here from hard-irq context), which then schedules a normal work item and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable based on the counter updates since the last tick. To allow platforms to disable this CPPC counter-based frequency invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE, which is enabled by default. This also exports sched_setattr_nocheck() as the CPPC driver can be built as a module. Cc: linux-acpi@vger.kernel.org Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-07-01cpufreq: CPPC: Pass structure instance by referenceViresh Kumar1-8/+8
Don't pass structure instance by value, pass it by reference instead. Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-07-01cpufreq: CPPC: Fix potential memleak in cppc_cpufreq_cpu_initViresh Kumar1-8/+20
It's a classic example of memleak, we allocate something, we fail and never free the resources. Make sure we free all resources on policy ->init() failures. Fixes: a28b2bfc099c ("cppc_cpufreq: replace per-cpu data array with a list") Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-06-30cpufreq: CPPC: Migrate to ->exit() callback instead of ->stop_cpu()Viresh Kumar1-22/+24
Commit 367dc4aa932b ("cpufreq: Add stop CPU callback to cpufreq_driver interface") added the ->stop_cpu() callback to allow the drivers to do clean up before the CPU is completely down and its state can't be modified. At that time the CPU hotplug framework used to call the cpufreq core's registered notifier for different events like CPU_DOWN_PREPARE and CPU_POST_DEAD. The ->stop_cpu() callback was called during the CPU_DOWN_PREPARE event. This is no longer the case, cpuhp_cpufreq_offline() is called only once by the CPU hotplug core now and we don't really need two separate callbacks for cpufreq drivers, i.e. ->stop_cpu() and -<exit(), as everything can be done from the ->exit() callback itself. Migrate to using the ->exit() callback instead of ->stop_cpu(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Minor edits in the changelog and subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-14Revert "cpufreq: CPPC: Add support for frequency invariance"Viresh Kumar1-233/+12
This reverts commit 4c38f2df71c8e33c0b64865992d693f5022eeaad. There are few races in the frequency invariance support for CPPC driver, namely the driver doesn't stop the kthread_work and irq_work on policy exit during suspend/resume or CPU hotplug. A proper fix won't be possible for the 5.13-rc, as it requires a lot of changes. Lets revert the patch instead for now. Fixes: 4c38f2df71c8 ("cpufreq: CPPC: Add support for frequency invariance") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-22cpufreq: cppc: simplify default delay_us settingTom Saeger1-12/+2
Simplify case when setting default in cppc_cpufreq_get_transition_delay_us. Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-22cpufreq: CPPC: Add support for frequency invarianceViresh Kumar1-12/+233
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. Normally, this scaling factor can be obtained directly with the help of the cpufreq drivers as they know the exact frequency the hardware is running at. But that isn't the case for CPPC cpufreq driver. Another way of obtaining that is using the arch specific counter support, which is already present in kernel, but that hardware is optional for platforms. This patch updates the CPPC driver to register itself with the topology core to provide its own implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which gets called by the scheduler on every tick. Note that the arch specific counters have higher priority than CPPC counters, if available, though the CPPC driver doesn't need to have any special handling for that. On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we reach here from hard-irq context), which then schedules a normal work item and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable based on the counter updates since the last tick. To allow platforms to disable this CPPC counter-based frequency invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE, which is enabled by default. This also exports sched_setattr_nocheck() as the CPPC driver can be built as a module. Cc: linux-acpi@vger.kernel.org Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-12-15cppc_cpufreq: replace per-cpu data array with a listIonela Voinescu1-83/+91
The cppc_cpudata per-cpu storage was inefficient (1) additional to causing functional issues (2) when CPUs are hotplugged out, due to per-cpu data being improperly initialised. (1) The amount of information needed for CPPC performance control in its cpufreq driver depends on the domain (PSD) coordination type: ANY: One set of CPPC control and capability data (e.g desired performance, highest/lowest performance, etc) applies to all CPUs in the domain. ALL: Same as ANY. To be noted that this type is not currently supported. When supported, information about which CPUs belong to a domain is needed in order for frequency change requests to be sent to each of them. HW: It's necessary to store CPPC control and capability information for all the CPUs. HW will then coordinate the performance state based on their limitations and requests. NONE: Same as HW. No HW coordination is expected. Despite this, the previous initialisation code would indiscriminately allocate memory for all CPUs (all_cpu_data) and unnecessarily duplicate performance capabilities and the domain sharing mask and type for each possible CPU. (2) With the current per-cpu structure, when having ANY coordination, the cppc_cpudata cpu information is not initialised (will remain 0) for all CPUs in a policy, other than policy->cpu. When policy->cpu is hotplugged out, the driver will incorrectly use the uninitialised (0) value of the other CPUs when making frequency changes. Additionally, the previous values stored in the perf_ctrls.desired_perf will be lost when policy->cpu changes. Therefore replace the array of per cpu data with a list. The memory for each structure is allocated at policy init, where a single structure can be allocated per policy, not per cpu. In order to accommodate the struct list_head node in the cppc_cpudata structure, the now unused cpu and cur_policy variables are removed. For example, on a arm64 Juno platform with 6 CPUs: (0, 1, 2, 3) in PSD1, (4, 5) in PSD2 - ANY coordination, the memory allocation comparison shows: Before patch: - ANY coordination: total slack req alloc/free caller 0 0 0 0/1 _kernel_size_le_hi32+0x0xffff800008ff7810 0 0 0 0/6 _kernel_size_le_hi32+0x0xffff800008ff7808 128 80 48 1/0 _kernel_size_le_hi32+0x0xffff800008ffc070 768 0 768 6/0 _kernel_size_le_hi32+0x0xffff800008ffc0e4 After patch: - ANY coordination: total slack req alloc/free caller 256 0 256 2/0 _kernel_size_le_hi32+0x0xffff800008fed410 0 0 0 0/2 _kernel_size_le_hi32+0x0xffff800008fed274 Additional notes: - A pointer to the policy's cppc_cpudata is stored in policy->driver_data - Driver registration is skipped if _CPC entries are not present. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-15cppc_cpufreq: expose information on frequency domainsIonela Voinescu1-0/+14
Use the existing sysfs attribute "freqdomain_cpus" to expose information to userspace about CPUs in the same frequency domain. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-15cppc_cpufreq: clarify support for coordination typesIonela Voinescu1-7/+12
The previous coordination type handling in the cppc_cpufreq init code created some confusion: the comment mentioned "Support only SW_ANY for now" while only the SW_ALL/ALL case resulted in a failure. The other coordination types (HW_ALL/HW, NONE) were silently supported. Clarify support for coordination types while describing in comments the intended behavior. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-15cppc_cpufreq: use policy->cpu as driver of frequency settingIonela Voinescu1-2/+3
Considering only the currently supported coordination types (ANY, HW, NONE), this change only makes a difference for the ANY type, when policy->cpu is hotplugged out. In that case the new policy->cpu will be different from ((struct cppc_cpudata *)policy->driver_data)->cpu. While in this case the controls of *ANY* CPU could be used to drive frequency changes, it's more consistent to use policy->cpu as the leading CPU, as used in all other cppc_cpufreq functions. Additionally, the debug prints in cppc_set_perf() would no longer create confusion when referring to a CPU that is hotplugged out. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-17cppc_cpufreq: simplify use of performance capabilitiesIonela Voinescu1-17/+23
The CPPC performance capabilities are used significantly throughout the driver. Simplify the use of them by introducing a local pointer "caps" to point to cpu_data->perf_caps, in functions that access performance capabilities often. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-17cppc_cpufreq: clean up cpu, cpu_num and cpunum variable useIonela Voinescu1-74/+69
In order to maintain the typical naming convention in the cpufreq framework: - replace the use of "cpu" variable name for cppc_cpudata pointers with "cpu_data" - replace variable names "cpu_num" and "cpunum" with "cpu" - make cpu variables unsigned int Where pertinent, also move the initialisation of cpu_data variable to its declaration and make consistent use of the local "cpu" variable. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-17cppc_cpufreq: fix misspelling, code style and readability issuesIonela Voinescu1-16/+16
Fix a few trivial issues in the cppc_cpufreq driver: - indentation of function arguments - consistent use of tabs (vs space) in defines - spelling: s/Offest/Offset, s/trasition/transition - order of local variables, from long pointers to structures to short ret and i (index) variables, to improve readability Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-30cpufreq: CPPC: Reuse caps variable in few routinesXin Hao1-2/+2
The 'caps' variable has been defined in cppc_cpufreq_khz_to_perf() and cppc_cpufreq_perf_to_khz() routines, so there is no need to get 'highest_perf' value through 'cpu->caps.highest_perf', we can use 'caps->highest_perf' instead. Signed-off-by: Xin Hao <xhao@linux.alibaba.com> [ Viresh: Updated commit log ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-07-30cpufreq: cppc: Reorder code and remove apply_hisi_workaround variableViresh Kumar1-49/+42
With the current approach we have an extra check in the cppc_cpufreq_get_rate() callback, which checks if hisilicon's get rate implementation should be used instead. While it works fine, the approach isn't very straight forward, over that we have an extra check in the routine. Rearrange code and update the cpufreq driver's get() callback pointer directly for the hisilicon case. This gets the extra variable is removed and the extra check isn't required anymore as well. Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-06-05cpufreq: CPPC: add SW BOOST supportXiongfeng Wang1-2/+37
To add SW BOOST support for CPPC, we need to get the max frequency of boost mode and non-boost mode. ACPI spec 6.2 section 8.4.7.1 describes the following two CPC registers. "Highest performance is the absolute maximum performance an individual processor may reach, assuming ideal conditions. This performance level may not be sustainable for long durations, and may only be achievable if other platform components are in a specific state; for example, it may require other processors be in an idle state. Nominal Performance is the maximum sustained performance level of the processor, assuming ideal operating conditions. In absence of an external constraint (power, thermal, etc.) this is the performance level the platform is expected to be able to maintain continuously. All processors are expected to be able to sustain their nominal performance state simultaneously." To add SW BOOST support for CPPC, we can use Highest Performance as the max performance in boost mode and Nominal Performance as the max performance in non-boost mode. If the Highest Performance is greater than the Nominal Performance, we assume SW BOOST is supported. The current CPPC driver does not support SW BOOST and use 'Highest Performance' as the max performance the CPU can achieve. 'Nominal Performance' is used to convert 'performance' to 'frequency'. That means, if firmware enable boost and provide a value for Highest Performance which is greater than Nominal Performance, boost feature is enabled by default. Because SW BOOST is disabled by default, so, after this patch, boost feature is disabled by default even if boost is enabled by firmware. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-01-27cpufreq: Avoid creating excessively large stack framesRafael J. Wysocki1-1/+1
In the process of modifying a cpufreq policy, the cpufreq core makes a copy of it including all of the internals which is stored on the CPU stack. Because struct cpufreq_policy is relatively large, this may cause the size of the stack frame to exceed the 2 KB limit and so the GCC complains when -Wframe-larger-than= is used. In fact, it is not necessary to copy the entire policy structure in order to modify it, however. First, because cpufreq_set_policy() obtains the min and max policy limits from frequency QoS now, it is not necessary to pass the limits to it from the callers. The only things that need to be passed to it from there are the new governor pointer or (if there is a built-in governor in the driver) the "policy" value representing the governor choice. They both can be passed as individual arguments, though, so make cpufreq_set_policy() take them this way and rework its callers accordingly. This avoids making copies of cpufreq policies in the callers of cpufreq_set_policy(). Second, cpufreq_set_policy() still needs to pass the new policy data to the ->verify() callback of the cpufreq driver whose task is to sanitize the min and max policy limits. It still does not need to make a full copy of struct cpufreq_policy for this purpose, but it needs to pass a few items from it to the driver in case they are needed (different drivers have different needs in that respect and all of them have to be covered). For this reason, introduce struct cpufreq_policy_data to hold copies of the members of struct cpufreq_policy used by the existing ->verify() driver callbacks and pass a pointer to a temporary structure of that type to ->verify() (instead of passing a pointer to full struct cpufreq_policy to it). While at it, notice that intel_pstate and longrun don't really need to verify the "policy" value in struct cpufreq_policy, so drop those check from them to avoid copying "policy" into struct cpufreq_policy_data (which allows it to be slightly smaller). Also while at it fix up white space in a couple of places and make cpufreq_set_policy() static (as it can be so). Fixes: 3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS") Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.com Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-12-29cpufreq: CPPC: put ACPI table after using itHanjun Guo1-0/+2
Put the ACPI table to release the table mapping after using it successfully. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-12-29cpufreq : CPPC: Break out if HiSilicon CPPC workaround is matchedHanjun Guo1-2/+4
Bail out if we match the OEM information, to save some possible extra iteration. Also update the code to fix minor coding style issue. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 315 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18cpufreq / cppc: Work around for Hisilicon CPPC cpufreqXiongfeng Wang1-0/+65
Hisilicon chips do not support delivered performance counter register and reference performance counter register. But the platform can calculate the real performance using its own method. We reuse the desired performance register to store the real performance calculated by the platform. After the platform finished the frequency adjust, it gets the real performance and writes it into desired performance register. Os can use it to calculate the real frequency. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> [ rjw: Drop unnecessary braces ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-03cpufreq / CPPC: Mark acpi_ids as usedNathan Chancellor1-1/+1
Clang warns: drivers/cpufreq/cppc_cpufreq.c:431:36: warning: variable 'cppc_acpi_ids' is not needed and will not be emitted [-Wunneeded-internal-declaration] static const struct acpi_device_id cppc_acpi_ids[] = { ^ 1 warning generated. Mark the definition as used so that Clang understands we don't want this warning while not inhibiting Clang's dead code elimination from removing the unreferenced internal symbol when moving the data it contains to the globally available symbol via MODULE_DEVICE_TABLE. $ nm -S drivers/cpufreq/cppc_cpufreq.o | grep acpi | tail -1 0000000000000000 0000000000000040 R __mod_acpi__cppc_acpi_ids_device_table Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-18cpufreq / CPPC: Add cpuinfo_cur_freq support for CPPCGeorge Cherian1-0/+52
Per Section 8.4.7.1.3 of ACPI 6.2, the platform provides performance feedback via set of performance counters. To determine the actual performance level delivered over time, OSPM may read a set of performance counters from the Reference Performance Counter Register and the Delivered Performance Counter Register. OSPM calculates the delivered performance over a given time period by taking a beginning and ending snapshot of both the reference and delivered performance counters, and calculating: delivered_perf = reference_perf X (delta of delivered_perf counter / delta of reference_perf counter). Implement the above and hook this up to the cpufreq->get method. Signed-off-by: George Cherian <george.cherian@cavium.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>