diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-10-23 10:28:21 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-10-23 10:28:21 +0100 |
commit | 12dd08fa954fb7c327382ead3bb9ac861f9b9b69 (patch) | |
tree | 86467d314ad13181fc5b053ec7cd48ce4ee183fa /drivers | |
parent | 72f86d080560d77069dd67b067a69578731320e4 (diff) | |
parent | cc19b05e3883efbbfd525747a86d0567b12c4d12 (diff) |
Merge tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These make hibernation on 32-bit x86 systems work in all of the cases
in which it works on 64-bit x86 ones, update the menu cpuidle governor
and the "polling" state to make them more efficient, add more hardware
support to cpufreq drivers and fix issues with some of them, fix a bug
in the conservative cpufreq governor, fix the operating performance
points (OPP) framework and make it more stable, update the devfreq
subsystem to take changes in the APIs used by into account and clean
up some things all over.
Specifics:
- Backport hibernation bug fixes from x86-64 to x86-32 and
consolidate hibernation handling on x86 to allow 32-bit systems to
work in all of the cases in which 64-bit ones work (Zhimin Gu, Chen
Yu).
- Fix hibernation documentation (Vladimir D. Seleznev).
- Update the menu cpuidle governor to fix a couple of issues with it,
make it more efficient in some cases and clean it up (Rafael
Wysocki).
- Rework the cpuidle polling state implementation to make it more
efficient (Rafael Wysocki).
- Clean up the cpuidle core somewhat (Fieah Lim).
- Fix the cpufreq conservative governor to take policy limits into
account properly in some cases (Rafael Wysocki).
- Add support for retrieving guaranteed performance information to
the ACPI CPPC library and make the intel_pstate driver use it to
expose the CPU base frequency via sysfs on systems with the
hardware-managed P-states (HWP) feature enabled (Srinivas
Pandruvada).
- Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).
- Get rid of device_node.name printing from cpufreq (Rob Herring).
- Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).
- Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju
Das).
- Update the dt-platdev cpufreq driver to allow RK3399 to have
separate tunables per cluster (Dmitry Torokhov).
- Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
(Christoph Hellwig).
- Make the imx6q cpufreq driver read OCOTP through nvmem for
imx6ul/imx6ull (Anson Huang).
- Fix several bugs in the operating performance points (OPP)
framework and make it more stable (Viresh Kumar, Dave Gerlach).
- Update the devfreq subsystem to take changes in the APIs used by
into account, fix some issues with it and make it stop print
device_node.name directly (Bjorn Andersson, Enric Balletbo i Serra,
Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong jiang).
- Prepare the generic power domains (genpd) framework for dealing
with domains containing CPUs (Ulf Hansson).
- Prevent sysfs attributes representing low-power S0 residency
counters from being exposed if low-power S0 support is not
indicated in ACPI FADT (Rajneesh Bhardwaj).
- Get rid of custom CPU features macros for Intel CPUs from the
intel_idle and RAPL drivers (Andy Shevchenko).
- Update the tasks freezer to list tasks that refused to freeze and
caused a system transition to a sleep state to be aborted (Todd
Brandt).
- Update the pm-graph set of tools to v5.2 (Todd Brandt).
- Fix some issues in the cpupower utility (Anders Roxell, Prarit
Bhargava)"
* tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
PM / Domains: Document flags for genpd
PM / Domains: Deal with multiple states but no governor in genpd
PM / Domains: Don't treat zero found compatible idle states as an error
cpuidle: menu: Avoid computations when result will be discarded
cpuidle: menu: Drop redundant comparison
cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
cpufreq: conservative: Take limits changes into account properly
Documentation: intel_pstate: Add base_frequency information
cpufreq: intel_pstate: Add base_frequency attribute
ACPI / CPPC: Add support for guaranteed performance
cpuidle: menu: Simplify checks related to the polling state
PM / tools: sleepgraph and bootgraph: upgrade to v5.2
PM / tools: sleepgraph: first batch of v5.2 changes
cpupower: Fix coredump on VMWare
cpupower: Fix AMD Family 0x17 msr_pstate size
cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
cpuidle: poll_state: Revise loop termination condition
cpuidle: menu: Move the latency_req == 0 special case check
cpuidle: menu: Avoid computations for very close timers
...
Diffstat (limited to 'drivers')
30 files changed, 637 insertions, 423 deletions
diff --git a/drivers/acpi/acpi_lpit.c b/drivers/acpi/acpi_lpit.c index cf4fc0161164..e43cb71b6972 100644 --- a/drivers/acpi/acpi_lpit.c +++ b/drivers/acpi/acpi_lpit.c @@ -117,11 +117,17 @@ static void lpit_update_residency(struct lpit_residency_info *info, if (!info->iomem_addr) return; + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) + return; + /* Silently fail, if cpuidle attribute group is not present */ sysfs_add_file_to_group(&cpu_subsys.dev_root->kobj, &dev_attr_low_power_idle_system_residency_us.attr, "cpuidle"); } else if (info->gaddr.space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) { + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) + return; + /* Silently fail, if cpuidle attribute group is not present */ sysfs_add_file_to_group(&cpu_subsys.dev_root->kobj, &dev_attr_low_power_idle_cpu_residency_us.attr, diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index d9ce4b162e2c..217a782c3e55 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1061,9 +1061,9 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) { struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum); struct cpc_register_resource *highest_reg, *lowest_reg, - *lowest_non_linear_reg, *nominal_reg, + *lowest_non_linear_reg, *nominal_reg, *guaranteed_reg, *low_freq_reg = NULL, *nom_freq_reg = NULL; - u64 high, low, nom, min_nonlinear, low_f = 0, nom_f = 0; + u64 high, low, guaranteed, nom, min_nonlinear, low_f = 0, nom_f = 0; int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum); struct cppc_pcc_data *pcc_ss_data = NULL; int ret = 0, regs_in_pcc = 0; @@ -1079,6 +1079,7 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) nominal_reg = &cpc_desc->cpc_regs[NOMINAL_PERF]; low_freq_reg = &cpc_desc->cpc_regs[LOWEST_FREQ]; nom_freq_reg = &cpc_desc->cpc_regs[NOMINAL_FREQ]; + guaranteed_reg = &cpc_desc->cpc_regs[GUARANTEED_PERF]; /* Are any of the regs PCC ?*/ if (CPC_IN_PCC(highest_reg) || CPC_IN_PCC(lowest_reg) || @@ -1107,6 +1108,9 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) cpc_read(cpunum, nominal_reg, &nom); perf_caps->nominal_perf = nom; + cpc_read(cpunum, guaranteed_reg, &guaranteed); + perf_caps->guaranteed_perf = guaranteed; + cpc_read(cpunum, lowest_non_linear_reg, &min_nonlinear); perf_caps->lowest_nonlinear_perf = min_nonlinear; diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 4b5714199490..7f38a92b444a 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -467,6 +467,10 @@ static int genpd_power_off(struct generic_pm_domain *genpd, bool one_dev_on, return -EAGAIN; } + /* Default to shallowest state. */ + if (!genpd->gov) + genpd->state_idx = 0; + if (genpd->power_off) { int ret; @@ -1687,6 +1691,8 @@ int pm_genpd_init(struct generic_pm_domain *genpd, ret = genpd_set_default_power_state(genpd); if (ret) return ret; + } else if (!gov) { + pr_warn("%s : no governor for states\n", genpd->name); } device_initialize(&genpd->dev); @@ -2478,8 +2484,8 @@ static int genpd_iterate_idle_states(struct device_node *dn, * * Returns the device states parsed from the OF node. The memory for the states * is allocated by this function and is the responsibility of the caller to - * free the memory after use. If no domain idle states is found it returns - * -EINVAL and in case of errors, a negative error code. + * free the memory after use. If any or zero compatible domain idle states is + * found it returns 0 and in case of errors, a negative error code is returned. */ int of_genpd_parse_idle_states(struct device_node *dn, struct genpd_power_state **states, int *n) @@ -2488,8 +2494,14 @@ int of_genpd_parse_idle_states(struct device_node *dn, int ret; ret = genpd_iterate_idle_states(dn, NULL); - if (ret <= 0) - return ret < 0 ? ret : -EINVAL; + if (ret < 0) + return ret; + + if (!ret) { + *states = NULL; + *n = 0; + return 0; + } st = kcalloc(ret, sizeof(*st), GFP_KERNEL); if (!st) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 30f302149730..fd25c21cee72 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -428,7 +428,7 @@ MODULE_LICENSE("GPL"); late_initcall(cppc_cpufreq_init); -static const struct acpi_device_id cppc_acpi_ids[] = { +static const struct acpi_device_id cppc_acpi_ids[] __used = { {ACPI_PROCESSOR_DEVICE_HID, }, {} }; diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index fe14c57de6ca..b1c5468dca16 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -58,6 +58,7 @@ static const struct of_device_id whitelist[] __initconst = { { .compatible = "renesas,r8a73a4", }, { .compatible = "renesas,r8a7740", }, { .compatible = "renesas,r8a7743", }, + { .compatible = "renesas,r8a7744", }, { .compatible = "renesas,r8a7745", }, { .compatible = "renesas,r8a7778", }, { .compatible = "renesas,r8a7779", }, @@ -78,7 +79,10 @@ static const struct of_device_id whitelist[] __initconst = { { .compatible = "rockchip,rk3328", }, { .compatible = "rockchip,rk3366", }, { .compatible = "rockchip,rk3368", }, - { .compatible = "rockchip,rk3399", }, + { .compatible = "rockchip,rk3399", + .data = &(struct cpufreq_dt_platform_data) + { .have_governor_per_policy = true, }, + }, { .compatible = "st-ericsson,u8500", }, { .compatible = "st-ericsson,u8540", }, diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index 0a9ebf00be46..e58bfcb1169e 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -32,6 +32,7 @@ struct private_data { struct device *cpu_dev; struct thermal_cooling_device *cdev; const char *reg_name; + bool have_static_opps; }; static struct freq_attr *cpufreq_dt_attr[] = { @@ -204,6 +205,15 @@ static int cpufreq_init(struct cpufreq_policy *policy) } } + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) { + ret = -ENOMEM; + goto out_put_regulator; + } + + priv->reg_name = name; + priv->opp_table = opp_table; + /* * Initialize OPP tables for all policy->cpus. They will be shared by * all CPUs which have marked their CPUs shared with OPP bindings. @@ -214,7 +224,8 @@ static int cpufreq_init(struct cpufreq_policy *policy) * * OPPs might be populated at runtime, don't check for error here */ - dev_pm_opp_of_cpumask_add_table(policy->cpus); + if (!dev_pm_opp_of_cpumask_add_table(policy->cpus)) + priv->have_static_opps = true; /* * But we need OPP table to function so if it is not there let's @@ -240,19 +251,10 @@ static int cpufreq_init(struct cpufreq_policy *policy) __func__, ret); } - priv = kzalloc(sizeof(*priv), GFP_KERNEL); - if (!priv) { - ret = -ENOMEM; - goto out_free_opp; - } - - priv->reg_name = name; - priv->opp_table = opp_table; - ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table); if (ret) { dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret); - goto out_free_priv; + goto out_free_opp; } priv->cpu_dev = cpu_dev; @@ -282,10 +284,11 @@ static int cpufreq_init(struct cpufreq_policy *policy) out_free_cpufreq_table: dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table); -out_free_priv: - kfree(priv); out_free_opp: - dev_pm_opp_of_cpumask_remove_table(policy->cpus); + if (priv->have_static_opps) + dev_pm_opp_of_cpumask_remove_table(policy->cpus); + kfree(priv); +out_put_regulator: if (name) dev_pm_opp_put_regulators(opp_table); out_put_clk: @@ -300,7 +303,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy) cpufreq_cooling_unregister(priv->cdev); dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table); - dev_pm_opp_of_cpumask_remove_table(policy->related_cpus); + if (priv->have_static_opps) + dev_pm_opp_of_cpumask_remove_table(policy->related_cpus); if (priv->reg_name) dev_pm_opp_put_regulators(priv->opp_table); diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index f53fb41efb7b..7aa3dcad2175 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -403,7 +403,7 @@ EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin); void cpufreq_freq_transition_end(struct cpufreq_policy *policy, struct cpufreq_freqs *freqs, int transition_failed) { - if (unlikely(WARN_ON(!policy->transition_ongoing))) + if (WARN_ON(!policy->transition_ongoing)) return; cpufreq_notify_post_transition(policy, freqs, transition_failed); diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index f20f20a77d4d..4268f87e99fc 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -80,8 +80,10 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy) * changed in the meantime, so fall back to current frequency in that * case. */ - if (requested_freq > policy->max || requested_freq < policy->min) + if (requested_freq > policy->max || requested_freq < policy->min) { requested_freq = policy->cur; + dbs_info->requested_freq = requested_freq; + } freq_step = get_freq_step(cs_tuners, policy); @@ -92,7 +94,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy) if (policy_dbs->idle_periods < UINT_MAX) { unsigned int freq_steps = policy_dbs->idle_periods * freq_step; - if (requested_freq > freq_steps) + if (requested_freq > policy->min + freq_steps) requested_freq -= freq_steps; else requested_freq = policy->min; diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c index b2ff423ad7f8..8cfee0ab804b 100644 --- a/drivers/cpufreq/imx6q-cpufreq.c +++ b/drivers/cpufreq/imx6q-cpufreq.c @@ -12,6 +12,7 @@ #include <linux/cpu_cooling.h> #include <linux/err.h> #include <linux/module.h> +#include <linux/nvmem-consumer.h> #include <linux/of.h> #include <linux/of_address.h> #include <linux/pm_opp.h> @@ -290,20 +291,32 @@ put_node: #define OCOTP_CFG3_6ULL_SPEED_792MHZ 0x2 #define OCOTP_CFG3_6ULL_SPEED_900MHZ 0x3 -static void imx6ul_opp_check_speed_grading(struct device *dev) +static int imx6ul_opp_check_speed_grading(struct device *dev) { - struct device_node *np; - void __iomem *base; u32 val; + int ret = 0; - np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp"); - if (!np) - return; + if (of_find_property(dev->of_node, "nvmem-cells", NULL)) { + ret = nvmem_cell_read_u32(dev, "speed_grade", &val); + if (ret) + return ret; + } else { + struct device_node *np; + void __iomem *base; + + np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp"); + if (!np) + return -ENOENT; + + base = of_iomap(np, 0); + of_node_put(np); + if (!base) { + dev_err(dev, "failed to map ocotp\n"); + return -EFAULT; + } - base = of_iomap(np, 0); - if (!base) { - dev_err(dev, "failed to map ocotp\n"); - goto put_node; + val = readl_relaxed(base + OCOTP_CFG3); + iounmap(base); } /* @@ -314,7 +327,6 @@ static void imx6ul_opp_check_speed_grading(struct device *dev) * 2b'11: 900000000Hz on i.MX6ULL only; * We need to set the max speed of ARM according to fuse map. */ - val = readl_relaxed(base + OCOTP_CFG3); val >>= OCOTP_CFG3_SPEED_SHIFT; val &= 0x3; @@ -334,9 +346,7 @@ static void imx6ul_opp_check_speed_grading(struct device *dev) dev_warn(dev, "failed to disable 900MHz OPP\n"); } - iounmap(base); -put_node: - of_node_put(np); + return ret; } static int imx6q_cpufreq_probe(struct platform_device *pdev) @@ -394,10 +404,18 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev) } if (of_machine_is_compatible("fsl,imx6ul") || - of_machine_is_compatible("fsl,imx6ull")) - imx6ul_opp_check_speed_grading(cpu_dev); - else + of_machine_is_compatible("fsl,imx6ull")) { + ret = imx6ul_opp_check_speed_grading(cpu_dev); + if (ret == -EPROBE_DEFER) + return ret; + if (ret) { + dev_err(cpu_dev, "failed to read ocotp: %d\n", + ret); + return ret; + } + } else { imx6q_opp_check_speed_grading(cpu_dev); + } /* Because we have added the OPPs here, we must free them */ free_opp = true; diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index b6a1aadaff9f..2a99e2fd9412 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -373,10 +373,28 @@ static void intel_pstate_set_itmt_prio(int cpu) } } } + +static int intel_pstate_get_cppc_guranteed(int cpu) +{ + struct cppc_perf_caps cppc_perf; + int ret; + + ret = cppc_get_perf_caps(cpu, &cppc_perf); + if (ret) + return ret; + + return cppc_perf.guaranteed_perf; +} + #else static void intel_pstate_set_itmt_prio(int cpu) { } + +static int intel_pstate_get_cppc_guranteed(int cpu) +{ + return -ENOTSUPP; +} #endif static void intel_pstate_init_acpi_perf_limits(struct cpufreq_policy *policy) @@ -699,9 +717,29 @@ static ssize_t show_energy_performance_preference( cpufreq_freq_attr_rw(energy_performance_preference); +static ssize_t show_base_frequency(struct cpufreq_policy *policy, char *buf) +{ + struct cpudata *cpu; + u64 cap; + int ratio; + + ratio = intel_pstate_get_cppc_guranteed(policy->cpu); + if (ratio <= 0) { + rdmsrl_on_cpu(policy->cpu, MSR_HWP_CAPABILITIES, &cap); + ratio = HWP_GUARANTEED_PERF(cap); + } + + cpu = all_cpu_data[policy->cpu]; + + return sprintf(buf, "%d\n", ratio * cpu->pstate.scaling); +} + +cpufreq_freq_attr_ro(base_frequency); + static struct freq_attr *hwp_cpufreq_attrs[] = { &energy_performance_preference, &energy_performance_available_preferences, + &base_frequency, NULL, }; diff --git a/drivers/cpufreq/mvebu-cpufreq.c b/drivers/cpufreq/mvebu-cpufreq.c index 31513bd42705..6d33a639f902 100644 --- a/drivers/cpufreq/mvebu-cpufreq.c +++ b/drivers/cpufreq/mvebu-cpufreq.c @@ -84,9 +84,10 @@ static int __init armada_xp_pmsu_cpufreq_init(void) ret = dev_pm_opp_add(cpu_dev, clk_get_rate(clk) / 2, 0); if (ret) { + dev_pm_opp_remove(cpu_dev, clk_get_rate(clk)); clk_put(clk); dev_err(cpu_dev, "Failed to register OPPs\n"); - goto opp_register_failed; + return ret; } ret = dev_pm_opp_set_sharing_cpus(cpu_dev, @@ -99,11 +100,5 @@ static int __init armada_xp_pmsu_cpufreq_init(void) platform_device_register_simple("cpufreq-dt", -1, NULL, 0); return 0; - -opp_register_failed: - /* As registering has failed remove all the opp for all cpus */ - dev_pm_opp_cpumask_remove_table(cpu_possible_mask); - - return ret; } device_initcall(armada_xp_pmsu_cpufreq_init); diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index 5d31c2db12a3..dbecd7667db2 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -611,8 +611,8 @@ static int s5pv210_cpufreq_probe(struct platform_device *pdev) for_each_compatible_node(np, NULL, "samsung,s5pv210-dmc") { id = of_alias_get_id(np, "dmc"); if (id < 0 || id >= ARRAY_SIZE(dmc_base)) { - pr_err("%s: failed to get alias of dmc node '%s'\n", - __func__, np->name); + pr_err("%s: failed to get alias of dmc node '%pOFn'\n", + __func__, np); of_node_put(np); return id; } diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c index 1f59966573aa..f1e09022b819 100644 --- a/drivers/cpufreq/tegra186-cpufreq.c +++ b/drivers/cpufreq/tegra186-cpufreq.c @@ -121,7 +121,7 @@ static struct cpufreq_frequency_table *init_vhint_table( void *virt; virt = dma_alloc_coherent(bpmp->dev, sizeof(*data), &phys, - GFP_KERNEL | GFP_DMA32); + GFP_KERNEL); if (!virt) return ERR_PTR(-ENOMEM); diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 6df894d65d9e..4a97446f66d8 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -247,17 +247,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, if (!cpuidle_state_is_coupled(drv, index)) local_irq_enable(); - diff = ktime_us_delta(time_end, time_start); - if (diff > INT_MAX) - diff = INT_MAX; - - dev->last_residency = (int) diff; - if (entered_state >= 0) { - /* Update cpuidle counters */ - /* This can be moved to within driver enter routine + /* + * Update cpuidle counters + * This can be moved to within driver enter routine, * but that results in multiple copies of same code. */ + diff = ktime_us_delta(time_end, time_start); + if (diff > INT_MAX) + diff = INT_MAX; + + dev->last_residency = (int)diff; dev->states_usage[entered_state].time += dev->last_residency; dev->states_usage[entered_state].usage++; } else { diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c index 704880a6612a..f0dddc66af26 100644 --- a/drivers/cpuidle/governors/ladder.c +++ b/drivers/cpuidle/governors/ladder.c @@ -80,7 +80,7 @@ static int ladder_select_state(struct cpuidle_driver *drv, last_state = &ldev->states[last_idx]; - last_residency = cpuidle_get_last_residency(dev) - drv->states[last_idx].exit_latency; + last_residency = dev->last_residency - drv->states[last_idx].exit_latency; /* consider promotion */ if (last_idx < drv->state_count - 1 && diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index e26a40971b26..575a68f31761 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -124,7 +124,6 @@ struct menu_device { int tick_wakeup; unsigned int next_timer_us; - unsigned int predicted_us; unsigned int bucket; unsigned int correction_factor[BUCKETS]; unsigned int intervals[INTERVALS]; @@ -197,10 +196,11 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev); * of points is below a threshold. If it is... then use the * average of these 8 points as the estimated value. */ -static unsigned int get_typical_interval(struct menu_device *data) +static unsigned int get_typical_interval(struct menu_device *data, + unsigned int predicted_us) { int i, divisor; - unsigned int max, thresh, avg; + unsigned int min, max, thresh, avg; uint64_t sum, variance; thresh = UINT_MAX; /* Discard outliers above this value */ @@ -208,6 +208,7 @@ static unsigned int get_typical_interval(struct menu_device *data) again: /* First calculate the average of past intervals */ + min = UINT_MAX; max = 0; sum = 0; divisor = 0; @@ -218,8 +219,19 @@ again: divisor++; if (value > max) max = value; + + if (value < min) + min = value; } } + + /* + * If the result of the computation is going to be discarded anyway, + * avoid the computation altogether. + */ + if (min >= predicted_us) + return UINT_MAX; + if (divisor == INTERVALS) avg = sum >> INTERVAL_SHIFT; else @@ -286,10 +298,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, struct menu_device *data = this_cpu_ptr(&menu_devices); int latency_req = cpuidle_governor_latency_req(dev->cpu); int i; - int first_idx; int idx; unsigned int interactivity_req; - unsigned int expected_interval; + unsigned int predicted_us; unsigned long nr_iowaiters, cpu_load; ktime_t delta_next; @@ -298,50 +309,36 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, data->needs_update = 0; } - /* Special case when user has set very strict latency requirement */ - if (unlikely(latency_req == 0)) { - *stop_tick = false; - return 0; - } - /* determine the expected residency time, round up */ data->next_timer_us = ktime_to_us(tick_nohz_get_sleep_length(&delta_next)); get_iowait_load(&nr_iowaiters, &cpu_load); data->bucket = which_bucket(data->next_timer_us, nr_iowaiters); + if (unlikely(drv->state_count <= 1 || latency_req == 0) || + ((data->next_timer_us < drv->states[1].target_residency || + latency_req < drv->states[1].exit_latency) && + !drv->states[0].disabled && !dev->states_usage[0].disable)) { + /* + * In this case state[0] will be used no matter what, so return + * it right away and keep the tick running. + */ + *stop_tick = false; + return 0; + } + /* * Force the result of multiplication to be 64 bits even if both * operands are 32 bits. * Make sure to round up for half microseconds. */ - data->predicted_us = DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us * + predicted_us = DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us * data->correction_factor[data->bucket], RESOLUTION * DECAY); - - expected_interval = get_typical_interval(data); - expected_interval = min(expected_interval, data->next_timer_us); - - first_idx = 0; - if (drv->states[0].flags & CPUIDLE_FLAG_POLLING) { - struct cpuidle_state *s = &drv->states[1]; - unsigned int polling_threshold; - - /* - * Default to a physical idle state, not to busy polling, unless - * a timer is going to trigger really really soon. - */ - polling_threshold = max_t(unsigned int, 20, s->target_residency); - if (data->next_timer_us > polling_threshold && - latency_req > s->exit_latency && !s->disabled && - !dev->states_usage[1].disable) - first_idx = 1; - } - /* * Use the lowest expected idle interval to pick the idle state. */ - data->predicted_us = min(data->predicted_us, expected_interval); + predicted_us = min(predicted_us, get_typical_interval(data, predicted_us)); if (tick_nohz_tick_stopped()) { /* @@ -352,34 +349,46 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * the known time till the closest timer event for the idle * state selection. */ - if (data->predicted_us < TICK_USEC) - data->predicted_us = ktime_to_us(delta_next); + if (predicted_us < TICK_USEC) + predicted_us = ktime_to_us(delta_next); } else { /* * Use the performance multiplier and the user-configurable * latency_req to determine the maximum exit latency. */ - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); + interactivity_req = predicted_us / performance_multiplier(nr_iowaiters, cpu_load); if (latency_req > interactivity_req) latency_req = interactivity_req; } - expected_interval = data->predicted_us; /* * Find the idle state with the lowest power while satisfying * our constraints. */ idx = -1; - for (i = first_idx; i < drv->state_count; i++) { + for (i = 0; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; struct cpuidle_state_usage *su = &dev->states_usage[i]; if (s->disabled || su->disable) continue; + if (idx == -1) idx = i; /* first enabled state */ - if (s->target_residency > data->predicted_us) { - if (data->predicted_us < TICK_USEC) + + if (s->target_residency > predicted_us) { + /* + * Use a physical idle state, not busy polling, unless + * a timer is going to trigger soon enough. + */ + if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) && + s->exit_latency <= latency_req && + s->target_residency <= data->next_timer_us) { + predicted_us = s->target_residency; + idx = i; + break; + } + if (predicted_us < TICK_USEC) break; if (!tick_nohz_tick_stopped()) { @@ -389,7 +398,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * tick in that case and let the governor run * again in the next iteration of the loop. */ - expected_interval = drv->states[idx].target_residency; + predicted_us = drv->states[idx].target_residency; break; } @@ -403,7 +412,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, s->target_residency <= ktime_to_us(delta_next)) idx = i; - goto out; + return idx; } if (s->exit_latency > latency_req) { /* @@ -412,7 +421,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * expected idle duration so that the tick is retained * as long as that target residency is low enough. */ - expected_interval = drv->states[idx].target_residency; + predicted_us = drv->states[idx].target_residency; break; } idx = i; @@ -426,7 +435,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * expected idle duration is shorter than the tick period length. */ if (((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) || - expected_interval < TICK_USEC) && !tick_nohz_tick_stopped()) { + predicted_us < TICK_USEC) && !tick_nohz_tick_stopped()) { unsigned int delta_next_us = ktime_to_us(delta_next); *stop_tick = false; @@ -450,10 +459,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, } } -out: - data->last_state_idx = idx; - - return data->last_state_idx; + return idx; } /** @@ -512,9 +518,19 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * duration predictor do a better job next time. */ measured_us = 9 * MAX_INTERESTING / 10; + } else if ((drv->states[last_idx].flags & CPUIDLE_FLAG_POLLING) && + dev->poll_time_limit) { + /* + * The CPU exited the "polling" state due to a time limit, so + * the idle duration prediction leading to the selection of that + * state was inaccurate. If a better prediction had been made, + * the CPU might have been woken up from idle by the next timer. + * Assume that to be the case. + */ + measured_us = data->next_timer_us; } else { /* measured value */ - measured_us = cpuidle_get_last_residency(dev); + measured_us = dev->last_residency; /* Deduct exit latency */ if (measured_us > 2 * target->exit_latency) diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c index 3f86d23c592e..85792d371add 100644 --- a/drivers/cpuidle/poll_state.c +++ b/drivers/cpuidle/poll_state.c @@ -9,7 +9,6 @@ #include <linux/sched/clock.h> #include <linux/sched/idle.h> -#define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16) #define POLL_IDLE_RELAX_COUNT 200 static int __cpuidle poll_idle(struct cpuidle_device *dev, @@ -17,8 +16,11 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev, { u64 time_start = local_clock(); + dev->poll_time_limit = false; + local_irq_enable(); if (!current_set_polling_and_test()) { + u64 limit = (u64)drv->states[1].target_residency * NSEC_PER_USEC; unsigned int loop_count = 0; while (!need_resched()) { @@ -27,8 +29,10 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev, continue; loop_count = 0; - if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT) + if (local_clock() - time_start > limit) { + dev->poll_time_limit = true; break; + } } } current_clr_polling(); diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index 4c49bb1330b5..141413067b5c 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -11,6 +11,7 @@ */ #include <linux/kernel.h> +#include <linux/kmod.h> #include <linux/sched.h> #include <linux/errno.h> #include <linux/err.h> @@ -28,9 +29,6 @@ #include <linux/of.h> #include "governor.h" -#define MAX(a,b) ((a > b) ? a : b) -#define MIN(a,b) ((a < b) ? a : b) - static struct class *devfreq_class; /* @@ -221,6 +219,49 @@ static struct devfreq_governor *find_devfreq_governor(const char *name) return ERR_PTR(-ENODEV); } +/** + * try_then_request_governor() - Try to find the governor and request the + * module if is not found. + * @name: name of the governor + * + * Search the list of devfreq governors and request the module and try again + * if is not found. This can happen when both drivers (the governor driver + * and the driver that call devfreq_add_device) are built as modules. + * devfreq_list_lock should be held by the caller. Returns the matched + * governor's pointer. + */ +static struct devfreq_governor *try_then_request_governor(const char *name) +{ + struct devfreq_governor *governor; + int err = 0; + + if (IS_ERR_OR_NULL(name)) { + pr_err("DEVFREQ: %s: Invalid parameters\n", __func__); + return ERR_PTR(-EINVAL); + } + WARN(!mutex_is_locked(&devfreq_list_lock), + "devfreq_list_lock must be locked."); + + governor = find_devfreq_governor(name); + if (IS_ERR(governor)) { + mutex_unlock(&devfreq_list_lock); + + if (!strncmp(name, DEVFREQ_GOV_SIMPLE_ONDEMAND, + DEVFREQ_NAME_LEN)) + err = request_module("governor_%s", "simpleondemand"); + else + err = request_module("governor_%s", name); + /* Restore previous state before return */ + mutex_lock(&devfreq_list_lock); + if (err) + return NULL; + + governor = find_devfreq_governor(name); + } + + return governor; +} + static int devfreq_notify_transition(struct devfreq *devfreq, struct devfreq_freqs *freqs, unsigned int state) { @@ -280,14 +321,14 @@ int update_devfreq(struct devfreq *devfreq) * max_freq * min_freq */ - max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq); - min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq); + max_freq = min(devfreq->scaling_max_freq, devfreq->max_freq); + min_freq = max(devfreq->scaling_min_freq, devfreq->min_freq); - if (min_freq && freq < min_freq) { + if (freq < min_freq) { freq = min_freq; flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */ } - if (max_freq && freq > max_freq) { + if (freq > max_freq) { freq = max_freq; flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */ } @@ -534,10 +575,6 @@ static void devfreq_dev_release(struct device *dev) list_del(&devfreq->node); mutex_unlock(&devfreq_list_lock); - if (devfreq->governor) - devfreq->governor->event_handler(devfreq, - DEVFREQ_GOV_STOP, NULL); - if (devfreq->profile->exit) devfreq->profile->exit(devfreq->dev.parent); @@ -646,9 +683,8 @@ struct devfreq *devfreq_add_device(struct device *dev, mutex_unlock(&devfreq->lock); mutex_lock(&devfreq_list_lock); - list_add(&devfreq->node, &devfreq_list); - governor = find_devfreq_governor(devfreq->governor_name); + governor = try_then_request_governor(devfreq->governor_name); if (IS_ERR(governor)) { dev_err(dev, "%s: Unable to find governor for the device\n", __func__); @@ -664,19 +700,20 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } + + list_add(&devfreq->node, &devfreq_list); + mutex_unlock(&devfreq_list_lock); return devfreq; err_init: - list_del(&devfreq->node); mutex_unlock(&devfreq_list_lock); - device_unregister(&devfreq->dev); + devfreq_remove_device(devfreq); devfreq = NULL; err_dev: - if (devfreq) - kfree(devfreq); + kfree(devfreq); err_out: return ERR_PTR(err); } @@ -693,6 +730,9 @@ int devfreq_remove_device(struct devfreq *devfreq) if (!devfreq) return -EINVAL; + if (devfreq->governor) + devfreq->governor->event_handler(devfreq, + DEVFREQ_GOV_STOP, NULL); device_unregister(&devfreq->dev); return 0; @@ -991,7 +1031,7 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, return -EINVAL; mutex_lock(&devfreq_list_lock); - governor = find_devfreq_governor(str_governor); + governor = try_then_request_governor(str_governor); if (IS_ERR(governor)) { ret = PTR_ERR(governor); goto out; @@ -1126,17 +1166,26 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr, struct devfreq *df = to_devfreq(dev); unsigned long value; int ret; - unsigned long max; ret = sscanf(buf, "%lu", &value); if (ret != 1) return -EINVAL; mutex_lock(&df->lock); - max = df->max_freq; - if (value && max && value > max) { - ret = -EINVAL; - goto unlock; + + if (value) { + if (value > df->max_freq) { + ret = -EINVAL; + goto unlock; + } + } else { + unsigned long *freq_table = df->profile->freq_table; + + /* Get minimum frequency according to sorting order */ + if (freq_table[0] < freq_table[df->profile->max_state - 1]) + value = freq_table[0]; + else + value = freq_table[df->profile->max_state - 1]; } df->min_freq = value; @@ -1152,7 +1201,7 @@ static ssize_t min_freq_show(struct device *dev, struct device_attribute *attr, { struct devfreq *df = to_devfreq(dev); - return sprintf(buf, "%lu\n", MAX(df->scaling_min_freq, df->min_freq)); + return sprintf(buf, "%lu\n", max(df->scaling_min_freq, df->min_freq)); } static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr, @@ -1161,17 +1210,26 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr, struct devfreq *df = to_devfreq(dev); unsigned long value; int ret; - unsigned long min; ret = sscanf(buf, "%lu", &value); if (ret != 1) return -EINVAL; mutex_lock(&df->lock); - min = df->min_freq; - if (value && min && value < min) { - ret = -EINVAL; - goto unlock; + + if (value) { + if (value < df->min_freq) { + ret = -EINVAL; + goto unlock; + } + } else { + unsigned long *freq_table = df->profile->freq_table; + + /* Get maximum frequency according to sorting order */ + if (freq_table[0] < freq_table[df->profile->max_state - 1]) + value = freq_table[df->profile->max_state - 1]; + else + value = freq_table[0]; } df->max_freq = value; @@ -1188,7 +1246,7 @@ static ssize_t max_freq_show(struct device *dev, struct device_attribute *attr, { struct devfreq *df = to_devfreq(dev); - return sprintf(buf, "%lu\n", MIN(df->scaling_max_freq, df->max_freq)); + return sprintf(buf, "%lu\n", min(df->scaling_max_freq, df->max_freq)); } static DEVICE_ATTR_RW(max_freq); diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c index a9c64f0d3284..c61de0bdf053 100644 --- a/drivers/devfreq/event/exynos-ppmu.c +++ b/drivers/devfreq/event/exynos-ppmu.c @@ -535,8 +535,8 @@ static int of_get_devfreq_events(struct device_node *np, if (i == ARRAY_SIZE(ppmu_events)) { dev_warn(dev, - "don't know how to configure events : %s\n", - node->name); + "don't know how to configure events : %pOFn\n", + node); continue; } diff --git a/drivers/devfreq/governor.h b/drivers/devfreq/governor.h index cfc50a61a90d..f53339ca610f 100644 --- a/drivers/devfreq/governor.h +++ b/drivers/devfreq/governor.h @@ -25,6 +25,9 @@ #define DEVFREQ_GOV_SUSPEND 0x4 #define DEVFREQ_GOV_RESUME 0x5 +#define DEVFREQ_MIN_FREQ 0 +#define DEVFREQ_MAX_FREQ ULONG_MAX + /** * struct devfreq_governor - Devfreq policy governor * @node: list node - contains registered devfreq governors @@ -54,9 +57,6 @@ struct devfreq_governor { unsigned int event, void *data); }; -/* Caution: devfreq->lock must be locked before calling update_devfreq */ -extern int update_devfreq(struct devfreq *devfreq); - extern void devfreq_monitor_start(struct devfreq *devfreq); extern void devfreq_monitor_stop(struct devfreq *devfreq); extern void devfreq_monitor_suspend(struct devfreq *devfreq); diff --git a/drivers/devfreq/governor_performance.c b/drivers/devfreq/governor_performance.c index 4d23ecfbd948..ded429fd51be 100644 --- a/drivers/devfreq/governor_performance.c +++ b/drivers/devfreq/governor_performance.c @@ -20,10 +20,7 @@ static int devfreq_performance_func(struct devfreq *df, * target callback should be able to get floor value as * said in devfreq.h */ - if (!df->max_freq) - *freq = UINT_MAX; - else - *freq = df->max_freq; + *freq = DEVFREQ_MAX_FREQ; return 0; } diff --git a/drivers/devfreq/governor_powersave.c b/drivers/devfreq/governor_powersave.c index 0c42f23249ef..9e8897f5ac42 100644 --- a/drivers/devfreq/governor_powersave.c +++ b/drivers/devfreq/governor_powersave.c @@ -20,7 +20,7 @@ static int devfreq_powersave_func(struct devfreq *df, * target callback should be able to get ceiling value as * said in devfreq.h */ - *freq = df->min_freq; + *freq = DEVFREQ_MIN_FREQ; return 0; } diff --git a/drivers/devfreq/governor_simpleondemand.c b/drivers/devfreq/governor_simpleondemand.c index 28e0f2de7100..c0417f0e081e 100644 --- a/drivers/devfreq/governor_simpleondemand.c +++ b/drivers/devfreq/governor_simpleondemand.c @@ -27,7 +27,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df, unsigned int dfso_upthreshold = DFSO_UPTHRESHOLD; unsigned int dfso_downdifferential = DFSO_DOWNDIFFERENCTIAL; struct devfreq_simple_ondemand_data *data = df->data; - unsigned long max = (df->max_freq) ? df->max_freq : UINT_MAX; err = devfreq_update_stats(df); if (err) @@ -47,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df, /* Assume MAX if it is going to be divided by zero */ if (stat->total_time == 0) { - *freq = max; + *freq = DEVFREQ_MAX_FREQ; return 0; } @@ -60,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df, /* Set MAX if it's busy enough */ if (stat->busy_time * 100 > stat->total_time * dfso_upthreshold) { - *freq = max; + *freq = DEVFREQ_MAX_FREQ; return 0; } /* Set MAX if we do not know the initial frequency */ if (stat->current_frequency == 0) { - *freq = max; + *freq = DEVFREQ_MAX_FREQ; return 0; } @@ -85,11 +84,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df, b = div_u64(b, (dfso_upthreshold - dfso_downdifferential / 2)); *freq = (unsigned long) b; - if (df->min_freq && *freq < df->min_freq) - *freq = df->min_freq; - if (df->max_freq && *freq > df->max_freq) - *freq = df->max_freq; - return 0; } diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c index 080607c3f34d..378d84c011df 100644 --- a/drivers/devfreq/governor_userspace.c +++ b/drivers/devfreq/governor_userspace.c @@ -26,19 +26,11 @@ static int devfreq_userspace_func(struct devfreq *df, unsigned long *freq) { struct userspace_data *data = df->data; - if (data->valid) { - unsigned long adjusted_freq = data->user_frequency; - - if (df->max_freq && adjusted_freq > df->max_freq) - adjusted_freq = df->max_freq; - - if (df->min_freq && adjusted_freq < df->min_freq) - adjusted_freq = df->min_freq; - - *freq = adjusted_freq; - } else { + if (data->valid) + *freq = data->user_frequency; + else *freq = df->previous_freq; /* No user freq specified yet */ - } + return 0; } diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b2ccce5fb071..791b8a366e6e 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1066,46 +1066,43 @@ static const struct idle_cpu idle_cpu_dnv = { .disable_promotion_to_c1e = true, }; -#define ICPU(model, cpu) \ - { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&cpu } - static const struct x86_cpu_id intel_idle_ids[] __initconst = { - ICPU(INTEL_FAM6_NEHALEM_EP, idle_cpu_nehalem), - ICPU(INTEL_FAM6_NEHALEM, idle_cpu_nehalem), - ICPU(INTEL_FAM6_NEHALEM_G, idle_cpu_nehalem), - ICPU(INTEL_FAM6_WESTMERE, idle_cpu_nehalem), - ICPU(INTEL_FAM6_WESTMERE_EP, idle_cpu_nehalem), - ICPU(INTEL_FAM6_NEHALEM_EX, idle_cpu_nehalem), - ICPU(INTEL_FAM6_ATOM_PINEVIEW, idle_cpu_atom), - ICPU(INTEL_FAM6_ATOM_LINCROFT, idle_cpu_lincroft), - ICPU(INTEL_FAM6_WESTMERE_EX, idle_cpu_nehalem), - ICPU(INTEL_FAM6_SANDYBRIDGE, idle_cpu_snb), - ICPU(INTEL_FAM6_SANDYBRIDGE_X, idle_cpu_snb), - ICPU(INTEL_FAM6_ATOM_CEDARVIEW, idle_cpu_atom), - ICPU(INTEL_FAM6_ATOM_SILVERMONT1, idle_cpu_byt), - ICPU(INTEL_FAM6_ATOM_MERRIFIELD, idle_cpu_tangier), - ICPU(INTEL_FAM6_ATOM_AIRMONT, idle_cpu_cht), - ICPU(INTEL_FAM6_IVYBRIDGE, idle_cpu_ivb), - ICPU(INTEL_FAM6_IVYBRIDGE_X, idle_cpu_ivt), - ICPU(INTEL_FAM6_HASWELL_CORE, idle_cpu_hsw), - ICPU(INTEL_FAM6_HASWELL_X, idle_cpu_hsw), - ICPU(INTEL_FAM6_HASWELL_ULT, idle_cpu_hsw), - ICPU(INTEL_FAM6_HASWELL_GT3E, idle_cpu_hsw), - ICPU(INTEL_FAM6_ATOM_SILVERMONT2, idle_cpu_avn), - ICPU(INTEL_FAM6_BROADWELL_CORE, idle_cpu_bdw), - ICPU(INTEL_FAM6_BROADWELL_GT3E, idle_cpu_bdw), - ICPU(INTEL_FAM6_BROADWELL_X, idle_cpu_bdw), - ICPU(INTEL_FAM6_BROADWELL_XEON_D, idle_cpu_bdw), - ICPU(INTEL_FAM6_SKYLAKE_MOBILE, idle_cpu_skl), - ICPU(INTEL_FAM6_SKYLAKE_DESKTOP, idle_cpu_skl), - ICPU(INTEL_FAM6_KABYLAKE_MOBILE, idle_cpu_skl), - ICPU(INTEL_FAM6_KABYLAKE_DESKTOP, idle_cpu_skl), - ICPU(INTEL_FAM6_SKYLAKE_X, idle_cpu_skx), - ICPU(INTEL_FAM6_XEON_PHI_KNL, idle_cpu_knl), - ICPU(INTEL_FAM6_XEON_PHI_KNM, idle_cpu_knl), - ICPU(INTEL_FAM6_ATOM_GOLDMONT, idle_cpu_bxt), - ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE, idle_cpu_bxt), - ICPU(INTEL_FAM6_ATOM_DENVERTON, idle_cpu_dnv), + INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nehalem), + INTEL_CPU_FAM6(NEHALEM, idle_cpu_nehalem), + INTEL_CPU_FAM6(NEHALEM_G, idle_cpu_nehalem), + INTEL_CPU_FAM6(WESTMERE, idle_cpu_nehalem), + INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nehalem), + INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nehalem), + INTEL_CPU_FAM6(ATOM_PINEVIEW, idle_cpu_atom), + INTEL_CPU_FAM6(ATOM_LINCROFT, idle_cpu_lincroft), + INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nehalem), + INTEL_CPU_FAM6(SANDYBRIDGE, idle_cpu_snb), + INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snb), + INTEL_CPU_FAM6(ATOM_CEDARVIEW, idle_cpu_atom), + INTEL_CPU_FAM6(ATOM_SILVERMONT1, idle_cpu_byt), + INTEL_CPU_FAM6(ATOM_MERRIFIELD, idle_cpu_tangier), + INTEL_CPU_FAM6(ATOM_AIRMONT, idle_cpu_cht), + INTEL_CPU_FAM6(IVYBRIDGE, idle_cpu_ivb), + INTEL_CPU_FAM6(IVYBRIDGE_X, idle_cpu_ivt), + INTEL_CPU_FAM6(HASWELL_CORE, idle_cpu_hsw), + INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsw), + INTEL_CPU_FAM6(HASWELL_ULT, idle_cpu_hsw), + INTEL_CPU_FAM6(HASWELL_GT3E, idle_cpu_hsw), + INTEL_CPU_FAM6(ATOM_SILVERMONT2, idle_cpu_avn), + INTEL_CPU_FAM6(BROADWELL_CORE, idle_cpu_bdw), + INTEL_CPU_FAM6(BROADWELL_GT3E, idle_cpu_bdw), + INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdw), + INTEL_CPU_FAM6(BROADWELL_XEON_D, idle_cpu_bdw), + INTEL_CPU_FAM6(SKYLAKE_MOBILE, idle_cpu_skl), + INTEL_CPU_FAM6(SKYLAKE_DESKTOP, idle_cpu_skl), + INTEL_CPU_FAM6(KABYLAKE_MOBILE, idle_cpu_skl), + INTEL_CPU_FAM6(KABYLAKE_DESKTOP, idle_cpu_skl), + INTEL_CPU_FAM6(SKYLAKE_X, idle_cpu_skx), + INTEL_CPU_FAM6(XEON_PHI_KNL, idle_cpu_knl), + INTEL_CPU_FAM6(XEON_PHI_KNM, idle_cpu_knl), + INTEL_CPU_FAM6(ATOM_GOLDMONT, idle_cpu_bxt), + INTEL_CPU_FAM6(ATOM_GEMINI_LAKE, idle_cpu_bxt), + INTEL_CPU_FAM6(ATOM_DENVERTON, idle_cpu_dnv), {} }; diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 31ff03dbeb83..2c2df4e4fc14 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -48,9 +48,14 @@ static struct opp_device *_find_opp_dev(const struct device *dev, static struct opp_table *_find_opp_table_unlocked(struct device *dev) { struct opp_table *opp_table; + bool found; list_for_each_entry(opp_table, &opp_tables, node) { - if (_find_opp_dev(dev, opp_table)) { + mutex_lock(&opp_table->lock); + found = !!_find_opp_dev(dev, opp_table); + mutex_unlock(&opp_table->lock); + + if (found) { _get_opp_table_kref(opp_table); return opp_table; @@ -313,7 +318,7 @@ int dev_pm_opp_get_opp_count(struct device *dev) count = PTR_ERR(opp_table); dev_dbg(dev, "%s: OPP table not found (%d)\n", __func__, count); - return 0; + return count; } count = _get_opp_count(opp_table); @@ -754,8 +759,8 @@ static void _remove_opp_dev(struct opp_device *opp_dev, kfree(opp_dev); } -struct opp_device *_add_opp_dev(const struct device *dev, - struct opp_table *opp_table) +static struct opp_device *_add_opp_dev_unlocked(const struct device *dev, + struct opp_table *opp_table) { struct opp_device *opp_dev; int ret; @@ -766,6 +771,7 @@ struct opp_device *_add_opp_dev(const struct device *dev, /* Initialize opp-dev */ opp_dev->dev = dev; + list_add(&opp_dev->node, &opp_table->dev_list); /* Create debugfs entries for the opp_table */ @@ -777,7 +783,19 @@ struct opp_device *_add_opp_dev(const struct device *dev, return opp_dev; } -static struct opp_table *_allocate_opp_table(struct device *dev) +struct opp_device *_add_opp_dev(const struct device *dev, + struct opp_table *opp_table) +{ + struct opp_device *opp_dev; + + mutex_lock(&opp_table->lock); + opp_dev = _add_opp_dev_unlocked(dev, opp_table); + mutex_unlock(&opp_table->lock); + + return opp_dev; +} + +static struct opp_table *_allocate_opp_table(struct device *dev, int index) { struct opp_table *opp_table; struct opp_device *opp_dev; @@ -791,6 +809,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev) if (!opp_table) return NULL; + mutex_init(&opp_table->lock); INIT_LIST_HEAD(&opp_table->dev_list); opp_dev = _add_opp_dev(dev, opp_table); @@ -799,7 +818,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev) return NULL; } - _of_init_opp_table(opp_table, dev); + _of_init_opp_table(opp_table, dev, index); /* Find clk for the device */ opp_table->clk = clk_get(dev, NULL); @@ -812,7 +831,6 @@ static struct opp_table *_allocate_opp_table(struct device *dev) BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head); INIT_LIST_HEAD(&opp_table->opp_list); - mutex_init(&opp_table->lock); kref_init(&opp_table->kref); /* Secure the device table modification */ @@ -825,7 +843,7 @@ void _get_opp_table_kref(struct opp_table *opp_table) kref_get(&opp_table->kref); } -struct opp_table *dev_pm_opp_get_opp_table(struct device *dev) +static struct opp_table *_opp_get_opp_table(struct device *dev, int index) { struct opp_table *opp_table; @@ -836,31 +854,56 @@ struct opp_table *dev_pm_opp_get_opp_table(struct device *dev) if (!IS_ERR(opp_table)) goto unlock; - opp_table = _allocate_opp_table(dev); + opp_table = _managed_opp(dev, index); + if (opp_table) { + if (!_add_opp_dev_unlocked(dev, opp_table)) { + dev_pm_opp_put_opp_table(opp_table); + opp_table = NULL; + } + goto unlock; + } + + opp_table = _allocate_opp_table(dev, index); unlock: mutex_unlock(&opp_table_lock); return opp_table; } + +struct opp_table *dev_pm_opp_get_opp_table(struct device *dev) +{ + return _opp_get_opp_table(dev, 0); +} EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_table); +struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *dev, + int index) +{ + return _opp_get_opp_table(dev, index); +} + static void _opp_table_kref_release(struct kref *kref) { struct opp_table *opp_table = container_of(kref, struct opp_table, kref); - struct opp_device *opp_dev; + struct opp_device *opp_dev, *temp; /* Release clk */ if (!IS_ERR(opp_table->clk)) clk_put(opp_table->clk); - opp_dev = list_first_entry(&opp_table->dev_list, struct opp_device, - node); + WARN_ON(!list_empty(&opp_table->opp_list)); - _remove_opp_dev(opp_dev, opp_table); + list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) { + /* + * The OPP table is getting removed, drop the performance state + * constraints. + */ + if (opp_table->genpd_performance_state) + dev_pm_genpd_set_performance_state((struct device *)(opp_dev->dev), 0); - /* dev_list must be empty now */ - WARN_ON(!list_empty(&opp_table->dev_list)); + _remove_opp_dev(opp_dev, opp_table); + } mutex_destroy(&opp_table->lock); list_del(&opp_table->node); @@ -869,6 +912,33 @@ static void _opp_table_kref_release(struct kref *kref) mutex_unlock(&opp_table_lock); } +void _opp_remove_all_static(struct opp_table *opp_table) +{ + struct dev_pm_opp *opp, *tmp; + + list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) { + if (!opp->dynamic) + dev_pm_opp_put(opp); + } + + opp_table->parsed_static_opps = false; +} + +static void _opp_table_list_kref_release(struct kref *kref) +{ + struct opp_table *opp_table = container_of(kref, struct opp_table, + list_kref); + + _opp_remove_all_static(opp_table); + mutex_unlock(&opp_table_lock); +} + +void _put_opp_list_kref(struct opp_table *opp_table) +{ + kref_put_mutex(&opp_table->list_kref, _opp_table_list_kref_release, + &opp_table_lock); +} + void dev_pm_opp_put_opp_table(struct opp_table *opp_table) { kref_put_mutex(&opp_table->kref, _opp_table_kref_release, @@ -896,7 +966,6 @@ static void _opp_kref_release(struct kref *kref) kfree(opp); mutex_unlock(&opp_table->lock); - dev_pm_opp_put_opp_table(opp_table); } void dev_pm_opp_get(struct dev_pm_opp *opp) @@ -940,11 +1009,15 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq) if (found) { dev_pm_opp_put(opp); + + /* Drop the reference taken by dev_pm_opp_add() */ + dev_pm_opp_put_opp_table(opp_table); } else { dev_warn(dev, "%s: Couldn't find OPP with freq: %lu\n", __func__, freq); } + /* Drop the reference taken by _find_opp_table() */ dev_pm_opp_put_opp_table(opp_table); } EXPORT_SYMBOL_GPL(dev_pm_opp_remove); @@ -1062,9 +1135,6 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, new_opp->opp_table = opp_table; kref_init(&new_opp->kref); - /* Get a reference to the OPP table */ - _get_opp_table_kref(opp_table); - ret = opp_debug_create_one(new_opp, opp_table); if (ret) dev_err(dev, "%s: Failed to register opp to debugfs (%d)\n", @@ -1543,8 +1613,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt) return -ENOMEM; ret = _opp_add_v1(opp_table, dev, freq, u_volt, true); + if (ret) + dev_pm_opp_put_opp_table(opp_table); - dev_pm_opp_put_opp_table(opp_table); return ret; } EXPORT_SYMBOL_GPL(dev_pm_opp_add); @@ -1707,35 +1778,7 @@ int dev_pm_opp_unregister_notifier(struct device *dev, } EXPORT_SYMBOL(dev_pm_opp_unregister_notifier); -/* - * Free OPPs either created using static entries present in DT or even the - * dynamically added entries based on remove_all param. - */ -void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev, - bool remove_all) -{ - struct dev_pm_opp *opp, *tmp; - - /* Find if opp_table manages a single device */ - if (list_is_singular(&opp_table->dev_list)) { - /* Free static OPPs */ - list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) { - if (remove_all || !opp->dynamic) - dev_pm_opp_put(opp); - } - - /* - * The OPP table is getting removed, drop the performance state - * constraints. - */ - if (opp_table->genpd_performance_state) - dev_pm_genpd_set_performance_state(dev, 0); - } else { - _remove_opp_dev(_find_opp_dev(dev, opp_table), opp_table); - } -} - -void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all) +void _dev_pm_opp_find_and_remove_table(struct device *dev) { struct opp_table *opp_table; @@ -1752,8 +1795,12 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all) return; } - _dev_pm_opp_remove_table(opp_table, dev, remove_all); + _put_opp_list_kref(opp_table); + + /* Drop reference taken by _find_opp_table() */ + dev_pm_opp_put_opp_table(opp_table); + /* Drop reference taken while the OPP table was added */ dev_pm_opp_put_opp_table(opp_table); } @@ -1766,6 +1813,6 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all) */ void dev_pm_opp_remove_table(struct device *dev) { - _dev_pm_opp_find_and_remove_table(dev, true); + _dev_pm_opp_find_and_remove_table(dev); } EXPORT_SYMBOL_GPL(dev_pm_opp_remove_table); diff --git a/drivers/opp/cpu.c b/drivers/opp/cpu.c index 0c0910709435..ab6d07e78945 100644 --- a/drivers/opp/cpu.c +++ b/drivers/opp/cpu.c @@ -108,7 +108,8 @@ void dev_pm_opp_free_cpufreq_table(struct device *dev, EXPORT_SYMBOL_GPL(dev_pm_opp_free_cpufreq_table); #endif /* CONFIG_CPU_FREQ */ -void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of) +void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, + int last_cpu) { struct device *cpu_dev; int cpu; @@ -116,6 +117,9 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of) WARN_ON(cpumask_empty(cpumask)); for_each_cpu(cpu, cpumask) { + if (cpu == last_cpu) + break; + cpu_dev = get_cpu_device(cpu); if (!cpu_dev) { pr_err("%s: failed to get cpu%d device\n", __func__, @@ -123,10 +127,7 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of) continue; } - if (of) - dev_pm_opp_of_remove_table(cpu_dev); - else - dev_pm_opp_remove_table(cpu_dev); + _dev_pm_opp_find_and_remove_table(cpu_dev); } } @@ -140,7 +141,7 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of) */ void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask) { - _dev_pm_opp_cpumask_remove_table(cpumask, false); + _dev_pm_opp_cpumask_remove_table(cpumask, -1); } EXPORT_SYMBOL_GPL(dev_pm_opp_cpumask_remove_table); @@ -222,8 +223,10 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask) cpumask_clear(cpumask); if (opp_table->shared_opp == OPP_TABLE_ACCESS_SHARED) { + mutex_lock(&opp_table->lock); list_for_each_entry(opp_dev, &opp_table->dev_list, node) cpumask_set_cpu(opp_dev->dev->id, cpumask); + mutex_unlock(&opp_table->lock); } else { cpumask_set_cpu(cpu_dev->id, cpumask); } diff --git a/drivers/opp/of.c b/drivers/opp/of.c index 7af0ddec936b..5a4b47958073 100644 --- a/drivers/opp/of.c +++ b/drivers/opp/of.c @@ -23,11 +23,32 @@ #include "opp.h" -static struct opp_table *_managed_opp(const struct device_node *np) +/* + * Returns opp descriptor node for a device node, caller must + * do of_node_put(). + */ +static struct device_node *_opp_of_get_opp_desc_node(struct device_node *np, + int index) +{ + /* "operating-points-v2" can be an array for power domain providers */ + return of_parse_phandle(np, "operating-points-v2", index); +} + +/* Returns opp descriptor node for a device, caller must do of_node_put() */ +struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev) +{ + return _opp_of_get_opp_desc_node(dev->of_node, 0); +} +EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_opp_desc_node); + +struct opp_table *_managed_opp(struct device *dev, int index) { struct opp_table *opp_table, *managed_table = NULL; + struct device_node *np; - mutex_lock(&opp_table_lock); + np = _opp_of_get_opp_desc_node(dev->of_node, index); + if (!np) + return NULL; list_for_each_entry(opp_table, &opp_tables, node) { if (opp_table->np == np) { @@ -47,29 +68,45 @@ static struct opp_table *_managed_opp(const struct device_node *np) } } - mutex_unlock(&opp_table_lock); + of_node_put(np); return managed_table; } -void _of_init_opp_table(struct opp_table *opp_table, struct device *dev) +void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, + int index) { - struct device_node *np; + struct device_node *np, *opp_np; + u32 val; /* * Only required for backward compatibility with v1 bindings, but isn't * harmful for other cases. And so we do it unconditionally. */ np = of_node_get(dev->of_node); - if (np) { - u32 val; - - if (!of_property_read_u32(np, "clock-latency", &val)) - opp_table->clock_latency_ns_max = val; - of_property_read_u32(np, "voltage-tolerance", - &opp_table->voltage_tolerance_v1); - of_node_put(np); - } + if (!np) + return; + + if (!of_property_read_u32(np, "clock-latency", &val)) + opp_table->clock_latency_ns_max = val; + of_property_read_u32(np, "voltage-tolerance", + &opp_table->voltage_tolerance_v1); + + /* Get OPP table node */ + opp_np = _opp_of_get_opp_desc_node(np, index); + of_node_put(np); + + if (!opp_np) + return; + + if (of_property_read_bool(opp_np, "opp-shared")) + opp_table->shared_opp = OPP_TABLE_ACCESS_SHARED; + else + opp_table->shared_opp = OPP_TABLE_ACCESS_EXCLUSIVE; + + opp_table->np = opp_np; + + of_node_put(opp_np); } static bool _opp_is_supported(struct device *dev, struct opp_table *opp_table, @@ -245,26 +282,10 @@ free_microvolt: */ void dev_pm_opp_of_remove_table(struct device *dev) { - _dev_pm_opp_find_and_remove_table(dev, false); + _dev_pm_opp_find_and_remove_table(dev); } EXPORT_SYMBOL_GPL(dev_pm_opp_of_remove_table); -/* Returns opp descriptor node for a device node, caller must - * do of_node_put() */ -static struct device_node *_opp_of_get_opp_desc_node(struct device_node *np, - int index) -{ - /* "operating-points-v2" can be an array for power domain providers */ - return of_parse_phandle(np, "operating-points-v2", index); -} - -/* Returns opp descriptor node for a device, caller must do of_node_put() */ -struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev) -{ - return _opp_of_get_opp_desc_node(dev->of_node, 0); -} -EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_opp_desc_node); - /** * _opp_add_static_v2() - Allocate static OPPs (As per 'v2' DT bindings) * @opp_table: OPP table @@ -276,15 +297,21 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_opp_desc_node); * removed by dev_pm_opp_remove. * * Return: - * 0 On success OR + * Valid OPP pointer: + * On success + * NULL: * Duplicate OPPs (both freq and volt are same) and opp->available - * -EEXIST Freq are same and volt are different OR + * OR if the OPP is not supported by hardware. + * ERR_PTR(-EEXIST): + * Freq are same and volt are different OR * Duplicate OPPs (both freq and volt are same) and !opp->available - * -ENOMEM Memory allocation failure - * -EINVAL Failed parsing the OPP node + * ERR_PTR(-ENOMEM): + * Memory allocation failure + * ERR_PTR(-EINVAL): + * Failed parsing the OPP node */ -static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev, - struct device_node *np) +static struct dev_pm_opp *_opp_add_static_v2(struct opp_table *opp_table, + struct device *dev, struct device_node *np) { struct dev_pm_opp *new_opp; u64 rate = 0; @@ -294,7 +321,7 @@ static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev, new_opp = _opp_allocate(opp_table); if (!new_opp) - return -ENOMEM; + return ERR_PTR(-ENOMEM); ret = of_property_read_u64(np, "opp-hz", &rate); if (ret < 0) { @@ -369,52 +396,47 @@ static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev, * frequency/voltage list. */ blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ADD, new_opp); - return 0; + return new_opp; free_opp: _opp_free(new_opp); - return ret; + return ERR_PTR(ret); } /* Initializes OPP tables based on new bindings */ -static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np) +static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table) { struct device_node *np; - struct opp_table *opp_table; - int ret = 0, count = 0, pstate_count = 0; + int ret, count = 0, pstate_count = 0; struct dev_pm_opp *opp; - opp_table = _managed_opp(opp_np); - if (opp_table) { - /* OPPs are already managed */ - if (!_add_opp_dev(dev, opp_table)) - ret = -ENOMEM; - goto put_opp_table; + /* OPP table is already initialized for the device */ + if (opp_table->parsed_static_opps) { + kref_get(&opp_table->list_kref); + return 0; } - opp_table = dev_pm_opp_get_opp_table(dev); - if (!opp_table) - return -ENOMEM; + kref_init(&opp_table->list_kref); /* We have opp-table node now, iterate over it and add OPPs */ - for_each_available_child_of_node(opp_np, np) { - count++; - - ret = _opp_add_static_v2(opp_table, dev, np); - if (ret) { + for_each_available_child_of_node(opp_table->np, np) { + opp = _opp_add_static_v2(opp_table, dev, np); + if (IS_ERR(opp)) { + ret = PTR_ERR(opp); dev_err(dev, "%s: Failed to add OPP, %d\n", __func__, ret); - _dev_pm_opp_remove_table(opp_table, dev, false); of_node_put(np); - goto put_opp_table; + goto put_list_kref; + } else if (opp) { + count++; } } /* There should be one of more OPP defined */ if (WARN_ON(!count)) { ret = -ENOENT; - goto put_opp_table; + goto put_list_kref; } list_for_each_entry(opp, &opp_table->opp_list, node) @@ -425,28 +447,25 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np) dev_err(dev, "Not all nodes have performance state set (%d: %d)\n", count, pstate_count); ret = -ENOENT; - goto put_opp_table; + goto put_list_kref; } if (pstate_count) opp_table->genpd_performance_state = true; - opp_table->np = opp_np; - if (of_property_read_bool(opp_np, "opp-shared")) - opp_table->shared_opp = OPP_TABLE_ACCESS_SHARED; - else - opp_table->shared_opp = OPP_TABLE_ACCESS_EXCLUSIVE; + opp_table->parsed_static_opps = true; -put_opp_table: - dev_pm_opp_put_opp_table(opp_table); + return 0; + +put_list_kref: + _put_opp_list_kref(opp_table); return ret; } /* Initializes OPP tables based on old-deprecated bindings */ -static int _of_add_opp_table_v1(struct device *dev) +static int _of_add_opp_table_v1(struct device *dev, struct opp_table *opp_table) { - struct opp_table *opp_table; const struct property *prop; const __be32 *val; int nr, ret = 0; @@ -467,9 +486,7 @@ static int _of_add_opp_table_v1(struct device *dev) return -EINVAL; } - opp_table = dev_pm_opp_get_opp_table(dev); - if (!opp_table) - return -ENOMEM; + kref_init(&opp_table->list_kref); val = prop->value; while (nr) { @@ -480,13 +497,12 @@ static int _of_add_opp_table_v1(struct device *dev) if (ret) { dev_err(dev, "%s: Failed to add OPP %ld (%d)\n", __func__, freq, ret); - _dev_pm_opp_remove_table(opp_table, dev, false); - break; + _put_opp_list_kref(opp_table); + return ret; } nr -= 2; } - dev_pm_opp_put_opp_table(opp_table); return ret; } @@ -509,24 +525,24 @@ static int _of_add_opp_table_v1(struct device *dev) */ int dev_pm_opp_of_add_table(struct device *dev) { - struct device_node *opp_np; + struct opp_table *opp_table; int ret; + opp_table = dev_pm_opp_get_opp_table_indexed(dev, 0); + if (!opp_table) + return -ENOMEM; + /* - * OPPs have two version of bindings now. The older one is deprecated, - * try for the new binding first. + * OPPs have two version of bindings now. Also try the old (v1) + * bindings for backward compatibility with older dtbs. */ - opp_np = dev_pm_opp_of_get_opp_desc_node(dev); - if (!opp_np) { - /* - * Try old-deprecated bindings for backward compatibility with - * older dtbs. - */ - return _of_add_opp_table_v1(dev); - } + if (opp_table->np) + ret = _of_add_opp_table_v2(dev, opp_table); + else + ret = _of_add_opp_table_v1(dev, opp_table); - ret = _of_add_opp_table_v2(dev, opp_np); - of_node_put(opp_np); + if (ret) + dev_pm_opp_put_opp_table(opp_table); return ret; } @@ -553,28 +569,29 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_add_table); */ int dev_pm_opp_of_add_table_indexed(struct device *dev, int index) { - struct device_node *opp_np; + struct opp_table *opp_table; int ret, count; -again: - opp_np = _opp_of_get_opp_desc_node(dev->of_node, index); - if (!opp_np) { + if (index) { /* * If only one phandle is present, then the same OPP table * applies for all index requests. */ count = of_count_phandle_with_args(dev->of_node, "operating-points-v2", NULL); - if (count == 1 && index) { - index = 0; - goto again; - } + if (count != 1) + return -ENODEV; - return -ENODEV; + index = 0; } - ret = _of_add_opp_table_v2(dev, opp_np); - of_node_put(opp_np); + opp_table = dev_pm_opp_get_opp_table_indexed(dev, index); + if (!opp_table) + return -ENOMEM; + + ret = _of_add_opp_table_v2(dev, opp_table); + if (ret) + dev_pm_opp_put_opp_table(opp_table); return ret; } @@ -591,7 +608,7 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_add_table_indexed); */ void dev_pm_opp_of_cpumask_remove_table(const struct cpumask *cpumask) { - _dev_pm_opp_cpumask_remove_table(cpumask, true); + _dev_pm_opp_cpumask_remove_table(cpumask, -1); } EXPORT_SYMBOL_GPL(dev_pm_opp_of_cpumask_remove_table); @@ -604,16 +621,18 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_cpumask_remove_table); int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask) { struct device *cpu_dev; - int cpu, ret = 0; + int cpu, ret; - WARN_ON(cpumask_empty(cpumask)); + if (WARN_ON(cpumask_empty(cpumask))) + return -ENODEV; for_each_cpu(cpu, cpumask) { cpu_dev = get_cpu_device(cpu); if (!cpu_dev) { pr_err("%s: failed to get cpu%d device\n", __func__, cpu); - continue; + ret = -ENODEV; + goto remove_table; } ret = dev_pm_opp_of_add_table(cpu_dev); @@ -625,12 +644,16 @@ int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask) pr_debug("%s: couldn't find opp table for cpu:%d, %d\n", __func__, cpu, ret); - /* Free all other OPPs */ - dev_pm_opp_of_cpumask_remove_table(cpumask); - break; + goto remove_table; } } + return 0; + +remove_table: + /* Free all other OPPs */ + _dev_pm_opp_cpumask_remove_table(cpumask, cpu); + return ret; } EXPORT_SYMBOL_GPL(dev_pm_opp_of_cpumask_add_table); diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h index 7c540fd063b2..9c6544b4f4f9 100644 --- a/drivers/opp/opp.h +++ b/drivers/opp/opp.h @@ -126,9 +126,11 @@ enum opp_table_access { * @dev_list: list of devices that share these OPPs * @opp_list: table of opps * @kref: for reference count of the table. - * @lock: mutex protecting the opp_list. + * @list_kref: for reference count of the OPP list. + * @lock: mutex protecting the opp_list and dev_list. * @np: struct device_node pointer for opp's DT node. * @clock_latency_ns_max: Max clock latency in nanoseconds. + * @parsed_static_opps: True if OPPs are initialized from DT. * @shared_opp: OPP is shared between multiple devices. * @suspend_opp: Pointer to OPP to be used during device suspend. * @supported_hw: Array of version number to support. @@ -156,6 +158,7 @@ struct opp_table { struct list_head dev_list; struct list_head opp_list; struct kref kref; + struct kref list_kref; struct mutex lock; struct device_node *np; @@ -164,6 +167,7 @@ struct opp_table { /* For backward compatibility with v1 bindings */ unsigned int voltage_tolerance_v1; + bool parsed_static_opps; enum opp_table_access shared_opp; struct dev_pm_opp *suspend_opp; @@ -186,23 +190,26 @@ struct opp_table { /* Routines internal to opp core */ void dev_pm_opp_get(struct dev_pm_opp *opp); +void _opp_remove_all_static(struct opp_table *opp_table); void _get_opp_table_kref(struct opp_table *opp_table); int _get_opp_count(struct opp_table *opp_table); struct opp_table *_find_opp_table(struct device *dev); struct opp_device *_add_opp_dev(const struct device *dev, struct opp_table *opp_table); -void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev, bool remove_all); -void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all); +void _dev_pm_opp_find_and_remove_table(struct device *dev); struct dev_pm_opp *_opp_allocate(struct opp_table *opp_table); void _opp_free(struct dev_pm_opp *opp); int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, struct opp_table *opp_table, bool rate_not_available); int _opp_add_v1(struct opp_table *opp_table, struct device *dev, unsigned long freq, long u_volt, bool dynamic); -void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of); +void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, int last_cpu); struct opp_table *_add_opp_table(struct device *dev); +void _put_opp_list_kref(struct opp_table *opp_table); #ifdef CONFIG_OF -void _of_init_opp_table(struct opp_table *opp_table, struct device *dev); +void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index); +struct opp_table *_managed_opp(struct device *dev, int index); #else -static inline void _of_init_opp_table(struct opp_table *opp_table, struct device *dev) {} +static inline void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index) {} +static inline struct opp_table *_managed_opp(struct device *dev, int index) { return NULL; } #endif #ifdef CONFIG_DEBUG_FS diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c index 295d8dcba48c..bb92874b1175 100644 --- a/drivers/powercap/intel_rapl.c +++ b/drivers/powercap/intel_rapl.c @@ -1133,47 +1133,40 @@ static const struct rapl_defaults rapl_defaults_cht = { .compute_time_window = rapl_compute_time_window_atom, }; -#define RAPL_CPU(_model, _ops) { \ - .vendor = X86_VENDOR_INTEL, \ - .family = 6, \ - .model = _model, \ - .driver_data = (kernel_ulong_t)&_ops, \ - } - static const struct x86_cpu_id rapl_ids[] __initconst = { - RAPL_CPU(INTEL_FAM6_SANDYBRIDGE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_SANDYBRIDGE_X, rapl_defaults_core), - - RAPL_CPU(INTEL_FAM6_IVYBRIDGE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_IVYBRIDGE_X, rapl_defaults_core), - - RAPL_CPU(INTEL_FAM6_HASWELL_CORE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_HASWELL_ULT, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_HASWELL_GT3E, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_HASWELL_X, rapl_defaults_hsw_server), - - RAPL_CPU(INTEL_FAM6_BROADWELL_CORE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_BROADWELL_GT3E, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_BROADWELL_XEON_D, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_BROADWELL_X, rapl_defaults_hsw_server), - - RAPL_CPU(INTEL_FAM6_SKYLAKE_DESKTOP, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_SKYLAKE_MOBILE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_SKYLAKE_X, rapl_defaults_hsw_server), - RAPL_CPU(INTEL_FAM6_KABYLAKE_MOBILE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_KABYLAKE_DESKTOP, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_CANNONLAKE_MOBILE, rapl_defaults_core), - - RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT1, rapl_defaults_byt), - RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT, rapl_defaults_cht), - RAPL_CPU(INTEL_FAM6_ATOM_MERRIFIELD, rapl_defaults_tng), - RAPL_CPU(INTEL_FAM6_ATOM_MOOREFIELD, rapl_defaults_ann), - RAPL_CPU(INTEL_FAM6_ATOM_GOLDMONT, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_ATOM_GEMINI_LAKE, rapl_defaults_core), - RAPL_CPU(INTEL_FAM6_ATOM_DENVERTON, rapl_defaults_core), - - RAPL_CPU(INTEL_FAM6_XEON_PHI_KNL, rapl_defaults_hsw_server), - RAPL_CPU(INTEL_FAM6_XEON_PHI_KNM, rapl_defaults_hsw_server), + INTEL_CPU_FAM6(SANDYBRIDGE, rapl_defaults_core), + INTEL_CPU_FAM6(SANDYBRIDGE_X, rapl_defaults_core), + + INTEL_CPU_FAM6(IVYBRIDGE, rapl_defaults_core), + INTEL_CPU_FAM6(IVYBRIDGE_X, rapl_defaults_core), + + INTEL_CPU_FAM6(HASWELL_CORE, rapl_defaults_core), + INTEL_CPU_FAM6(HASWELL_ULT, rapl_defaults_core), + INTEL_CPU_FAM6(HASWELL_GT3E, rapl_defaults_core), + INTEL_CPU_FAM6(HASWELL_X, rapl_defaults_hsw_server), + + INTEL_CPU_FAM6(BROADWELL_CORE, rapl_defaults_core), + INTEL_CPU_FAM6(BROADWELL_GT3E, rapl_defaults_core), + INTEL_CPU_FAM6(BROADWELL_XEON_D, rapl_defaults_core), + INTEL_CPU_FAM6(BROADWELL_X, rapl_defaults_hsw_server), + + INTEL_CPU_FAM6(SKYLAKE_DESKTOP, rapl_defaults_core), + INTEL_CPU_FAM6(SKYLAKE_MOBILE, rapl_defaults_core), + INTEL_CPU_FAM6(SKYLAKE_X, rapl_defaults_hsw_server), + INTEL_CPU_FAM6(KABYLAKE_MOBILE, rapl_defaults_core), + INTEL_CPU_FAM6(KABYLAKE_DESKTOP, rapl_defaults_core), + INTEL_CPU_FAM6(CANNONLAKE_MOBILE, rapl_defaults_core), + + INTEL_CPU_FAM6(ATOM_SILVERMONT1, rapl_defaults_byt), + INTEL_CPU_FAM6(ATOM_AIRMONT, rapl_defaults_cht), + INTEL_CPU_FAM6(ATOM_MERRIFIELD, rapl_defaults_tng), + INTEL_CPU_FAM6(ATOM_MOOREFIELD, rapl_defaults_ann), + INTEL_CPU_FAM6(ATOM_GOLDMONT, rapl_defaults_core), + INTEL_CPU_FAM6(ATOM_GEMINI_LAKE, rapl_defaults_core), + INTEL_CPU_FAM6(ATOM_DENVERTON, rapl_defaults_core), + + INTEL_CPU_FAM6(XEON_PHI_KNL, rapl_defaults_hsw_server), + INTEL_CPU_FAM6(XEON_PHI_KNM, rapl_defaults_hsw_server), {} }; MODULE_DEVICE_TABLE(x86cpu, rapl_ids); |