summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/panfrost
AgeCommit message (Collapse)AuthorFilesLines
2024-12-13Merge tag 'drm-misc-next-2024-12-05' of ↵Dave Airlie2-1/+4
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next [airlied: handle module ns conflict] drm-misc-next for 6.14: UAPI Changes: Cross-subsystem Changes: Core Changes: - Remove driver date from drm_driver Driver Changes: - amdxdna: New driver! - ivpu: Fix qemu crash when using passthrough - nouveau: expose GSP-RM logging buffers via debugfs - panfrost: Add MT8188 Mali-G57 MC3 support - panthor: misc improvements, - rockchip: Gamma LUT support - tidss: Misc improvements - virtio: convert to helpers, add prime support for scanout buffers - v3d: Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL - vc4: Add support for BCM2712 - vkms: Improvements all across the board - panels: - Introduce backlight quirks infrastructure - New panels: KDB KD116N2130B12 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241205-agile-straight-pegasus-aca7f4@houat
2024-12-05drm: remove driver date from struct drm_driver and all driversJani Nikula1-1/+0
We stopped using the driver initialized date in commit 7fb8af6798e8 ("drm: deprecate driver date") and (eventually) started returning "0" for drm_version ioctl instead. Finish the job, and remove the unused date member from struct drm_driver, its initialization from drivers, along with the common DRIVER_DATE macros. v2: Also update drivers/accel (kernel test robot) Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Simon Ser <contact@emersion.fr> Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/1f2bf2543aed270a06f6c707fd6ed1b78bf16712.1733322525.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds1-1/+1
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-28drm/panfrost: Add GPU ID for MT8188 Mali-G57 MC3AngeloGioacchino Del Regno1-0/+4
The MediaTek MT8188 SoC has a Mali-G57 MC3 GPU and, similarly to MT8192, it has yet another special GPU ID. Add the GPU ID to the list and treat it as a standard Mali-G57. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241113112622.123044-1-angelogioacchino.delregno@collabora.com
2024-11-06drm/panfrost: Add missing OPP table refcnt decrementalAdrián Larumbe1-1/+2
Commit f11b0417eec2 ("drm/panfrost: Add fdinfo support GPU load metrics") retrieves the OPP for the maximum device clock frequency, but forgets to keep the reference count balanced by putting the returned OPP object. This eventually leads to an OPP core warning when removing the device. Fix it by putting OPP objects as many times as they're retrieved. Also remove an unnecessary whitespace. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: f11b0417eec2 ("drm/panfrost: Add fdinfo support GPU load metrics") Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105205458.1318989-1-adrian.larumbe@collabora.com
2024-10-25drm/panfrost: Remove unused id_mask from struct panfrost_modelSteven Price1-1/+0
The id_mask field of struct panfrost_model has never been used. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025140008.385081-1-steven.price@arm.com
2024-09-06drm/sched: add optional errno to drm_sched_start()Christian König1-1/+1
The current implementation of drm_sched_start uses a hardcoded -ECANCELED to dispose of a job when the parent/hw fence is NULL. This results in drm_sched_job_done being called with -ECANCELED for each job with a NULL parent in the pending list, making it difficult to distinguish between recovery methods, whether a queue reset or a full GPU reset was used. To improve this, we first try a soft recovery for timeout jobs and use the error code -ENODATA. If soft recovery fails, we proceed with a queue reset, where the error code remains -ENODATA for the job. Finally, for a full GPU reset, we use error codes -ECANCELED or -ETIME. This patch adds an error code parameter to drm_sched_start, allowing us to differentiate between queue reset and GPU reset failures. This enables user mode and test applications to validate the expected correctness of the requested operation. After a successful queue reset, the only way to continue normal operation is to call drm_sched_job_done with the specific error code -ENODATA. v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain and amdgpu_device_unlock_reset_domain to allow user mode to track the queue reset status and distinguish between queue reset and GPU reset. v2: Christian suggested using the error codes -ENODATA for queue reset and -ECANCELED or -ETIME for GPU reset, returned to amdgpu_cs_wait_ioctl. v3: To meet the requirements, we introduce a new function drm_sched_start_ex with an additional parameter to set dma_fence_set_error, allowing us to handle the specific error codes appropriately and dispose of bad jobs with the selected error code depending on whether it was a queue reset or GPU reset. v4: Alex suggested using a new name, drm_sched_start_with_recovery_error, which more accurately describes the function's purpose. Additionally, it was recommended to add documentation details about the new method. v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex) v6 (chk): rebase on upstream changes, cleanup the commit message, drop the new function again and update all callers, apply the errno also to scheduler fences with hw fences v7 (chk): rebased Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240826122541.85663-1-christian.koenig@amd.com
2024-09-02drm/panfrost: Add cycle counter job requirementMary Guillemard2-13/+23
Extend the uAPI with a new job requirement flag for cycle counters. This requirement is used by userland to indicate that a job requires cycle counters or system timestamp to be propagated. (for use with write value timestamp jobs) We cannot enable cycle counters unconditionally as this would result in an increase of GPU power consumption. As a result, they should be left off unless required by the application. If a job requires cycle counters or system timestamps propagation, we must enable cycle counting before issuing a job and disable it right after the job completes. Since this extends the uAPI and because userland needs a way to advertise features like VK_KHR_shader_clock conditionally, we bumps the driver minor version. v2: - Rework commit message - Squash uAPI changes and implementation in this commit - Simplify changes based on Steven Price comments v3: - Add Steven Price r-b - Fix a codestyle issue Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819080224.24914-3-mary.guillemard@collabora.com
2024-09-02drm/panfrost: Add SYSTEM_TIMESTAMP and SYSTEM_TIMESTAMP_FREQUENCY parametersMary Guillemard4-0/+52
Expose system timestamp and frequency supported by the GPU. Mali uses an external timer as GPU system time. On ARM, this is wired to the generic arch timer so we wire cntfrq_el0 as device frequency. This new uAPI will be used in Mesa to implement timestamp queries and VK_KHR_calibrated_timestamps. v2: - Rewrote to use GPU timestamp register - Add missing include for arch_timer_get_cntfrq - Rework commit message v3: - Move panfrost_cycle_counter_get and panfrost_cycle_counter_put to panfrost_ioctl_query_timestamp - Handle possible overflow in panfrost_timestamp_read Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819080224.24914-2-mary.guillemard@collabora.com
2024-07-29Merge drm/drm-next into drm-misc-nextThomas Zimmermann1-1/+1
Backmerging to get a late RC of v6.10 before moving into v6.11. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-25drm/scheduler: remove full_recover from drm_sched_startChristian König1-1/+1
This was basically just another one of amdgpus hacks. The parameter allowed to restart the scheduler without turning fence signaling on again. That this is absolutely not a good idea should be obvious by now since the fences will then just sit there and never signal. While at it cleanup the code a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
2024-07-05Merge tag 'drm-misc-next-2024-07-04' of ↵Daniel Vetter1-0/+1
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for $kernel-version: UAPI Changes: Cross-subsystem Changes: Core Changes: - dp/mst: Fix daisy-chaining at resume - dsc: Add helper to dump the DSC configuration - tests: Add tests for the new monochrome TV mode variant Driver Changes: - ast: Refactor the mode setting code - panfrost: Fix devfreq job reporting - stm: Add LDVS support, DSI PHY updates - panels: - New panel: AUO G104STN01, K&d kd101ne3-40ti, Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704-curvy-outstanding-lizard-bcea78@houat
2024-07-03drm/panfrost: Mark simple_ondemand governor as softdepDragan Simic1-0/+1
Panfrost DRM driver uses devfreq to perform DVFS, while using simple_ondemand devfreq governor by default. This causes driver initialization to fail on boot when simple_ondemand governor isn't built into the kernel statically, as a result of the missing module dependency and, consequently, the required governor module not being included in the initial ramdisk. Thus, let's mark simple_ondemand governor as a softdep for Panfrost, to have its kernel module included in the initial ramdisk. This is a rather longstanding issue that has forced distributions to build devfreq governors statically into their kernels, [1][2] or has forced users to introduce some unnecessary workarounds. [3] For future reference, not having support for the simple_ondemand governor in the initial ramdisk produces errors in the kernel log similar to these below, which were taken from a Pine64 RockPro64: panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init [panfrost]] *ERROR* Couldn't initialize GPU devfreq panfrost ff9a0000.gpu: Fatal error during GPU init panfrost: probe of ff9a0000.gpu failed with error -22 Having simple_ondemand marked as a softdep for Panfrost may not resolve this issue for all Linux distributions. In particular, it will remain unresolved for the distributions whose utilities for the initial ramdisk generation do not handle the available softdep information [4] properly yet. However, some Linux distributions already handle softdeps properly while generating their initial ramdisks, [5] and this is a prerequisite step in the right direction for the distributions that don't handle them properly yet. [1] https://gitlab.manjaro.org/manjaro-arm/packages/core/linux/-/blob/linux61/config?ref_type=heads#L8180 [2] https://salsa.debian.org/kernel-team/linux/-/merge_requests/1066 [3] https://forum.pine64.org/showthread.php?tid=15458 [4] https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git/commit/?id=49d8e0b59052999de577ab732b719cfbeb89504d [5] https://github.com/archlinux/mkinitcpio/commit/97ac4d37aae084a050be512f6d8f4489054668ad Cc: Diederik de Haas <didi.debian@cknow.org> Cc: Furkan Kardame <f.kardame@manjaro.org> Cc: stable@vger.kernel.org Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/4e1e00422a14db4e2a80870afb704405da16fd1b.1718655077.git.dsimic@manjaro.org
2024-06-21Merge tag 'drm-misc-next-2024-06-20' of ↵Dave Airlie1-0/+10
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.11: UAPI Changes: - New monochrome TV mode variant Cross-subsystem Changes: - dma heaps: Change slightly the allocation hook prototype Core Changes: Driver Changes: - ivpu: various improvements over firmware handling, clocks, power management, scheduling and logging. - mgag200: Add BMC output, enable polling - panfrost: Enable MT8188 support - tidss: drm_panic support - zynqmp_dp: IRQ cleanups, debugfs DP compliance testing API - bridge: - sii902x: state validation improvements - panel: - edp: Drop legacy panel compatibles - simple-bridge: Switch to devm_drm_bridge_add Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620-heretic-honored-macaque-b40f8a@houat
2024-06-19drm/panfrost: Add support for Mali on the MT8188 SoCAngeloGioacchino Del Regno1-0/+10
MediaTek MT8188 has a Mali-G57 MC3 (Valhall-JM): add a new compatible and platform data using the same supplies and the same power domain lists as MT8183 (one regulator, three power domains). Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240611085602.491324-3-angelogioacchino.delregno@collabora.com
2024-05-29drm/panfrost: Fix dma_resv deadlock at drm object pin timeAdrián Larumbe1-1/+1
When Panfrost must pin an object that is being prepared a dma-buf attachment for on behalf of another driver, the core drm gem object pinning code already takes a lock on the object's dma reservation. However, Panfrost GEM object's pinning callback would eventually try taking the lock on the same dma reservation when delegating pinning of the object onto the shmem subsystem, which led to a deadlock. This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws the following recursive locking situation: weston/3440 is trying to acquire lock: ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper] but task is already holding lock: ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm] Fix it by replacing drm_gem_shmem_pin with its locked version, as the lock had already been taken by drm_gem_pin(). Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()") Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-2-adrian.larumbe@collabora.com
2024-04-22Backmerge tag 'v6.9-rc5' into drm-nextDave Airlie2-7/+12
Linux 6.9-rc5 I've had a persistent msm failure on clang, and the fix is in fixes so just pull it back to fix that. Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-04drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()Boris Brezillon1-4/+9
Subject: [PATCH] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr() If some the pages or sgt allocation failed, we shouldn't release the pages ref we got earlier, otherwise we will end up with unbalanced get/put_pages() calls. We should instead leave everything in place and let the BO release function deal with extra cleanup when the object is destroyed, or let the fault handler try again next time it's called. Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Cc: <stable@vger.kernel.org> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Co-developed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240105184624.508603-18-dmitry.osipenko@collabora.com
2024-03-28drm/panfrost: fix power transition timeout warningsChristian Hewitt1-3/+3
Increase the timeout value to prevent system logs on Amlogic boards flooding with power transition warnings: [ 13.047638] panfrost ffe40000.gpu: shader power transition timeout [ 13.048674] panfrost ffe40000.gpu: l2 power transition timeout [ 13.937324] panfrost ffe40000.gpu: shader power transition timeout [ 13.938351] panfrost ffe40000.gpu: l2 power transition timeout ... [39829.506904] panfrost ffe40000.gpu: shader power transition timeout [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout [39949.508369] panfrost ffe40000.gpu: shader power transition timeout [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout The 2000 value has been found through trial and error testing with devices using G52 and G31 GPUs. Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()") Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240322164525.2617508-1-christianshewitt@gmail.com
2024-03-25drm/panfrost: Only display fdinfo's engine and cycle tags when profiling is onAdrián Larumbe1-4/+6
If job accounting is disabled, then both fdinfo's drm-engine and drm-cycle key values will remain immutable. In that case, it makes more sense not to display them at all to avoid confusing user space profiling tools. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240316231306.293817-1-adrian.larumbe@collabora.com
2024-03-11drm/panfrost: Replace fdinfo's profiling debugfs knob with sysfsAdrián Larumbe6-44/+37
Debugfs isn't always available in production builds that try to squeeze every single byte out of the kernel image, but we still need a way to toggle the timestamp and cycle counter registers so that jobs can be profiled for fdinfo's drm engine and cycle calculations. Drop the debugfs knob and replace it with a sysfs file that accomplishes the same functionality, and document its ABI in a separate file. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306015819.822128-2-adrian.larumbe@collabora.com
2023-12-12Backmerge tag 'v6.7-rc5' into drm-nextDave Airlie2-3/+16
Linux 6.7-rc5 Alex requested this for some amdkfd work relying on the symbols exports. Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-12-05drm/panfrost: Synchronize and disable interrupts before powering offAngeloGioacchino Del Regno8-9/+71
To make sure that we don't unintentionally perform any unclocked and/or unpowered R/W operation on GPU registers, before turning off clocks and regulators we must make sure that no GPU, JOB or MMU ISR execution is pending: doing that requires to add a mechanism to synchronize the interrupts on suspend. Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform interrupts masking and ISR execution synchronization, and then call those in the panfrost_device_runtime_suspend() handler in the exact sequence of job (may require mmu!) -> mmu -> gpu. As a side note, JOB and MMU suspend_irq functions needed some special treatment: as their interrupt handlers will unmask interrupts, it was necessary to add an `is_suspended` bitmap which is used to address the possible corner case of unintentional IRQ unmasking because of ISR execution after a call to synchronize_irq(). At resume, clear each is_suspended bit in the reset path of JOB/MMU to allow unmasking the interrupts. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-4-angelogioacchino.delregno@collabora.com
2023-12-05drm/panfrost: Add gpu_irq, mmu_irq to struct panfrost_deviceAngeloGioacchino Del Regno3-10/+12
In preparation for adding a IRQ synchronization mechanism for PM suspend, add gpu_irq and mmu_irq variables to struct panfrost_device and change functions panfrost_gpu_init() and panfrost_mmu_init() to use those. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-3-angelogioacchino.delregno@collabora.com
2023-12-05drm/panfrost: Ignore core_mask for poweroff and disable PWRTRANS irqAngeloGioacchino Del Regno1-4/+8
Some SoCs may be equipped with a GPU containing two core groups and this is exactly the case of Samsung's Exynos 5422 featuring an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost is partial, as this driver currently supports using only one core group and that's reflected on all parts of it, including the power on (and power off, previously to this patch) function. The issue with this is that even though executing the soft reset operation should power off all cores unconditionally, on at least one platform we're seeing a crash that seems to be happening due to an interrupt firing which may be because we are calling power transition only on the first core group, leaving the second one unchanged, or because ISR execution was pending before entering the panfrost_gpu_power_off() function and executed after powering off the GPU cores, or all of the above. Finally, solve this by: - Avoid to enable the power transition interrupt on reset; and - Ignoring the core_mask and ask the GPU to poweroff both core groups Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-2-angelogioacchino.delregno@collabora.com
2023-11-30drm/panfrost: Fix incorrect updating of current device frequencyAdrián Larumbe1-2/+15
It was noticed when setting the Panfrost's DVFS device to the performant governor, GPU frequency as reported by fdinfo had dropped to 0 permamently. There are two separate issues causing this behaviour: - Not initialising the device's current_frequency variable to its original value during device probe(). - Updating said variable in Panfrost devfreq's get_dev_status() rather than after the new OPP's frequency had been retrieved in target(), which meant the old frequency would be assigned instead. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: f11b0417eec2 ("drm/panfrost: Add fdinfo support GPU load metrics") Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231125205438.375407-3-adrian.larumbe@collabora.com
2023-11-30drm/panfrost: Consider dma-buf imported objects as residentAdrián Larumbe1-1/+1
A GEM object constructed from a dma-buf imported sgtable should be regarded as being memory resident, because the dma-buf API mandates backing storage to be allocated when attachment succeeds. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: 9ccdac7aa822 ("drm/panfrost: Add fdinfo support for memory stats") Reported-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231125205438.375407-2-adrian.larumbe@collabora.com
2023-11-15Merge drm/drm-next into drm-misc-nextMaxime Ripard4-15/+24
Let's kickstart the v6.8 release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-11-10drm/panfrost: Set regulators on/off during system sleep on MediaTek SoCsAngeloGioacchino Del Regno1-3/+3
All of the MediaTek SoCs supported by Panfrost can completely cut power to the GPU during full system sleep without any user-noticeable delay in the resume operation, as shown by measurements taken on multiple MediaTek SoCs (MT8183/86/92/95). As an example, for MT8195 - a "before" with only runtime PM operations (so, without turning on/off regulators), and an "after" executing both the system sleep .resume() handler and .runtime_resume() (so the time refers to T_Resume + T_Runtime_Resume): Average Panfrost-only system sleep resume time, before: ~33500ns Average Panfrost-only system sleep resume time, after: ~336200ns Keep in mind that this additional ~308200 nanoseconds delay happens only in resume from a full system suspend, and not in runtime PM operations, hence it is acceptable. Measurements were also taken on MT8186, showing a delay of ~312000 ns. Testing of this happened on all of the aforementioned MediaTek SoCs, but: MT8183 got tested only by KernelCI with <=10 suspend/resume cycles MT8186, MT8192, MT8195 were tested manually with over 100 suspend/resume cycles with GNOME DE (Mutter + Wayland). Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-7-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Implement ability to turn on/off regulators in suspendAngeloGioacchino Del Regno2-1/+20
Some platforms/SoCs can power off the GPU entirely by completely cutting off power, greatly enhancing battery time during system suspend: add a new pm_feature GPU_PM_VREG_OFF to allow turning off the GPU regulators during full suspend only on selected platforms. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-6-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Set clocks on/off during system sleep on MediaTek SoCsAngeloGioacchino Del Regno1-0/+3
All of the MediaTek SoCs supported by Panfrost can switch the clocks off and on during system sleep to save some power without any user experience penalty. Measurements taken on multiple MediaTek SoCs (MT8183/8186/8192/8195) show that adding this will not prolong the time that is required to resume the system in any meaningful way. As an example, for MT8195 - a "before" with only runtime PM operations (so, without turning on/off GPU clocks), and an "after" executing both the system sleep .resume() handler and .runtime_resume() (so the time refers to T_Resume + T_Runtime_Resume): Average Panfrost-only system sleep resume time, before: ~28000ns Average Panfrost-only system sleep resume time, after: ~33500ns Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-5-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Implement ability to turn on/off GPU clocks in suspendAngeloGioacchino Del Regno2-4/+68
Currently, the GPU is being internally powered off for runtime suspend and turned back on for runtime resume through commands sent to it, but note that the GPU doesn't need to be clocked during the poweroff state, hence it is possible to save some power on selected platforms. Add suspend and resume handlers for full system sleep and then add a new panfrost_gpu_pm enumeration and a pm_features variable in the panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to enable this power saving technique only on SoCs that are able to safely use it. Note that this was implemented only for the system sleep case and not for runtime PM because testing on one of my MediaTek platforms showed issues when turning on and off clocks aggressively (in PM runtime) resulting in a full system lockup. Doing this only for full system sleep never showed issues during my testing by suspending and resuming the system continuously for more than 100 cycles. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-4-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Tighten polling for soft reset and power onAngeloGioacchino Del Regno1-4/+4
In many cases, soft reset takes more than 1 microsecond, but definitely less than 10; moreover in the poweron flow, tilers, shaders and l2 will become ready (each) in less than 10 microseconds as well. Even in the cases (at least on my platforms, rarely) in which those take more than 10 microseconds, it's very unlikely to see both soft reset and poweron to take more than 70 microseconds. Shorten the polling delay to 10 microseconds to consistently reduce the runtime resume time of the GPU. As an indicative example, measurements taken on a MediaTek MT8195 SoC Average runtime resume time in nanoseconds before this commit: GDM, user selection up/down: 88435ns GDM, Text Entry (typing user/password): 91489ns GNOME Desktop, idling, GKRELLM running: 73200ns After this commit: GDM: user selection up/down: 26690ns GDM: Text Entry (typing user/password): 27917ns GNOME Desktop, idling, GKRELLM running: 25304ns Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-3-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Perform hard reset to recover GPU if soft reset failsAngeloGioacchino Del Regno2-3/+11
Even though soft reset should ideally never fail, during development of some power management features I managed to get some bits wrong: this resulted in GPU soft reset failures, where the GPU was never able to recover, not even after suspend/resume cycles, meaning that the only way to get functionality back was to reboot the machine. Perform a hard reset after a soft reset failure to be able to recover the GPU during runtime (so, without any machine reboot). Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-2-angelogioacchino.delregno@collabora.com
2023-11-10drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()AngeloGioacchino Del Regno1-18/+46
The layout of the registers {TILER,SHADER,L2}_PWROFF_LO, used to request powering off cores, is the same as the {TILER,SHADER,L2}_PWRON_LO ones: this means that in order to request poweroff of cores, we are supposed to write a bitmask of cores that should be powered off! This means that the panfrost_gpu_power_off() function has always been doing nothing. Fix powering off the GPU by writing a bitmask of the cores to poweroff to the relevant PWROFF_LO registers and then check that the transition (from ON to OFF) has finished by polling the relevant PWRTRANS_LO registers. While at it, in order to avoid code duplication, move the core mask logic from panfrost_gpu_power_on() to a new panfrost_get_core_mask() function, used in both poweron and poweroff. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231102141507.73481-1-angelogioacchino.delregno@collabora.com
2023-11-10drm/sched: implement dynamic job-flow controlDanilo Krummrich2-2/+2
Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Luben Tuikov <ltuikov89@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com
2023-11-02Merge tag 'mm-stable-2023-11-01-14-33' of ↵Linus Torvalds4-15/+24
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...
2023-11-01drm/sched: Convert drm scheduler to use a work queue rather than kthreadMatthew Brost1-1/+1
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
2023-10-30drm/panfrost: Remove incorrect IS_ERR() checkSteven Price1-10/+2
sg_page_iter_page() doesn't return an error code, so the IS_ERR() check is wrong and the error path will never be executed. This also allows simplifying the code to remove the local variable 'page'. CC: Adrián Larumbe <adrian.larumbe@collabora.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/376713ff-9a4f-4ea3-b097-fb5efb685d95@moroto.mountain Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com> Tested-by: Adrián Larumbe <adrian.larumbe@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020104405.53992-1-steven.price@arm.com
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov1-0/+1
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
2023-10-11Merge drm/drm-next into drm-misc-nextThomas Zimmermann1-1/+1
Updating drm-misc-next to the state of Linux v6.6-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2023-10-04drm/panfrost: dynamically allocate the drm-panfrost shrinkerQi Zheng4-15/+24
In preparation for implementing lockless slab shrink, use new APIs to dynamically allocate the drm-panfrost shrinker, so that it can be freed asynchronously via RCU. Then it doesn't need to wait for RCU read-side critical section when releasing the struct panfrost_device. Link: https://lkml.kernel.org/r/20230911094444.68966-23-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: David Airlie <airlied@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04drm/panfrost: Implement generic DRM object RSS reporting functionAdrián Larumbe3-0/+21
BO's RSS is updated every time new pages are allocated on demand and mapped for the object at GPU page fault's IRQ handler, but only for heap buffers. The reason this is unnecessary for non-heap buffers is that they are mapped onto the GPU's VA space and backed by physical memory in their entirety at BO creation time. This calculation is unnecessary for imported PRIME objects, since heap buffers cannot be exported by our driver, and the actual BO RSS size is the one reported in its attached dmabuf structure. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-6-adrian.larumbe@collabora.com
2023-10-04drm/panfrost: Add fdinfo support for memory statsAdrián Larumbe2-0/+17
A new DRM GEM object function is added so that drm_show_memory_stats can provide more accurate memory usage numbers. Ideally, in panfrost_gem_status, the BO's purgeable flag would be checked after locking the driver's shrinker mutex, but drm_show_memory_stats takes over the drm file's object handle database spinlock, so there's potential for a race condition here. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-4-adrian.larumbe@collabora.com
2023-10-04drm/panfrost: Add fdinfo support GPU load metricsAdrián Larumbe12-1/+194
The drm-stats fdinfo tags made available to user space are drm-engine, drm-cycles, drm-max-freq and drm-curfreq, one per job slot. This deviates from standard practice in other DRM drivers, where a single set of key:value pairs is provided for the whole render engine. However, Panfrost has separate queues for fragment and vertex/tiler jobs, so a decision was made to calculate bus cycles and workload times separately. Maximum operating frequency is calculated at devfreq initialisation time. Current frequency is made available to user space because nvtop uses it when performing engine usage calculations. It is important to bear in mind that both GPU cycle and kernel time numbers provided are at best rough estimations, and always reported in excess from the actual figure because of two reasons: - Excess time because of the delay between the end of a job processing, the subsequent job IRQ and the actual time of the sample. - Time spent in the engine queue waiting for the GPU to pick up the next job. To avoid race conditions during enablement/disabling, a reference counting mechanism was introduced, and a job flag that tells us whether a given job increased the refcount. This is necessary, because user space can toggle cycle counting through a debugfs file, and a given job might have been in flight by the time cycle counting was disabled. The main goal of the debugfs cycle counter knob is letting tools like nvtop or IGT's gputop switch it at any time, to avoid power waste in case no engine usage measuring is necessary. Also add a documentation file explaining the possible values for fdinfo's engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-3-adrian.larumbe@collabora.com
2023-10-04drm/panfrost: Add cycle count GPU register definitionsAdrián Larumbe1-0/+5
These GPU registers will be used when programming the cycle counter, which we need for providing accurate fdinfo drm-cycles values to user space. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-2-adrian.larumbe@collabora.com
2023-09-22Merge tag 'drm-misc-next-2023-09-11-1' of ↵Dave Airlie3-6/+6
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: UAPI Changes: - Nouveau changed to not set NO_PREFETCH flag explicitly. Cross-subsystem Changes: - Update documentation of dma-buf intro and uapi. - fbdev/sbus fixes. - Use initializer macros in a lot of fbdev drivers. - Add Boris Brezillon as Panfrost driver maintainer. - Add Jessica Zhang as drm/panel reviewer. - Make more fbdev drivers use fb_ops helpers for deferred io. - Small hid trailing whitespace fix. - Use fb_ops in hid/picolcd Core Changes: - Assorted small fixes to ttm tests, drm/mst. - Documentation updates to bridge. - Add kunit tests for some drm_fb functions. - Rework drm_debugfs implementation. - Update xe documentation to mark todos as completed. Driver Changes: - Add support to rockchip for rv1126 mipi-dsi and vop. - Assorted small fixes to nouveau, bridge/samsung-dsim, bridge/lvds-codec, loongson, rockchip, panfrost, gma500, repaper, komeda, virtio, ssd130x. - Add support for simple panels Mitsubishi AA084XE01, JDI LPM102A188A, - Documentation updates to accel/ivpu. - Some nouveau scheduling/fence fixes. - Power management related fixes and other fixes to ivpu. - Assorted bridge/it66121 fixes. - Make platform drivers return void in remove() callback. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/3da6554b-3b47-fe7d-c4ea-21f4f819dbb6@linux.intel.com
2023-08-30Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drmLinus Torvalds4-20/+39
Pull drm updates from Dave Airlie: "The drm core grew a new generic gpu virtual address manager, and new execution locking helpers. These are used by nouveau now to provide uAPI support for the userspace Vulkan driver. AMD had a bunch of new IP core support, loads of refactoring around fbdev, but mostly just the usual amount of stuff across the board. core: - fix gfp flags in drmm_kmalloc gpuva: - add new generic GPU VA manager (for nouveau initially) syncobj: - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl dma-buf: - acquire resv lock for mmap() in exporters - support dma-buf self import automatically - docs fixes backlight: - fix fbdev interactions atomic: - improve logging prime: - remove struct gem_prim_mmap plus driver updates gem: - drm_exec: add locking over multiple GEM objects - fix lockdep checking fbdev: - make fbdev userspace interfaces optional - use linux device instead of fbdev device - use deferred i/o helper macros in various drivers - Make FB core selectable without drivers - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT - Add helper macros and Kconfig tokens for DMA-allocated framebuffer ttm: - support init_on_free - swapout fixes panel: - panel-edp: Support AUO B116XAB01.4 - Support Visionox R66451 plus DT bindings - ld9040: - Backlight support - magic improved - Kconfig fix - Convert to of_device_get_match_data() - Fix Kconfig dependencies - simple: - Set bpc value to fix warning - Set connector type for AUO T215HVN01 - Support Innolux G156HCE-L01 plus DT bindings - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings - sitronix-st7789v: - Support Inanbo T28CP45TN89 plus DT bindings - Support EDT ET028013DMA plus DT bindings - Various cleanups - edp: Add timings for N140HCA-EAC - Allow panels and touchscreens to power sequence together - Fix Innolux G156HCE-L01 LVDS clock bridge: - debugfs for chains support - dw-hdmi: - Improve support for YUV420 bus format - CEC suspend/resume - update EDID on HDMI detect - dw-mipi-dsi: Fix enable/disable of DSI controller - lt9611uxc: Use MODULE_FIRMWARE() - ps8640: Remove broken EDID code - samsung-dsim: Fix command transfer - tc358764: - Handle HS/VS polarity - Use BIT() macro - Various cleanups - adv7511: Fix low refresh rate - anx7625: - Switch to macros instead of hardcoded values - locking fixes - tc358767: fix hardware delays - sitronix-st7789v: - Support panel orientation - Support rotation property - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings amdgpu: - SDMA 6.1.0 support - HDP 6.1 support - SMUIO 14.0 support - PSP 14.0 support - IH 6.1 support - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SMU 13.x fixes - PSP 13.x fixes - SubVP fixes - GC 9.4.3 fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes - initial freesync panel replay support - revert zpos properly until igt regression is fixeed - use TTM to manage doorbell BAR - Expose both current and average power via hwmon if supported amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes - Convert older APUs to use dGPU path like newer APUs - Drop IOMMUv2 path as it is no longer used - TBA fix for aldebaran i915: - ICL+ DSI modeset sequence - HDCP improvements - MTL display fixes and cleanups - HSW/BDW PSR1 restored - Init DDI ports in VBT order - General display refactors - Start using plane scale factor for relative data rate - Use shmem for dpt objects - Expose RPS thresholds in sysfs - Apply GuC SLPC min frequency softlimit correctly - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL - Fix a VMA UAF for multi-gt platform - Do not use stolen on MTL due to HW bug - Check HuC and GuC version compatibility on MTL - avoid infinite GPU waits due to premature release of request memory - Fixes and updates for GSC memory allocation - Display SDVO fixes - Take stolen handling out of FBC code - Make i915_coherent_map_type GT-centric - Simplify shmem_create_from_object map_type msm: - SM6125 MDSS support - DPU: SM6125 DPU support - DSI: runtime PM support, burst mode support - DSI PHY: SM6125 support in 14nm DSI PHY driver - GPU: prepare for a7xx - fix a690 firmware - disable relocs on a6xx and newer radeon: - Lots of checkpatch cleanups ast: - improve device-model detection - Represent BMV as virtual connector - Report DP connection status nouveau: - add new exec/bind interface to support Vulkan - document some getparam ioctls - improve VRAM detection - various fixes/cleanups - workraound DPCD issues ivpu: - MMU updates - debugfs support - Support vpu4 virtio: - add sync object support atmel-hlcdc: - Support inverted pixclock polarity etnaviv: - runtime PM cleanups - hang handling fixes exynos: - use fbdev DMA helpers - fix possible NULL ptr dereference komeda: - always attach encoder omapdrm: - use fbdev DMA helpers ingenic: - kconfig regmap fixes loongson: - support display controller mediatek: - Small mtk-dpi cleanups - DisplayPort: support eDP and aux-bus - Fix coverity issues - Fix potential memory leak if vmap() fail mgag200: - minor fixes mxsfb: - support disabling overlay planes panfrost: - fix sync in IRQ handling ssd130x: - Support per-controller default resolution plus DT bindings - Reduce memory-allocation overhead - Improve intermediate buffer size computation - Fix allocation of temporary buffers - Fix pitch computation - Fix shadow plane allocation tegra: - use fbdev DMA helpers - Convert to devm_platform_ioremap_resource() - support bridge/connector - enable PM tidss: - Support TI AM625 plus DT bindings - Implement new connector model plus driver updates vkms: - improve write back support - docs fixes - support gamma LUT zynqmp-dpsub: - misc fixes" * tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits) drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map() drm/tests/drm_kunit_helpers: Place correct function name in the comment header drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly drm/nouveau: uvmm: fix unset region pointer on remap drm/nouveau: sched: avoid job races between entities drm/i915: Fix HPD polling, reenabling the output poll work as needed drm: Add an HPD poll helper to reschedule the poll work drm/i915: Fix TLB-Invalidation seqno store drm/ttm/tests: Fix type conversion in ttm_pool_test drm/msm/a6xx: Bail out early if setting GPU OOB fails drm/msm/a6xx: Move LLC accessors to the common header drm/msm/a6xx: Introduce a6xx_llc_read drm/ttm/tests: Require MMU when testing drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0"" drm/amdgpu: Add memory vendor information drm/amd: flush any delayed gfxoff on suspend entry drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix drm/amdgpu: Remove gfxoff check in GFX v9.4.3 drm/amd/pm: Update pci link speed for smu v13.0.6 ...
2023-08-21drm/panfrost: Skip speed binning on EOPNOTSUPPDavid Michael1-1/+1
Encountered on an ARM Mali-T760 MP4, attempting to read the nvmem variable can also return EOPNOTSUPP instead of ENOENT when speed binning is unsupported. Cc: <stable@vger.kernel.org> Fixes: 7d690f936e9b ("drm/panfrost: Add basic support for speed binning") Signed-off-by: David Michael <fedora.dm0@gmail.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/87msyryd7y.fsf@gmail.com
2023-08-21drm/panfrost: Do not check for 0 return after calling platform_get_irq_byname()Ruan Jinjie3-6/+6
It is not possible for platform_get_irq_byname() to return 0. Use the return value from platform_get_irq_byname(). Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230803040401.3067484-2-ruanjinjie@huawei.com