-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DCP: Initial support for overlay blending #304
base: bits/200-dcp
Are you sure you want to change the base?
DCP: Initial support for overlay blending #304
Conversation
@@ -318,16 +382,29 @@ static int apple_probe_per_dcp(struct device *dev, | |||
struct apple_crtc *crtc; | |||
struct apple_connector *connector; | |||
struct apple_encoder *enc; | |||
struct drm_plane *primary; | |||
int ret; | |||
struct drm_plane *planes[DCP_MAX_PLANES]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only need a single plane pointer (since we do not support a cursor plane)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the pointers to set the immutable zpos property in a sane way.
if (IS_ERR(primary)) | ||
return PTR_ERR(primary); | ||
/* Set up our other planes */ | ||
for (i = 1; i < DCP_MAX_PLANES; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any advantage of offering more than overlay plane if we can blend only 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I did it like this instead of hard-coding two planes was to make it easier to extend if/when we figure out how to get all four surfaces to play nice somehow. I can revert back to hard coded planes if you don't think there's value in doing it like this
* Plane order is nondeterministic for this iterator. DCP will | ||
* almost always crash at some point if the z order of planes | ||
* flip-flops around. Make sure we are always blending them | ||
* in the correct order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems strange and I wouldn't expect that changing the order is the cause of the crash but something else. We of course need a deterministic order for proper display.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a race leaving a dangling pointer somewhere from all the surfaces swapping order, but I definitely did isolate the underlying cause to the fact that they were swapping order. If you leave everything as-is and comment out normalize_zpos
it will happen.
* having 4 surfaces, we can only blend two. Surface 0 is also | ||
* unusable on some machines, so ignore it. | ||
*/ | ||
l = MAX_BLEND_SURFACES - new_state->normalized_zpos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better approach with the current state of supported planes and formats would be to hard-code l
based on plane type
25cb4ec
to
db6d1c7
Compare
Rebased and addressed feedback |
db6d1c7
to
dfcb592
Compare
dfcb592
to
933cb59
Compare
83a2784
to
15b881f
Compare
ccc12d2
to
24c1838
Compare
933cb59
to
35b66c6
Compare
Rebased on top of |
Add a DRM/KMS driver for Apple system on chips using the DCP coprocessor, namely the Apple M1. The DCP was added in Apple A14; this driver does not apply to older iDevices. This driver targets the DCP firmware API shipped by macOS 12.1. Currently no incompatibilities with macOS 12.0.1 or 12.2.1 are known. Signed-off-by: Alyssa Rosenzweig <[email protected]> Co-developed-by: Janne Grunau <[email protected]> Signed-off-by: Janne Grunau <[email protected]>
Link: https://oftc.irclog.whitequark.org/asahi-dev/2022-03-20#30747564 Link: https://oftc.irclog.whitequark.org/asahi-dev/2022-03-20#30747570 Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Janne Grunau <[email protected]>
Xorg startup with modesetting driver triggers this. Move vblank signalling to dcp to avoid a circular dependency between apple_drv and dcp. Signed-off-by: Janne Grunau <[email protected]>
Frequency differs between M1 and M1 Pro/Max/Ultra. Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Janne Grunau <[email protected]>
The framebuffer can be unreferenced by GEM while the display controller is still using it for scanout resulting in IOVA faults and crashed dcp. dcp has to hold a reference until the swap is complete. Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Janne Grunau <[email protected]>
Use the firmware setPowerState callback. Signed-off-by: Alyssa Rosenzweig <[email protected]>
fixup! WIP: drm/apple: Add DCP display driver Signed-off-by: Janne Grunau <[email protected]>
Two callbacks for IOMFB::MemDescRelay seems to be dropped between 12.3 and 13.5 DCP firmware. This results in the renumbering of MemDescRelay::release_descriptor from D456 to D454. Noticed while when switching the display refresh rate to 50 Hz with a 14.5 system firmware on a M1 Max Macbook Pro. Signed-off-by: Janne Grunau <[email protected]>
The most likely cause for this is an unexpected callback form which the current driver doesn't recover. Warn only once about it. Signed-off-by: Janne Grunau <[email protected]>
Reduce log spam while errors are still likely due missing state checks.
Can be still enabled by adding `apple_dcp.hdmi_audio` the kernel command line. Signed-off-by: Janne Grunau <[email protected]>
Since we don't init/uses drm's vblank support our page flip timestamps are CLOCK_MONOTONIC timestamps during the event generation. Since compositors use the timestamp to schedule their next kms commit this is timing sensitive sop move it under the drivers control. Take the timestamp directly in the swap_complete callback. Framebuffer swaps are unfortunately not fast with DCP. Measured time from swap_submit to swap_complete is ~1.5 ms for dcp and ~2.3 ms for dcpext. This warrants further investigation. Presentation timestamps might help if delay on dcp firmware side occurs after the actual swap. In the meantime doctor the time stamps and move the page flip completion up to 1 ms earler. This fixes half rate refresh on external displays displays using dcpext. Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
This works even before the DMA device probes. Might help deal with runtime-pm ordering, though it doesn't solve the deferred ordering problem (since we're creating the link while already probing)... Signed-off-by: Asahi Lina <[email protected]>
Allow the DMA device driver to probe late, and still create the sound device upfront. Instead try to request the DMA channel on first PCM open. This should be safe as long as we bail early and don't allow the process to continue to configuring buffers (since that requires the DMA to be configured). Signed-off-by: Asahi Lina <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Signed-off-by: Sven Peter <[email protected]>
This appears to be lane count as "2" is observed for USB-C DP alt mode in shared DP/USB3 mode. Signed-off-by: Janne Grunau <[email protected]>
This unfortunately doesn't work relieably with typec-altmode-displayport since the oob hotplug notification arrives before atc-phy is configured to the appropiate DP mode. Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Fixes failure to unmap buffers in dcpep_cb_unmap_piodma() due to the unaligned size. Further along this causes kernel log splat when DCP tries to map the buffers again since thye IOVA is still in use. This causes no apparent issue although map_piodma callback signals an errror and returns 0 (unmapped as DVA). It's not clear why this presents only randomly. Possibly some build or uninitialized memory triggers this unmap/free and immediate allocate/map cycle in the DCP firmware. I never notices this with a clang-built kernel on j314c. It showed with gcc build with the Fedora config at least on 6.8.8 based kernels. This did not reproduce on j375d. Signed-off-by: Janne Grunau <[email protected]>
This reverts commit fa86f31. Signed-off-by: Janne Grunau <[email protected]>
The dpavserv endpoint uses various EPICProviderClass depending on the connected display. Observed values: - "AppleDCPAgileCDIDPDisplay" (j134c, dcp, panel) - "AppleDCPMCDP29XX" (j274, dcp, hdmi) - "AppleDCPPS190" (j474s, dcpext0, hdmi) - "DCPDPService" (j474s, dcpext1, typec) So match against against EPICName which is consistent in all cases. This also allows the distinction between 'dcpav-service-epic' and 'dcpdp-service-epic'. Not sure what the second EPIC service is used for. Signed-off-by: Janne Grunau <[email protected]>
Known uses EDID retrieval and raw I2C access. Signed-off-by: Janne Grunau <[email protected]>
External display only since the EDID provided by integrated panels holds no useful / correct information. Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Martin Povišer <[email protected]>
The for_each_oldnew_plane_in_state iterator is nondeterministic in terms of the order of planes. DCP expects surfaces to be fed to it in the correct order. Relying on the iterator to lazily increment the index into surf[] means we cannot meet this expectation. The constant reordering of planes in the surf[] array seems to cause DCP to crash under certain circumstances. Cursors will also often be drawn under the main plane, which is less than ideal. Populate surf[] in the order everyone expects us to. This fixes a whole host of odd behaviour when wiring up multiple DRM universal planes. Signed-off-by: James Calligeros <[email protected]>
Despite having 4 surfaces, DCP can only blend two of them at once. Constrain swaps to two surfaces, and warn if userspace somehow tries to give us more to swap. Signed-off-by: James Calligeros <[email protected]>
Owing to its origin in mobile devices and the Apple TV, DCP seems to have been designed under the assumption that no one could possibly want a rectangle to clip the screen. If a rectangle's bottom-right edge clips the screen, DCP will instead try to scale the destination rectangle to the best of its ability... until it can't anymore. DCP is not tolerant to faults and will crash if the onscreen portion of the framebuffer ends up smaller than 32x32, or if any dimension ends up entirely offscreen. Use apple_plane_atomic_check() to reject requested plane states that could crash DCP. This is the final piece of the puzzle required to enable preliminary support for overlay planes on Apple Silicon devices. Signed-off-by: James Calligeros <[email protected]>
DCP is capable of compositing two surfaces in hardware. This is important for zero-copy video playback, etc. Set up an overlay plane so that userspace can do cool things with it. Signed-off-by: James Calligeros <[email protected]>
Fix the call to drm_atomic_helper_check_plane_state to use the correct scaling factors. Signed-off-by: James Calligeros <[email protected]>
Some userspace may not handle invalid plane checks gracefully when falling back to a software cursor. This will manifest as the screen freezing, recoverable by moving the cursor away from a screen edge. Throw a warning once to let the user know why this has happened. Signed-off-by: James Calligeros <[email protected]>
Userspace cannot be trusted to give us a sane zpos value, but given DCP's requirement that the primary plane always be the bottommost surface, we can't rely on drm_atomic_normalize_zpos() to do the job for us either. Make the zpos property immutable, and keep the primary plane at zpos 0. Signed-off-by: James Calligeros <[email protected]>
a9bc693
to
5669272
Compare
kwin now supports using overlay planes for cursors (queued for 6.3, backported to 6.2 in fedora) so we now have a use case! |
DCP can blend two surfaces, so we should wire this up.
We currently don't really have a valid use case for this functionality, however this will be required to make full use of zero-copy video decode once AVD is merged. Work is ongoing in userspace to properly make use of hardware overlay planes. Given DCP's provenance, the best example of this would be directly scanning out hardware-decoded video to the primary plane, then overlaying subtitles/media controls on a different plane.
Given that DCP dies if we try to blend more than two surfaces, we cannot expose an overlay plane and a hardware cursor plane, as userspace has been designed to expect for the last 30 years. Even if we could, again owing to its provenance DCP cannot deal with the unique requirements of a cursor plane - it crashes if a framebuffer partially clips outside of screenspace. This can be hacked around and it is possible to expose a mostly-working cursor plane, however Kwin seems to be the only compositor that is architecturally capable of gracefully handling this in the same way macOS does (fall back to software when required and go back to hardware once the driver says it's safe). As a part of bringing up support for overlay planes, Kwin will also be able to use a
DRM_PLANE_TYPE_OVERLAY
for a hardware cursor, negating the need for any hacks. Kwin will safely use a hardware cursor on our overlay plane, and compositors which can't/won't implement a similar feature will just use it as a traditionalDRM_PLANE_TYPE_OVERLAY
.This series only implements support for ARGB/ABGR on the overlay plane. This is really only useful for cursors, media player overlays/graphics, etc. and does not magically make video work. The final piece in the zero-copy video playback puzzle will be supporting the direct scanout of planar YUV video formats. This may not be trivial.