This used to be so much easier in the times before systemd.
You see, you can use Python with Qt Webkit or Webengine to create an easily controlled web browser, and you can run Qt on top of X11, Wayland, or even the raw framebuffer. So, all you needed was to remove the greeter, optionally start the X11/Wayland server before your application, and run your Python script as a dedicated user as a service. With very little configuration, you could (under X11) get even accelerated video playback within the controlled web browser. Simples!
Most of the Python code is configuration (window location and such) and browser manipulation. IIRC, the minimum browser implementation was something like a dozen lines or so.
Under systemd, the procedure is much the same, except that you must remember that systemd controls all, and that you must abide by its internal model of services, users, and sessions; and ensure the necessary dbus and systemd services are running, and so on. Essentially, instead of a "kiosk", you must think of it more as a "single-user workstation" with an untrusted and limited user (but don't think of that user as a guest, because that invokes a different submodel in systemd; more like a passwordless login user).
Or, you can use one of the kiosk projects/templates, which have about as much security as a bag of marshmallows, because "security" is not something systemd design patterns ever consider; it's too hard to think about, they'll add it later, you see.
Ahem. Apologies for the rant.
As I have a few Linux SBCs, I've been asked to do a guide on how to prune Debian to secure kiosk use, maximizing maintainability, but ripping out all the systemd nonsense that are not suitable for small appliances or secure kiosks, then rebuilding one of the init systems with a fast boot and nice bootsplash/animation; but certain details jusk irk me too much. (I've done this sort of stuff years ago for a couple of different universities, server and kiosk stuff, standalone testbenches for cluster tenders, etc. I do claim I know this stuff.)
First is that I dislike RPi's because Qualcomm/Foundation hostility to Linux/GPL developers. The second is that dropping systemd – which really is completely unsuitable for small appliances and secure kiosks due to its internal model of what should be happening (users, sessions, the entire dbus model) – usually causes a volley of negative comments and useless time-wasting discussion from people who do not understand the details. (Simply put, systemd model is designed for workstation use, and does not really fit anything else at all; it's just shoe-horned to kinda-sorta work. Like using a letter opener to drive a screw. For what it is designed for, systemd works quite well for most users. It is just an all-compassing userspace framework and model, not an answer to all problems.)
Q: If we ignore security for a while, what are the main differences between existing kiosk setups and a full DIY kiosk/appliance stack?
A: Control, versatility, and freedom. You are not tied to any specific model, and have full control of the hardware.
If one of the existing kiosk/appliance projects suits your needs, and you don't worry about security, feel free to use those.