Such ads would be detectable because they offer the ability to either skip them or click on them to access other content. The timecodes for this information would need to be provided to the client. They can do traditional TV-style advertisement with no interaction, but that would probably not be that popular amongst advertisers (they -love- interaction.) YouTube also precaches a huge amount at local datacenters; they want to supply each user with different ads. Encoding that into the stream is difficult. They can mux it into the stream but such muxes are detectable because they would break the I/P/B frame sequence of the video (or the equivalent for AV1/whatever Google use).
They would probably be detectable with the metadata, even if obscured - it would be annoying to keep up with this for the blockers. However if Google refuses to send the stream segments significantly ahead of their realtime target start, then this doesn't achieve much. You could block the ad from displaying based on the metadata, but you can't skip ahead if Google won't send you those frames early.
It's not trivial, but I don't think it's beyond an entity like Google to mux the ads into the stream. They rencode everything uploaded anyway, so they can place I frames where they need them (or keep track of precisely where the natural ones are in the original stream). I think they actually already do something similar to try to place ads at natural cuts in the source material, though it doesn't work that well, presumably it relies on I-frame placement, that would be the simplest way to make such a guess. Their caches are already intelligent enough to support DASH segmented streaming, so it would not be a huge stretch to assemble the video stream per-user, whether the client is using DASH or traditional HTTP, they 'just' need to serve a virtual stream that the web server can assemble from different files. Technically it seems quite achievable. Not allowing users to grab segments before they should seems like the harder part, and requires more state to keep track of where each player is (supposed to be) at, but all the information needed is there to do it.
I am just not sure it's worth it, it would definitely make other things like seeking difficult to implement for ad-based clients, and I guess simply taking the 'block the ad blockers' actions they have done will mostly achieve the goal of increasing ad views.
I've no issue with image based ads or even ads below the video, ads in search results, that kind of thing. I just don't like the pre-roll and mid video ads. YouTube premium is just a bit too expensive for my taste, if it was around £5-6 per month that might be more reasonable but at £12 per month (at least in the UK) it's as expensive as Netflix and the like when the production value and costs are far lower (especially given most YouTubers earn more from sponsors than the ad revenue).
I do think the price is a bit steep, but I think the 'low production value' user generated content is exactly what is compelling about it. I watch far more YouTube than any other media, and it is exactly because most of what I watch is relatively niche and would never have enough viewers to support TV-level production value. In fact, much of what I enjoy watch would probably be ruined by increased production value to that level; the off the cuff stuff, the failed projects, the hour long 'AMA' videos and so on. The main selling point to me is exactly that - the 'not profitable enough' content. I justify it with the included YouTube Music subscription.
Production costs might be lower (hard to say overall, there is orders of magnitude more content produced) but I suspect YouTube's operating costs are much higher, they host way more content, need to develop and maintain user-facing production and streaming tools, and I guess get a lot more views.