The LO is large signal?
That's just a normal mixer, you form the product with any input amplitude up to the limit of the mixer itself (where clipping occurs).
Typically, the LO signal is large, so that it normally causes clipping, but only in the path it's active. This gives maximum RF-to-IF port gain, with the downside of causing higher order mixing products (which are usually filtered away).
With that LO overdrive, the exact operation is not so much a product of sines, as the product of a sine with something closer to a square wave. Which, in turn, is closer to toggling a polarity switch on the input waveform. Square waves contain harmonics, so, you get mixing products around the harmonics too. The overall effect is still fine, but reality is dirtier than simplest theory, and you need to keep all your inputs and outputs pristine with filters (and controlled impedances).
Or, for a single balanced mixer, the switch is between "off" and "on", which loses half the signal (on average), so you get more mixer loss (a bit over 6dB) for an SBM versus a DBM.
Regarding dual gate FETs, you feed RF to one gate (usually g1) and LO to the other. As long as the LO signal is a few times larger than Vgs(th) (or Vpo if depletion mode), the effect is the same: the transistor is switched between variably-on (variable depending on RF) and hard cutoff, causing drain current to follow suit. (Drain voltage is then whatever load resistance * drain current is, so you get gain this way.
Note you can't avoid mixer loss -- that is, your noise figure will be <mixer loss> worse than if it were a linear amplifier, with constant bias instead of LO drive.
Tim