Would this be amenable to "morphing between presets", or even manually combining a selection from one network into another network? Lots of things to try out here!
Logic gates implemented with non-ideal transistors have non-zero rise times. Therefore, they are smooth and differentiable.
--- edit ---
After reading the paper more thoroughly, I find the way they implement differentiable logic clever. They use continuous relaxations of 16 logic operators, run them in parallel and apply a softmax to select the most useful operator. At inference time, everything is binarized.