This means we now require 2+ input frames per input, and compare the primary
stream timestamp with secondary stream timestamps in order to select the
correct output timestamp. We also must release frames and back-pressure as
soon as possible to avoid blocking upstream VFPs.
Also, improve signalling of VFP onReadyToAcceptInputFrame
PiperOrigin-RevId: 553448965