Some extractor implementations underneath MediaExtractor require a seekTo
call after tracks are selected to ensure samples are read from the correct
position. De-duplicating logic was preventing this from happening in some
cases, causing issues like:
https://github.com/google/ExoPlayer/issues/301
Note that seeking all tracks a side effect of track selection sucks if
you already have one or more tracks selected, because it introduces
discontinuities to the already selected tracks. However, in general, it
*is* necessary to specify the position for the track being selected,
because the underlying extractor doesn't have enough information to know
where to start reading from. It can't determine this based on the read
positions of the already selected tracks, because the samples in these
tracks might be very sparse with respect to time.
I think a more optimal fix would be to change the SampleExtractor
interface to receive the current position as an argument to selectTrack.
For our own extractors, we'd seek the newly selected track to that
position, whilst the already enabled tracks would be left in their
current positions (if possible). For FrameworkSampleExtractor we'd
still have no choice but to call seekTo on the extractor to seek all
of the tracks. This solution ends up being more complex though, because:
- The SampleExtractor then needs a way of telling DefaultSampleSource
which tracks were actually seeked, so that the pendingDiscontinuities
flags can be set correctly.
- It's a weird API that requires the "current playback position to seek
only the track being enabled"
So it may not be worth it! I think this fix is definitely good for now,
in any case.
Issue: #301
- This change:
1. Extracts HlsExtractor interface from TsExtractor.
2. Adds AdtsExtractor for AAC/ADTS streams, which turned out to be
really easy.
Selection of the ADTS extractor relies on seeing the .aac extension.
This is at least guaranteed not to break anything that works already
(since no-one is going to be using .aac as the extension for something
that's not elementary AAC/ADTS).
Issue: #209
I think this is the limit of how far we should be pushing complexity
v.s. efficiency. It's a little complicated to understand, but probably
worth it since the H264 bitstream is the majority of the data.
Issue: #278
Use of Sample objects was inefficient for several reasons:
- Lots of objects (1 per sample, obviously).
- When switching up bitrates, there was a tendency for all Sample
instances to need to expand, which effectively led to our whole
media buffer being GC'd as each Sample discarded its byte[] to
obtain a larger one.
- When a keyframe was encountered, the Sample would typically need
to expand to accommodate it. Over time, this would lead to a
gradual increase in the population of Samples that were sized to
accommodate keyframes. These Sample instances were then typically
underutilized whenever recycled to hold a non-keyframe, leading
to inefficient memory usage.
This CL introduces RollingBuffer, which tightly packs pending sample
data into a byte[]s obtained from an underlying BufferPool. Which
fixes all of the above. There is still an issue where the total
memory allocation may grow when switching up bitrate, but we can
easily fix that from this point, if we choose to restrict the buffer
based on allocation size rather than time.
Issue: #278
- Remove TsExtractor's knowledge of Sample.
- Push handling of Sample objects into SampleQueue as much
as possible. This is a precursor to replacing Sample objects
with a different type of backing memory. Ideally, the
individual readers shouldn't know how the sample data is
stored. This is true after this CL, with the except of the
TODO in H264Reader.
- Avoid double-scanning every H264 sample for NAL units, by
moving the scan for SEI units from SeiReader into H264Reader.
Issue: #278
The complexity around not enabling the video renderer before it
has a valid surface is because MediaCodecTrackRenderer supports
a "discard" mode where it pulls through and discards samples
without a decoder. This mode means that if the demo app were to
enable the renderer before supplying the surface, the renderer
could discard the first few frames prior to getting the surface,
meaning video rendering wouldn't happen until the following sync
frame.
To get a handle on complexity, I think we're better off just removing
support for this mode, which nicely decouples how the demo app
handles surfaces v.s. how it handles enabling/disabling renderers.
Reordering in the extractor isn't going to work well with the
optimizations I'm making there. This change moves sorting back
to the renderer, although keeps all of the renderer
simplifications. It's basically just moving where the sort
happens from one place to another.
I'm not really a fan of micro-optimizations, but given this method
scans through every H264 frame in the HLS case, it seems worthwhile.
The trick here is to examine the first 7 bits of the third byte
first. If they're not all 0s, then we know that we haven't found a
NAL unit, and also that we wont find one at the next two positions.
This allows the loop to increment 3 bytes at a time.
Speedup is around 60% on Art according to some ad-hoc benchmarking.
1. AdtsReader would previously copy all data through an intermediate
adtsBuffer. This change eliminates the additional copy step, and
instead copies directly into Sample objects.
2. PesReader would previously accumulate a whole packet by copying
multiple TS packets into an intermediate buffer. This change
eliminates this copy step. After the change, TS packet buffers
are propagated directly to PesPayloadReaders, which are required
to handle partial payload data correctly. The copy steps in the
extractor are simplified from:
DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder
To:
DataSource->Ts_BitArray->Sample->SampleHolder
Issue: #278
- TsExtractor is now based on ParsableByteArray rather than BitArray.
This makes is much clearer that, for the most part, data is byte
aligned. It will allow us to optimize TsExtractor without worrying
about arbitrary bit offsets.
- BitArray is renamed ParsableBitArray for consistency, and is now
exclusively for bit-stream level reading.
- There are some temporary methods in ParsableByteArray that should be
cleared up once the optimizations are in place.
Issue: #278
This is the start of a sequence of changes to fix the ref'd
github issue. Currently TsExtractor involves multiple memory
copy steps:
DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder
This is inefficient, but more importantly, the copy into
Sample is problematic, because Samples are of dynamically
varying size. The way we end up expanding Sample objects to
be large enough to hold the data being written means that we
end up gradually expanding all Sample objects in the pool
(which wastes memory), and that we generate a lot of GC churn,
particularly when switching to a higher quality which can
trigger all Sample objects to expand.
The fix will be to reduce the copy steps to:
DataSource->TsPacket->SampleHolder
We will track Pes and Sample data with lists of pointers into
TsPackets, rather than actually copying the data. We will
recycle these pointers.
The following steps are approximately how the refactor will
progress:
1. Start reducing use of BitArray. It's going to be way too
complicated to track bit-granularity offsets into multiple packets,
and allow reading across packet boundaries. In practice reads
from Ts packets are all byte aligned except for small sections,
so we'll move over to using ParsableByteArray instead, so we
only need to track byte offsets.
2. Move TsExtractor to use ParsableByteArray except for small
sections where we really need bit-granularity offsets.
3. Do the actual optimization.
Issue: #278
- Remove simple variant. Maintaining both simple + full is
unnecessary effort.
- Remove need to specify content id in Sample definition,
except where it's actually required (for DRM requests in
the Widevine GTS samples)
If the timesource track renderer ends, but other track renderers
haven't finished, the player would get stuck in a pending state.
This change enables automatic switching to the media clock in the
case that the timesource renderer has ended, which allows other
renderers to continue to play.
These may occur in VOD streams where a representation's data
is small enough not to require segmentation or an index. For
example subtitle files.
Issue: #268
SampleExtractor will initially only be implemented by FrameworkSampleExtractor
which delegates to a MediaExtractor, but eventually it will also be implemented
by additional extractors.
The sample extractor can be used as a source of samples via DefaultSampleSource.