1. 程式人生 > 其它 >ExoPlayer播放器剖析(七)ExoPlayer對音訊時間戳的處理

ExoPlayer播放器剖析(七)ExoPlayer對音訊時間戳的處理

技術標籤:Android媒體exoplayer

相關部落格

ExoPlayer播放器剖析(一)進入ExoPlayer的世界
ExoPlayer播放器剖析(二)編寫exoplayer的demo
ExoPlayer播放器剖析(三)流程分析—從build到prepare看ExoPlayer的建立流程
ExoPlayer播放器剖析(四)從renderer.render函式分析至MediaCodec
ExoPlayer播放器剖析(五)ExoPlayer對AudioTrack的操作
ExoPlayer播放器剖析(六)ExoPlayer同步機制分析
ExoPlayer播放器剖析(七)ExoPlayer對音訊時間戳的處理


ExoPlayer播放器擴充套件(一)DASH流與HLS流簡介

一、前言:
在exoplayer的同步機制分析中,我們知道所有的同步處理前提都是基於準確的音訊的時間戳來執行的。因為exoplayer對音訊的時間戳處理很繁瑣,所以,單獨編寫一篇部落格來分析。

二、程式碼分析:
1.音視訊時間戳的更新點:
時間戳的更新是在doSomeWork那個大迴圈裡面去執行的,也就是說,每10ms進行一次更新:

[email protected]ExoPlayer\library\core\src\main\java\com\google\android\exoplayer2\ExoPlayerImplInternal.
java: private void doSomeWork() throws ExoPlaybackException, IOException { ... /* 更新音訊時間戳 */ updatePlaybackPositions(); ... /* 呼叫各個型別的render進行資料處理 */ for (int i = 0; i < renderers.length; i++) { ... renderer.render(rendererPositionUs, rendererPositionElapsedRealtimeUs); ... } ..
. }

updatePlaybackPositions函式將會去更新並處理pts,跟進到這個函式:

  private void updatePlaybackPositions() throws ExoPlaybackException {
    MediaPeriodHolder playingPeriodHolder = queue.getPlayingPeriod();
    if (playingPeriodHolder == null) {
      return;
    }

    // Update the playback position.
    long discontinuityPositionUs =
        playingPeriodHolder.prepared
            ? playingPeriodHolder.mediaPeriod.readDiscontinuity()
            : C.TIME_UNSET;
    if (discontinuityPositionUs != C.TIME_UNSET) {
      resetRendererPosition(discontinuityPositionUs);
      // A MediaPeriod may report a discontinuity at the current playback position to ensure the
      // renderers are flushed. Only report the discontinuity externally if the position changed.
      if (discontinuityPositionUs != playbackInfo.positionUs) {
        playbackInfo =
            handlePositionDiscontinuity(
                playbackInfo.periodId,
                discontinuityPositionUs,
                playbackInfo.requestedContentPositionUs);
        playbackInfoUpdate.setPositionDiscontinuity(Player.DISCONTINUITY_REASON_INTERNAL);
      }
    } else {
    	/* 在這裡面處理pts */
      rendererPositionUs =
          mediaClock.syncAndGetPositionUs(
              /* isReadingAhead= */ playingPeriodHolder != queue.getReadingPeriod());
      long periodPositionUs = playingPeriodHolder.toPeriodTime(rendererPositionUs);
      maybeTriggerPendingMessages(playbackInfo.positionUs, periodPositionUs);
      playbackInfo.positionUs = periodPositionUs;
    }

    // Update the buffered position and total buffered duration.
    MediaPeriodHolder loadingPeriod = queue.getLoadingPeriod();
    playbackInfo.bufferedPositionUs = loadingPeriod.getBufferedPositionUs();
    playbackInfo.totalBufferedDurationUs = getTotalBufferedDurationUs();
  }

繼續更進:

  public long syncAndGetPositionUs(boolean isReadingAhead) {
    syncClocks(isReadingAhead);
    return getPositionUs();
  }

這裡需要注意下入參isReadingAhead,它的值是由上面的表示式playingPeriodHolder != queue.getReadingPeriod()來得到的,也就是說,如果碼流裡面解析出來有帶pts,那麼將會使用碼流中的pts來進行後續的同步處理,如果沒有,就使用exoplayer自己維護的系統時間來進行同步,顯然,正常情況下,都是使用碼流中的pts來處理的。我們進入到getPositionUs():

  @Override
  public long getPositionUs() {
    return isUsingStandaloneClock
        ? standaloneClock.getPositionUs()
        : Assertions.checkNotNull(rendererClock).getPositionUs();
  }

有了上面的分析,我們直接去看後者,注意這裡的rendererClock實現類是
MediaCodecAudioRenderer:

  @Override
  public long getPositionUs() {
    if (getState() == STATE_STARTED) {
      updateCurrentPosition();
    }
    return currentPositionUs;
  }

如果播放器的狀態為STATE_STARTED後,將會呼叫updateCurrentPosition()持續處理:

  private void updateCurrentPosition() {
    long newCurrentPositionUs = audioSink.getCurrentPositionUs(isEnded());
    if (newCurrentPositionUs != AudioSink.CURRENT_POSITION_NOT_SET) {
      currentPositionUs =
          allowPositionDiscontinuity
              ? newCurrentPositionUs
              : max(currentPositionUs, newCurrentPositionUs);
      allowPositionDiscontinuity = false;
    }
  }

這個函式的意義實際上就是在更新類變數newCurrentPositionUs的值,我們去看
getCurrentPositionUs()函式,audioSink的實現類是DefaultAudioSink:

  @Override
  public long getCurrentPositionUs(boolean sourceEnded) {
    if (!isAudioTrackInitialized() || startMediaTimeUsNeedsInit) {
      return CURRENT_POSITION_NOT_SET;
    }
    /* 從AudioTrack獲取pts */
    long positionUs = audioTrackPositionTracker.getCurrentPositionUs(sourceEnded);
    /* 返回標準為audiotrack和本地記錄獲取的二者的最小值 */
    positionUs = min(positionUs, configuration.framesToDurationUs(getWrittenFrames()));
    return applySkipping(applyMediaPositionParameters(positionUs));
  }

經過多層封裝,找到了最終從audiotrack獲取pts的地方,下面將著重分析函式getCurrentPositionUs。

2.getCurrentPositionUs函式通講:
這個函式有點複雜,先註釋好列出如下:

  public long getCurrentPositionUs(boolean sourceEnded) {
    /* 三件重要的事:*/
    /* 1.根據AudioTrack.getPlaybackHeadPosition的值來計算平滑抖動值  */
    /* 2.校驗timestamp */
    /* 3.校驗latency */
    if (Assertions.checkNotNull(this.audioTrack).getPlayState() == PLAYSTATE_PLAYING) {
      maybeSampleSyncParams();
    }

    // If the device supports it, use the playback timestamp from AudioTrack.getTimestamp.
    // Otherwise, derive a smoothed position by sampling the track's frame position.
    long systemTimeUs = System.nanoTime() / 1000;
    long positionUs;
    AudioTimestampPoller audioTimestampPoller = Assertions.checkNotNull(this.audioTimestampPoller);
    boolean useGetTimestampMode = audioTimestampPoller.hasAdvancingTimestamp();
    /* if分支為裝置支援的timestamp模式,else為AudioTrack.getPlaybackHeadPosition獲取的處理方式 */
    /* 使用手機實測走的if分支 */
    if (useGetTimestampMode) {
      // Calculate the speed-adjusted position using the timestamp (which may be in the future).
      long timestampPositionFrames = audioTimestampPoller.getTimestampPositionFrames();
      long timestampPositionUs = framesToDurationUs(timestampPositionFrames);
      long elapsedSinceTimestampUs = systemTimeUs - audioTimestampPoller.getTimestampSystemTimeUs();
      elapsedSinceTimestampUs =
          Util.getMediaDurationForPlayoutDuration(elapsedSinceTimestampUs, audioTrackPlaybackSpeed);
      positionUs = timestampPositionUs + elapsedSinceTimestampUs;
    } else {
      if (playheadOffsetCount == 0) {
        // The AudioTrack has started, but we don't have any samples to compute a smoothed position.
        positionUs = getPlaybackHeadPositionUs();
      } else {
        // getPlaybackHeadPositionUs() only has a granularity of ~20 ms, so we base the position off
        // the system clock (and a smoothed offset between it and the playhead position) so as to
        // prevent jitter in the reported positions.
        positionUs = systemTimeUs + smoothedPlayheadOffsetUs;
      }
      if (!sourceEnded) {
        /* 獲取到的position還要再減去一個latency */
        positionUs = max(0, positionUs - latencyUs);
      }
    }

    /* 如果模式有切換,做下儲存 */
    if (lastSampleUsedGetTimestampMode != useGetTimestampMode) {
      // We've switched sampling mode.
      previousModeSystemTimeUs = lastSystemTimeUs;
      previousModePositionUs = lastPositionUs;
    }
    long elapsedSincePreviousModeUs = systemTimeUs - previousModeSystemTimeUs;
    if (elapsedSincePreviousModeUs < MODE_SWITCH_SMOOTHING_DURATION_US) {
      // Use a ramp to smooth between the old mode and the new one to avoid introducing a sudden
      // jump if the two modes disagree.
      long previousModeProjectedPositionUs =
          previousModePositionUs
              + Util.getMediaDurationForPlayoutDuration(
                  elapsedSincePreviousModeUs, audioTrackPlaybackSpeed);
      // A ramp consisting of 1000 points distributed over MODE_SWITCH_SMOOTHING_DURATION_US.
      long rampPoint = (elapsedSincePreviousModeUs * 1000) / MODE_SWITCH_SMOOTHING_DURATION_US;
      positionUs *= rampPoint;
      positionUs += (1000 - rampPoint) * previousModeProjectedPositionUs;
      positionUs /= 1000;
    }

    if (!notifiedPositionIncreasing && positionUs > lastPositionUs) {
      notifiedPositionIncreasing = true;
      long mediaDurationSinceLastPositionUs = C.usToMs(positionUs - lastPositionUs);
      long playoutDurationSinceLastPositionUs =
          Util.getPlayoutDurationForMediaDuration(
              mediaDurationSinceLastPositionUs, audioTrackPlaybackSpeed);
      long playoutStartSystemTimeMs =
          System.currentTimeMillis() - C.usToMs(playoutDurationSinceLastPositionUs);
      listener.onPositionAdvancing(playoutStartSystemTimeMs);
    }

    lastSystemTimeUs = systemTimeUs;
    lastPositionUs = positionUs;
    lastSampleUsedGetTimestampMode = useGetTimestampMode;

    return positionUs;
  }

想要充分理解這個函式,需要做一下準備功課,AudioTrack提供了兩種api來查詢音訊pts:

/* 1 */
public boolean getTimestamp(AudioTimestamp timestamp);
/* 2 */
public int getPlaybackHeadPosition();

前者需要裝置底層有實現,可以理解為這是一個精確獲取pts的函式,但是該函式有一個特點,不會頻繁重新整理值,所以不宜頻繁呼叫,從官方文件的給的參考來看,建議10s~60s呼叫一次,看一下入參類AudioTimestamp 包含的兩個成員變數:

public long framePosition; /* 寫入幀的位置 */
public long nanoTime;      /* 更新幀位置時的系統時間 */

後者則是可以頻繁呼叫的介面,其值返回的是從播放開始,audiotrack持續寫入到hal層的資料,對這兩個函式有了瞭解後,我們就知道有兩種方式可以從audiotrack獲取音訊的pts,因此,getCurrentPositionUs實際上就是對兩種方式都做了處理。我們先對整個函式進行通講,然後再具體分析各個函式的細節:

    /* 三件重要的事:*/
    /* 1.根據AudioTrack.getPlaybackHeadPosition的值來計算平滑抖動值  */
    /* 2.校驗timestamp */
    /* 3.校驗latency */
    if (Assertions.checkNotNull(this.audioTrack).getPlayState() == PLAYSTATE_PLAYING) {
      maybeSampleSyncParams();
    }

首先看最上面對maybeSampleSyncParams()函式的註釋,一共幹了三件重要的事,第一件事,是根據AudioTrack.getPlaybackHeadPosition的值來計算平滑抖動值,因為AudioTrack.getPlaybackHeadPosition拿到的pts是頻繁呼叫後得到的,為了避免偶爾的大抖動會影響後續的同步,所以exoplayer這裡做了一個演算法處理,用一個變數smoothedPlayheadOffsetUs來記錄針對該介面的抖動值,便於後續處理,第二件事是去校驗timestamp,這個值就是調getTimestamp()介面拿到的,第三件事,是去校驗latency,什麼是latency呢?就是音訊資料從audiotrack到寫入硬體,最終出聲的一個延時,尤其是在藍芽連線場景非常明顯,需要注意的是,這個介面需要由底層晶片廠商來實現,應用能夠直接呼叫獲取,如果應用層想要自己去計算是非常困難的。在做完了整個maybeSampleSyncParams的準備工作後,我們看該函式的下面是如何操作的:

    /* if分支為裝置支援的timestamp模式,else為AudioTrack.getPlaybackHeadPosition獲取的處理方式 */
    /* 使用手機實測走的if分支 */
    if (useGetTimestampMode) {
      // Calculate the speed-adjusted position using the timestamp (which may be in the future).
      long timestampPositionFrames = audioTimestampPoller.getTimestampPositionFrames();
      long timestampPositionUs = framesToDurationUs(timestampPositionFrames);
      long elapsedSinceTimestampUs = systemTimeUs - audioTimestampPoller.getTimestampSystemTimeUs();
      elapsedSinceTimestampUs =
          Util.getMediaDurationForPlayoutDuration(elapsedSinceTimestampUs, audioTrackPlaybackSpeed);
      positionUs = timestampPositionUs + elapsedSinceTimestampUs;
    } else {
      if (playheadOffsetCount == 0) {
        // The AudioTrack has started, but we don't have any samples to compute a smoothed position.
        positionUs = getPlaybackHeadPositionUs();
      } else {
        // getPlaybackHeadPositionUs() only has a granularity of ~20 ms, so we base the position off
        // the system clock (and a smoothed offset between it and the playhead position) so as to
        // prevent jitter in the reported positions.
        positionUs = systemTimeUs + smoothedPlayheadOffsetUs;
      }
      if (!sourceEnded) {
        /* 獲取到的position還要再減去一個latency */
        positionUs = max(0, positionUs - latencyUs);
      }
    }

整個if-else分支就是根據兩個audiotrack的api介面來區分的,如果程式碼執行的平臺支援從getTimestamp獲取,那麼就走if分支去處理,如果是不支援的話,那麼就從getPlaybackHeadPosition獲取值去處理,這裡先丟擲一個結論,exoplayer是每10s去呼叫一次getTimestamp來獲取底層精確的pts值的,我們來看下對if函式的註釋:

	/* 通過getTimestamp來獲取pts */
    if (useGetTimestampMode) {
      // Calculate the speed-adjusted position using the timestamp (which may be in the future).
      /* 得到最新呼叫getTimestamp()介面時拿到的寫入幀數 */
      long timestampPositionFrames = audioTimestampPoller.getTimestampPositionFrames();
      /* 將總幀數轉化為持續時間 */
      long timestampPositionUs = framesToDurationUs(timestampPositionFrames);
      /* 計算當前系統時間與底層更新幀數時系統時間的差值 */
      long elapsedSinceTimestampUs = systemTimeUs - audioTimestampPoller.getTimestampSystemTimeUs();
      /* 對差值時間做一個校準,基於倍速進行校準 */
      elapsedSinceTimestampUs =
          Util.getMediaDurationForPlayoutDuration(elapsedSinceTimestampUs, audioTrackPlaybackSpeed);
      /* 得到最新的音訊時間戳 */
      positionUs = timestampPositionUs + elapsedSinceTimestampUs;
    }

這段程式碼實際上可以這麼理解,每10s更新一次基準時間戳,然後在基準時間戳的基礎上不斷累加校準過後的時間差值,就是最終送給視訊做同步的音訊pts。
來看下else裡面是怎麼做的:

else {
      if (playheadOffsetCount == 0) {
        // The AudioTrack has started, but we don't have any samples to compute a smoothed position.
        positionUs = getPlaybackHeadPositionUs();
      } else {
        // getPlaybackHeadPositionUs() only has a granularity of ~20 ms, so we base the position off
        // the system clock (and a smoothed offset between it and the playhead position) so as to
        // prevent jitter in the reported positions.
        positionUs = systemTimeUs + smoothedPlayheadOffsetUs;
      }
      if (!sourceEnded) {
        /* 獲取到的position還要再減去一個latency */
        positionUs = max(0, positionUs - latencyUs);
      }
    }

else的程式碼就很好理解了,就是呼叫AudioTrack.getPlaybackHeadPosition()拿到值之後,做一個平滑處理,然後再減去latency就行了。

三、關鍵函式分析:
maybeSampleSyncParams分析:
來看下這個函式是如何做準備工作的:

  private void maybeSampleSyncParams() {
    /* 1.從audiotrack獲取播放時長 */
    long playbackPositionUs = getPlaybackHeadPositionUs();
    if (playbackPositionUs == 0) {
      // The AudioTrack hasn't output anything yet.
      return;
    }
    long systemTimeUs = System.nanoTime() / 1000;
    /* 2.每30ms進行一次平滑計算 */
    if (systemTimeUs - lastPlayheadSampleTimeUs >= MIN_PLAYHEAD_OFFSET_SAMPLE_INTERVAL_US) {
      // Take a new sample and update the smoothed offset between the system clock and the playhead.
      playheadOffsets[nextPlayheadOffsetIndex] = playbackPositionUs - systemTimeUs;
      nextPlayheadOffsetIndex = (nextPlayheadOffsetIndex + 1) % MAX_PLAYHEAD_OFFSET_COUNT;
      if (playheadOffsetCount < MAX_PLAYHEAD_OFFSET_COUNT) {
        playheadOffsetCount++;
      }
      lastPlayheadSampleTimeUs = systemTimeUs;
      smoothedPlayheadOffsetUs = 0;
      for (int i = 0; i < playheadOffsetCount; i++) {
        smoothedPlayheadOffsetUs += playheadOffsets[i] / playheadOffsetCount;
      }
    }

    if (needsPassthroughWorkarounds) {
      // Don't sample the timestamp and latency if this is an AC-3 passthrough AudioTrack on
      // platform API versions 21/22, as incorrect values are returned. See [Internal: b/21145353].
      return;
    }

    /* 3.校驗從audiotrack獲取的timestamp和系統時間及getPlaybackHeadPostion獲取的時間 */
    maybePollAndCheckTimestamp(systemTimeUs, playbackPositionUs);
    /* 4.校驗latency(如果底層實現了介面的話) */
    maybeUpdateLatency(systemTimeUs);
  }

註釋一:

  private long getPlaybackHeadPositionUs() {
    return framesToDurationUs(getPlaybackHeadPosition());
  }

繼續跟進getPlaybackHeadPosition():

private long getPlaybackHeadPosition() {
	...
    int state = audioTrack.getPlayState();
    if (state == PLAYSTATE_STOPPED) {
      // The audio track hasn't been started.
      return 0;
    }
    ...	
    /* 呼叫AudioTrack.getPlaybackHeadPosition:返回為幀數 */
    long rawPlaybackHeadPosition = 0xFFFFFFFFL & audioTrack.getPlaybackHeadPosition();
    ...
	if (lastRawPlaybackHeadPosition > rawPlaybackHeadPosition) {
		// The value must have wrapped around.
		rawPlaybackHeadWrapCount++;
	}
	lastRawPlaybackHeadPosition = rawPlaybackHeadPosition;
	return rawPlaybackHeadPosition + (rawPlaybackHeadWrapCount << 32);
}

我們只看關鍵部分,進來首先檢查了一下狀態,然後呼叫AudioTrack.getPlaybackHeadPosition()拿到pts,最後確認下這個pts是否有覆蓋的情況,沒有的話就做一個數據更新。
回到maybeSampleSyncParams(),我們看下exoplayer是如何計算平滑值的:
註釋二:

	/* MIN_PLAYHEAD_OFFSET_SAMPLE_INTERVAL_US值為30ms */
    if (systemTimeUs - lastPlayheadSampleTimeUs >= MIN_PLAYHEAD_OFFSET_SAMPLE_INTERVAL_US) {
      // Take a new sample and update the smoothed offset between the system clock and the playhead.
      playheadOffsets[nextPlayheadOffsetIndex] = playbackPositionUs - systemTimeUs;
      nextPlayheadOffsetIndex = (nextPlayheadOffsetIndex + 1) % MAX_PLAYHEAD_OFFSET_COUNT;
      if (playheadOffsetCount < MAX_PLAYHEAD_OFFSET_COUNT) {
        playheadOffsetCount++;
      }
      lastPlayheadSampleTimeUs = systemTimeUs;
      smoothedPlayheadOffsetUs = 0;
      for (int i = 0; i < playheadOffsetCount; i++) {
        smoothedPlayheadOffsetUs += playheadOffsets[i] / playheadOffsetCount;
      }
    }

這段程式碼的原理如下:每30ms更新一次平滑抖動偏差處理,每次處理會先用獲取的pts值減去當前的系統時間作為一個基準差值,然後記錄在陣列playheadOffsets[]中,接著,將近10次的所有偏差平均到每次偏差中進行計算再累加,即得到最新的平滑抖動偏差值。關於為什麼平滑處理是30ms,從exoplayer給的註釋來看是因為getPlaybackHeadPositionUs()更新的時間間隔為20ms:

else {
      if (playheadOffsetCount == 0) {
		...
      } else {
        // getPlaybackHeadPositionUs() only has a granularity of ~20 ms, so we base the position off
        // the system clock (and a smoothed offset between it and the playhead position) so as to
        // prevent jitter in the reported positions.
        positionUs = systemTimeUs + smoothedPlayheadOffsetUs;
      }
	  ...
    }

英文註釋了getPlaybackHeadPositionUs()會有一個20ms的時間間隔,實際上smoothedPlayheadOffsetUs就是一個經過演算法處理的平滑抖動值,加上當前的系統時間就可以認為是此時的音訊pts了。

回到maybeSampleSyncParams,繼續往下看:
註釋三:

    /* 校驗從audiotrack獲取的timestamp和系統時間及getPlaybackHeadPostion獲取的時間 */
    maybePollAndCheckTimestamp(systemTimeUs, playbackPositionUs);

來看下這個函式是如何校驗timestamp的:

  private void maybePollAndCheckTimestamp(long systemTimeUs, long playbackPositionUs) {
    /* 確認平臺底層是否更新了timestamp */
    AudioTimestampPoller audioTimestampPoller = Assertions.checkNotNull(this.audioTimestampPoller);
    if (!audioTimestampPoller.maybePollTimestamp(systemTimeUs)) {
      return;
    }

    // Check the timestamp and accept/reject it.
    long audioTimestampSystemTimeUs = audioTimestampPoller.getTimestampSystemTimeUs();
    long audioTimestampPositionFrames = audioTimestampPoller.getTimestampPositionFrames();
    /* 確認timestamp更新時的系統時間與當前系統時間是否差距大過5s */
    if (Math.abs(audioTimestampSystemTimeUs - systemTimeUs) > MAX_AUDIO_TIMESTAMP_OFFSET_US) {
      listener.onSystemTimeUsMismatch(
          audioTimestampPositionFrames,
          audioTimestampSystemTimeUs,
          systemTimeUs,
          playbackPositionUs);
      audioTimestampPoller.rejectTimestamp();
      /* 確認AudioTrack.getTimeStamp和AudioTrack.getPlaybackHeadPostion獲取的時間差值是否大於5s */
    } else if (Math.abs(framesToDurationUs(audioTimestampPositionFrames) - playbackPositionUs)
        > MAX_AUDIO_TIMESTAMP_OFFSET_US) {
      listener.onPositionFramesMismatch(
          audioTimestampPositionFrames,
          audioTimestampSystemTimeUs,
          systemTimeUs,
          playbackPositionUs);
      audioTimestampPoller.rejectTimestamp();
    } else {
      audioTimestampPoller.acceptTimestamp();
    }
  }

函式很長,但是很好理解。就是確認底層是否更新了timestamp,如果更新了的話,就將最新更新的這個timestamp中的系統時間拿來和應用層當前的系統時間對比,如果超過5s則不接收這個timestamp,同理,下面也對比了AudioTrack.getTimeStamp和AudioTrack.getPlaybackHeadPostion獲取的時間差值。這個函式的關鍵還是要看下層是如何確定是否更新timestamp的,以及在支援timestamp模式下10S的限制時間是如何來的,來看audioTimestampPoller.maybePollTimestamp():

  @TargetApi(19) // audioTimestamp will be null if Util.SDK_INT < 19.
  public boolean maybePollTimestamp(long systemTimeUs) {
    /* if迴圈確保每10s呼叫一次 */
    if (audioTimestamp == null || (systemTimeUs - lastTimestampSampleTimeUs) < sampleIntervalUs) {
      return false;
    }
    lastTimestampSampleTimeUs = systemTimeUs;
    /* 從AudioTrack.getTimestamp獲取最近的timestamp */
    boolean updatedTimestamp = audioTimestamp.maybeUpdateTimestamp();
    switch (state) {
      case STATE_INITIALIZING:
        if (updatedTimestamp) {
          if (audioTimestamp.getTimestampSystemTimeUs() >= initializeSystemTimeUs) {
            // We have an initial timestamp, but don't know if it's advancing yet.
            initialTimestampPositionFrames = audioTimestamp.getTimestampPositionFrames();
            updateState(STATE_TIMESTAMP);
          } else {
            // Drop the timestamp, as it was sampled before the last reset.
            updatedTimestamp = false;
          }
        } else if (systemTimeUs - initializeSystemTimeUs > INITIALIZING_DURATION_US) {
          // We haven't received a timestamp for a while, so they probably aren't available for the
          // current audio route. Poll infrequently in case the route changes later.
          // TODO: Ideally we should listen for audio route changes in order to detect when a
          // timestamp becomes available again.
          updateState(STATE_NO_TIMESTAMP);
        }
        break;
      case STATE_TIMESTAMP:
        if (updatedTimestamp) {
          long timestampPositionFrames = audioTimestamp.getTimestampPositionFrames();
          if (timestampPositionFrames > initialTimestampPositionFrames) {
            updateState(STATE_TIMESTAMP_ADVANCING);
          }
        } else {
          reset();
        }
        break;
      case STATE_TIMESTAMP_ADVANCING:
        if (!updatedTimestamp) {
          // The audio route may have changed, so reset polling.
          reset();
        }
        break;
      case STATE_NO_TIMESTAMP:
        if (updatedTimestamp) {
          // The audio route may have changed, so reset polling.
          reset();
        }
        break;
      case STATE_ERROR:
        // Do nothing. If the caller accepts any new timestamp we'll reset polling.
        break;
      default:
        throw new IllegalStateException();
    }
    return updatedTimestamp;
  }

注意,該函式要求在SDK版本大於Android4.4以上才行。我們可以從程式碼的最前面的if迴圈看到,是否往下去呼叫audioTimestamp.maybeUpdateTimestamp()是由sampleIntervalUs這個類變數來決定的。先來看下sampleIntervalUs值是在哪裡改變的。首先,exoplayer初始化完成後,這個函式中類變數state是處於STATE_INITIALIZING狀態的,看下switch中的選擇分支:

      case STATE_INITIALIZING:
        if (updatedTimestamp) {
          if (audioTimestamp.getTimestampSystemTimeUs() >= initializeSystemTimeUs) {
            // We have an initial timestamp, but don't know if it's advancing yet.
            initialTimestampPositionFrames = audioTimestamp.getTimestampPositionFrames();
            /* 第一次更新狀態 */
            updateState(STATE_TIMESTAMP);
          } else {
            // Drop the timestamp, as it was sampled before the last reset.
            updatedTimestamp = false;
          }
		...
        break;

這裡會去更新狀態為STATE_TIMESTAMP,看下updateState函式:

  private void updateState(@State int state) {
    this.state = state;
    switch (state) {
      case STATE_INITIALIZING:
        // Force polling a timestamp immediately, and poll quickly.
        lastTimestampSampleTimeUs = 0;
        initialTimestampPositionFrames = C.POSITION_UNSET;
        initializeSystemTimeUs = System.nanoTime() / 1000;
        sampleIntervalUs = FAST_POLL_INTERVAL_US; /* 10ms */
        break;
      case STATE_TIMESTAMP:
        sampleIntervalUs = FAST_POLL_INTERVAL_US; /* 10ms */
        break;
      case STATE_TIMESTAMP_ADVANCING:
      case STATE_NO_TIMESTAMP:
        sampleIntervalUs = SLOW_POLL_INTERVAL_US; /* 10s */
        break;
      case STATE_ERROR:
        sampleIntervalUs = ERROR_POLL_INTERVAL_US; /* 500ms */
        break;
      default:
        throw new IllegalStateException();
    }
  }

這裡面可以看到STATE_TIMESTAMP狀態時依然是10ms的訪問,但是更新了狀態之後回到上面的函式,如果下一次查詢到timestamp更新了,那麼就會再次去更新狀態為STATE_TIMESTAMP_ADVANCING,故sampleIntervalUs 就變成了10S了。
總結下這的狀態:

STATE_INITIALIZING(10ms)—>STATE_TIMESTAMP(10ms)—>STATE_TIMESTAMP_ADVANCING(10s)

看懂了這裡的狀態切換,再看下audioTimestamp.maybeUpdateTimestamp()是怎麼更新的:

    public boolean maybeUpdateTimestamp() {
      /* 呼叫Android api進行訪問 */
      boolean updated = audioTrack.getTimestamp(audioTimestamp);
      if (updated) {
        long rawPositionFrames = audioTimestamp.framePosition;
        if (lastTimestampRawPositionFrames > rawPositionFrames) {
          // The value must have wrapped around.
          rawTimestampFramePositionWrapCount++;
        }
        lastTimestampRawPositionFrames = rawPositionFrames;
        /* 更新timestamp */
        lastTimestampPositionFrames =
            rawPositionFrames + (rawTimestampFramePositionWrapCount << 32);
      }
      return updated;
    }

邏輯很簡單,就是查詢底層api,返回為true,說明下層有最新值可用,然後應用讀取出這個最新值更新下就行了。
註釋四:
校驗下latency,看下maybeUpdateLatency()函式:

  private void maybeUpdateLatency(long systemTimeUs) {
    /* 底層實現了getLatencyMethod的前提下每50ms時差進行一次latency校驗 */
    if (isOutputPcm
        && getLatencyMethod != null
        && systemTimeUs - lastLatencySampleTimeUs >= MIN_LATENCY_SAMPLE_INTERVAL_US) {
      try {
        // Compute the audio track latency, excluding the latency due to the buffer (leaving
        // latency due to the mixer and audio hardware driver).
        latencyUs =
            castNonNull((Integer) getLatencyMethod.invoke(Assertions.checkNotNull(audioTrack)))
                    * 1000L
                - bufferSizeUs;
        // Check that the latency is non-negative.
        latencyUs = max(latencyUs, 0);
        // Check that the latency isn't too large.
        if (latencyUs > MAX_LATENCY_US) {
          listener.onInvalidLatency(latencyUs);
          latencyUs = 0;
        }
      } catch (Exception e) {
        // The method existed, but doesn't work. Don't try again.
        getLatencyMethod = null;
      }
      lastLatencySampleTimeUs = systemTimeUs;
    }
  }

這個函式的邏輯也很簡單,首先前提是需要底層實現getLatencyMethod這個方法,否則的話直接不考慮latency,在拿到了latency之後,還要減去bufferSizeUs,註釋說的很簡單,至於為什麼要減去這個latency,我不是特別能理解。

三、總結:
ExoPlayer對音訊pts的處理繁瑣而精確,首先是確定了兩種音訊pts的獲取方式,一種是使用碼流中解析出來的,另一種則是exoplayer自己維護的系統時間。正常情況下,我們都是使用前者。從碼流中解析出來的pts有兩種方式去從audiotrack獲取,且分別進行了校驗處理,最終得到的值將送往video的渲染操作中去進行同步:
在這裡插入圖片描述