1. 程式人生 > >kaldi中TransitionModel介紹

kaldi中TransitionModel介紹

kaldi中的HMM模型,實際就是一個TransitionModel物件。這個物件描述了音素的HMM拓撲結構,並儲存了pdf-id和transition-id相關的資訊,並且可以進行各種變數的轉換。
TransitionModel的定義和實現位於transition-model.h和transition-model.cc中。在瞭解此物件之前,應先閱讀和理解hmm-topology相關的內容。
在介紹TransitionModel之前,先介紹一些概念。
                phone:  音素,從1開始編號。可以根據phones.txt對映為具體音素
        HMM-state:  音素HMM模型的狀態,從0開始編號
                 pdf-id:  決策樹和聲學模型中用到的pdf的編號,從0開始
   transition-state:  一個(虛擬的)狀態,通過弧跳轉到自己或其他狀態。某些情況下,可以跟pdf-id一一對應。
  transition-index:  HMM狀態中轉移的索引,即HmmTopology::HmmState::transitions的索引,從0開始編號
        transition-id:  所有的HMM狀態的弧進行編號。從1開始編號。
   
   通常,將phone、HMM-state和pdf-id(包括forward-pdf-id, self-loop-pdf-id)作為一個元組(Tuple),一個元組,可對映為一個transition-state。transition-state加一個具體的transition-index,可以映射出一個transition-id。各種對映關係如下:

   (phone, HMM-state, forward-pdf-id, self-loop-pdf-id) -> transition-state
   (transition-state, transition-index)                 -> transition-id
   
同時也存在著反向的對映關係,即:
                      transition-id -> transition-state
                      transition-id -> transition-index
                   transition-state -> phone
                   transition-state -> HMM-state
                   transition-state -> forward-pdf-id
                   transition-state -> self-loop-pdf-id
kaldi中TransitionModel的定義如下,為了方便閱讀和理解,對程式碼做了修改。
class TransitionModel {

 public:

  TransitionModel() { }

  void Read(std::istream &is, bool binary); 
  void Write(std::ostream &os, bool binary) const;

  /// return reference to HMM-topology object.
  const HmmTopology &GetTopo() const { return topo_; }

  /// \name Integer mapping functions
  /// @{

  int32 TupleToTransitionState(int32 phone, int32 hmm_state, int32 pdf, int32 self_loop_pdf) const;
  int32 PairToTransitionId(int32 trans_state, int32 trans_index) const;
  int32 TransitionIdToTransitionState(int32 trans_id) const;  //return id2state_[trans_id];
  int32 TransitionIdToTransitionIndex(int32 trans_id) const;
  int32 TransitionStateToPhone(int32 trans_state) const;  //return tuples_[trans_state-1].phone;
  int32 TransitionStateToHmmState(int32 trans_state) const;
  int32 TransitionStateToForwardPdfClass(int32 trans_state) const;
  int32 TransitionStateToSelfLoopPdfClass(int32 trans_state) const;
  int32 TransitionStateToForwardPdf(int32 trans_state) const;
  int32 TransitionStateToSelfLoopPdf(int32 trans_state) const;
  int32 SelfLoopOf(int32 trans_state) const;  // returns the self-loop transition-id, or zero if
  // this state doesn't have a self-loop.

  inline int32 TransitionIdToPdf(int32 trans_id) const;  //return id2pdf_id_[trans_id];
  int32 TransitionIdToPhone(int32 trans_id) const;  //return tuples_[id2state_[trans_id]-1].phone;
  int32 TransitionIdToPdfClass(int32 trans_id) const;
  int32 TransitionIdToHmmState(int32 trans_id) const;

  /// Returns the total number of transition-ids (note, these are one-based).
  inline int32 NumTransitionIds() const { return id2state_.size()-1; }

  /// Returns the number of transition-indices for a particular transition-state.
  /// Note: "Indices" is the plural of "index".   Index is not the same as "id",
  /// here.  A transition-index is a zero-based offset into the transitions
  /// out of a particular transition state.
  int32 NumTransitionIndices(int32 trans_state){
    return state2id_[trans_state+1]-state2id_[trans_state];
  }

  /// Returns the total number of transition-states (note, these are one-based).
  int32 NumTransitionStates() const { return tuples_.size(); }

  // NumPdfs() actually returns the highest-numbered pdf we ever saw, plus one.
  // In normal cases this should equal the number of pdfs in the system, but if you
  // initialized this object with fewer than all the phones, and it happens that
  // an unseen phone has the highest-numbered pdf, this might be different.
  int32 NumPdfs() const { return num_pdfs_; }

  BaseFloat GetTransitionLogProb(int32 trans_id){
    return log_probs_(trans_id);
  }


 private:

  struct Tuple {
    int32 phone;
    int32 hmm_state;
    int32 forward_pdf;
    int32 self_loop_pdf;
    Tuple() { }
    Tuple(int32 phone, int32 hmm_state, int32 forward_pdf, int32 self_loop_pdf):
      phone(phone), hmm_state(hmm_state), forward_pdf(forward_pdf), self_loop_pdf(self_loop_pdf) { }
  };

  HmmTopology topo_;

  /// Triples indexed by transition state minus one;
  /// the triples are in sorted order which allows us to do the reverse mapping from
  /// triple to transition state
  std::vector<Tuple> tuples_;

  /// Gives the first transition_id of each transition-state; indexed by
  /// the transition-state.  Array indexed 1..num-transition-states+1 (the last one
  /// is needed so we can know the num-transitions of the last transition-state.
  std::vector<int32> state2id_;

  /// For each transition-id, the corresponding transition
  /// state (indexed by transition-id).
  std::vector<int32> id2state_;

  std::vector<int32> id2pdf_id_;

  /// For each transition-id, the corresponding log-prob.  Indexed by transition-id.
  Vector<BaseFloat> log_probs_;

  /// For each transition-state, the log of (1 - self-loop-prob).  Indexed by
  /// transition-state.
  Vector<BaseFloat> non_self_loop_log_probs_;

  /// This is actually one plus the highest-numbered pdf we ever got back from the
  /// tree (but the tree numbers pdfs contiguously from zero so this is the number
  /// of pdfs).
  int32 num_pdfs_;

};
實際寫入模型檔案(如final.mdl)中的HMM模型就是一個TransitionModel物件。但是寫入到檔案中的,並不是所有成員變數。只是寫入了topo_、tuples_和log_probs_這三項。其他項,都是在後來計算出來的。下面的表格,也對幾個成員變數,做了彙總介紹。

表格中,“tr_state”表示transition-state。