opencv3.1中的opencv_traincascade人臉檢測訓練程式碼分析

阿新 • • 發佈：2019-02-08

利用cmake對opencv的資源進行編譯，得到opencv_traincascade工程，並設定其為啟動項，利用準備好的訓練樣本，即可訓練。

從main（）函式開始

其中的引數都是自己設定的，具體含義

    classifier.train( cascadeDirName,//分類器存放地址
                      vecName,//vec檔名
                      bgName,//負樣本檔名
                      numPos, numNeg,//numpos表示每一個訓練階段正樣本數量，numNeg每一個訓練階段負樣本數
                      precalcValBufSize, precalcIdxBufSize,
                      numStages,//分類器級數
                      cascadeParams,
                      *featureParams[cascadeParams.featureType],
                      stageParams,
                      baseFormatSave,
                      acceptanceRatioBreakValue );
    return 0;

進入classifier.train()函式中

bool CvCascadeClassifier::train( const string _cascadeDirName,//級聯分類器訓練
                                const string _posFilename,
                                const string _negFilename,
                                int _numPos, int _numNeg,
                                int _precalcValBufSize, int _precalcIdxBufSize,
                                int _numStages,
                                const CvCascadeParams& _cascadeParams,
                                const CvFeatureParams& _featureParams,//選擇了Harr特徵
                                const CvCascadeBoostParams& _stageParams,
                                bool baseFormatSave,
                                double acceptanceRatioBreakValue )
{
    // Start recording clock ticks for training time output
    const clock_t begin_time = clock();

    if( _cascadeDirName.empty() || _posFilename.empty() || _negFilename.empty() )
        CV_Error( CV_StsBadArg, "_cascadeDirName or _bgfileName or _vecFileName is NULL" );

    string dirName;
    if (_cascadeDirName.find_last_of("/\\") == (_cascadeDirName.length() - 1) )
        dirName = _cascadeDirName;
    else
        dirName = _cascadeDirName + '/';

    numPos = _numPos;
    numNeg = _numNeg;
    numStages = _numStages;
    if ( !imgReader.create( _posFilename, _negFilename, _cascadeParams.winSize ) )
    {
        cout << "Image reader can not be created from -vec " << _posFilename
                << " and -bg " << _negFilename << "." << endl;
        return false;
    }
    if ( !load( dirName ) )
    {
        cascadeParams = _cascadeParams;
        featureParams = CvFeatureParams::create(cascadeParams.featureType);
        featureParams->init(_featureParams);
        stageParams = makePtr<CvCascadeBoostParams>();
        *stageParams = _stageParams;
        featureEvaluator = CvFeatureEvaluator::create(cascadeParams.featureType);
        featureEvaluator->init( featureParams, numPos + numNeg, cascadeParams.winSize );
        stageClassifiers.reserve( numStages );
    }else{
        // Make sure that if model parameters are preloaded, that people are aware of this,
        // even when passing other parameters to the training command
        cout << "---------------------------------------------------------------------------------" << endl;
        cout << "Training parameters are pre-loaded from the parameter file in data folder!" << endl;
        cout << "Please empty this folder if you want to use a NEW set of training parameters." << endl;
        cout << "---------------------------------------------------------------------------------" << endl;
    }

//輸出引數資訊
    cout << "PARAMETERS:" << endl;
    cout << "cascadeDirName: " << _cascadeDirName << endl;
    cout << "vecFileName: " << _posFilename << endl;
    cout << "bgFileName: " << _negFilename << endl;
    cout << "numPos: " << _numPos << endl;
    cout << "numNeg: " << _numNeg << endl;
    cout << "numStages: " << numStages << endl;
    cout << "precalcValBufSize[Mb] : " << _precalcValBufSize << endl;
    cout << "precalcIdxBufSize[Mb] : " << _precalcIdxBufSize << endl;
    cout << "acceptanceRatioBreakValue : " << acceptanceRatioBreakValue << endl;
    cascadeParams.printAttrs();
    stageParams->printAttrs();
    featureParams->printAttrs();

    int startNumStages = (int)stageClassifiers.size();
    if ( startNumStages > 1 )
        cout << endl << "Stages 0-" << startNumStages-1 << " are loaded" << endl;
    else if ( startNumStages == 1)
        cout << endl << "Stage 0 is loaded" << endl;
	//requiredLeafFARate:要求的分支虛警率0.0000009536743，接近於0
    double requiredLeafFARate = pow( (double) stageParams->maxFalseAlarm, (double) numStages ) /
                                (double)stageParams->max_depth;
    double tempLeafFARate;

    for( int i = startNumStages; i < numStages; i++ )//進行每一階段的分類器訓練
    {
        cout << endl << "===== TRAINING " << i << "-stage =====" << endl;
        cout << "<BEGIN" << endl;

        if ( !updateTrainingSet( requiredLeafFARate, tempLeafFARate ) )//
			//從正負樣本集合裡挑選出numPos+numNeg個樣本到集合CvCascadeImageReader imgReader中,
			//個人感覺很多時間耗在這裡，因為這裡對每一個樣本要進行predict，就是說利用當前強分類器已構建的弱分類器進行判斷是正樣本還是負樣本，
			//只有被判斷是正樣本的情況才被加到TrainingSet中（第一次的時候當然是預設都是正樣本）。
			//所以如果i比較大的時候，在構建負樣本的TrainingSet就特別費時。並且把積分影象等計算出放到evaluator的img中。
        {
			//樣本數不夠, 退出訓練
            cout << "Train dataset for temp stage can not be filled. "
                    "Branch training terminated." << endl;//訓練停止條件
            break;
        }
        if( tempLeafFARate <= requiredLeafFARate )
        {
            cout << "Required leaf false alarm rate achieved. "
                    "Branch training terminated." << endl;
            break;
        }
        if( (tempLeafFARate <= acceptanceRatioBreakValue) && (acceptanceRatioBreakValue >= 0) ){
            cout << "The required acceptanceRatio for the model has been reached to avoid overfitting of trainingdata. "
                    "Branch training terminated." << endl;
            break;
        }

        Ptr<CvCascadeBoost> tempStage = makePtr<CvCascadeBoost>();
        bool isStageTrained = tempStage->train( featureEvaluator,
                                                curNumSamples, _precalcValBufSize, _precalcIdxBufSize,
                                                *stageParams );//訓練一個強分類器
        cout << "END>" << endl;

        if(!isStageTrained)
            break;

        stageClassifiers.push_back( tempStage );//把訓練好的強分類器加入到容器中

        // save params
        if( i == 0)
        {
            std::string paramsFilename = dirName + CC_PARAMS_FILENAME;
            FileStorage fs( paramsFilename, FileStorage::WRITE);
            if ( !fs.isOpened() )
            {
                cout << "Parameters can not be written, because file " << paramsFilename
                        << " can not be opened." << endl;
                return false;
            }
            fs << FileStorage::getDefaultObjectName(paramsFilename) << "{";
            writeParams( fs );
            fs << "}";
        }

//以下是儲存資料
        // save current stage
        char buf[10];
        sprintf(buf, "%s%d", "stage", i );
        string stageFilename = dirName + buf + ".xml";
        FileStorage fs( stageFilename, FileStorage::WRITE );
        if ( !fs.isOpened() )
        {
            cout << "Current stage can not be written, because file " << stageFilename
                    << " can not be opened." << endl;
            return false;
        }
        fs << FileStorage::getDefaultObjectName(stageFilename) << "{";
        tempStage->write( fs, Mat() );
        fs << "}";

        // Output training time up till now
        float seconds = float( clock () - begin_time ) / CLOCKS_PER_SEC;
        int days = int(seconds) / 60 / 60 / 24;
        int hours = (int(seconds) / 60 / 60) % 24;
        int minutes = (int(seconds) / 60) % 60;
        int seconds_left = int(seconds) % 60;
        cout << "Training until now has taken " << days << " days " << hours << " hours " << minutes << " minutes " << seconds_left <<" seconds." << endl;
    }

    if(stageClassifiers.size() == 0)
    {
        cout << "Cascade classifier can't be trained. Check the used training parameters." << endl;
        return false;
    }

    save( dirName + CC_CASCADE_FILENAME, baseFormatSave );

    return true;
}

進入bool isStageTrained = tempStage->train（）//訓練強分類器

bool CvCascadeBoost::train( const CvFeatureEvaluator* _featureEvaluator,
                           int _numSamples,
                           int _precalcValBufSize, int _precalcIdxBufSize,
                           const CvCascadeBoostParams& _params )
{
    bool isTrained = false;
    CV_Assert( !data );
    clear();
    data = new CvCascadeBoostTrainData( _featureEvaluator, _numSamples,
                                        _precalcValBufSize, _precalcIdxBufSize, _params );//setdata（）,目前暫且知道這裡是呼叫preCalculate計算所有的特徵值
    CvMemStorage *storage = cvCreateMemStorage();
	//建立將駐留在指定儲存中的新空序列，弱分類器。
    weak = cvCreateSeq( 0, sizeof(CvSeq), sizeof(CvBoostTree*), storage );
    storage = 0;

    set_params( _params );
    if ( (_params.boost_type == LOGIT) || (_params.boost_type == GENTLE) )
        data->do_responses_copy();

    update_weights( 0 );//權重初始化

    cout << "+----+---------+---------+" << endl;
    cout << "|  N |    HR   |    FA   |" << endl;
    cout << "+----+---------+---------+" << endl;

    do
    {
        CvCascadeBoostTree* tree = new CvCascadeBoostTree;
        if( !tree->train( data, subsample_mask, this ) )//訓練弱分類器，一個決策樹
        {
            delete tree;
            break;
        }
        cvSeqPush( weak, &tree );//把弱分類器新增到強分類器裡面 
        update_weights( tree );//更新權重。
        trim_weights();
        if( cvCountNonZero(subsample_mask) == 0 )
            break;
    }
    while( !isErrDesired() && (weak->total < params.weak_count) );//訓練出的弱分類器的個數小於引數值：100並且虛警率不小於0.5則訓練繼續。

    if(weak->total > 0)
    {
        data->is_classifier = true;
        data->free_train_data();
        isTrained = true;
    }
    else
        clear();

    return isTrained;
}

tree->train( data, subsample_mask, this ) )//訓練弱分類器，一個決策樹

CvBoostTree::train( CvDTreeTrainData* _train_data,
                    const CvMat* _subsample_idx, CvBoost* _ensemble )
{
    clear();
    ensemble = _ensemble;
    data = _train_data;
    data->shared = true;
    return do_train( _subsample_idx );//傳遞的是：_subsample_idx子樣本索引
}

進入do_train() ，在這裡獲得弱分類器的是生成一個決策樹（李航.統計學習方法中有關決策樹的論述）

bool CvDTree::do_train( const CvMat* _subsample_idx )//訓練弱分類器
{
	//1）對於每個特徵 f，計算所有訓練樣本的特徵值，並將其排序。
	//	掃描一遍排好序的特徵值，對排好序的表中的每個元素，計算下面四個值：
	//	全部人臉樣本的權重的和t1；
	//	全部非人臉樣本的權重的和t0；
	//	在此元素之前的人臉樣本的權重的和s1；
	//	在此元素之前的非人臉樣本的權重的和s0；
	//	2）最終求得每個元素的分類誤差
	//	在表中尋找r值最小的元素，則該元素作為最優閾值。有了該閾值，我們的第一個最優弱分類器就誕生了。

<span style="white-space:pre">	</span>//以上的弱分類器的尋找是在網上比較常見的描述但是在opencv_traincascade中，使用的決策樹（）
    bool result = false;

    CV_FUNCNAME( "CvDTree::do_train" );

    __BEGIN__;

    root = data->subsample_data( _subsample_idx );//子樣本資料

    CV_CALL( try_split_node(root));//嘗試分裂節點，生成樹

    if( root->split )
    {
        CV_Assert( root->left );
        CV_Assert( root->right );

        if( data->params.cv_folds > 0 )
            CV_CALL( prune_cv() );//修建生成樹

        if( !data->shared )
            data->free_train_data();

        result = true;
    }

    __END__;

    return result;
}

CV_CALL( try_split_node(root));//嘗試分裂節點，生成樹

void CvDTree::try_split_node( CvDTreeNode* node )//分裂節點，生成樹
{
    CvDTreeSplit* best_split = 0;
    int i, n = node->sample_count, vi;
    bool can_split = true;
    double quality_scale;

    calc_node_value( node );//計算節點值（就是一個決策樹），根節點值，最優特徵

    if( node->sample_count <= data->params.min_sample_count ||
        node->depth >= data->params.max_depth )
        can_split = false;//根據條件停止生成樹

    if( can_split && data->is_classifier )
    {
        // check if we have a "pure" node,
        // we assume that cls_count is filled by calc_node_value()
        int* cls_count = data->counts->data.i;
        int nz = 0, m = data->get_num_classes();
        for( i = 0; i < m; i++ )
            nz += cls_count[i] != 0;
        if( nz == 1 ) // there is only one class
            can_split = false;
    }
    else if( can_split )
    {
        if( sqrt(node->node_risk)/n < data->params.regression_accuracy )
            can_split = false;
    }

    if( can_split )
    {
        best_split = find_best_split(node);
        // TODO: check the split quality ...
        node->split = best_split;
    }
    if( !can_split || !best_split )
    {
        data->free_node_data(node);
        return;
    }

    quality_scale = calc_node_dir( node );
    if( data->params.use_surrogates )
    {
        // find all the surrogate splits
        // and sort them by their similarity to the primary one
		//找到所有的代理拆分，並通過它們與主要的相似性對它們進行排序
        for( vi = 0; vi < data->var_count; vi++ )
        {
            CvDTreeSplit* split;
            int ci = data->get_var_type(vi);

            if( vi == best_split->var_idx )
                continue;

            if( ci >= 0 )
                split = find_surrogate_split_cat( node, vi );
            else
                split = find_surrogate_split_ord( node, vi );

            if( split )
            {
                // insert the split
				//插入分裂
                CvDTreeSplit* prev_split = node->split;
                split->quality = (float)(split->quality*quality_scale);

                while( prev_split->next &&
                       prev_split->next->quality > split->quality )
                    prev_split = prev_split->next;
                split->next = prev_split->next;
                prev_split->next = split;
            }
        }
    }
    split_node_data( node );
    try_split_node( node->left );//左節點
    try_split_node( node->right );//右節點
}

calc_node_value( node );//計算節點值（就是一個決策樹），根節點值，最優特徵

void CvDTree::calc_node_value( CvDTreeNode* node )//得到一個決策樹，一個弱分類器，根節點就是最優特徵
{
    int i, j, k, n = node->sample_count, cv_n = data->params.cv_folds;
    int m = data->get_num_classes();

    int base_size = data->is_classifier ? m*cv_n*sizeof(int) : 2*cv_n*sizeof(double)+cv_n*sizeof(int);
    int ext_size = n*(sizeof(int) + (data->is_classifier ? sizeof(int) : sizeof(int)+sizeof(float)));
    cv::AutoBuffer<uchar> inn_buf(base_size + ext_size);
    uchar* base_buf = (uchar*)inn_buf;
    uchar* ext_buf = base_buf + base_size;

    int* cv_labels_buf = (int*)ext_buf;
    const int* cv_labels = data->get_cv_labels(node, cv_labels_buf);

    if( data->is_classifier )
    {
        // in case of classification tree:決策樹中的分類樹
        //  * node value is the label of the class that has the largest weight in the node.
        //  * node risk is the weighted number of misclassified samples,
        //  * j-th cross-validation fold value and risk are calculated as above,
        //    but using the samples with cv_labels(*)!=j.
        //  * j-th cross-validation fold error is calculated as the weighted number of
        //    misclassified samples with cv_labels(*)==j.

        // compute the number of instances of each class
        int* cls_count = data->counts->data.i;
        int* responses_buf = cv_labels_buf + n;
        const int* responses = data->get_class_labels(node, responses_buf);
        int* cv_cls_count = (int*)base_buf;
        double max_val = -1, total_weight = 0;
        int max_k = -1;
        double* priors = data->priors_mult->data.db;

        for( k = 0; k < m; k++ )
            cls_count[k] = 0;

        if( cv_n == 0 )
        {
            for( i = 0; i < n; i++ )
                cls_count[responses[i]]++;
        }
        else
        {
            for( j = 0; j < cv_n; j++ )
                for( k = 0; k < m; k++ )
                    cv_cls_count[j*m + k] = 0;

            for( i = 0; i < n; i++ )
            {
                j = cv_labels[i]; k = responses[i];
                cv_cls_count[j*m + k]++;
            }

            for( j = 0; j < cv_n; j++ )
                for( k = 0; k < m; k++ )
                    cls_count[k] += cv_cls_count[j*m + k];
        }

        if( data->have_priors && node->parent == 0 )
        {
            // compute priors_mult from priors, take the sample ratio into account.
            double sum = 0;
            for( k = 0; k < m; k++ )
            {
                int n_k = cls_count[k];
                priors[k] = data->priors->data.db[k]*(n_k ? 1./n_k : 0.);
                sum += priors[k];
            }
            sum = 1./sum;
            for( k = 0; k < m; k++ )
                priors[k] *= sum;
        }

        for( k = 0; k < m; k++ )
        {
            double val = cls_count[k]*priors[k];
            total_weight += val;
            if( max_val < val )
            {
                max_val = val;
                max_k = k;
            }
        }

        node->class_idx = max_k;
        node->value = data->cat_map->data.i[
            data->cat_ofs->data.i[data->cat_var_count] + max_k];
        node->node_risk = total_weight - max_val;

        for( j = 0; j < cv_n; j++ )
        {
            double sum_k = 0, sum = 0, max_val_k = 0;
            max_val = -1; max_k = -1;

            for( k = 0; k < m; k++ )
            {
                double w = priors[k];
                double val_k = cv_cls_count[j*m + k]*w;
                double val = cls_count[k]*w - val_k;
                sum_k += val_k;
                sum += val;
                if( max_val < val )
                {
                    max_val = val;
                    max_val_k = val_k;
                    max_k = k;
                }
            }

            node->cv_Tn[j] = INT_MAX;
            node->cv_node_risk[j] = sum - max_val;
            node->cv_node_error[j] = sum_k - max_val_k;
        }
    }
    else
    {
        // in case of regression tree:決策樹種的迴歸樹
        //  * node value is 1/n*sum_i(Y_i), where Y_i is i-th response,
        //    n is the number of samples in the node.節點樣本數
        //  * node risk is the sum of squared errors: sum_i((Y_i - <node_value>)^2);平方誤差
        //  * j-th cross-validation fold value and risk are calculated as above,較差驗證摺疊值和平方誤差計算
        //    but using the samples with cv_labels(*)!=j.
        //  * j-th cross-validation fold error is calculated
        //    using samples with cv_labels(*)==j as the test subset:
        //    error_j = sum_(i,cv_labels(i)==j)((Y_i - <node_value_j>)^2),
        //    where node_value_j is the node value calculated
        //    as described in the previous bullet, and summation is done
        //    over the samples with cv_labels(*)==j.

        double sum = 0, sum2 = 0;
        float* values_buf = (float*)(cv_labels_buf + n);
        int* sample_indices_buf = (int*)(values_buf + n);
        const float* values = data->get_ord_responses(node, values_buf, sample_indices_buf);
        double *cv_sum = 0, *cv_sum2 = 0;
        int* cv_count = 0;

        if( cv_n == 0 )
        {
            for( i = 0; i < n; i++ )
            {
                double t = values[i];
                sum += t;
                sum2 += t*t;
            }
        }
        else
        {
            cv_sum = (double*)base_buf;
            cv_sum2 = cv_sum + cv_n;
            cv_count = (int*)(cv_sum2 + cv_n);

            for( j = 0; j < cv_n; j++ )
            {
                cv_sum[j] = cv_sum2[j] = 0.;
                cv_count[j] = 0;
            }

            for( i = 0; i < n; i++ )
            {
                j = cv_labels[i];
                double t = values[i];
                double s = cv_sum[j] + t;
                double s2 = cv_sum2[j] + t*t;
                int nc = cv_count[j] + 1;
                cv_sum[j] = s;
                cv_sum2[j] = s2;
                cv_count[j] = nc;
            }

            for( j = 0; j < cv_n; j++ )
            {
                sum += cv_sum[j];
                sum2 += cv_sum2[j];
            }
        }

        node->node_risk = sum2 - (sum/n)*sum;//平方誤差總和
        node->value = sum/n;

        for( j = 0; j < cv_n; j++ )
        {
            double s = cv_sum[j], si = sum - s;
            double s2 = cv_sum2[j], s2i = sum2 - s2;
            int c = cv_count[j], ci = n - c;
            double r = si/MAX(ci,1);
            node->cv_node_risk[j] = s2i - r*r*ci;
            node->cv_node_error[j] = s2 - 2*r*s + c*r*r;
            node->cv_Tn[j] = INT_MAX;
        }
    }
}

決策樹生成之後，到達：

  do
    {
        CvCascadeBoostTree* tree = new CvCascadeBoostTree;
        if( !tree->train( data, subsample_mask, this ) )//訓練弱分類器，一個決策樹
        {
            delete tree;
            break;
        }
        cvSeqPush( weak, &tree );//把弱分類器新增到強分類器裡面 
        update_weights( tree );//更新權重。
        trim_weights();
        if( cvCountNonZero(subsample_mask) == 0 )
            break;
    }
    while( !isErrDesired() && (weak->total < params.weak_count) )

進入權重更新：update_weights(tree),在更新權重的過程中，裡面有四個Adaboost的更新演算法discrete Adaboost、Real Adaboost、Gentle Adaboost和 LogitBoost。其中的演算法描述在《基於子空間的人臉識別》中有詳細介紹，網路上的描述內容不詳細。

void CvCascadeBoost::update_weights( CvBoostTree* tree )
{
    int n = data->sample_count;//1000個訓練樣本
    double sumW = 0.;
    int step = 0;
    float* fdata = 0;
    int *sampleIdxBuf;
    const int* sampleIdx = 0;
    int inn_buf_size = ((params.boost_type == LOGIT) || (params.boost_type == GENTLE) ? n*sizeof(int) : 0) +
                       ( !tree ? n*sizeof(int) : 0 );
    cv::AutoBuffer<uchar> inn_buf(inn_buf_size);
    uchar* cur_inn_buf_pos = (uchar*)inn_buf;
    if ( (params.boost_type == LOGIT) || (params.boost_type == GENTLE) )
    {
        step = CV_IS_MAT_CONT(data->responses_copy->type) ?
            1 : data->responses_copy->step / CV_ELEM_SIZE(data->responses_copy->type);
        fdata = data->responses_copy->data.fl;
        sampleIdxBuf = (int*)cur_inn_buf_pos; cur_inn_buf_pos = (uchar*)(sampleIdxBuf + n);
        sampleIdx = data->get_sample_indices( data->data_root, sampleIdxBuf );
    }
    CvMat* buf = data->buf;
    size_t length_buf_row = data->get_length_subbuf();
    if( !tree ) // before training the first tree, initialize weights and other parameters初始化權重和其他引數
    {
        int* classLabelsBuf = (int*)cur_inn_buf_pos; cur_inn_buf_pos = (uchar*)(classLabelsBuf + n);
        const int* classLabels = data->get_class_labels(data->data_root, classLabelsBuf);
        // in case of logitboost and gentle adaboost each weak tree is a regression tree,
        // so we need to convert class labels to floating-point values
        double w0 = 1./n;
        double p[2] = { 1, 1 };

        cvReleaseMat( &orig_response );
        cvReleaseMat( &sum_response );
        cvReleaseMat( &weak_eval );
        cvReleaseMat( &subsample_mask );
        cvReleaseMat( &weights );

        orig_response = cvCreateMat( 1, n, CV_32S );
        weak_eval = cvCreateMat( 1, n, CV_64F );//1*1000的矩陣
        subsample_mask = cvCreateMat( 1, n, CV_8U );
        weights = cvCreateMat( 1, n, CV_64F );//1*1000的矩陣
        subtree_weights = cvCreateMat( 1, n + 2, CV_64F );

        if (data->is_buf_16u)
        {
            unsigned short* labels = (unsigned short*)(buf->data.s + data->data_root->buf_idx*length_buf_row +
                data->data_root->offset + (size_t)(data->work_var_count-1)*data->sample_count);
            for( int i = 0; i < n; i++ )
            {
                // save original categorical responses {0,1}, convert them to {-1,1}
                orig_response->data.i[i] = classLabels[i]*2 - 1;//
                // make all the samples active at start.
                // later, in trim_weights() deactivate/reactive again some, if need
                subsample_mask->data.ptr[i] = (uchar)1;
                // make all the initial weights the same.初始權值相同
                weights->data.db[i] = w0*p[classLabels[i]];//weight->data.db = 0.001.
                // set the labels to find (from within weak tree learning proc)
                // the particular sample weight, and where to store the response.
                labels[i] = (unsigned short)i;
            }
        }
        else
        {
            int* labels = buf->data.i + data->data_root->buf_idx*length_buf_row +
                data->data_root->offset + (size_t)(data->work_var_count-1)*data->sample_count;

            for( int i = 0; i < n; i++ )
            {
                // save original categorical responses {0,1}, convert them to {-1,1}
                orig_response->data.i[i] = classLabels[i]*2 - 1;
                subsample_mask->data.ptr[i] = (uchar)1;
                weights->data.db[i] = w0*p[classLabels[i]];
                labels[i] = i;
            }
        }

        if( params.boost_type == LOGIT )
        {
            sum_response = cvCreateMat( 1, n, CV_64F );

            for( int i = 0; i < n; i++ )
            {
                sum_response->data.db[i] = 0;
                fdata[sampleIdx[i]*step] = orig_response->data.i[i] > 0 ? 2.f : -2.f;
            }

            // in case of logitboost each weak tree is a regression tree.
            // the target function values are recalculated for each of the trees
            data->is_classifier = false;
        }
        else if( params.boost_type == GENTLE )
        {
            for( int i = 0; i < n; i++ )
                fdata[sampleIdx[i]*step] = (float)orig_response->data.i[i];

            data->is_classifier = false;
        }
    }
    else
    {
        // at this moment, for all the samples that participated in the training of the most
        // recent weak classifier we know the responses. For other samples we need to compute them
        if( have_subsample )
        {
            // invert the subsample mask
            cvXorS( subsample_mask, cvScalar(1.), subsample_mask );

            // run tree through all the non-processed samples
            for( int i = 0; i < n; i++ )
                if( subsample_mask->data.ptr[i] )
                {
                    weak_eval->data.db[i] = ((CvCascadeBoostTree*)tree)->predict( i )->value;//有可能是遍歷所有的訓練樣本
																							 //得到弱分類器的分類結果
                }
        }

        // now update weights and other parameters for each type of boosting
        if( params.boost_type == DISCRETE )
        {
            // Discrete AdaBoost:
            //   weak_eval[i] (=f(x_i)) is in {-1,1}弱分類器的形式和取值
            //   err = sum(w_i*(f(x_i) != y_i))/sum(w_i);y_i對應於（x_i, y_i）表示訓練樣本x_i表示Haar特徵表示，y_i表示每個樣本對應的標籤
            //   C = log((1-err)/err)加權係數
            //   w_i *= exp(C*(f(x_i) != y_i))更新權值

            double C, err = 0.;
            double scale[] = { 1., 0. };

            for( int i = 0; i < n; i++ )
            {
                double w = weights->data.db[i];//在update_weights（0）中.db[i]的值是0.001
                sumW += w;//暫時理解為權值和
                err += w*(weak_eval->data.db[i] != orig_response->data.i[i]);
						//weak_eval->data.db[i]表示分類器的輸出結果f（x_i）,是構造決策樹之後遍歷所有樣本。
						//orig_response->data.i[i]表示訓練樣本的標籤{1, -1}正負樣本
						//如果相等，表示檢測結果與實際相同（正確檢測），如果不等表示誤檢，w*()表示錯誤率
            }

            if( sumW != 0 )
                err /= sumW;
            C = err = -logRatio( err );
            scale[1] = exp(err);

            sumW = 0;
            for( int i = 0; i < n; i++ )
            {
                double w = weights->data.db[i]*
                    scale[weak_eval->data.db[i] != orig_response->data.i[i]];//增大分類錯誤樣本權重，不等scale[0] = 1,
																			 //較小分類正確樣本權重相等scale[1] = 0；
                sumW += w;
                weights->data.db[i] = w;//權值更新
            }

            tree->scale( C );//node 放縮
        }
        else if( params.boost_type == REAL )
        {
            // Real AdaBoost:是實數輸出的Adaboost演算法
            //   weak_eval[i] = f(x_i) = 0.5*log(p(x_i)/(1-p(x_i))), p(x_i)=P(y=1|x_i)；弱分類器表示形式
            //   w_i *= exp(-y_i*f(x_i))更新權重

            for( int i = 0; i < n; i++ )
                weak_eval->data.db[i] *= -orig_response->data.i[i];

            cvExp( weak_eval, weak_eval );

            for( int i = 0; i < n; i++ )
            {
                double w = weights->data.db[i]*weak_eval->data.db[i];
                sumW += w;
                weights->data.db[i] = w;
            }
        }
        else if( params.boost_type == LOGIT )
        {
            // LogitBoost:
            //   weak_eval[i] = f(x_i) in [-z_max,z_max]
            //   sum_response = F(x_i).
            //   F(x_i) += 0.5*f(x_i)
            //   p(x_i) = exp(F(x_i))/(exp(F(x_i)) + exp(-F(x_i))=1/(1+exp(-2*F(x_i)))
            //   reuse weak_eval: weak_eval[i] <- p(x_i)
            //   w_i = p(x_i)*1(1 - p(x_i))
            //   z_i = ((y_i+1)/2 - p(x_i))/(p(x_i)*(1 - p(x_i)))
            //   store z_i to the data->data_root as the new target responses

            const double lbWeightThresh = FLT_EPSILON;
            const double lbZMax = 10.;

            for( int i = 0; i < n; i++ )
            {
                double s = sum_response->data.db[i] + 0.5*weak_eval->data.db[i];
                sum_response->data.db[i] = s;
                weak_eval->data.db[i] = -2*s;
            }

            cvExp( weak_eval, weak_eval );

            for( int i = 0; i < n; i++ )
            {
                double p = 1./(1. + weak_eval->data.db[i]);
                double w = p*(1 - p), z;
                w = MAX( w, lbWeightThresh );
                weights->data.db[i] = w;
                sumW += w;
                if( orig_response->data.i[i] > 0 )
                {
                    z = 1./p;
                    fdata[sampleIdx[i]*step] = (float)min(z, lbZMax);
                }
                else
                {
                    z = 1./(1-p);
                    fdata[sampleIdx[i]*step] = (float)-min(z, lbZMax);
                }
            }
        }
        else
        {
            // Gentle AdaBoost:是基於加性迴歸模型的Adaboost演算法
            //   weak_eval[i] = f(x_i) in [-1,1]；弱分類器利用x_i和y_i做加權最小二乘迴歸，得到弱分類器
            //   w_i *= exp(-y_i*f(x_i))
            assert( params.boost_type == GENTLE );//如果為假，輸出錯誤資訊，然後呼叫abort終止程式執行

            for( int i = 0; i < n; i++ )
                weak_eval->data.db[i] *= -orig_response->data.i[i];//這個迴圈是：弱分類分類結果
																	//其中weak_eval表示弱分類器

            cvExp( weak_eval, weak_eval );

            for( int i = 0; i < n; i++ )
            {
                double w = weights->data.db[i] * weak_eval->data.db[i];
                weights->data.db[i] = w;
                sumW += w;
            }
        }
    }

    // renormalize weights歸一化權重，使得權重之和 = 1
    if( sumW > FLT_EPSILON )
    {
        sumW = 1./sumW;
        for( int i = 0; i < n; ++i )
            weights->data.db[i] *= sumW;
    }
	double original_ = *(weights->data.db);//weights.db = 0.001,1/1000樣本初始權重
}

opencv3.1中的opencv_traincascade人臉檢測訓練程式碼分析

利用cmake對opencv的資源進行編譯，得到opencv_traincascade工程，並設定其為啟動項，利用準備好的訓練樣本，即可訓練。從main（）函式開始其中的引數都是自己設定的，具體含義 classifier.train( cascadeDirNa

AdaBoost人臉檢測訓練演算法（中）

（3）採用演算法選取優化的弱分類器通過Adaboost演算法挑選數千個有效的haar特徵來組成人臉檢測器，Adaboost演算法中不同的訓練集是通過調整每個樣本對應的權重來實現的。開始時，每個樣本對應的權重是相同的，對於h1分類錯誤的樣本，加大其對應的權重；而對於分類

程式碼C++， opencv實現人臉識別，人臉檢測，人臉匹配，視訊中的人臉檢測，攝像頭下的人臉檢測等

前一段時間寫了一個人臉相關的演算法，包括視訊中的人臉檢測，相機的人臉檢測，影象中人臉檢測，還有人臉識別。使用的是VS2013和opencv。首先建立標頭檔案common.h#ifndef _COMMON_H #define _COMMON_H #include <op

實踐案例：使用開源工具從視訊中構建人臉檢測模型（Python實現）

介紹近年來，計算機視覺這個奇妙的領域已經發展到獨樹一幟的地步。在世界各地已經有大量的應用程式在廣泛使用。我在這個領域最喜歡的事情之一是我們的社群擁抱開源概念的想法。即使是大型科技巨頭也願意與每個人分享新的突破和創新，這樣技術就不會成為“富人的專利”。其中一種技術是人臉檢測，它在實際

Android中的人臉檢測（靜態和動態）

（1）背景。 Google 於2006年8月收購Neven Vision 公司（該公司擁有10多項應用於移動裝置領域的影象識別的專利），以此獲得了影象識別的技術，並加入到android中。Android 中的人臉識別技術，用到的底層庫：android/ex

對人臉檢測訓練樣本如何生成正樣本描述檔案和負樣本集合檔案

本博是筆者讀書筆記整理，歡迎轉載，請註明出處，如有不足，還望網友們指教。準備工作： 1 兩個資料夾，一個存放的是人臉樣本，一個存放的是非人臉樣本； 2 opencv_createsamples.exe 該可執行檔案在opencv的bin資料夾下，請注意查詢; 注：正

AdaBoost人臉檢測訓練演算法（下）

http://blog.csdn.net/hqw7286/article/details/5556812 就像我一開始說的，比起ViolaJones人臉檢測方法，Lienhart的人臉檢測方法只是在Harr-like特徵的選取、計算以及AdaBoost的訓練演算法上有區別。

opencv3.3 +python3.6人臉檢測 + 人臉識別

利用opencv進行人臉檢測和人臉提取，在網上查到相關資料，人臉檢測，需要引入opencv的一個專用包，也就是opencv中的CascadeClassifier類的detectMultiScale（）方法，對於人臉提取，就是在此基礎上使用影象剪下，將那一部分顯示出來imp

opencv3.1.0 特徵點檢測與影象匹配(features2d、xfeatures2d)

特徵檢測與匹配，在物體檢測，視覺跟蹤，三維重建等領域都有廣泛的應用。所以學習features2d、xfeatures2d中函式的使用，很有必要。 1、得到特徵點與特徵點描述（SIFT SURF ORB AKAZE）（1）SIFT #include <opencv

opencv3.1的HOG特徵檢測引數詳解（HOG梯度直方圖的高質量文章*****）

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

c#中static的作用及程式碼分析

說明2：C# 是面向物件的程式設計語言，每一個函式都屬於一個類。當一個方法被宣告為Static時，這個方法是一個靜態方法，編譯器會在編譯時保留這個方法的實現。也就是說，這個方法屬於類，但是不屬於任何成員，不管這個類的例項是否存在，它們都會存在。就像入口函式Static void Main，因為它是靜態函式，

人臉檢測演算法對比分析

人臉識別包括以下5個步驟：人臉檢測、影象預處理、特徵提取、匹配、結果輸出。人臉檢測是人臉識別中的第一個環節，是一項關鍵技術。人臉檢測是指假設在輸入影象中存在一個或者多個人臉區域的情況下，確定影象中全部人臉的位置、大小和姿勢的過程。從教學理論上來講，人臉檢測本質上是對一副影

[RK3399][Android7.1] 除錯筆記 GT9XX 觸控式螢幕中斷程式碼分析

平臺核心版本安卓版本 RK3399 Linux4.4 Android7.1 文章目錄裝置樹: 裝置樹

Harris角點檢測及程式碼分析（續）

緊接著“Harris角點檢測及程式碼分析”，這裡主要分析OpenCV的cvGoodFeaturesToTrack()函式，這才是角點提取的真正程式碼。 Jianbo Shi, Carlo Tomasi. Good Features to Track.

Python MTCNN（人臉檢測）專案附程式碼講解(1)-原理與論文介紹

首先介紹下什麼是MTCNN？論文：Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks （獲取原論文，私信我回復“MTCNN”即可獲得） MTCNN提出了一種Multi

人臉檢測和識別原始碼下載-opencv3+python3.6完整實戰專案原始碼識別視訊《歡樂頌》中人物

import os import sys import cv2 import numpy as np def normalize(X, low, high, dtype=None): """Normalizes a given array in X to a value between low an

opencv3視訊中檢測主播人臉 python 專案完整原始碼例項

專案完整原始碼下載包含： py檔案、測試視訊、xml檔案效果圖原始碼 # -*- coding: utf-8 -*- import cv2 import numpy as np cv2.namedWindow("test") # cap = cv2.Vid

人臉檢測中幾種框框大小的選擇~

gravity 經濟自己位置之間實現 track 之前訓練樣本人臉檢測應用極為廣泛，內部細節也偏多，尤其是涉及到幾種類型的框，這幾種框的大小之前有著千絲萬縷的聯系，對檢測性能的好壞影響程度大小不一。本篇文章基於自己在人臉檢測方面的經驗，說說對這些框之間關系的

從視頻中提取圖片，對圖片做人臉檢測並截取人臉區域

rep pan details 一個 ons sprintf imread href multipl 環境配置：VS2013+opencv2.4.10+libface.lib 參考博客：http://blog.csdn.net/augusdi/article/details

opencv_人臉檢測、模型訓練、人臉識別

nbsp tro python3.6 pychar 入門 tex 幫助 family ext 人臉檢測、模型訓練、人臉識別 2018-08-15 　　今天給大家帶來一套人臉識別一個小案例，主要是幫助小夥伴們解決如何入門OpenCV人臉識別的問題，現在的AI行業比較火熱，

opencv3.1中的opencv_traincascade人臉檢測訓練程式碼分析

相關推薦