memcpy() 函式的效率與平臺相關.

阿新 • • 發佈：2019-02-14

先來看看微軟開發工具下的 memcpy() 原始碼(E:\Microsoft Visual Studio 9.0\VC\crt\src):

/***
*memcpy.c - contains memcpy routine
*
*       Copyright (c) Microsoft Corporation. All rights reserved.
*
*Purpose:
*       memcpy() copies a source memory buffer to a destination buffer.
*       Overlapping buffers are not treated specially, so propogation may occur.
*
*******************************************************************************/

#include <cruntime.h>
#include <string.h>

#ifdef _MSC_VER
#pragma function(memcpy)
#endif  /* _MSC_VER */

/***
*memcpy - Copy source buffer to destination buffer
*
*Purpose:
*       memcpy() copies a source memory buffer to a destination memory buffer.
*       This routine does NOT recognize overlapping buffers, and thus can lead
*       to propogation.
*
*       For cases where propogation must be avoided, memmove() must be used.
*
*Entry:
*       void *dst = pointer to destination buffer
*       const void *src = pointer to source buffer
*       size_t count = number of bytes to copy
*
*Exit:
*       Returns a pointer to the destination buffer
*
*Exceptions:
*******************************************************************************/

void * __cdecl memcpy (
        void * dst,
        const void * src,
        size_t count
        )
{
        void * ret = dst;

#if defined (_M_IA64)

        {


        __declspec(dllimport)


        void RtlCopyMemory( void *, const void *, size_t count );

        RtlCopyMemory( dst, src, count );

        }

#else  /* defined (_M_IA64) */
        /*
         * copy from lower addresses to higher addresses
         */
        while (count--) {
                *(char *)dst = *(char *)src;
                dst = (char *)dst + 1;
                src = (char *)src + 1;
        }
#endif  /* defined (_M_IA64) */

        return(ret);
}

在 16/32 位系統中, 一次拷貝一個位元組的情況是非常浪費 CPU 效率的. 因為他們一般都要半字或字對齊. 讀寫資料一次就是 16/32bit. 如果在奇數地址上訪問一個位元組效率可想而知. 所以, 對於像 ARM 這種 4Byte 對齊的CPU而言下面的這種寫法是效率最高的, 而且效率相比於一次一位元組的情況, 不止是 4 倍的效率增長:

void my_memcpy(void * dest, const void * src, unsigned int n)
{
    unsigned int i = 0;
    long * Dest = (long *)dest;
    long * Src  = (long *)src;

    for (i = 0; i < (n >> 2); i++) {
        Dest[i] = Src[i];
    }
}

當然了, 如果不能保證使用的 CPU 平臺是 4B 對齊的, 可以在上述程式中新增程式碼來儘量保證實現高效率.

void my_memcpy(void * dest, const void * src, unsigned int n)
{
    unsigned int i = 0;
    long * Dest = (long *)dest;
    long * Src  = (long *)src;

    if (((unsigned long)Src) % 4 == 0) && ((unsigned long)Dest % 4 == 0) {
        for (i = 0; i < (n >> 2); i++) {
            Dest[i] = Src[i];
        }
    } else {
        memcpy(dest, src, n);
    }
}

而這種調整隻有在對效能要求敏感的場合使用, 如果不是這樣還是要使用標準的庫函式. 畢竟可移植性和可維護性也是很重要的.

===============================================================================================================================

memcpy() 函式的效率與平臺相關.

memcpy() 函式的效率與平臺相關.

SourceInsight精確匯入只與平臺相關的原始碼

php 與類相關的系統函式;

Codeforces 548 E Mike ans Foam (與質數相關的容斥多半會用到莫比烏斯函式)

php中url與路徑相關的函式

Libevent原始碼分析-----與event相關的一些函式和操作

# c++中的複合與繼承相關建構函式的呼叫先後

PHP用mb_string函式庫處理與windows相關中文字元

關於DB2與ORACLE相關常用函式比較分析

memset()與memcpy()函式及其作用

【libevent】原始碼分析（4）--與event相關的一些函式和操作

C語言檔案操作標準庫函式與Linux系統函式效率比較

溢米教育推薦平臺的效率與穩定性建設 | SOFAStack 使用者說

與WCAG相關的一些學習心得

與postgis相關的一些常用的sql

malloc與new相關

Shell與if相關參數

osgi應用使用橋接的方式打成war包部署在websphere上時遇到的與cxf相關的問題

http緩存與cdn相關技術

redis中與key相關的命令

memcpy() 函式的效率與平臺相關.

相關推薦