1. 程式人生 > >intel:spectre&Meltdown側通道攻擊(五)—— DRAM address mapping

intel:spectre&Meltdown側通道攻擊(五)—— DRAM address mapping

  前面介紹了row hammer,理論上很完美,實際操作的時候會面臨很尷尬的問題:記憶體儲存資料最小的單位是cell(就是個電容,充電是1,放電是0),無數個橫著的cell組成row,無數個豎著的cell組成colume;數個row和colume組成bank,多個bank組成chip,然後是rank,接著是dimm、channel,總的邏輯包含順序是:channel->dimm->rank->chip->bak->row/colume->cell;怎麼把實體地址對映到具體的cell了?換句話說:比如知道了某個能提權資料位的虛擬地址,linux和windwos都能呼叫系統API查詢到相應的實體地址,又怎麼根據實體地址對映到bank/row/colume了? 否則怎麼精準hammer?Intel 並未公開對映演算法,怎麼通過一些逆向的方式、方法猜測出實體地址到dram 的address mapping了?

  1、百度的同學公開一種演算法,https://cloud.tencent.com/developer/article/1620354 這裡有演算法的說明,但本人並未找到該工具(自稱DRAMDig)的原始碼,也未找到該工具下載的地址,未能驗證證其效果;為方便理解,整理了一個導圖,如下:

  

  通過DRAMDig測試的結果如下,由此可見:不同cpu型號、不同記憶體大小,對應不同的row、colume、bank function,情況較為複雜;

  

  2、國外有團隊做了row hammer測試,根據測試結果猜測出了實體地址對映到DRAM的方法,具體參考:http://lackingrhoticity.blogspot.com/2015/05/how-physical-addresses-map-to-rows-and-banks.html

     對應的測試程式碼:https://github.com/google/rowhammer-test/tree/master/extended_test  下面詳細說明程式碼的思路和倒推對映演算法的思路;

  (1)整個流程大致如下: 先分配1GB記憶體,每個bit全部初始化為1;再盲選40個地址反覆hammer,選出成功flip的地址後兩兩組合繼續hammer,再取消兩兩組合後繼續hammer;

     

  (2)新問題來了:作者一開始盲選地址hammer,成功flip後為啥要narrow to pair down後再hammer了?——為了更準確地確認實體地址和反轉flip的關係,看看到底是哪個地址導致了哪些cell flip,藉此更精確的地逆向address mapping;下面會列印hammer的地址和被flip的地址,然後根據這些資訊逆向address mapping;

if (check(&bit_flip_info)) {
        found = true;
        printf("RESULT PAIR,0x%" PRIx64 ",0x%" PRIx64 ",0x%" PRIx64 ",%i,%i\n",
               get_physical_addr((uintptr_t) addr1),
               get_physical_addr((uintptr_t) addr2),
               get_physical_addr((uintptr_t) bit_flip_info.victim_virtual_addr),
               bit_flip_info.bit_number,
               bit_flip_info.flips_to);
      }

    作者耗費6小時,反轉了22個地址,分別如下:

RESULT PAIR,0x6ccc1000,0x6cd59000,0x6cd1f680,40,0
RESULT PAIR,0x708f1000,0x70969000,0x7092ef08,40,0
RESULT PAIR,0x1a1d57000,0x1a1ddc000,0x1a1d9b718,63,0
RESULT PAIR,0x1a14de000,0x72367000,0x72321c20,33,0
RESULT PAIR,0x194d63000,0x194cf8000,0x194d27b30,16,0
RESULT PAIR,0x7b664000,0x7b6ed000,0x7b622d30,47,0
RESULT PAIR,0x72366000,0x61503000,0x72321c20,33,0
RESULT PAIR,0x72366000,0x5e9cf000,0x72321c20,33,0
RESULT PAIR,0x193606000,0x193825000,0x193643c10,2,0
RESULT PAIR,0x171417000,0x171236000,0x171272980,44,0
RESULT PAIR,0x17a644000,0x17a865000,0x17a822f00,49,0
RESULT PAIR,0x80af9000,0x17ebaf000,0x80a34310,4,0
RESULT PAIR,0x1961ec000,0x196165000,0x1961abd10,39,0
RESULT PAIR,0x7248f000,0x72515000,0x724c8d88,45,0
RESULT PAIR,0x1716b7000,0x7eb69000,0x1716f1ea0,36,0
RESULT PAIR,0x16f3d6000,0x16f1f6000,0x16f3930b0,47,0
RESULT PAIR,0x72901000,0x177232000,0x1772775a0,41,0
RESULT PAIR,0x772fc000,0x77277000,0x77231830,36,0
RESULT PAIR,0x7bcf3000,0x7bd69000,0x7bd2ef10,33,0
RESULT PAIR,0x7e275000,0x7e456000,0x7e412a30,39,0
RESULT PAIR,0x1730d7000,0x17305d000,0x1730910a8,35,0
RESULT PAIR,0x80afb000,0x78671000,0x80a34310,4,0

    怎麼根據這些flip的位反推實體地址和row、colume、bank了?作者cpu是sandy brige,ubuntu系統,4GB記憶體,先用decode-dimms初步查看了記憶體資訊,如下:

Size                                            4096 MB
Banks x Rows x Columns x Bits                   8 x 15 x 10 x 64
Ranks                                           2

  這裡有2個rank,8個bank;每個bank包含了2^15  = 32768 rows;每個row的容量 2^10*64=8KB;總容量 = 8 kbytes per row * 32768 rows * 2 ranks * 8 banks = 4GB;通過該命令,初步確認了row、colume和bank的位數;結合上述被filp的地址,連帶著各種猜測和不停的嘗試,作者把地址做出了以下分解:

result:
    diff=-39980
    addr=0x06cd1f680 -> row=01101100110100 rank=0 bank=011 col_hi=1101101 channel=0 col_lo=000000 (victim)
    addr=0x06cd59000 -> row=01101100110101 rank=0 bank=011 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x06ccc1000 -> row=01101100110011 rank=0 bank=011 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-3a0f8
    addr=0x07092ef08 -> row=01110000100100 rank=1 bank=111 col_hi=1011110 channel=0 col_lo=001000 (victim)
    addr=0x070969000 -> row=01110000100101 rank=1 bank=111 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x0708f1000 -> row=01110000100011 rank=1 bank=111 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-408e8
    addr=0x1a1d9b718 -> row=10100001110110 rank=0 bank=000 col_hi=1101110 channel=0 col_lo=011000 (victim)  
    addr=0x1a1ddc000 -> row=10100001110111 rank=0 bank=000 col_hi=0000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x1a1d57000 -> row=10100001110101 rank=0 bank=000 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-453e0
    addr=0x072321c20 -> row=01110010001100 rank=1 bank=100 col_hi=0111000 channel=0 col_lo=100000 (victim)
    addr=0x072367000 -> row=01110010001101 rank=1 bank=100 col_hi=1100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x1a14de000 -> row=10100001010011 rank=0 bank=100 col_hi=1000000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=2fb30
    addr=0x194d27b30 -> row=10010100110100 rank=1 bank=101 col_hi=1110110 channel=0 col_lo=110000 (victim)
    addr=0x194cf8000 -> row=10010100110011 rank=1 bank=101 col_hi=0000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x194d63000 -> row=10010100110101 rank=1 bank=101 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
    unusual?
result:
    diff=-412d0
    addr=0x07b622d30 -> row=01111011011000 rank=1 bank=000 col_hi=1011010 channel=0 col_lo=110000 (victim)
    addr=0x07b664000 -> row=01111011011001 rank=1 bank=000 col_hi=0000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x07b6ed000 -> row=01111011011011 rank=1 bank=000 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-443e0
    addr=0x072321c20 -> row=01110010001100 rank=1 bank=100 col_hi=0111000 channel=0 col_lo=100000 (victim)
    addr=0x072366000 -> row=01110010001101 rank=1 bank=100 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x061503000 -> row=01100001010100 rank=0 bank=100 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-443e0
    addr=0x072321c20 -> row=01110010001100 rank=1 bank=100 col_hi=0111000 channel=0 col_lo=100000 (victim)
    addr=0x072366000 -> row=01110010001101 rank=1 bank=100 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x05e9cf000 -> row=01011110100111 rank=0 bank=100 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=3dc10
    addr=0x193643c10 -> row=10010011011001 rank=0 bank=001 col_hi=1111000 channel=0 col_lo=010000 (victim)
    addr=0x193606000 -> row=10010011011000 rank=0 bank=001 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x193825000 -> row=10010011100000 rank=1 bank=001 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=3c980
    addr=0x171272980 -> row=01110001001001 rank=1 bank=101 col_hi=1010011 channel=0 col_lo=000000 (victim)
    addr=0x171236000 -> row=01110001001000 rank=1 bank=101 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x171417000 -> row=01110001010000 rank=0 bank=101 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-42100
    addr=0x17a822f00 -> row=01111010100000 rank=1 bank=000 col_hi=1011110 channel=0 col_lo=000000 (victim)
    addr=0x17a865000 -> row=01111010100001 rank=1 bank=000 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x17a644000 -> row=01111010011001 rank=0 bank=000 col_hi=0000000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-c4cf0
    addr=0x080a34310 -> row=10000000101000 rank=1 bank=101 col_hi=0000110 channel=0 col_lo=010000 (victim)
    addr=0x080af9000 -> row=10000000101011 rank=1 bank=101 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x17ebaf000 -> row=01111110101110 rank=1 bank=101 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=False
    unusual?
result:
    diff=-402f0
    addr=0x1961abd10 -> row=10010110000110 rank=1 bank=100 col_hi=1111010 channel=0 col_lo=010000 (victim)
    addr=0x1961ec000 -> row=10010110000111 rank=1 bank=100 col_hi=0000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x196165000 -> row=10010110000101 rank=1 bank=100 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=39d88
    addr=0x0724c8d88 -> row=01110010010011 rank=0 bank=001 col_hi=0011011 channel=0 col_lo=001000 (victim)
    addr=0x07248f000 -> row=01110010010010 rank=0 bank=001 col_hi=1100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x072515000 -> row=01110010010100 rank=0 bank=001 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=3aea0
    addr=0x1716f1ea0 -> row=01110001011011 rank=1 bank=111 col_hi=0111101 channel=0 col_lo=100000 (victim)
    addr=0x1716b7000 -> row=01110001011010 rank=1 bank=111 col_hi=1100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x07eb69000 -> row=01111110101101 rank=1 bank=111 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-42f50
    addr=0x16f3930b0 -> row=01101111001110 rank=0 bank=010 col_hi=1100001 channel=0 col_lo=110000 (victim)
    addr=0x16f3d6000 -> row=01101111001111 rank=0 bank=010 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x16f1f6000 -> row=01101111000111 rank=1 bank=010 col_hi=1000000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=455a0
    addr=0x1772775a0 -> row=01110111001001 rank=1 bank=100 col_hi=1101011 channel=0 col_lo=100000 (victim)
    addr=0x177232000 -> row=01110111001000 rank=1 bank=100 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x072901000 -> row=01110010100100 rank=0 bank=100 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-457d0
    addr=0x077231830 -> row=01110111001000 rank=1 bank=100 col_hi=0110000 channel=0 col_lo=110000 (victim)
    addr=0x077277000 -> row=01110111001001 rank=1 bank=100 col_hi=1100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x0772fc000 -> row=01110111001011 rank=1 bank=100 col_hi=0000000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-3a0f0
    addr=0x07bd2ef10 -> row=01111011110100 rank=1 bank=111 col_hi=1011110 channel=0 col_lo=010000 (victim)
    addr=0x07bd69000 -> row=01111011110101 rank=1 bank=111 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x07bcf3000 -> row=01111011110011 rank=1 bank=111 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=-435d0
    addr=0x07e412a30 -> row=01111110010000 rank=0 bank=100 col_hi=1010100 channel=0 col_lo=110000 (victim)
    addr=0x07e456000 -> row=01111110010001 rank=0 bank=100 col_hi=1000000 channel=0 col_lo=000000 (aggressor1)
    addr=0x07e275000 -> row=01111110001001 rank=1 bank=100 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
result:
    diff=340a8
    addr=0x1730910a8 -> row=01110011000010 rank=0 bank=110 col_hi=0100001 channel=0 col_lo=101000 (victim)
    addr=0x17305d000 -> row=01110011000001 rank=0 bank=110 col_hi=0100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x1730d7000 -> row=01110011000011 rank=0 bank=110 col_hi=1100000 channel=0 col_lo=000000 (aggressor2)
    fits=True
    unusual?
result:
    diff=-c6cf0
    addr=0x080a34310 -> row=10000000101000 rank=1 bank=101 col_hi=0000110 channel=0 col_lo=010000 (victim)
    addr=0x080afb000 -> row=10000000101011 rank=1 bank=101 col_hi=1100000 channel=0 col_lo=000000 (aggressor1)
    addr=0x078671000 -> row=01111000011001 rank=1 bank=101 col_hi=0100000 channel=0 col_lo=000000 (aggressor2)
    fits=False
    unusual?

  仔細觀察:(1)aggressor1距離victim更近(22個樣本中,20個樣本的位數只差1,兩個樣本差3),大概率是影響victim 的地址   (2)victims和兩個aggressor都在同一個bank

  進而得出了以下address mapping的結論:

  • Bits 0-5: These are the lower 6 bits of the byte index within a row (i.e. the 6-bit index into a 64-byte cache line).
  • Bit 6: This is a 1-bit channel number, which selects between the 2 DIMMs.
  • Bits 7-13: These are the upper 7 bits of the index within a row (i.e. the upper bits of the column number).
  • Bits 14-16: These are XOR'd with the bottom 3 bits of the row number to give the 3-bit bank number.
  • Bit 17: This is a 1-bit rank number, which selects between the 2 ranks of a DIMM (which are typically the two sides of the DIMM's circuit board).
  • Bits 18-32: These are the 15-bit row number.
  • Bits 33+: These may be set because physical memory starts at physical addresses greater than 0.

  記憶體管理器這麼對映實體地址,有啥好處了?

  •  0~5bit一共有6位,剛好是2^6=64byte 一個cache line的大小,這麼做可以讓兩個channel同時並行訪問不同的cache line,提升速度;
  •  bank 並行:8個bank可以同時並行讀寫,提升速度
  •  bank function: bank bit之間的XOR,可以在大範圍讀取資料時讓地址對映到不同的bank,減少thrashing碰撞的概率

  (3)最核心的hammer程式碼如下: 對特定地址讀54萬次;每次讀後cflush清空cache line,強迫cpu每次都去DRAM讀取,使得對應地址的row反覆充放電,達到hammer的效果;

//inner的每個地址分別讀資料,再清除快取,如此重複54萬次;
static void row_hammer_inner(struct InnerSet inner) {
  if (TEST_MODE &&
      inner.addrs[0] == g_inject_addr1 &&
      inner.addrs[1] == g_inject_addr2) {
    printf("Test mode: Injecting bit flip...\n");
    g_mem[3] ^= 1;
  }

  uint32_t sum = 0;
  for (int i = 0; i < toggles; i++) {//重複54萬次
    for (int a = 0; a < ADDR_COUNT; a++)
      sum += *inner.addrs[a] + 1;//分別從4個內層地址讀資料,可以把這4個地址儲存的資料放進rowbuffer,原地址的cell讀一次會充放電一次,影響其周邊的cell;
    if (!TEST_MODE) {
      for (int a = 0; a < ADDR_COUNT; a++)
        //上面4個地址的內容從cache line清除,確保下次cpu還是從記憶體去讀,才能保證row hammer的效果
        asm volatile("clflush (%0)" : : "r" (inner.addrs[a]) : "memory");
    }
  }

  // Sanity check.  We don't expect this to fail, because reading
  // these rows refreshes them.
  if (sum != 0) {
    printf("error: sum=%x\n", sum);
    exit(1);
  }
}

   完整程式碼如下:精華都在註釋(英文是原作者,中文是我加的)

// Copyright 2015, Google, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// This is required on Mac OS X for getting PRI* macros #defined.
#define __STDC_FORMAT_MACROS

#include <assert.h>
#include <fcntl.h>
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>


#if !defined(TEST_MODE)
# define TEST_MODE 0
#endif

const size_t mem_size = 1 << 30;//1GB
const int toggles = 540000;

char *g_mem;
void *g_inject_addr1;
void *g_inject_addr2;

uint64_t g_address_sets_tried;
int g_errors_found;

/*從g_mem開始隨機選個實體地址,偏移是頁的整數倍,大小不超過1GB 
http://lackingrhoticity.blogspot.com/2015/05/how-physical-addresses-map-to-rows-and-banks.html 原作者自己的硬體環境:
Bits 0-5: These are the lower 6 bits of the byte index within a row (i.e. the 6-bit index into a 64-byte cache line).
Bit 6: This is a 1-bit channel number, which selects between the 2 DIMMs.
Bits 7-13: These are the upper 7 bits of the index within a row (i.e. the upper bits of the column number).
實體地址依次增加0x1000,也就是不同的實體地址低12位是相同的,那麼這些實體地址的:1、channel是一樣的  2、colume是一樣的  3、bank和row可能不同
*/
char *pick_addr() {
  //offset是頁(1<<12=4096)的整數倍
  size_t offset = (rand() << 12) % mem_size;
  return g_mem + offset;
}
//虛擬地址轉成實體地址,http://0x4c43.cn/2018/0508/linux-dynamic-link/  有詳細說明
uint64_t get_physical_addr(uintptr_t virtual_addr) {
  int fd = open("/proc/self/pagemap", O_RDONLY);
  assert(fd >= 0);

  int kPageSize = 0x1000;
  off_t pos = lseek(fd, (virtual_addr / kPageSize) * 8, SEEK_SET);
  assert(pos >= 0);
  uint64_t value;
  int got = read(fd, &value, 8);
  assert(got == 8);
  int rc = close(fd);
  assert(rc == 0);

  uint64_t frame_num = value & ((1ULL << 54) - 1);
  return (frame_num * kPageSize) | (virtual_addr & (kPageSize - 1));
}

class Timer {
  struct timeval start_time_;

 public:
  Timer() {
    // Note that we use gettimeofday() (with microsecond resolution)
    // rather than clock_gettime() (with nanosecond resolution) so
    // that this works on Mac OS X, because OS X doesn't provide
    // clock_gettime() and we don't really need nanosecond resolution.
    int rc = gettimeofday(&start_time_, NULL);
    assert(rc == 0);
  }

  double get_diff() {
    struct timeval end_time;
    int rc = gettimeofday(&end_time, NULL);
    assert(rc == 0);
    return (end_time.tv_sec - start_time_.tv_sec
            + (double) (end_time.tv_usec - start_time_.tv_usec) / 1e6);
  }
};

#define ADDR_COUNT 4
#define ITERATIONS 10

struct InnerSet {
  uint32_t *addrs[ADDR_COUNT];//4個32位的地址
};
struct OuterSet {
  struct InnerSet inner[ITERATIONS];//10個內層結構,每個內層結構4個32位地址
};

/*g_mem每bit都置1,後續按照一定頻率反覆讀寫某些地址的資料。如果發生flip,
會導致其內容不再是1,後續會在check函式檢查是否發生了bit flip*/
static void reset_mem() {
  memset(g_mem, 0xff, mem_size);
}

//從g_mem開始隨機選擇40個實體地址儲存在set;實體地址的偏移是頁的整數倍,大小超過1GB;
static void pick_addrs(struct OuterSet *set) {
  for (int j = 0; j < ITERATIONS; j++) {
    for (int a = 0; a < ADDR_COUNT; a++) {
      set->inner[j].addrs[a] = (uint32_t *) pick_addr();
    }
  }
}

//inner的每個地址分別讀資料,再清除快取,如此重複54萬次;
static void row_hammer_inner(struct InnerSet inner) {
  if (TEST_MODE &&
      inner.addrs[0] == g_inject_addr1 &&
      inner.addrs[1] == g_inject_addr2) {
    printf("Test mode: Injecting bit flip...\n");
    g_mem[3] ^= 1;
  }

  uint32_t sum = 0;
  for (int i = 0; i < toggles; i++) {//重複54萬次
    for (int a = 0; a < ADDR_COUNT; a++)
      sum += *inner.addrs[a] + 1;//分別從4個內層地址讀資料,可以把這4個地址儲存的資料放進rowbuffer,原地址的cell讀一次會充放電一次,影響其周邊的cell;
    if (!TEST_MODE) {
      for (int a = 0; a < ADDR_COUNT; a++)
        //上面4個地址的內容從cache line清除,確保下次cpu還是從記憶體去讀,才能保證row hammer的效果
        asm volatile("clflush (%0)" : : "r" (inner.addrs[a]) : "memory");
    }
  }

  // Sanity check.  We don't expect this to fail, because reading
  // these rows refreshes them.
  if (sum != 0) {
    printf("error: sum=%x\n", sum);
    exit(1);
  }
}

static void row_hammer(struct OuterSet *set) {
  Timer timer;
  for (int j = 0; j < ITERATIONS; j++) {
      //讀取inner的地址、清空快取,重複54萬次;
    row_hammer_inner(set->inner[j]);
    g_address_sets_tried++;
  }

  // Print statistics derived from the time and number of accesses.
  double time_taken = timer.get_diff();
  printf("  Took %.1f ms per address set\n",//1個set = 1個1inner(一共4個地址),耗時58~59ms,平局每個地址的讀取耗時14.5~15ms(每個地址重複了54萬次,並且上次讀取後清空了快取);
         time_taken / ITERATIONS * 1e3);
  printf("  Took %g sec in total for %i address sets\n",
         time_taken, ITERATIONS);
  int memory_accesses = ITERATIONS * ADDR_COUNT * toggles;
  printf("  Took %.3f nanosec per memory access (for %i memory accesses)\n",
         time_taken / memory_accesses * 1e9,//每個地址的讀取時間在27~29ns之間
         memory_accesses);
  int refresh_period_ms = 64;//記憶體控制器每64ms重新整理一次;每個重新整理週期內,每個地址訪問53~58萬次;
  printf("  This gives %i accesses per address per %i ms refresh period\n",
         (int) (refresh_period_ms * 1e-3 * ITERATIONS * toggles / time_taken),
         refresh_period_ms);
}

struct BitFlipInfo {
  uintptr_t victim_virtual_addr;
  int bit_number;
  uint8_t flips_to;  // 1 if this is a 0 -> 1 bit flip, 0 otherwise.
};

static bool check(struct BitFlipInfo *result) {
  uint64_t *end = (uint64_t *) (g_mem + mem_size);
  uint64_t *ptr;
  bool found_error = false;
  for (ptr = (uint64_t *) g_mem; ptr < end; ptr++) {
    uint64_t got = *ptr;
    uint64_t expected = ~(uint64_t) 0;//g_mem每個bit初始都置1
    if (got != expected) {
      printf("error at %p (phys 0x%" PRIx64 "): got 0x%" PRIx64 "\n",
             ptr, get_physical_addr((uintptr_t) ptr), got);
      found_error = true;
      g_errors_found++;

      if (result) {
        result->victim_virtual_addr = (uintptr_t) ptr;//儲存flip的地址
        result->bit_number = -1;//0xff,初始值;
        for (int bit = 0; bit < 64; bit++) {//上面每次比對取64bit,這裡繼續看看到底是哪個bit被flip了
          if (((got >> bit) & 1) != ((expected >> bit) && 1)) {
            result->bit_number = bit;//找到了flip的位,最終flip的位=victim_virtual_addr+bit_number;
            result->flips_to = (got >> bit) & 1;
          }
        }
        assert(result->bit_number != -1);
      }
    }
  }
  return found_error;
}
/*
用發生flip的地址兩兩組合形成新inner地址,縮小範圍後繼續hammer
*/
bool narrow_to_pair(struct InnerSet *inner) {
  bool found = false;
  for (int idx1 = 0; idx1 < ADDR_COUNT; idx1++) {
    for (int idx2 = idx1 + 1; idx2 < ADDR_COUNT; idx2++) {
        //0+1、1+2、2+3組合發生反轉的地址
      uint32_t *addr1 = inner->addrs[idx1];
      uint32_t *addr2 = inner->addrs[idx2];
      struct InnerSet new_set;
      // This is slightly hacky: We reuse row_hammer_inner(), which
      // always expects to hammer ADDR_COUNT addresses.  Rather than
      // making another version that takes a pair of addresses, we
      // just pass our 2 addresses to row_hammer_inner() multiple
      // times.  新的inner分別放這兩個發生過flip的地址組合
      for (int a = 0; a < ADDR_COUNT; a++) {
        new_set.addrs[a] = a % 2 == 0 ? addr1 : addr2;
      }
      printf("Trying pair: 0x%" PRIx64 ", 0x%" PRIx64 "\n",
             get_physical_addr((uintptr_t) addr1),
             get_physical_addr((uintptr_t) addr2));
      reset_mem();
      row_hammer_inner(new_set);
      struct BitFlipInfo bit_flip_info;
      if (check(&bit_flip_info)) {
        found = true;
        printf("RESULT PAIR,0x%" PRIx64 ",0x%" PRIx64 ",0x%" PRIx64 ",%i,%i\n",
               get_physical_addr((uintptr_t) addr1),
               get_physical_addr((uintptr_t) addr2),
               get_physical_addr((uintptr_t) bit_flip_info.victim_virtual_addr),
               bit_flip_info.bit_number,
               bit_flip_info.flips_to);
      }
    }
  }
  return found;
}
//繼續hammer,把發生flip的inner地址儲存後
bool narrow_down(struct OuterSet *outer) {
  bool found = false;
  for (int j = 0; j < ITERATIONS; j++) {
    reset_mem();
    row_hammer_inner(outer->inner[j]);
    if (check(NULL)) {
      printf("hammered addresses:\n");
      struct InnerSet *inner = &outer->inner[j];//把發生flip的地址儲存在inner結構
      for (int a = 0; a < ADDR_COUNT; a++) {
        printf("  logical=%p, physical=0x%" PRIx64 "\n",
               inner->addrs[a],
               get_physical_addr((uintptr_t) inner->addrs[a]));
      }
      found = true;

      printf("Narrowing down to a specific pair...\n");
      int tries = 0;
      while (!narrow_to_pair(inner)) {
        if (++tries >= 10) {
          printf("Narrowing to pair: Giving up after %i tries\n", tries);
          break;
        }
      }
    }
  }
  return found;
}

void main_prog() {
  printf("RESULT START_TIME,%" PRId64 "\n", time(NULL));

  g_mem = (char *) mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
                        MAP_ANON | MAP_PRIVATE, -1, 0);
  assert(g_mem != MAP_FAILED);

  printf("Clearing memory...\n");
  reset_mem();//分配的1G記憶體全部值0xFF

  Timer t;
  int iter = 0;
  for (;;) {
    printf("Iteration %i (after %.2fs)\n", iter++, t.get_diff());
    struct OuterSet addr_set;
    pick_addrs(&addr_set);
    if (TEST_MODE && iter == 3) {
      printf("Test mode: Will inject a bit flip...\n");
      g_inject_addr1 = addr_set.inner[2].addrs[0];
      g_inject_addr2 = addr_set.inner[2].addrs[1];
    }
    row_hammer(&addr_set);

    Timer check_timer;
    bool found_error = check(NULL);
    printf("  Checking for bit flips took %f sec\n", check_timer.get_diff());

    if (iter % 100 == 0 || found_error) {
      // Report general progress stats:
      //  - Time since start, in seconds
      //  - Current Unix time (seconds since epoch)
      //  - Number of address sets tried
      //  - Number of bit flips found (not necessarily unique ones)
      printf("RESULT STAT,%.2f,%" PRId64 ",%" PRId64 ",%i\n",
             t.get_diff(),
             (uint64_t) time(NULL),
             g_address_sets_tried,
             g_errors_found);
    }

    if (found_error) {
      printf("\nNarrowing down to set of %i addresses...\n", ADDR_COUNT);
      int tries = 0;
      while (!narrow_down(&addr_set)) {
        if (++tries >= 10) {
          printf("Narrowing to address set: Giving up after %i tries\n", tries);
          break;
        }
      }

      printf("\nRunning retries...\n");
      for (int i = 0; i < 10; i++) {
        printf("Retry %i\n", i);
        reset_mem();
        row_hammer(&addr_set);
        check(NULL);
      }
      if (TEST_MODE)
        exit(1);
    }
  }
}


int main() {
  // Turn off unwanted buffering for when stdout is a pipe.
  setvbuf(stdout, NULL, _IONBF, 0);

  // Start with an empty line in case previous output was truncated
  // mid-line.
  printf("\n");

  if (TEST_MODE) {
    printf("Running in safe test mode...\n");
  }

  // Fork a subprocess so that we can print the test process's exit
  // status, and to prevent reboots or kernel panics if we are running
  // as PID 1.
  int pid = fork();
  if (pid == 0) {
    main_prog();
    _exit(1);
  }

  int status;
  if (waitpid(pid, &status, 0) == pid) {
    printf("** exited with status %i (0x%x)\n", status, status);
  }

  if (getpid() == 1) {
    // We're the "init" process.  Avoid exiting because that would
    // cause a kernel panic, which can cause a reboot or just obscure
    // log output and prevent console scrollback from working.
    for (;;) {
      sleep(999);
    }
  }
  return 0;
}

參考:

      1、 http://lackingrhoticity.blogspot.com/2015/05/how-physical-addresses-map-to-rows-and-banks.html   How physical addresses map to rows and banks in DRAM

      2、 https://cloud.tencent.com/developer/article/1620354  逆向DRAM地址對映

    3、 DRAMDig: A Knowledge-assisted Tool to Uncover DRAM Address Map