jvm原始碼分析之oop-klass物件模型
概述
HotSpot是基於c++實現,而c++是一門面向物件的語言,本身具備面向物件基本特徵,所以Java中的物件表示,最簡單的做法是為每個Java類生成一個c++類與之對應。
但HotSpot JVM並沒有這麼做,而是設計了一個OOP-Klass Model。這裡的 OOP 指的是 Ordinary Object Pointer (普通物件指標),它用來表示物件的例項資訊,看起來像個指標實際上是藏在指標裡的物件。而 Klass 則包含元資料和方法資訊,用來描述Java類。
之所以採用這個模型是因為HotSopt JVM的設計者不想讓每個物件中都含有一個vtable(虛擬函式表),所以就把物件模型拆成klass和oop,其中oop中不含有任何虛擬函式,而Klass就含有虛擬函式表,可以進行method dispatch。
oop-klass物件模型
klass
Klass簡單的說是Java類在HotSpot中的c++對等體,用來描述Java類。
Klass主要有兩個功能:
- 實現語言層面的Java類
- 實現Java物件的分發功能
Klass是什麼時候建立的呢?一般jvm在載入class檔案時,會在方法區建立instanceKlass,表示其元資料,包括常量池、欄位、方法等。
oop
Klass是在class檔案在載入過程中建立的,OOP則是在Java程式執行過程中new物件時建立的。
一個OOP物件包含以下幾個部分:
- 物件頭 (header)
- Mark Word,主要儲存物件執行時記錄資訊,如hashcode, GC分代年齡,鎖狀態標誌,執行緒ID,時間戳等
- 元資料指標,即指向方法區的instanceKlass例項
- 例項資料。儲存的是真正有效資料,如各種欄位內容,各欄位的分配策略為longs/doubles、ints、shorts/chars、bytes/boolean、oops(ordinary object pointers),相同寬度的欄位總是被分配到一起,便於之後取資料。父類定義的變數會出現在子類定義的變數的前面。
- 對齊填充。僅僅起到佔位符的作用,並非必須。
例項說明
假設我們有以下程式碼:
class Model { public static int a = 1; public int b; public Model(int b) { this.b = b; } } public static void main(String[] args) { int c = 10; Model modelA = new Model(2); Model modelB = new Model(3); }
上述程式碼得OOP-Klass模型入下所示
oop-klass的jvm原始碼分析
oop.hpp
oopDesc類描述了java物件的格式。
oopDesc中包含兩個資料成員:_mark 和 _metadata。
- _mark物件即為Mark World,儲存物件執行時記錄資訊,如hashcode, GC分代年齡,鎖狀態標誌,執行緒ID,時間戳等。
- _metadata即為元資料指標,它是一個聯合體,其中_klass是普通指標,_compressed_klass是壓縮類指標,這兩個指標都指向instanceKlass物件。
// oopDesc is the top baseclass for objects classes. The {name}Desc classes describe
// the format of Java objects so the fields can be accessed from C++.
//這個類描述了java物件的格式
// oopDesc is abstract.
// (see oopHierarchy for complete oop class hierarchy)
//
// no virtual functions allowed 不允許虛擬函式
class oopDesc {
friend class VMStructs;
private:
volatile markOop _mark; //Mark Word
union _metadata { //元資料指標
wideKlassOop _klass;
narrowOop _compressed_klass;
} _metadata;
}
instanceOop.hpp
instanceOopDesc繼承了oopDesc,它代表了java類的一個例項化物件。
// An instanceOop is an instance of a Java Class
// Evaluating "new HashTable()" will create an instanceOop.
class instanceOopDesc : public oopDesc {
public:
// aligned header size.
static int header_size() { return sizeof(instanceOopDesc)/HeapWordSize; }
// If compressed, the offset of the fields of the instance may not be aligned.
static int base_offset_in_bytes() {
return UseCompressedOops ?
klass_gap_offset_in_bytes() :
sizeof(instanceOopDesc);
}
static bool contains_field_offset(int offset, int nonstatic_field_size) {
int base_in_bytes = base_offset_in_bytes();
return (offset >= base_in_bytes &&
(offset-base_in_bytes) < nonstatic_field_size * heapOopSize);
}
};
instanceKlass.hpp
instanceKlass是Java類的vm級別的表示。
其中,ClassState描述了類載入的狀態:分配、載入、連結、初始化。
instanceKlass的佈局包括:宣告介面、欄位、方法、常量池、原始檔名等等。
// An instanceKlass is the VM level representation of a Java class.
// It contains all information needed for at class at execution runtime.
class instanceKlass: public Klass {
friend class VMStructs;
public:
enum ClassState {
unparsable_by_gc = 0, // object is not yet parsable by gc. Value of _init_state at object allocation.
allocated, // allocated (but not yet linked)
loaded, // loaded and inserted in class hierarchy (but not linked yet)
linked, // successfully linked/verified (but not initialized yet)
being_initialized, // currently running class initializer
fully_initialized, // initialized (successfull final state)
initialization_error // error happened during initialization
};
//部分內容省略
protected:
// Method array. 方法陣列
objArrayOop _methods;
// Interface (klassOops) this class declares locally to implement.
objArrayOop _local_interfaces; //該類宣告要實現的介面.
// Instance and static variable information
typeArrayOop _fields;
// Constant pool for this class.
constantPoolOop _constants; //常量池
// Class loader used to load this class, NULL if VM loader used.
oop _class_loader; //類載入器
typeArrayOop _inner_classes; //內部類
Symbol* _source_file_name; //原始檔名
}
markOop.hpp
markOop描述了java的物件頭格式。
// The markOop describes the header of an object.
//markOop描述了Java的物件頭
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
//
// Bit-format of an object header (most significant first, big endian layout below):
//
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)
//
// unused:25 hash:31 -->| cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && normal object)
// JavaThread*:54 epoch:2 cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && biased object)
// narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
// unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)
class markOopDesc: public oopDesc {
private:
// Conversion
uintptr_t value() const { return (uintptr_t) this; }
public:
// Constants
enum { age_bits = 4,
lock_bits = 2,
biased_lock_bits = 1,
max_hash_bits = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
hash_bits = max_hash_bits > 31 ? 31 : max_hash_bits,
cms_bits = LP64_ONLY(1) NOT_LP64(0),
epoch_bits = 2
};
// The biased locking code currently requires that the age bits be
// contiguous to the lock bits.
enum { lock_shift = 0,
biased_lock_shift = lock_bits,
age_shift = lock_bits + biased_lock_bits,
cms_shift = age_shift + age_bits,
hash_shift = cms_shift + cms_bits,
epoch_shift = hash_shift
};
//部分內容省略
}
instanceOopDesc物件的建立過程
allocate_instance方法
instanceOopDesc物件通過instanceKlass::allocate_instance進行建立,實現過程如下:
1、has_finalizer判斷當前類是否包含不為空的finalize方法;
2、size_helper確定建立當前物件需要分配多大記憶體;
3、CollectedHeap::obj_allocate從堆中申請指定大小的記憶體,並建立instanceOopDesc物件
instanceKlass.cpp
instanceOop instanceKlass::allocate_instance(TRAPS) {
assert(!oop_is_instanceMirror(), "wrong allocation path");
bool has_finalizer_flag = has_finalizer(); // Query before possible GC
int size = size_helper(); // Query before forming handle.
KlassHandle h_k(THREAD, as_klassOop());
instanceOop i;
i = (instanceOop)CollectedHeap::obj_allocate(h_k, size, CHECK_NULL);
if (has_finalizer_flag && !RegisterFinalizersAtInit) {
i = register_finalizer(i, CHECK_NULL);
}
return i;
}
obj_allocate方法
CollectedHeap::obj_allocate從堆中申請指定大小的記憶體,並建立instanceOopDesc物件,實現如下:
CollectedHeap.inline.hpp
oop CollectedHeap::obj_allocate(KlassHandle klass, int size, TRAPS) {
debug_only(check_for_valid_allocation_state());
assert(!Universe::heap()->is_gc_active(), "Allocation during gc not allowed");
assert(size >= 0, "int won't convert to size_t");
HeapWord* obj = common_mem_allocate_init(klass, size, CHECK_NULL);
post_allocation_setup_obj(klass, obj);
NOT_PRODUCT(Universe::heap()->check_for_bad_heap_word_value(obj, size));
return (oop)obj;
}
common_mem_allocate_noinit方法
該方法的實現如下:
1、如果開啟了TLAB優化,從tlab分配記憶體並返回(TLAB全稱ThreadLocalAllocBuffer
,是執行緒的一塊私有記憶體);
2、如果第一步不執行,呼叫Universe::heap()->mem_allocate方法在堆上分配記憶體並返回;
HeapWord* CollectedHeap::common_mem_allocate_noinit(KlassHandle klass, size_t size, TRAPS) {
// Clear unhandled oops for memory allocation. Memory allocation might
// not take out a lock if from tlab, so clear here.
CHECK_UNHANDLED_OOPS_ONLY(THREAD->clear_unhandled_oops();)
if (HAS_PENDING_EXCEPTION) {
NOT_PRODUCT(guarantee(false, "Should not allocate with exception pending"));
return NULL; // caller does a CHECK_0 too
}
HeapWord* result = NULL;
if (UseTLAB) { //如果開啟了TLAB優化
result = allocate_from_tlab(klass, THREAD, size);
if (result != NULL) {
assert(!HAS_PENDING_EXCEPTION,
"Unexpected exception, will result in uninitialized storage");
return result;
}
}
bool gc_overhead_limit_was_exceeded = false;
result = Universe::heap()->mem_allocate(size,
&gc_overhead_limit_was_exceeded);
if (result != NULL) {
NOT_PRODUCT(Universe::heap()->
check_for_non_bad_heap_word_value(result, size));
assert(!HAS_PENDING_EXCEPTION,
"Unexpected exception, will result in uninitialized storage");
THREAD->incr_allocated_bytes(size * HeapWordSize);
AllocTracer::send_allocation_outside_tlab_event(klass, size * HeapWordSize);
return result;
}
mem_allocate方法
假設使用G1垃圾收集器,該方法實現如下:
g1CollectedHeap.cpp
HeapWord*
G1CollectedHeap::mem_allocate(size_t word_size,
bool* gc_overhead_limit_was_exceeded) {
assert_heap_not_locked_and_not_at_safepoint();
// Loop until the allocation is satisfied, or unsatisfied after GC.
for (int try_count = 1; /* we'll return */; try_count += 1) {
unsigned int gc_count_before;
HeapWord* result = NULL;
if (!isHumongous(word_size)) {
result = attempt_allocation(word_size, &gc_count_before);
} else {
result = attempt_allocation_humongous(word_size, &gc_count_before);
}
if (result != NULL) {
return result;
}
// Create the garbage collection operation...
VM_G1CollectForAllocation op(gc_count_before, word_size);
// ...and get the VM thread to execute it.
VMThread::execute(&op);
if (op.prologue_succeeded() && op.pause_succeeded()) {
// If the operation was successful we'll return the result even
// if it is NULL. If the allocation attempt failed immediately
// after a Full GC, it's unlikely we'll be able to allocate now.
HeapWord* result = op.result();
if (result != NULL && !isHumongous(word_size)) {
// Allocations that take place on VM operations do not do any
// card dirtying and we have to do it here. We only have to do
// this for non-humongous allocations, though.
dirty_young_block(result, word_size);
}
return result;
} else {
assert(op.result() == NULL,
"the result should be NULL if the VM op did not succeed");
}
// Give a warning if we seem to be looping forever.
if ((QueuedAllocationWarningCount > 0) &&
(try_count % QueuedAllocationWarningCount == 0)) {
warning("G1CollectedHeap::mem_allocate retries %d times", try_count);
}
}
ShouldNotReachHere();
return NULL;
}
成員變數在物件中的佈局
佈局策略
各欄位的分配策略為longs/doubles、ints、shorts/chars、bytes/boolean、oops(ordinary object pointers),相同寬度的欄位總是被分配到一起,便於之後取資料。父類定義的變數會出現在子類定義的變數的前面。
事實上,它有三種分配策略:
First Fields order: oops, longs/doubles, ints, shorts/chars, bytes
Second Fields order: longs/doubles, ints, shorts/chars, bytes, oops
Third Fields allocation: oops fields in super and sub classes are together.
我們使用的一般是第二種分配策略。
jvm原始碼實現位於classFileParser.cpp
parseClassFile方法
該函式主要功能就是根據JVM Spec解析class檔案,它依次解析以下部分:
1. class檔案的一些元資訊,包括class檔案的magic number以及它的minor/major版本號。
2. constant pool。
3. 類的訪問標記以及類的屬性(是否是class/interface,當前類的index,父類的index)。
4. interfaces的描述
5. fields的描述
6. methods的描述
5. attributes的描述
在Hotspot中,每個類在初始化時就會完成成員變數在物件佈局的初始化。具體而言就是在class檔案被解析的時候完成這個步驟的。
該步驟實現如下(以不存在父類和靜態欄位為例):
1、判斷父類是否存在,如果存在,獲取父類的非靜態欄位的大小;
// Field size and offset computation
//判斷是否有父類,如果沒有父類,非靜態欄位的大小為0,否則設為父類的非靜態欄位的大小
int nonstatic_field_size = super_klass() == NULL ? 0 : super_klass->nonstatic_field_size();
2、求出首個非靜態欄位在物件的偏移;
instanceOopDesc::base_offset_in_bytes()方法返回的其實是Java物件頭的大小。
假如父類不存在,即nonstatic_field_size為0,首個非靜態欄位在物件的偏移量即為Java物件頭的大小。
heapOopSize指的是oop的大小,它依賴於是否開啟UseCompressedOops(預設開啟)。開啟時為4-byte否則為8-byte。
因為nonstatic_field_size的單位是heapOopSize故要換算成offset需要乘上它。
first_nonstatic_field_offset = instanceOopDesc::base_offset_in_bytes() +
nonstatic_field_size * heapOopSize;
3、求出各種欄位型別的個數,初始化next指標為first;
next_nonstatic_field_offset變數相當於是一個pointer。
first_nonstatic_field_offset = instanceOopDesc::base_offset_in_bytes() +
nonstatic_field_size * heapOopSize;
next_nonstatic_field_offset = first_nonstatic_field_offset; //初始化next指標為first
unsigned int nonstatic_double_count = fac.count[NONSTATIC_DOUBLE];//double和long欄位型別
unsigned int nonstatic_word_count = fac.count[NONSTATIC_WORD]; //int和float欄位型別
unsigned int nonstatic_short_count = fac.count[NONSTATIC_SHORT]; //short欄位型別
unsigned int nonstatic_byte_count = fac.count[NONSTATIC_BYTE]; //short欄位型別
unsigned int nonstatic_oop_count = fac.count[NONSTATIC_OOP]; //oop欄位型別
4、根據分配策略求出首個欄位型別在物件的偏移;
如果是第一種分配策略:先求出oop型別欄位和double型別欄位的偏移;
如果是第二種分配策略:先求出double型別欄位的偏移;
if( allocation_style == 0 ) {
// Fields order: oops, longs/doubles, ints, shorts/chars, bytes
next_nonstatic_oop_offset = next_nonstatic_field_offset;
next_nonstatic_double_offset = next_nonstatic_oop_offset +
(nonstatic_oop_count * heapOopSize);
} else if( allocation_style == 1 ) {
// Fields order: longs/doubles, ints, shorts/chars, bytes, oops
next_nonstatic_double_offset = next_nonstatic_field_offset;
} else if( allocation_style == 2 ) {
//第三種分配策略此處不討論
}
5、求出各種欄位型別在物件的偏移;
按照double >> word >> short >> btye的欄位順序:
word欄位的偏移 = double欄位的偏移 + (double欄位的個數 * 一個double欄位的位元組長度)
short欄位的偏移 = word欄位的偏移 + (word欄位的個數 * 一個word欄位的位元組長度)
btye欄位的偏移 = short欄位的偏移 + (short欄位的個數 * 一個short欄位的位元組長度)
next_nonstatic_word_offset = next_nonstatic_double_offset +
(nonstatic_double_count * BytesPerLong);
next_nonstatic_short_offset = next_nonstatic_word_offset +
(nonstatic_word_count * BytesPerInt);
next_nonstatic_byte_offset = next_nonstatic_short_offset +
(nonstatic_short_count * BytesPerShort);