BufferedInputStream BufferedOutputStream原理解析,正確使用Buffer
談到java BIO中的效能優化,大部分人都會說使用BufferedInputStream BufferedOutputStream,理由是IO是跟硬體互動,是耗時操作,使用BufferedInputStream減少IO互動次數能大量提升IO效能。
檢視BufferedInputStream 原始碼,BufferedInputStream 有一個快取陣列
protected volatile byte buf[];
快取陣列大小預設是8192,也就是8K,(網上好多文章都是8M....文章一大抄 0.0)
private static int defaultBufferSize = 8192;
呼叫BufferedInputStream的讀取方法時,會先判斷快取陣列有沒有可用資料,如果沒有會先呼叫fill()方法將資料從硬碟載入到快取中,然後從快取資料中取資料返回。(呼叫fill前有個判斷(第八行)如果要求的資料長度比快取的陣列容器長度(不是指有效快取長度)大,那將直接從硬碟讀取載入,不再走BufferedInputStream的記憶體快取)
private int read1(byte[] b, int off, int len) throws IOException { int avail = count - pos; if (avail <= 0) { /* If the requested length is at least as large as the buffer, and if there is no mark/reset activity, do not bother to copy the bytes into the local buffer. In this way buffered streams will cascade harmlessly. */ if (len >= getBufIfOpen().length && markpos < 0) { return getInIfOpen().read(b, off, len); } fill(); avail = count - pos; if (avail <= 0) return -1; } int cnt = (avail < len) ? avail : len; System.arraycopy(getBufIfOpen(), pos, b, off, cnt); pos += cnt; return cnt; }
private void fill() throws IOException { byte[] buffer = getBufIfOpen(); if (markpos < 0) pos = 0; /* no mark: throw away the buffer */ else if (pos >= buffer.length) /* no room left in buffer */ if (markpos > 0) { /* can throw away early part of the buffer */ int sz = pos - markpos; System.arraycopy(buffer, markpos, buffer, 0, sz); pos = sz; markpos = 0; } else if (buffer.length >= marklimit) { markpos = -1; /* buffer got too big, invalidate mark */ pos = 0; /* drop buffer contents */ } else { /* grow buffer */ int nsz = pos * 2; if (nsz > marklimit) nsz = marklimit; byte nbuf[] = new byte[nsz]; System.arraycopy(buffer, 0, nbuf, 0, pos); if (!bufUpdater.compareAndSet(this, buffer, nbuf)) { // Can't replace buf if there was an async close. // Note: This would need to be changed if fill() // is ever made accessible to multiple threads. // But for now, the only way CAS can fail is via close. // assert buf == null; throw new IOException("Stream closed"); } buffer = nbuf; } count = pos; int n = getInIfOpen().read(buffer, pos, buffer.length - pos); if (n > 0) count = n + pos; }
fill()方法的重點在倒數第四行,BufferedInputStream是一個包裝類,getInIfOpen()返回物件就是InputStream,由此可以看到BufferedInputStream的本質其實就是新增了一層記憶體快取機制。
結論:只有兩個情況下BufferedInputStream能優化io效能。
1.需要頻繁讀取小片資料流(一個位元組或者幾個,幾十個位元組)的情況。典型的就是java字元流的Writer 跟Reader了,字串的大小都很小,頻繁得與硬體打交道就會非常慢,一次性多載入點到記憶體中,再進行讀取就快了,這也是為什麼Writer跟Reader自帶Buffer緩衝區,位元組流不帶的原因,位元組流通常不需要頻繁讀取小片資料流來處理。
2.需要用到BufferedInputStream API的情況。
其他情況下,使用InputStream的read(byte[]),read(byte[], int, int)方法操作大片資料流(大於8k)即可,這個時候再用BufferedInputStream沒有任何意義。當然,無腦用BufferedInputStream,通常也不會產生危害。