原文地址：Getting Started with Headless Chrome ByEngineer @ Google working on web tooling: Headless Chrome, Puppeteer, Lighthouse

Headless Chrome在Chrome59中釋出，用於在headless環境中執行Chrome瀏覽器，也就是在非Chrome環境中執行Chrome。它將Chromium和Blink渲染引擎提供的所有現代Web平臺功能引入命令列。

它有什麼用處呢？

headless瀏覽器是自動測試和伺服器環境的絕佳工具，您不需要可見的UI shell。例如，針對真實的網頁進行測試，建立網頁的PDF，或者只是檢查瀏覽器如何呈現URL。

0. 開始

最簡單的開始使用headless模式的方法是從命令列開啟Chrome。如果你已經安裝了Chrome59+的版本，可以使用 --headless 標籤：

chrome \
  --headless \                   # 在headless模式執行Chrome
  --disable-gpu \                # 在Windows上執行時需要--remote-debugging-port=9222 \
  https://www.chromestatus.com   # 開啟URL. 預設為about:blank

注意：若在Windows中執行，則需要在命令列新增 --disable-gpu

。

chrome 命令需要指向Chrome的安裝路徑。（即在Chrome的安裝路徑下執行）

1. 命令列功能

在某些情況下，您可能不需要以程式設計方式編寫Headless Chrome指令碼。下面是一些有用的命令列標誌來執行常見任務。

1.1 列印DOM --dump-dom

將 document.body.innerHTML 在stdout打印出來：

chrome --headless --disable-gpu --dump-dom https://www.chromestatus.com/

1.2 建立PDF --print-to-pdf ：

chrome --headless --disable-gpu --print-to-pdf https:// 
www.chromestatus.com/

1.3 截圖 --screenshot

chrome --headless --disable-gpu --screenshot https://www.chromestatus.com/

# 標準螢幕大小
chrome --headless --disable-gpu --screenshot --window-size=1280,1696 https://www.chromestatus.com/

# Nexus 5x
chrome --headless --disable-gpu --screenshot --window-size=412,732 https://www.chromestatus.com/

執行 --screenshot將會在當前執行目錄下生成一個 screenshot.png 檔案。若想給整個頁面的截圖，那麼會比較複雜。來自 David Schnurr 的一篇很棒的博文介紹了這一內容。請檢視使用 headless Chrome 作為自動截圖工具。

1.4 REPL模式（read-eval-print loop） --repl

在REPL模式執行Headless，該模式允許通過命令列在瀏覽器中評估JS表示式：

$ chrome --headless --disable-gpu --repl --crash-dumps-dir=./tmp https://www.chromestatus.com/
[0608/112805.245285:INFO:headless_shell.cc(278)] Type a Javascript expression to evaluate or "quit" to exit.
>>> location.href
{"result":{"type":"string","value":"https://www.chromestatus.com/features"}}
>>> quit
$

注意：使用repl模式時需要新增 --crash-dumps-dir 命令。

2. 在沒有瀏覽器介面情況下除錯Chrome

當使用 --remote-debugging-port=9222 執行Chrome時，會啟用DevTools協議的例項。該協議用於與Chrome通訊並且驅動headless瀏覽器例項。除此之外，它還是一個類似於 Sublime, VS Code, 和Node的工具，可用於遠端除錯一個應用。

由於沒有瀏覽器UI來檢視頁面，因此需要在另一個瀏覽器中導航到http：// localhost：9222以檢查一切是否正常。這將看到一個可檢視頁面的列表，可以在其中單擊並檢視Headless正在呈現的內容：

DevTools遠端除錯介面

在這裡，你可以使用熟悉的DecTools功能來檢視、除錯、修改頁面。若以程式設計方式（programmatically）使用Headless，該頁面的功能更強大，可以用於檢視所有的DecTools協議的命令，並與瀏覽器進行通訊。

3. 使用程式設計模式（Node）

3.1 Puppeteer

Puppeteer 由Chrome團隊開發的Node庫。它提供了控制headless Chrome的高階API。類似於 Phantom 和 NightmareJS這樣的自動測試庫，但它只用於最新版本的Chrome。

除此之外，Puppeteer還可用於截圖，建立PDF，頁面導航，以及獲取有關這些頁面的資訊。如果需要快速進行瀏覽器的自動化測試，建議使用該庫。它隱藏了DevTools協議的複雜性，並負責啟動Chrome的除錯例項等冗餘任務。

安裝：

npm i --save puppeteer

例子-列印使用者代理資訊：

const puppeteer = require('puppeteer');

(async() => {
  const browser = await puppeteer.launch();
  console.log(await browser.version());
  await browser.close();
})();

例子-截圖

const puppeteer = require('puppeteer');

(async() => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.chromestatus.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'page.pdf', format: 'A4'});

await browser.close();
})();

3.2 CRI庫

相對於Puppeteer's API來說，chrome-remote-interface 是一個低階的庫，推薦使用它更接近底層地直接使用DevTools協議。

開啟Chrome

chrome-remote-interface不能開啟Chrome，因此需要自己開啟Chrome。

在CLI部分，我們使用--headless --remote-debugging-port = 9222手動開啟Chrome。但是，要實現完全自動化測試，您可能希望從應用程式中生成Chrome。

使用 child——process 的一種方式：

const execFile = require('child_process').execFile;

function launchHeadlessChrome(url, callback) {
  // Assuming MacOSx.
  const CHROME = '/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome';
  execFile(CHROME, ['--headless', '--disable-gpu', '--remote-debugging-port=9222', url], callback);
}

launchHeadlessChrome('https://www.chromestatus.com', (err, stdout, stderr) => {
  ...
});

但是如果你想要一個適用於多個平臺的可移植解決方案，那麼事情會變得棘手。看看Chrome的硬編碼路徑吧：(

使用ChromeLaucher

Lighthouse 是測試web應用質量絕佳工具。用於啟動Chrome的強大的模組就是在Lighthouse中開發的，現在可以單獨使用。 chrome-launcher NPM module 可以找到Chrome的安裝路徑，設定除錯例項，開啟瀏覽器，並且當程式執行完成時關掉它。最棒的是，由於Node，它可以跨平臺工作！

預設情況下，chrome-launcher會嘗試啟動Chrome Canary（如果已安裝），但可以更改它以手動選擇要使用的Chrome。要使用它，首先從npm安裝：

npm i --save chrome-launcher

例子-使用 chrome-launcher 啟動Headless模式

const chromeLauncher = require('chrome-launcher');

// 可選: 設定launcher的日誌記錄級別以檢視其輸出
// 安裝：: npm i --save lighthouse-logger
// const log = require('lighthouse-logger');
// log.setLevel('info');

/**
 * 啟動Chrome的除錯例項
 * @param {boolean=} headless True (default) 啟動headless模式的Chrome.
 *     False 啟動Chrome的完成版本.
 * @return {Promise<ChromeLauncher>}
 */
function launchChrome(headless=true) {
  return chromeLauncher.launch({
    // port: 9222, // Uncomment to force a specific port of your choice.
    chromeFlags: [
      '--window-size=412,732',
      '--disable-gpu',
      headless ? '--headless' : ''
    ]
  });
}

launchChrome().then(chrome => {
  console.log(`Chrome debuggable on port: ${chrome.port}`);
  ...
  // chrome.kill();
});

執行此指令碼並沒有太大作用，但在工作管理員中應該可以看到Chrome例項已啟動，內容為 about:blank 。但是沒有瀏覽器介面。因為是headless模式。

要控制瀏覽器，我們需要DevTools協議！

檢索有關頁面的資訊

安裝：

npm i --save chrome-remote-interface

例子-列印使用者代理

const CDP = require('chrome-remote-interface');

...

launchChrome().then(async chrome => {
  const version = await CDP.Version({port: chrome.port});
  console.log(version['User-Agent']);
});

結果類似於： HeadlessChrome/60.0.3082.0

例子-檢查網站是否有應用列表

const CDP = require('chrome-remote-interface');

...

(async function() {

const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});

// Extract the DevTools protocol domains we need and enable them.
// See API docs: https://chromedevtools.github.io/devtools-protocol/
const {Page} = protocol;
await Page.enable();

Page.navigate({url: 'https://www.chromestatus.com/'});

// Wait for window.onload before doing stuff.
Page.loadEventFired(async () => {
  const manifest = await Page.getAppManifest();

  if (manifest.url) {
    console.log('Manifest: ' + manifest.url);
    console.log(manifest.data);
  } else {
    console.log('Site has no app manifest');
  }

  protocol.close();
  chrome.kill(); // Kill Chrome.
});

})();

例子-使用DOM API提取頁面的<title>

const CDP = require('chrome-remote-interface');

...

(async function() {

const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});

// Extract the DevTools protocol domains we need and enable them.
// See API docs: https://chromedevtools.github.io/devtools-protocol/
const {Page, Runtime} = protocol;
await Promise.all([Page.enable(), Runtime.enable()]);

Page.navigate({url: 'https://www.chromestatus.com/'});

// Wait for window.onload before doing stuff.
Page.loadEventFired(async () => {
  const js = "document.querySelector('title').textContent";
  // Evaluate the JS expression in the page.
  const result = await Runtime.evaluate({expression: js});

  console.log('Title of page: ' + result.result.value);

  protocol.close();
  chrome.kill(); // Kill Chrome.
});

})();

4. 使用Selenium，WebDriver和ChromeDriver

現在，Selenium打開了一個完整地Chrome的例項，也就是說，換句話說，它是一種自動化解決方案，但並非完全headless。但是，Selenium可以通過一些配置來執行headless Chrome。我建議使用headless Chrome執行Selenium，若你還是想要如何自己設定的完整說明，我已經在下面的一些例子中展示瞭如何讓你放棄。

使用ChromeDriver

ChromeDriver 2.32使用了Chrome61，並且在headless Chrome執行的更好。

安裝：

npm i --save-dev selenium-webdriver chromedriver

例子

const fs = require('fs');
const webdriver = require('selenium-webdriver');
const chromedriver = require('chromedriver');

const chromeCapabilities = webdriver.Capabilities.chrome();
chromeCapabilities.set('chromeOptions', {args: ['--headless']});

const driver = new webdriver.Builder()
  .forBrowser('chrome')
  .withCapabilities(chromeCapabilities)
  .build();

// Navigate to google.com, enter a search.
driver.get('https://www.google.com/');
driver.findElement({name: 'q'}).sendKeys('webdriver');
driver.findElement({name: 'btnG'}).click();
driver.wait(webdriver.until.titleIs('webdriver - Google Search'), 1000);

// Take screenshot of results page. Save to disk.
driver.takeScreenshot().then(base64png => {
  fs.writeFileSync('screenshot.png', new Buffer(base64png, 'base64'));
});

driver.quit();

使用WebDriverIO

WebDriverIO 是Selenium WebDriver之上的更高階的API。

安裝：

npm i --save-dev webdriverio chromedriver

例子-chromestatus.com上的CSS filter功能

const webdriverio = require('webdriverio');
const chromedriver = require('chromedriver');

const PORT = 9515;

chromedriver.start([
  '--url-base=wd/hub',
  `--port=${PORT}`,
  '--verbose'
]);

(async () => {

const opts = {
  port: PORT,
  desiredCapabilities: {
    browserName: 'chrome',
    chromeOptions: {args: ['--headless']}
  }
};

const browser = webdriverio.remote(opts).init();

await browser.url('https://www.chromestatus.com/features');

const title = await browser.getTitle();
console.log(`Title: ${title}`);

await browser.waitForText('.num-features', 3000);
let numFeatures = await browser.getText('.num-features');
console.log(`Chrome has ${numFeatures} total features`);

await browser.setValue('input[type="search"]', 'CSS');
console.log('Filtering features...');
await browser.pause(1000);

numFeatures = await browser.getText('.num-features');
console.log(`Chrome has ${numFeatures} CSS features`);

const buffer = await browser.saveScreenshot('screenshot.png');
console.log('Saved screenshot...');

chromedriver.stop();
browser.end();

})();

5. 更多資源

以下是一些有用的資源，可幫助您入門：

文件：

工具：

演示：

"The Headless Web" - Paul Kinlan關於使用Headless和api.ai的部落格文章

6. FAQ

6.1 是否需要 --disable-gpu 命令？

僅Windows平臺需要。其他平臺不需要。--disable-gpu命令是一個臨時解決一些錯誤的方案。在將來的Chrome版本中，不再需要此命令。有關更多資訊，請參閱crbug.com/737678。

6.2 是否需要 Xvfb？

不需要。Headless Chrome不使用視窗，因此不再需要像Xvfb這樣的顯示伺服器。沒有它，也可以愉快地執行自動化測試。

什麼是Xvfb？Xvfb是一種用於類Unix系統的記憶體顯示伺服器，它使您能夠執行圖形應用程式（如Chrome）而無需附加物理顯示裝置。許多人使用Xvfb執行早期版本的Chrome進行“headless”測試。

6.3 如何建立執行Headless Chrome的Docker容器？

6.4 Headless Chrome與PhantomJS有什麼關係？

Headless Chrome與PhantomJS等工具類似。兩者都可用於headless環境中的自動化測試。兩者之間的主要區別在於Phantom使用較舊版本的WebKit作為其渲染引擎，而Headless Chrome使用最新版本的Blink。

目前，Phantom還提供了比

6.5 在哪裡提交bugs？

對於Headless Chrome的bugs，請在crbug.com上提交。

tech & looking ahead to the platform's future.

Getting Started with Headless Chrome

Engineer @ Google working on web tooling: Headless Chrome, Puppeteer, Lighthouse

TL;DR

Headless Chrome is shipping in Chrome 59. It's a way to run the Chrome browser in a headless environment. Essentially, running Chrome without chrome! It brings all modern web platform features provided by Chromium and the Blink rendering engine to the command line.

Why is that useful?

A headless browser is a great tool for automated testing and server environments where you don't need a visible UI shell. For example, you may want to run some tests against a real web page, create a PDF of it, or just inspect how the browser renders an URL.

Note: Headless mode has been available on Mac and Linux since Chrome 59. Windows support came in Chrome 60.

Starting Headless (CLI)

The easiest way to get started with headless mode is to open the Chrome binary from the command line. If you've got Chrome 59+ installed, start Chrome with the --headless flag:

chrome \--headless \# Runs Chrome in headless mode.--disable-gpu \# Temporarily needed if running on Windows.--remote-debugging-port=9222\  https://www.chromestatus.com   # URL to open. Defaults to about:blank.

Note: Right now, you'll also want to include the --disable-gpu flag if you're running on Windows. See crbug.com/737678.

chrome should point to your installation of Chrome. The exact location will vary from platform to platform. Since I'm on Mac, I created convenient aliases for each version of Chrome that I have installed.

If you're on the stable channel of Chrome and cannot get the Beta, I recommend using chrome-canary:

alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"alias chrome-canary="/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary"alias chromium="/Applications/Chromium.app/Contents/MacOS/Chromium"

Download Chrome Canary here.

Command line features

In some cases, you may not need to programmatically script Headless Chrome. There are some useful command line flags to perform common tasks.

Printing the DOM

The --dump-dom flag prints document.body.innerHTML to stdout:

chrome --headless --disable-gpu --dump-dom https://www.chromestatus.com/

Create a PDF

The --print-to-pdf flag creates a PDF of the page:

chrome --headless --disable-gpu --print-to-pdf https://www.chromestatus.com/

Taking screenshots

To capture a screenshot of a page, use the --screenshot flag:

chrome --headless --disable-gpu --screenshot https://www.chromestatus.com/# Size of a standard letterhead.chrome --headless --disable-gpu --screenshot --window-size=1280,1696 https://www.chromestatus.com/# Nexus 5xchrome --headless --disable-gpu --screenshot --window-size=412,732 https://www.chromestatus.com/

Running with --screenshot will produce a file named screenshot.png in the current working directory. If you're looking for full page screenshots, things are a tad more involved. There's a great blog post from David Schnurr that has you covered. Check out Using headless Chrome as an automated screenshot tool .

REPL mode (read-eval-print loop)

The --repl flag runs Headless in a mode where you can evaluate JS expressions in the browser, right from the command line:

$ chrome --headless --disable-gpu --repl --crash-dumps-dir=./tmp https://www.chromestatus.com/[0608/112805.245285:INFO:headless_shell.cc(278)]Type a Javascript expression to evaluate or"quit" to exit.>>> location.href{"result":{"type":"string","value":"https://www.chromestatus.com/features"}}>>> quit$

Note: the addition of the --crash-dumps-dir flag when using repl mode.

Debugging Chrome without a browser UI?

When you run Chrome with --remote-debugging-port=9222, it starts an instance with the DevTools protocol enabled. The protocol is used to communicate with Chrome and drive the headless browser instance. It's also what tools like Sublime, VS Code, and Node use for remote debugging an application. #synergy

Since you don't have browser UI to see the page, navigate to http://localhost:9222 in another browser to check that everything is working. You'll see a list of inspectable pages where you can click through and see what Headless is rendering:

DevTools remote debugging UI

From here, you can use the familiar DevTools features to inspect, debug, and tweak the page as you normally would. If you're using Headless programmatically, this page is also a powerful debugging tool for seeing all the raw DevTools protocol commands going across the wire, communicating with the browser.

Using programmatically (Node)

Puppeteer

Puppeteer is a Node library developed by the Chrome team. It provides a high-level API to control headless (or full) Chrome. It's similar to other automated testing libraries like Phantom and NightmareJS, but it only works with the latest versions of Chrome.

Among other things, Puppeteer can be used to easily take screenshots, create PDFs, navigate pages, and fetch information about those pages. I recommend the library if you want to quickly automate browser testing. It hides away the complexities of the DevTools protocol and takes care of redundant tasks like launching a debug instance of Chrome.

Install it:

npm i --save puppeteer

Example - print the user agent

const puppeteer =require('puppeteer');(async()=>{const browser = await puppeteer.launch();  console.log(await browser.version());  await browser.close();})();

Example - taking a screenshot of the page

const puppeteer =require('puppeteer');(async()=>{const browser = await puppeteer.launch();const page = await browser.newPage();await page.goto('https://www.chromestatus.com',{waitUntil:'networkidle2'});await page.pdf({path:'page.pdf', format:'A4'});await browser.close();})();

The CRI library

chrome-remote-interface is a lower-level library than Puppeteer's API. I recommend it if you want to be close to the metal and use the DevTools protocol directly.

Launching Chrome

chrome-remote-interface doesn't launch Chrome for you, so you'll have to take care of that yourself.

In the CLI section, we started Chrome manually using --headless --remote-debugging-port=9222. However, to fully automate tests, you'll probably want to spawn Chrome from your application.

One way is to use child_process:

const execFile =require('child_process').execFile;function launchHeadlessChrome(url, callback){// Assuming MacOSx.const CHROME ='/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome';  execFile(CHROME,['--headless','--disable-gpu','--remote-debugging-port=9222', url], callback);}launchHeadlessChrome('https://www.chromestatus.com',(err, stdout, stderr)=>{...});

But things get tricky if you want a portable solution that works across multiple platforms. Just look at that hard-coded path to Chrome :(

Using ChromeLauncher

Lighthouse is a marvelous tool for testing the quality of your web apps. A robust module for launching Chrome was developed within Lighthouse and is now extracted for standalone use. The chrome-launcher NPM modulewill find where Chrome is installed, set up a debug instance, launch the browser, and kill it when your program is done. Best part is that it works cross-platform thanks to Node!

By default, chrome-launcher will try to launch Chrome Canary (if it's installed), but you can change that to manually select which Chrome to use. To use it, first install from npm:

npm i --save chrome-launcher

Example - using chrome-launcher to launch Headless

const chromeLauncher =require('chrome-launcher');// Optional: set logging level of launcher to see its output.// Install it using: npm i --save lighthouse-logger// const log = require('lighthouse-logger');// log.setLevel('info');/** * Launches a debugging instance of Chrome. * @param {boolean=} headless True (default) launches Chrome in headless mode. *     False launches a full version of Chrome. * @return {Promise<ChromeLauncher>} */function launchChrome(headless=true){return chromeLauncher.launch({// port: 9222, // Uncomment to force a specific port of your choice.    chromeFlags:['--window-size=412,732','--disable-gpu',      headless ?'--headless':'']});}launchChrome().then(chrome =>{  console.log(`Chrome debuggable on port: ${chrome.port}`);...// chrome.kill();});

Headless Chrome入門