Skip to main content

ZSL

在 Hugo 中使用 Shiki 並且加快執行速度超過百倍

Published:
Updated:

目標是在 Hugo 中使用 Shiki syntax highlighter 取代內建錯誤百出的 Chroma,並且大幅加快 Shiki 執行速度。本文內容和大部分使用 shiki 的人無關,這篇文章的用途是將已經 build 完的 HTML 檔案加上 shiki highlight

前言

原本都想好要針對最多人用的 PaperMod 主題為例寫使用教學的,但是實際使用時發現 PaperMod 修改了 code block 的 CSS,也就是說教學內容不會永遠成立,所以就改成寫效能分析了。

回到內容,Blowfish 內建的 Chroma CSS 語法解析錯誤百出,同樣都是 指令 --參數 格式結果上下兩行 highlight 結果不一樣,後來換成開箱即用的 PrismJS 和 HLJS,但是效能死宅的我不喜歡客戶端渲染,之後看到 eallion 的 shiki 文章 二話不說馬上換成 shiki。

換成 Shiki 前請先注意這些功能問題和使用細節:

  1. Code block 的語言務必正確標記,否則會完全沒顏色
  2. 不同 Hugo 主題 CSS 需要各自處理,本文只有 Blowfish 和 PaperMod 兩個主題作為範例
  3. Code copy buttom 也很有可能消失,需要自行解決
  4. 開發階段即時預覽的功能不適用於 hugo server --renderToMemory

不同做法

Eallion 的做法

eallion 的文章以 rehype CLI 方式完成,分成三個步驟

  1. 關閉 Hugo 內建的渲染
  2. 設定 rehype 和 shiki 的 CLI
  3. 以 rehype 的 CLI 幫已經建立好的 HTML 頁面加上 shiki 語法

這個方案有幾個問題,第一是速度非常慢,我的網站不到 60 個 md 文件就要六秒渲染,eallion 的網站有 600 個 md 文件耗時 191 秒;第二個是記憶體用量,這個方式會一次處理目標資料夾裡面的所有全部 HTML 文件,記憶體用量非常大;第三是無法在編輯時預覽,要開啟另外一個終端執行 shiki 指令才可以看到語法上色結果。

以 eallion 的網站作為測試,在 M1 MacBook Pro 上執行 rehype shiki CLI,耗時 191 秒。

Orta Therox 的做法

Orta 是 shiki 的 contributor,不只是 contributor,他的部落格也剛好是用 Hugo 搭建的,所以他的方式應該很有參考性:以腳本執行 shiki,無須 rehype 並且有基本的過濾機制,不過沒有進階的效能優化。

以 eallion 的網站作為測試,在 M1 MacBook Pro 上執行 Orta Therox 的 shikify.ts 耗時 5.9 秒。

我的做法

我同樣使用腳本完成,但是有幾個優化

  1. 更複雜的過濾方式
    1. 可以指定哪些目錄需要修改
    2. 每個目錄可以指定排除的路徑,例如 Hugo 會自動生成 page 頁面,此路徑的 HTML 全數排除
    3. 對於大型專案,可以設定開發時只監視指定目錄避免不必要的重複處理
    4. 增量處理:根據觀察,由於本地開發時 Hugo 只會重新渲染和這次修改有關的檔案,所以基於 HTML 檔案的修改時間進行快取,避免不必要的重複處理
  2. 更高效的處理方式
    1. 使用非同步方式處理檔案 IO,使用多線程執行 shiki
    2. 根據 CPU 核心數、待處理檔案數量設定線程數
    3. 每個線程批量獲取任務避免通訊開銷
    4. 根據觀察,多數 HTML 頁面其實沒有 code block,因此使用 early return 避免 parser 解析壓根沒有 code block 的頁面
    5. 使用高效的 htmlparser2 而不是 jsdom/cheerio

以 eallion 的網站作為測試,優化完成後在 M1 MacBook Pro 上執行只需要 1.88 秒,並且開發時預覽修改可以在一秒內完成。

效能分析

總結三種方式,從最簡單到我提供的最複雜的方案,分別是 1 / 32 / 101 倍的效能提升,而且我的方案在本地開發階段受惠於快取機制避免重複處理,所以在 0.8 秒左右就可以完成 highlighting,速度提升高達 240 倍。

  ---
config:
    xyChart:
        width: 1000
        height: 600
    themeVariables:
        xyChart:
            plotColorPalette: "#00A2ED"
---
xychart-beta
    title "測試 shiki highlight 速度差距倍率,以 eallion.com 為範例專案"
    x-axis ["My script (cached)", "My script (uncached)", "Orta's script", "Rehype CLI"]
    y-axis "Speedup Multiplier" 0 --> 250
    bar [238.75, 101.6, 32.37, 1]

如果你想要在開發階段獲得極致的反應速度,腳本也可以設定 TARGETS_DEV 控制開發階段的渲染目標,將其限制為只有正在修改的檔案甚至能 0.25 秒完成渲染,比完整渲染快 764 倍。

和不同方案比較之後這裡再分析自己方案的效能,使用 CLInic.js 進行效能分析,以 pnpm clinic flame -- node scripts/shiki/index.js 測試,完整的火焰圖如下:

all

可以看到 V8 引擎佔據七成時間,這時如果把我們自己的 shiki 關閉,再來對比上下兩張圖,可以看到差距非常小,代表沒有優化空間了:

no-shiki

不過我們還是簡單觀察一下目前的效能瓶頸,把 V8 和依賴的耗時都關掉,結果如下

only-node-and-shiki

紅框處是我的函式耗時,可以看到有一半的時間花在掃描檔案上,因為 eallion 的構建結果有高達 6000 個資料夾 (find ./public -mindepth 1 -type d | wc -l 輸出 6451),所以耗時長是可以預測的。

唯一一個要改的地方是在 dev 階段應該讓線程和 shiki highlighter instance 持續存活,但是現在是直接呼叫腳本重新執行。

使用教學

本文的程式碼總共需要 9 個檔案:

❯ tree scripts
scripts
├── dev.js
└── shiki
    ├── bench.py
    ├── config.js
    ├── file-collector.js
    ├── highlighter.js
    ├── index.js
    ├── logger.js
    └── worker.js

2 directories, 9 files

在你的專案建立這些檔案,其中最重要的是

  • scripts/dev.js 用於開發階段及時預覽,整合了 hugo server
  • scripts/shiki/index.js 部署時使用,因此你的 pages build command 應該會變成 hugo --minify && node scripts/shiki/index.js
dev.js
const { spawn, execSync } = require("node:child_process");

const SHIKI_COMMAND = "node scripts/shiki/index.js --dev --quiet";
const HUGO_COMMAND = ["server", "--disableKinds", "RSS", "-p", "1313"];
const SASS_INIT = "pnpm run build:css";
const SASS_WATCH = "pnpm run css:watch";

const HUGO_OK_FLAG = "Web Server is available"; // 監聽 hugo html 建立完成
const HUGO_DEBOUNCE_MS = 300; // 避免短時間重複執行 shiki
const CLEANUP_RETRY_MS = 500; // spawn process 清除失敗的重試等待時間

const EXIT_SIGNALS = ["SIGTERM", "SIGHUP", "SIGQUIT", "SIGABRT", "uncaughtException", "unhandledRejection"];

let debounceTimer;
let isShikiRunning = false;
let hugoProcess;
let sassWatchProcess;
const childProcesses = [];

function cleanup() {
  for (const proc of childProcesses) {
    if (proc?.pid && !proc.killed) {
      try {
        proc.kill("SIGTERM");

        setTimeout(() => {
          try {
            if (proc?.pid && !proc.killed) {
              proc.kill("SIGKILL");

              if (process.platform === "win32") {
                execSync(`taskkill /F /PID ${proc.pid}`, { stdio: "ignore" });
              } else {
                execSync(`kill -9 ${proc.pid}`, { stdio: "ignore" });
              }
            }
          } catch (e) {}
        }, CLEANUP_RETRY_MS).unref();
      } catch (e) {}
    }
  }
}

process.on("exit", cleanup);
process.on("SIGINT", cleanup);

for (const signal of EXIT_SIGNALS) {
  process.on(signal, (error) => {
    if (signal !== "SIGINT" && error) console.error(`Error: ${signal}`, error);
    cleanup();
    setTimeout(() => {
      process.exit(signal === "uncaughtException" || signal === "unhandledRejection" ? 1 : 0);
    }, CLEANUP_RETRY_MS).unref();
  });
}

try {
  execSync(SASS_INIT, { stdio: "inherit" });

  sassWatchProcess = spawn("sh", ["-c", SASS_WATCH], { detached: false });
  childProcesses.push(sassWatchProcess);

  hugoProcess = spawn("hugo", HUGO_COMMAND, { detached: false });
  childProcesses.push(hugoProcess);

  hugoProcess.stdout.on("data", (data) => {
    const output = data.toString();
    console.log(output);

    if (output.includes(HUGO_OK_FLAG)) {
      clearTimeout(debounceTimer);
      debounceTimer = setTimeout(() => {
        if (!isShikiRunning) {
          isShikiRunning = true;
          try {
            execSync(SHIKI_COMMAND, { stdio: "inherit" });
          } finally {
            isShikiRunning = false;
          }
        } else {
          console.log("Shiki is already running, skipping trigger.");
        }
      }, HUGO_DEBOUNCE_MS);
    }
  });

  hugoProcess.stderr.on("data", (data) => {
    console.error(`Hugo: ${data.toString().trim()}`);
  });

  sassWatchProcess.stdout.on("data", (data) => {
    console.log(`SASS: ${data.toString().trim()}`);
  });

  sassWatchProcess.stderr.on("data", (data) => {
    console.error(`SASS: ${data.toString().trim()}`);
  });

  hugoProcess.on("close", (code) => {
    console.log(`Hugo process exited with code ${code}`);
    cleanup();
    process.exit(code || 0);
  });

  sassWatchProcess.on("close", (code) => {
    console.log(`SASS process exited with code ${code}`);
    if (hugoProcess && !hugoProcess.killed) {
      cleanup();
      process.exit(code || 0);
    }
  });
} catch (error) {
  console.error(error);
  cleanup();
  process.exit(1);
}
config.js
const CONFIG = {
  // 要掃描的目標目錄設定,EXCLUDE 設定`精確`的排除資料夾
  // 例如設定 page 就只會排除 "path/page/*" 不會排除 "path/*page*/*"
  TARGETS: [
    {
      DIR: "public/posts",
    },
    {
      DIR: "public/til",
    },
    {
      DIR: "public/series",
      EXCLUDE: ["folderNameToExclude", "anotherName"],
    },
  ],

  // 大型專案在開發時可以縮小範圍
  TARGETS_DEV: [
    {
      DIR: "public/posts",
      EXCLUDE: ["page"],
    },
    {
      DIR: "public/til",
    },
  ],

  // 允許亮暗模式雙主題
  // https://shiki.style/guide/dual-themes
  ENABLE_DUAL_THEME: true,

  // 色彩主題,如果單主題則只須設定 LIGHT
  // https://shiki.style/themes
  // THEMES: { LIGHT: "min-light", DARK: "andromeeda" },
  // THEMES: { LIGHT: "Catppuccin Latte", DARK: "Ayu Dark" },
  THEMES: { LIGHT: "github-light", DARK: "github-dark" },

  // 支援的語言
  // https://shiki.style/languages
  LANGUAGES: [
    "javascript",
    "typescript",
    "html",
    "css",
    "json",
    "py",
    "yaml",
    "dotenv",
    "sh",
    "md",
    "go",
    "c",
    "ini",
    "toml",
    "matlab",
    "tex",
    "php",
    "nginx",
    "docker",
    "json",
    "jsonc",
    "json5",
  ],

  // codeimporter 根據副檔名設定 language,使用字典映射 shiki 接受的副檔名
  // 例如 FileName.env 會以 dotenv 渲染
  LANGUAGE_ALIAS: {
    env: "dotenv",
    node: "js",
    gitignore: "shell",
  },

  // 設定線程數
  THREAD_COUNT: 4,

  // 每個線程每次能取得的檔案數量
  BATCH_SIZE: 4,
};

module.exports = CONFIG;
file-collector.js
const fs = require("node:fs").promises;
const path = require("node:path");

/**
 * @typedef {Object} TargetConfig
 * @property {string} DIR - 掃描的根目錄
 * @property {string[]=} [EXCLUDE] - 要排除的資料夾名稱
 */

/**
 * 掃描目標目錄並收集所有 .html 檔案
 * @param {TargetConfig[]} targets - 目標目錄配置陣列
 * @returns {Promise<string[]>} - 所有符合條件的 HTML 檔案路徑
 */
async function collectHtmlFiles(targets) {
  const result = new Set();

  for (const { DIR, EXCLUDE = [] } of targets) {
    await scanDirectory(DIR, EXCLUDE, result);
  }

  return Array.from(result);
}

/**
 * 掃描單一目錄並收集 HTML 檔案
 * @param {string} rootDir - 要掃描的根目錄
 * @param {string[]} exclude - 要排除的資料夾名稱
 * @param {Set<string>} resultSet - 儲存結果的集合
 */
async function scanDirectory(rootDir, exclude, resultSet) {
  const queue = [rootDir];

  while (queue.length > 0) {
    const currentDir = queue.shift();

    let entries;
    try {
      entries = await fs.readdir(currentDir, { withFileTypes: true });
    } catch (error) {
      console.error(`Error reading ${currentDir}: ${error.message}`);
      continue;
    }

    for (const entry of entries) {
      const fullPath = path.join(currentDir, entry.name);
      const normalized = fullPath + path.sep;

      if (shouldExclude(normalized, exclude)) continue;

      if (entry.isDirectory()) {
        queue.push(fullPath);
        continue;
      }

      if (entry.isFile() && entry.name.endsWith(".html")) {
        resultSet.add(fullPath);
      }
    }
  }
}

/**
 * 檢查是否應排除特定路徑
 * @param {string} filePath - 要檢查的檔案路徑
 * @param {string[]} exclude - 要排除的資料夾名稱
 * @returns {boolean} - 是否應排除
 */
function shouldExclude(filePath, exclude) {
  return exclude.some((folder) => filePath.includes(path.sep + folder + path.sep));
}

module.exports = { collectHtmlFiles };
highlighter.js
const fs = require("node:fs/promises");
const render = require("dom-serializer").default;
const { createHighlighter } = require("shiki");
const { parseDocument } = require("htmlparser2");
const { decode } = require("html-entities");
const { DomUtils } = require("htmlparser2");

class CodeHighlighter {
  constructor(CustomConfig) {
    this.CustomConfig = CustomConfig;
    this.themeOption = this.CustomConfig.ENABLE_DUAL_THEME
      ? {
          themes: {
            light: this.CustomConfig.THEMES.LIGHT,
            dark: this.CustomConfig.THEMES.DARK,
          },
        }
      : { theme: this.CustomConfig.THEMES.LIGHT };
    this.shiki = null;
  }

  async initShiki() {
    if (this.shiki) return;

    this.shiki = await createHighlighter({
      themes: [this.CustomConfig.THEMES.LIGHT, this.CustomConfig.THEMES.DARK],
      langs: this.CustomConfig.LANGUAGES,
    });
  }

  async processFile(filePath) {
    const html = await fs.readFile(filePath, "utf8");

    // quick filter: 300ms (15%) improvements on large repo
    if (!html.includes("<pre><code") || html.includes('<pre class="shiki')) return 0;

    const dom = parseDocument(html, { decodeEntities: false });
    const nodes = [];

    for (const el of DomUtils.findAll((el) => el.name === "pre", dom.children)) {
      if (el.attribs?.class?.includes("shiki")) return 0; // 已經被處理過
      const codeNode = el.children?.find((child) => child.name === "code"); // 找到 tag code
      if (codeNode) nodes.push(codeNode);
    }

    if (nodes.length === 0) return 0;

    let modified = 0;
    for (const code of nodes) {
      const pre = code.parent;
      const rawCode = decode(DomUtils.textContent(code));

      let lang = code.attribs?.class?.replace(/^language-/, "") || "text";
      lang = this.CustomConfig.LANGUAGE_ALIAS[lang] || lang;

      try {
        const highlighted = await this.shiki.codeToHtml(rawCode, { lang, ...this.themeOption });
        const newDom = parseDocument(highlighted, { decodeEntities: false }).children;

        // Add a wrapper for scrapper, follow vitepress format
        const wrapperDiv = {
          type: "tag",
          name: "div",
          attribs: { class: `language-${lang}` },
          children: [newDom[0]],
          parent: null,
        };
        DomUtils.replaceElement(pre, wrapperDiv);
        // DomUtils.replaceElement(pre, newDom[0]);
        modified++;
      } catch (e) {
        console.error(`Error highlighting code block (${lang}): ${e.message}`);
      }
    }

    if (modified > 0) {
      const updated = render(dom, { decodeEntities: false });
      await fs.writeFile(filePath, updated, "utf8");
    }

    return modified;
  }
}

module.exports = CodeHighlighter;
index.js
const { isMainThread, workerData } = require("node:worker_threads");
const CodeHighlighter = require("./highlighter");
const { MainThreadManager, WorkerThreadManager } = require("./worker");

async function main() {
  if (isMainThread) {
    const CONFIG = require("./config");
    CONFIG.DEBUG = process.argv.includes("--debug") || false;
    CONFIG.QUIET = CONFIG.DEBUG ? false : process.argv.includes("--quiet") || false;
    CONFIG.DEV = process.argv.includes("--dev") || false;

    try {
      const mainThreadManager = new MainThreadManager(CONFIG);
      await mainThreadManager.initialize();
      await mainThreadManager.runMainThread();
    } catch (error) {
      throw new Error(`Main thread error: ${error.message}\n${error.stack}`);
    }
  } else {
    const configFromMain = workerData.workerThreadConfig;
    const CONFIG = Object.assign({}, require("./config"), configFromMain);
    const { workerId, cachedProcessedFiles } = workerData;

    try {
      const highlighter = new CodeHighlighter(CONFIG);
      const workerThreadManager = new WorkerThreadManager(CONFIG);
      await workerThreadManager.initializeWorker(workerId, highlighter, cachedProcessedFiles);
    } catch (error) {
      throw new Error(`Worker ${workerId} initialization error: ${error.message}\n${error.stack}`);
    }
  }
}

if (require.main === module) {
  main().catch((error) => {
    console.error("Fatal error:", error.stack);
    process.exit(1);
  });
}

module.exports = {
  CodeHighlighter,
  MainThreadManager: isMainThread ? require("./worker").MainThreadManager : null,
  WorkerThreadManager: isMainThread ? require("./worker").WorkerThreadManager : null,
};
logger.js
import chalk from "chalk";

function patchConsole(prefix, color = "green") {
  const colorPrefix = chalk[color](prefix);
  for (const key of ["log", "warn", "error", "info", "debug"]) {
    const original = console[key].bind(console);
    console[key] = (...args) => original(colorPrefix, ...args);
  }
}

export { patchConsole };
worker.js
const path = require("node:path");
const { Worker, parentPort } = require("node:worker_threads");
const fs = require("node:fs").promises;
const { collectHtmlFiles } = require("./file-collector");

const MESSAGE_TYPES = {
  READY: "ready",
  PROCESS: "process",
  COMPLETED: "completed",
  EXIT: "exit",
  ERROR: "error",
};

class MainThreadManager {
  constructor(config, workerArgs = []) {
    this.config = config;
    this.workerArgs = workerArgs;
    this.targets = config.DEV ? config.TARGETS_DEV : config.TARGETS;
    this.mainScriptPath = path.resolve(__dirname, "index.js");
    this.processedFiles = new Map();
    this.recordFilePath = path.join(__dirname, ".processed-files.json");
    this.workers = [];
    this.totalProcessed = 0;
    this.activeWorkers = 0;
    this.startTime = Date.now();
  }

  async initialize() {
    await this.loadProcessedRecords();
  }

  async loadProcessedRecords() {
    try {
      const data = await fs.readFile(this.recordFilePath, "utf8");
      const records = JSON.parse(data);

      for (const [file, timestamp] of Object.entries(records)) {
        this.processedFiles.set(file, timestamp);
      }

      if (this.config.DEBUG) {
        console.log(`Loaded records for ${this.processedFiles.size} files`);
      }
    } catch (error) {
      if (error.code !== "ENOENT") {
        console.error("Failed to load processed files records:", error.stack);
      }
    }
  }

  async saveProcessedRecords() {
    try {
      const records = Object.fromEntries(this.processedFiles);

      await fs.writeFile(this.recordFilePath, JSON.stringify(records, null, 2), "utf8");
      if (this.config.DEBUG) {
        console.log(`Saved records for ${Object.keys(records).length} processed files`);
      }
    } catch (error) {
      console.error("Failed to save processed files records:", error.stack);
    }
  }

  async runMainThread() {
    console.log("Starting HTML syntax highlighting process in:");
    for (const target of this.targets) {
      const excludeInfo = target.EXCLUDE?.length ? ` (excluding: ${target.EXCLUDE.join(", ")})` : "";
      console.log(`- ${target.DIR}${excludeInfo}`);
    }

    const files = await collectHtmlFiles(this.targets);
    if (!files || files.length === 0) {
      console.log("No HTML files found to process.");
      return;
    }

    const filteredFiles = await this._filterFile(files);

    if (filteredFiles.length === 0) {
      console.log("No HTML files need processing.");
      return;
    }

    const workerCount = this._calculateOptimalWorkerCount(filteredFiles.length);
    if (!this.config.QUIET) {
      console.log(`Found ${filteredFiles.length} HTML files. Starting with ${workerCount} workers.`);
    }

    return this._startWorkerProcessing(filteredFiles, workerCount);
  }

  async _filterFile(files) {
    const filteredFiles = [];

    await Promise.all(
      files.map(async (filePath) => {
        try {
          const stats = await fs.stat(filePath);
          const lastModified = stats.mtimeMs;
          if (!this.processedFiles.has(filePath) || this.processedFiles.get(filePath) < lastModified) {
            filteredFiles.push(filePath);
          }
        } catch (error) {
          console.error(`Error checking file status: ${filePath}`, error.stack);
          filteredFiles.push(filePath);
        }
      }),
    );

    return filteredFiles;
  }

  _calculateOptimalWorkerCount(fileCount) {
    const cpuCount = require("node:os").cpus().length;
    const baseWorkerCount = Math.min(this.config.THREAD_COUNT, Math.ceil(fileCount / 10), cpuCount - 1 || 1);
    return Math.max(1, baseWorkerCount);
  }

  _startWorkerProcessing(files, workerCount) {
    const jobQueue = [...files];
    this.activeWorkers = workerCount;

    return new Promise((resolve) => {
      for (let i = 0; i < workerCount; i++) {
        this._createAndSetupWorker(i, jobQueue, resolve);
      }
    });
  }

  _createAndSetupWorker(workerId, jobQueue, resolvePromise) {
    const processedFilesArray = Array.from(this.processedFiles.entries());

    const worker = new Worker(this.mainScriptPath, {
      workerData: {
        workerId,
        isWorker: true,
        workerThreadConfig: this.config,
        cachedProcessedFiles: processedFilesArray,
      },
      argv: this.workerArgs,
    });

    worker.on("message", (message) => {
      this._handleWorkerMessage(message, worker, jobQueue);
      if (message.type === MESSAGE_TYPES.COMPLETED) {
        this.totalProcessed += message.count;
      }
    });

    worker.on("error", (error) => {
      console.error(`[Worker ${workerId}] encountered an error:`, error.stack);
    });

    worker.on("exit", (code) => {
      this._handleWorkerExit(workerId, code, resolvePromise);
    });

    this.workers.push(worker);
  }

  _handleWorkerExit(workerId, code, resolvePromise) {
    if (this.config.DEBUG && code !== 0) {
      console.warn(`[Worker ${workerId}] exited with code ${code}`);
    }

    this.activeWorkers--;

    if (this.activeWorkers === 0) {
      console.log(
        `Process completed. Modified ${this.totalProcessed} files using ${Date.now() - this.startTime} ms`,
      );
      this.saveProcessedRecords().then(resolvePromise);
    }
  }

  _handleWorkerMessage(message, worker, jobQueue) {
    switch (message.type) {
      case MESSAGE_TYPES.COMPLETED:
        this._updateProcessedFiles(message.processedFiles);
        this._assignNextBatch(worker, jobQueue);
        break;
      case MESSAGE_TYPES.READY:
        this._assignNextBatch(worker, jobQueue);
        break;
      case MESSAGE_TYPES.ERROR:
        console.error(`[Worker ${message.workerId}] error:`, message.error);
        break;
    }
  }

  _updateProcessedFiles(processedFiles) {
    if (!processedFiles || !processedFiles.length) return;

    for (const { filePath, timestamp } of processedFiles) {
      if (!this.processedFiles.has(filePath) || timestamp > this.processedFiles.get(filePath)) {
        this.processedFiles.set(filePath, timestamp);
      }
    }
  }

  _assignNextBatch(worker, jobQueue) {
    const nextBatch = jobQueue.splice(0, this.config.BATCH_SIZE);
    if (nextBatch.length > 0) {
      worker.postMessage({ type: MESSAGE_TYPES.PROCESS, files: nextBatch });
    } else {
      worker.postMessage({ type: MESSAGE_TYPES.EXIT });
    }
  }
}

class WorkerThreadManager {
  constructor(config) {
    this.config = config;
    this.processedFiles = new Map();
  }

  async initializeWorker(workerId, highlighter, cachedProcessedFiles) {
    try {
      if (Array.isArray(cachedProcessedFiles)) {
        this.processedFiles = new Map(cachedProcessedFiles);
      }

      await highlighter.initShiki();
      this._setupWorkerMessageHandlers(workerId, highlighter);
      parentPort.postMessage({ type: MESSAGE_TYPES.READY, workerId });
    } catch (error) {
      parentPort.postMessage({
        type: MESSAGE_TYPES.ERROR,
        workerId,
        error: error.stack,
      });
      process.exit(1);
    }
  }

  _setupWorkerMessageHandlers(workerId, highlighter) {
    parentPort.on("message", async (message) => {
      switch (message.type) {
        case MESSAGE_TYPES.PROCESS: {
          try {
            const result = await this._processBatch(highlighter, message.files, workerId);
            parentPort.postMessage({
              type: MESSAGE_TYPES.COMPLETED,
              workerId,
              count: result.count,
              processedFiles: result.processedFiles,
            });
          } catch (error) {
            // Send error to main thread
            parentPort.postMessage({
              type: MESSAGE_TYPES.ERROR,
              workerId,
              error: error.stack,
            });
          }
          break;
        }
        case MESSAGE_TYPES.EXIT:
          if (this.config.DEBUG) {
            console.log(`[Worker ${workerId}] received exit signal`);
          }
          process.exit(0);
      }
    });
  }

  async _processBatch(highlighter, files, workerId) {
    let processedCount = 0;
    const processedFiles = [];

    for (const file of files) {
      // _processFileSafe is CPU bound, do not run it asynchrous
      const result = await this._processFileSafe(highlighter, file, workerId);
      if (!result) continue;

      const { filePath, modifications } = result;
      if (modifications > 0) {
        processedCount++;
        const timestamp = Date.now();
        this.processedFiles.set(filePath, timestamp);
        processedFiles.push({ filePath, timestamp });
      }
    }

    return { count: processedCount, processedFiles };
  }

  async _processFileSafe(highlighter, filePath, workerId) {
    try {
      const modifications = await this._processFile(highlighter, filePath, workerId);
      return { filePath, modifications };
    } catch (error) {
      parentPort.postMessage({
        type: MESSAGE_TYPES.ERROR,
        workerId,
        error: error.stack,
      });
      return null;
    }
  }

  async _processFile(highlighter, filePath, workerId) {
    const modifications = await highlighter.processFile(filePath);
    if (modifications > 0) {
      const timestamp = Date.now();
      this.processedFiles.set(filePath, timestamp);
      if (!this.config.QUIET) {
        console.log(`[Worker ${workerId}] Processed ${modifications} code blocks in: ${filePath}`);
      }
    }

    return modifications;
  }
}

module.exports = {
  MainThreadManager,
  WorkerThreadManager,
  MESSAGE_TYPES,
};
bench.py
#!/usr/bin/env python3
#
# Benchmark script for Shiki highlighter. Run in Python to avoid Node.js cache.
import subprocess
import time

ROUND = 10
SLEEP_INTERVAL = 0.00001  # Avoid cache


def profile(round: int) -> float:
    total = 0
    for i in range(1, round + 1):
        run_process(["hugo", "--gc", "--minify"])

        start = time.time()
        run_process(["node", "scripts/shiki/index.js"])
        end = time.time()

        duration = (end - start) * 1000
        print(f"Shiki 耗時 {duration:.0f} ms, 休息 {SLEEP_INTERVAL} 秒...")
        total += duration

        if i < round - 1:
            time.sleep(SLEEP_INTERVAL)

    return total / round


def run_process(args):
    subprocess.run(args=args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)


if __name__ == "__main__":
    avg = profile(ROUND)
    print(f"\nShiki 平均耗時 {avg:.2f} ms")

設定方式

先關閉 Hugo 的 code fences,找到你的 Hugo 設定檔

[markup]
  [markup.highlight]
    codeFences = false  # 關閉 code fences

接下來安裝依賴套件,以 pnpm 為例

pnpm add -D chalk dom-serializer html-entities htmlparser2 serve shiki
套件說明
  1. chalk: 日誌工具
  2. dom-serializer html-entities htmlparser2: HTML 解析
  3. serve: 即時預覽
  4. shiki: 語法上色

下一步是修改 CSS,以 Blowfish 為例,custom.css 只需要依照 shiki 官方文檔修改:

.dark .shiki,
.dark .shiki span {
  color: var(--shiki-dark) !important;
  background-color: var(--shiki-dark-bg) !important;
}

但是有些主題會修改 code block 的 CSS 導致文檔方式不適用,例如很多人用的 PaperMod 就有這個問題,所以在 PaperMod 要改成這樣:

/* 在 PaperMod 主題中使用 Shiki */
.post-content .shiki {
  border-radius: var(--radius);
  overflow: hidden;
}

.post-content .shiki code {
  background: transparent !important;
  color: inherit !important;
}

.dark .post-content .shiki {
  background-color: var(--shiki-dark-bg) !important;
  color: var(--shiki-dark) !important;
}

.dark .post-content .shiki span {
  color: var(--shiki-dark) !important;
  background-color: transparent !important;
}

設定完 CSS 後就可以開始 highlight 了。scripts/shiki/config.js 提供自訂選項,有幾個重點項目需要設定

  1. 處理的目標資料夾 TARGETS,是 hugo server 的輸出目錄
  2. 是否啟用雙主題 ENABLE_DUAL_THEME
  3. 主題選擇 THEMES

最後使用 node scripts/dev.js 啟用開發預覽,以 node scripts/shiki/index.js 幫建構完成的 HTML 檔案 highlight。

修復 Code Copy

這個改動修改 HTML 結構,因此 code copy 功能也非常可能失效,code copy 不同主題的設定方式不同,因此無法提供範例。

使用須知

  1. 腳本的原理是找到目標 HTML 文件,檢查是否包含 <pre><code 字串,有才定位裡面的 <code class="language-CODE_LANGUAGE"> 幫指定語言上色
  2. 如果你的 HTML 沒有被上色,請檢查 config.js 裡面的 TARGETS 設定,或者如上一個段落說的檢查 CSS
  3. 如果你的語言不在預設範圍內,請在 LANGUAGES 新增語言;如果你的語言沒辦法被正確識別,使用 LANGUAGE_ALIAS 做映射
  4. node scripts/shiki/index.js --dev 可以將目標目錄改為 TARGETS_DEV 以便開發時迅速預覽
  5. Shiki 直接修改 HTML,所以你的檔案會變大,但這個是不必要的擔心,因為增加一點點檔案大小比客戶端 JS 好的多。這裡也測試檔案變大多少,一樣是 eallion.com:
❯ uv run analysis_filesize.py
+580.0K 108.0K → 688.0K (+537.0%)       public/weasel/index.html
+140.0K 92.0K → 232.0K  (+152.2%)       public/blog-heatmap/index.html
+132.0K 92.0K → 224.0K  (+143.5%)       public/neodb/index.html
+128.0K 92.0K → 220.0K  (+139.1%)       public/cdn-cname-cloudflare/index.html
+116.0K 88.0K → 204.0K  (+131.8%)       public/memos-api/index.html
+56.0K  84.0K → 140.0K  (+66.7%)        public/mastodon-sync-to-memos/index.html
+52.0K  64.0K → 116.0K  (+81.2%)        public/judge-phone-ua/index.html
+40.0K  64.0K → 104.0K  (+62.5%)        public/share-duoshuo-css/index.html
+36.0K  84.0K → 120.0K  (+42.9%)        public/ubuntu1610/index.html
+36.0K  76.0K → 112.0K  (+47.4%)        public/hugo-redirect-landing-page/index.html

看似增加很多容量,不過這是 top 10,實際上 600 個 md 文件總容量只增加不到 2MB,幅度 1.3%。

analysis_filesize.py
# 測試 shiki 前後的檔案大小差異
# 使用這個指令印出檔案大小後執行此腳本分析
# find public -type f -name "*.html" -exec du -h {} + | sort -rh | tail -n 999999 > before-shiki.txt
# find public -type f -name "*.html" -exec du -h {} + | sort -rh | tail -n 999999 > after-shiki.txt

def parse_file(filename):
    data = {}
    with open(filename, 'r') as f:
        for line in f:
            parts = line.strip().split('\t')
            if len(parts) != 2:
                continue
            size_str, path = parts
            if size_str.endswith('K'):
                size = float(size_str[:-1])
            elif size_str.endswith('M'):
                size = float(size_str[:-1]) * 1024
            else:
                continue
            data[path] = size
    return data

before = parse_file('before-shiki.txt')
after = parse_file('after-shiki.txt')

all_paths = set(before.keys()).union(after.keys())
diffs = []
total_before = 0
total_after = 0

for path in all_paths:
    before_size = before.get(path, 0)
    after_size = after.get(path, 0)
    delta = after_size - before_size
    ratio = (delta / before_size * 100) if before_size != 0 else float('inf')
    diffs.append((delta, before_size, after_size, ratio, path))
    total_before += before_size
    total_after += after_size

diffs.sort(key=lambda x: -abs(x[0]))

top_diffs = diffs[:10]

for delta, before_size, after_size, ratio, path in top_diffs:
    sign = '+' if delta >= 0 else '-'
    ratio_str = f"{sign}{abs(ratio):.1f}%" if before_size != 0 else "N/A"
    print(f"{sign}{abs(delta):.1f}K\t{before_size:.1f}K → {after_size:.1f}K\t({ratio_str})\t{path}")

total_delta = total_after - total_before
total_ratio = (total_delta / total_before * 100) if total_before != 0 else float('inf')
total_sign = '+' if total_delta >= 0 else '-'
print(f"\n總變化:{total_sign}{abs(total_delta):.1f}K\t{total_before:.1f}K → {total_after:.1f}K\t({total_sign}{abs(total_ratio):.1f}%)")