Linguistics

HTML Kanbun Kundoku

Input

After a kanji: (…) furigana, {…} okurigana (braces optional on bare kana), […] kaeriten. For saidoku 再読文字 use ‹…› / «…»; ‘…’ groups several kanji under one ruby. One line = one column.

Layout
Preview
Notes

Kanbun kundoku (漢文訓読) is the Japanese tradition of reading Classical Chinese as Japanese: the original characters are kept in place but annotated so they can be read in Japanese word order and with Japanese inflection. This tool takes text marked up in that convention and sets it the way it appears on the page — vertically, right-to-left, with each reading mark in its proper position around the kanji.

Annotation syntax. Marks follow the kanji they attach to:

BracketMarkTerm
(…)furigana (reading)振り仮名
{…}okurigana (inflectional kana)送り仮名
[…]kaeriten (reading-order marks)返り点
‹…› / «…»furigana / okurigana of a re-read character再読文字
‘…’group several kanji under one ruby base

Bare kana need no braces — the engine recognises them and treats them as okurigana automatically; punctuation needs no annotation. Kaeriten go inside […]: numerals 一二三, the heaven–earth–man series 天地人, the 甲乙丙 series, the 上中下 series, and the reversing mark (which may follow another mark, e.g. [一レ]). The toolbar inserts the less obvious pieces — saidoku brackets, the tateten , the ninojiten , and the multiple-kanji ruby.

Two settings. Akigumi (アキ組) gives every character a fixed cell, so columns align regardless of how much kana each character carries; betagumi (ベタ組) sets the characters solid and lets the marks tuck into the gaps. The refinements — sinking okurigana when a character has no furigana, splitting touching kana, centring a lone furigana — match the choices a typesetter makes by hand. Copy HTML yields the rendered markup for pasting elsewhere.

The display engine is ported from untunt’s kanbunHTML (AGPL-3.0); see the reference below. It renders best with a Japanese mincho font installed (Yu Mincho, Hiragino Mincho, or similar) and falls back to the system serif otherwise.

漢文訓読是以日語讀解文言的傳統:原文漢字一仍其舊,只在字旁加註,使之能依 日語語序、帶日語活用而讀。本工具接受依此體例標註的文本,並按其落紙時的樣 貌排版——自右而左豎排,各種讀解符號各就漢字四周的本位。

標註體例。 各記號緊附於所注的漢字之後:

括號記號名稱
(…)振假名(注音)振り仮名
{…}送假名(活用假名)送り仮名
[…]返り點(讀序記號)返り点
‹…› / «…»再讀文字的振假名/送假名再読文字
‘…’數字共用一個注音底

裸假名不必加大括號——引擎會自動辨識並當作送假名處理;標點則無須標註。 返り點寫在 […] 內:一二三、天地人 天地人甲乙丙上中下 諸系,以及表示逆讀的 (可綴於他號之後,如 [一レ])。工具列可一鍵 插入較不易輸入的部件——再讀文字括號、たて點 、二の字點 ,以及 多字共用注音。

兩種排法。 疏排(アキ組)使每字佔一固定格,無論假名多寡,列皆對齊; 密排(ベタ組)則字字相接,令假名嵌入字間空隙。其餘的細部——某字無振假名 時令送假名下沉、相鄰假名緊貼時分開、孤立振假名置中——都對應排字師傅手工 拿捏的取捨。按 Copy HTML 即可取得生成的標記,貼往他處。

排版引擎移植自 untunt 的 kanbunHTML (AGPL-3.0),見下方參考。裝有日文明朝體(游明朝、ヒラギノ明朝等)時顯示 最佳,否則回退到系統襯線字體。

References

  • untunt. kanbunHTML — a kanbun kundoku (漢文訓読) HTML display solution. GitHub · live demo. Licensed AGPL-3.0; the display engine here is a port of that project.