Hangul Processing Toolkit
A precise, zero-dependency library for decomposing, composing, romanizing, and analyzing Korean text. Built around the systematic Unicode structure of the Hangul Syllables block.
$ hangul decompose "한글"
→ ㅎ + ㅏ + ㄴ # U+1112 U+1161 U+11AB
→ ㄱ + ㅡ + ㄹ # U+1100 U+1173 U+11AF
$ hangul romanize "한글" --scheme=rr
→ "hangeul"
Installation
Install via your preferred package manager. The library is published as ESM and CJS with TypeScript declarations bundled.
# npm
npm install hangul
# pnpm
pnpm add hangul
# deno
import { decompose } from "npm:hangul";
Quickstart
Decompose a Korean syllable into its constituent lead, vowel, and optional tail jamo. The library returns codepoints in the Conjoining Jamo block (U+1100–U+11FF).
import { decompose, compose, romanize } from "hangul";
const jamo = decompose("한");
// → { lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ", codepoints: [0x1112, 0x1161, 0x11AB] }
const syllable = compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" });
// → "한" (U+D55C)
const latin = romanize("안녕하세요", { scheme: "rr" });
// → "annyeonghaseyo"
decompose(input, options?)
Splits each precomposed Hangul syllable in input into its lead consonant, vowel, and optional tail consonant. Non-Hangul characters pass through unchanged.
| Parameter | Type | Default | Description |
|---|---|---|---|
input |
string |
— | Korean text to decompose. |
options.format |
"jamo" | "compat" |
"jamo" |
Output codepoint block: Conjoining Jamo or Compatibility Jamo. |
options.includeOffsets |
boolean |
false |
Include source character offsets for each emitted jamo. |
options.passthrough |
boolean |
true |
Preserve non-Hangul characters in the output stream. |
compose(parts)
Reconstructs a precomposed syllable from a triple of jamo. Throws InvalidJamoError if any component is outside the valid range.
compose({ lead: "ㄱ", vowel: "ㅏ" }); // → "가"
compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" }); // → "한"
compose({ lead: "ㅋ", vowel: "ㅗ", tail: "ㅁ" }); // → "콤"
romanize(input, options?)
Transliterates Korean text into Latin script using the chosen romanization scheme. Supports the official Revised Romanization as well as McCune-Reischauer.
Unicode Explorer
The Hangul Syllables block (U+AC00–U+D7A3) contains 11,172 precomposed syllables, organized in a strict mathematical pattern: codepoint = 0xAC00 + lead × 588 + vowel × 28 + tail. Hover or tap a cell to inspect its decomposition.
Conjoining Jamo Block
The Conjoining Jamo block (U+1100–U+11FF) provides the canonical decomposition targets for Hangul syllables. Use these codepoints for normalization-safe text processing.
| Range | Class | Count | Example |
|---|---|---|---|
U+1100–U+1112 | Lead consonants | 19 | ㄱ ㄴ ㄷ ㄹ ㅁ |
U+1161–U+1175 | Vowels | 21 | ㅏ ㅑ ㅓ ㅕ ㅗ |
U+11A8–U+11C2 | Tail consonants | 27 | ㄱ ㄴ ㄷ ㄹ ㅁ |
Revised Romanization (RR)
The official romanization standard of South Korea, adopted in 2000. Maps Korean phonemes to Latin letters using a transcription approach that reflects pronunciation rather than strict letter-by-letter conversion.
romanize("대한민국", { scheme: "rr" });
// → "daehanminguk"
romanize("서울특별시", { scheme: "rr" });
// → "seoulteukbyeolsi"
McCune-Reischauer (MR)
The 1937 romanization standard widely used in academic publications and historical North Korean materials. Uses diacritics to mark phonemic distinctions.
romanize("평양", { scheme: "mr" });
// → "P'yŏngyang"
API Playground
Edit the request on the left; the response updates on the right. The runtime is the same WASM-compiled core that ships with the npm package.