Hangul Processing Toolkit

A precise, zero-dependency library for decomposing, composing, romanizing, and analyzing Korean text. Built around the systematic Unicode structure of the Hangul Syllables block.

$ hangul decompose "한글"
 +  +   # U+1112 U+1161 U+11AB
 +  +   # U+1100 U+1173 U+11AF
$ hangul romanize "한글" --scheme=rr
"hangeul"
npmv1.4.2 buildpassing coverage98% size4.1kB

Installation

Install via your preferred package manager. The library is published as ESM and CJS with TypeScript declarations bundled.

# npm
npm install hangul

# pnpm
pnpm add hangul

# deno
import { decompose } from "npm:hangul";

Quickstart

Decompose a Korean syllable into its constituent lead, vowel, and optional tail jamo. The library returns codepoints in the Conjoining Jamo block (U+1100–U+11FF).

import { decompose, compose, romanize } from "hangul";

const jamo = decompose("한");
// → { lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ", codepoints: [0x1112, 0x1161, 0x11AB] }

const syllable = compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" });
// → "한"  (U+D55C)

const latin = romanize("안녕하세요", { scheme: "rr" });
// → "annyeonghaseyo"

decompose(input, options?)

Splits each precomposed Hangul syllable in input into its lead consonant, vowel, and optional tail consonant. Non-Hangul characters pass through unchanged.

Parameter Type Default Description
input string Korean text to decompose.
options.format "jamo" | "compat" "jamo" Output codepoint block: Conjoining Jamo or Compatibility Jamo.
options.includeOffsets boolean false Include source character offsets for each emitted jamo.
options.passthrough boolean true Preserve non-Hangul characters in the output stream.

compose(parts)

Reconstructs a precomposed syllable from a triple of jamo. Throws InvalidJamoError if any component is outside the valid range.

compose({ lead: "ㄱ", vowel: "ㅏ" });          // → "가"
compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" }); // → "한"
compose({ lead: "ㅋ", vowel: "ㅗ", tail: "ㅁ" }); // → "콤"

romanize(input, options?)

Transliterates Korean text into Latin script using the chosen romanization scheme. Supports the official Revised Romanization as well as McCune-Reischauer.

Unicode Explorer

The Hangul Syllables block (U+AC00U+D7A3) contains 11,172 precomposed syllables, organized in a strict mathematical pattern: codepoint = 0xAC00 + lead × 588 + vowel × 28 + tail. Hover or tap a cell to inspect its decomposition.

velar nasal alveolar liquid labial fricative palatal aspirate
syllable
codepoint U+AC00
decompose ㄱ + ㅏ
romanize ga

Conjoining Jamo Block

The Conjoining Jamo block (U+1100U+11FF) provides the canonical decomposition targets for Hangul syllables. Use these codepoints for normalization-safe text processing.

RangeClassCountExample
U+1100–U+1112Lead consonants19ㄱ ㄴ ㄷ ㄹ ㅁ
U+1161–U+1175Vowels21ㅏ ㅑ ㅓ ㅕ ㅗ
U+11A8–U+11C2Tail consonants27ㄱ ㄴ ㄷ ㄹ ㅁ

Revised Romanization (RR)

The official romanization standard of South Korea, adopted in 2000. Maps Korean phonemes to Latin letters using a transcription approach that reflects pronunciation rather than strict letter-by-letter conversion.

romanize("대한민국", { scheme: "rr" });
// → "daehanminguk"

romanize("서울특별시", { scheme: "rr" });
// → "seoulteukbyeolsi"

McCune-Reischauer (MR)

The 1937 romanization standard widely used in academic publications and historical North Korean materials. Uses diacritics to mark phonemic distinctions.

romanize("평양", { scheme: "mr" });
// → "P'yŏngyang"

API Playground

Edit the request on the left; the response updates on the right. The runtime is the same WASM-compiled core that ships with the npm package.

request.js
response.json 200 OK · 1.2ms