Hangul Processing Toolkit

A precise, zero-dependency library for decomposing, composing, romanizing, and analyzing Korean text. Built around the systematic Unicode structure of the Hangul Syllables block.

$ hangul decompose "한글"
→ ㅎ + ㅏ + ㄴ  # U+1112 U+1161 U+11AB
→ ㄱ + ㅡ + ㄹ  # U+1100 U+1173 U+11AF
$ hangul romanize "한글" --scheme=rr
→ "hangeul"

npmv1.4.2 buildpassing coverage98% size4.1kB

Installation

Install via your preferred package manager. The library is published as ESM and CJS with TypeScript declarations bundled.

# npm
npm install hangul

# pnpm
pnpm add hangul

# deno
import { decompose } from "npm:hangul";

Quickstart

Decompose a Korean syllable into its constituent lead, vowel, and optional tail jamo. The library returns codepoints in the Conjoining Jamo block (U+1100–U+11FF).

import { decompose, compose, romanize } from "hangul";

const jamo = decompose("한");
// → { lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ", codepoints: [0x1112, 0x1161, 0x11AB] }

const syllable = compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" });
// → "한"  (U+D55C)

const latin = romanize("안녕하세요", { scheme: "rr" });
// → "annyeonghaseyo"

decompose(input, options?)

Splits each precomposed Hangul syllable in input into its lead consonant, vowel, and optional tail consonant. Non-Hangul characters pass through unchanged.

Parameter	Type	Default	Description
`input`	`string`	—	Korean text to decompose.
`options.format`	`"jamo" \| "compat"`	`"jamo"`	Output codepoint block: Conjoining Jamo or Compatibility Jamo.
`options.includeOffsets`	`boolean`	`false`	Include source character offsets for each emitted jamo.
`options.passthrough`	`boolean`	`true`	Preserve non-Hangul characters in the output stream.

compose(parts)

Reconstructs a precomposed syllable from a triple of jamo. Throws InvalidJamoError if any component is outside the valid range.

compose({ lead: "ㄱ", vowel: "ㅏ" });          // → "가"
compose({ lead: "ㅎ", vowel: "ㅏ", tail: "ㄴ" }); // → "한"
compose({ lead: "ㅋ", vowel: "ㅗ", tail: "ㅁ" }); // → "콤"

romanize(input, options?)

Transliterates Korean text into Latin script using the chosen romanization scheme. Supports the official Revised Romanization as well as McCune-Reischauer.

Unicode Explorer

The Hangul Syllables block (U+AC00–U+D7A3) contains 11,172 precomposed syllables, organized in a strict mathematical pattern: codepoint = 0xAC00 + lead × 588 + vowel × 28 + tail. Hover or tap a cell to inspect its decomposition.

Initial (초성)

velar nasal alveolar liquid labial fricative palatal aspirate

syllable 가

codepoint U+AC00

decompose ㄱ + ㅏ

romanize ga

Conjoining Jamo Block

The Conjoining Jamo block (U+1100–U+11FF) provides the canonical decomposition targets for Hangul syllables. Use these codepoints for normalization-safe text processing.

Range	Class	Count	Example
`U+1100–U+1112`	Lead consonants	19	ㄱ ㄴ ㄷ ㄹ ㅁ
`U+1161–U+1175`	Vowels	21	ㅏ ㅑ ㅓ ㅕ ㅗ
`U+11A8–U+11C2`	Tail consonants	27	ㄱ ㄴ ㄷ ㄹ ㅁ

Revised Romanization (RR)

The official romanization standard of South Korea, adopted in 2000. Maps Korean phonemes to Latin letters using a transcription approach that reflects pronunciation rather than strict letter-by-letter conversion.

romanize("대한민국", { scheme: "rr" });
// → "daehanminguk"

romanize("서울특별시", { scheme: "rr" });
// → "seoulteukbyeolsi"

McCune-Reischauer (MR)

The 1937 romanization standard widely used in academic publications and historical North Korean materials. Uses diacritics to mark phonemic distinctions.

romanize("평양", { scheme: "mr" });
// → "P'yŏngyang"

API Playground

Edit the request on the left; the response updates on the right. The runtime is the same WASM-compiled core that ships with the npm package.

request.js

response.json 200 OK · 1.2ms