Skip to content

Convert Unicode characters to Latin characters using transliteration

License

Notifications You must be signed in to change notification settings

sindresorhus/transliterate

transliterate

Convert Unicode characters to Latin characters using transliteration

Can be useful for slugification purposes and other times you cannot use Unicode.

Install

npm install @sindresorhus/transliterate

Usage

import transliterate from '@sindresorhus/transliterate';

transliterate('Fußgängerübergänge');
//=> 'Fussgaengeruebergaenge'

transliterate('Я люблю единорогов');
//=> 'Ya lyublyu edinorogov'

transliterate('أنا أحب حيدات');
//=> 'ana ahb hydat'

transliterate('tôi yêu những chú kỳ lân');
//=> 'toi yeu nhung chu ky lan'

transliterate('En–dashes and em—dashes are normalized');
//=> 'En-dashes and em-dashes are normalized'

API

transliterate(string, options?)

string

Type: string

String to transliterate.

options

Type: object

customReplacements

Type: Array<string[]>
Default: []

Add your own custom replacements.

The replacements are run on the original string before any other transformations.

This only overrides a default replacement if you set an item with the same key.

import transliterate from '@sindresorhus/transliterate';

transliterate('Я люблю единорогов', {
	customReplacements: [
		['единорогов', '🦄']
	]
})
//=> 'Ya lyublyu 🦄'
locale

Type: string

BCP-47 language tag for language-specific transliteration.

When specified, uses language-specific replacement rules for characters that have different transliterations in different languages.

import transliterate from '@sindresorhus/transliterate';

// Swedish: ä→a, ö→o, å→a
transliterate('Räksmörgås', {locale: 'sv'});
//=> 'Raksmorgas'

// German: ä→ae, ö→oe
transliterate('Räksmörgås', {locale: 'de'});
//=> 'Raeksmoergas'

Supported locales

The following locales have specific replacement rules when using the locale option:

  • da - Danish
  • de - German
  • hu - Hungarian
  • nb - Norwegian Bokmål
  • sr - Serbian
  • sv - Swedish
  • tr - Turkish

Supported languages

Most major languages are supported.

This includes special handling for:

  • Arabic
  • Armenian
  • Czech
  • Danish
  • Dhivehi
  • Georgian
  • German (umlauts)
  • Greek
  • Hungarian
  • Latin
  • Latvian
  • Lithuanian
  • Macedonian
  • Pashto
  • Persian
  • Polish
  • Romanian
  • Russian
  • Serbian
  • Slovak
  • Swedish
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese

However, Chinese is currently not supported.

Related

About

Convert Unicode characters to Latin characters using transliteration

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 9