BabelPad is a Unicode text editor for Windows that supports the proper rendering of most complex scripts, and allows you to assign different fonts to different scripts in order to facilitate multi-script text editing. BabelPad is free and fully functional, with no disabled features or time restrictions.
Summary of Features
User Interface
- Swap between Edit Mode and Browser Mode :
- Edit Mode allows documents of any size to be edited in plain text format.
- Browser Mode allows the current document to be viewed in an Internet Explorer browser window.
- For Windows NT4/2K/XP build only the GUI (menus, dialog boxes, status bar etc.) maybe displayed in any of the following languages :
- English
- Chinese (simplified)
- Chinese (traditional)
- Multiple instances of BabelPad may be tiled (horizontally, vertically or patchwork), cascaded, minimized, maximized, restored or closed from the "Window" manu of any of the BabelPad windows,
File Features
- Open files encoded as :
- Unicode : UTF-8
- Unicode : UTF-16 (Big Endian or Little Endian)
- Unicode : UTF-32 (Big Endian or Little Endian)
- Unicode : UTF-7
- Unicode : CESU-8
- Unicode 1.0 : UCS-2
- Unicode 1.1 : UCS-2
- Unicode 1.1 : UTF-7
- ISO-8859-1 (Latin1) : Western European
- ISO-8859-2 (Latin2) : Non-Cyrillic Central European
- ISO-8859-3 (Latin3) : Esperanto, Galician, Maltese, Turkish
- ISO-8859-4 (Latin4) : Baltic Rim
- ISO-8859-5 (Cyrillic)
- ISO-8859-6 (Arabic)
- ISO-8859-7 (Greek)
- ISO-8859-8 (Hebrew)
- ISO-8859-9 (Latin5) : Improved Turkish
- ISO-8859-10 (Latin6) : Inuit, Lappish
- ISO-8859-11 (Thai)
- ISO-8859-13 (Latin7) : Improved Baltic Rim
- ISO-8859-14 (Latin8) : Celtic
- ISO-8859-15 (Latin9, a.k.a. Latin0) : Improved Western European
- ISO-8859-16 (Latin10) : South-Eastern European
- Windows CP 874 (Thai)
- Windows CP 932 (extension of Shift-JIS) : Japanese
- Windows CP 936 (extension of GB2312) : Simplified Chinese
- Windows CP 949 (Unified Hangul Code) : Korean
- Windows CP 950 (extension of Big5) : Traditional Chinese
- Windows CP 1133 (Lao)
- Windows CP 1250 (East European)
- Windows CP 1251 (Cyrillic)
- Windows CP 1252 (West European)
- Windows CP 1253 (Greek)
- Windows CP 1254 (Turkish)
- Windows CP 1255 (Hebrew)
- Windows CP 1256 (Arabic)
- Windows CP 1257 (Baltic)
- Windows CP 1258 (Vietnamese)
- EUC-JA (Japanese)
- EUC-KR (Korean)
- GB18030 (Extended Chinese) : Unicode-mapped superset of GB2312
- GB2312 (Simplified Chinese)
- Big5 (Traditional Chinese)
- Big5-HKSCS (Big5 plus Hong Kong Supplementary Character Set)
- Shift-JIS (Japanese)
- JIS X 0201 (Latin plus Katakana)
- JIS X 0208 (Japanese)
- KSC 5601 (KS X 1001) (Korean)
- Wansung (Korean)
- Johab (Korean)
- KOI8-R (Russian)
- KOI8-U (Ukranian)
- ARMSCII-8 (Armenian)
- VISCII (Vietnamese)
- VIQR (Vietnamese Quoted Readable)
- TIS-620 (Thai)
- Mulelao-1 (Lao)
- TSCII (Tamil)
- TAM (Tamil Monolingual)
- TAB (Tamil Bilingual)
- I.S. 434 (Ogham)
- Autodetects Unicode encoding forms and character sets declared in HTML or XML documents.
- Automatically convert CR/LF, CR, LF, Line Separator and Paragraph Separator characters.
- Option to convert Numeric Character References (NCR) and/or Universal Character Names (UCN) to Unicode characters on Open.
- Save the current document as :
- Unicode : UTF-8 (with or without a Byte Order Mark)
- Unicode : UTF-16 Big Endian or Little Endian (with or without a Byte Order Mark)
- Unicode : UTF-32 Big Endian or Little Endian (with or without a Byte Order Mark)
- GB18030 (with or without a Byte Order Mark)
- ASCII with Hexadecimal Numeric Character Reference (NCR) substitution of non Basic Latin characters
- ASCII with Decimal Numeric Character Reference (NCR) substitution of non Basic Latin characters
- ASCII with Universal Character Name (UCN) substitution of non Basic Latin characters
- ASCII with HTML Entity substitution of non Basic Latin characters
- Save line breaks as CR/LF, LF, CR, or as Unicode Line Separator [U+2028] or Paragraph Separator characters [U+2029].
Edit Features
- Left-To-Right (LTR) or Right-To-Left (RTL) page layout.
- Line Wrap mode or No Line Wrap mode.
- Drag and Drop editing.
- Multiple Undo/Redo.
- Indent and Unindent selected lines of text using TAB and Shift-TAB.
- Option to Auto-Indent text as you type (useful for writing code).
- Select a "word" by double-clicking and navigate by "word" by means of the left/right arrows (works for most Unicode scripts).
- Select a line of text by left-clicking in the margin (select a paragraph by double-clicking in the margin).
- Find and Replace functions.
- Select default font and font size from dropdown list on the toolbar.
- Configure individual Unicode blocks to always use a particular font regardless of which font is currently selected for default display.
- Status Bar displays codepoint and Unicode name of the character at the current caret position.
- For CJK ideographs the status bar also displays the Mandarin, Korean or Vietnamese reading for the character at the current caret position (choice of reading is user-selectable).
- Able to open and edit very large (multi-megabyte) files with little degredation in performance.
- Standard printing functionality enabled.
Text Conversion
- Case Conversion (covering all scripts that have upper/lower case distinctions, including Latin, Greek, Cyrillic, Armenian and Deseret) :
- Convert the selected alphabetic text to upper case.
- Convert the selected alphabetic text to lower case.
- Convert the selected alphabetic text to title case.
- Normalization (conforms to Unicode 5.0 normalization algorithm) :
- Convert the selected text to Normalization Form NFD (cannonical decomposition).
- Convert the selected text to Normalization Form NFC (cannonical composition).
- Convert the selected text to Normalization Form NFKD (cannonical decomposition with compatibility characters replaced).
- Convert the selected text to Normalization Form NFKC (cannonical composition with compatibility characters replaced).
- CJK Conversion :
- Convert the selected Simplified Chinese text to Traditional Chinese.
- Convert the selected Traditional Chinese text to Simplified Chinese.
- Entity Conversion :
- Convert all HTML Entities (e.g. ü) in the selected text to Unicode characters.
- Convert all non-Basic Latin characters in the selected text to HTML Entities or hexadecimal Numeric Character References (NCRs).
- Convert all Numeric Character References (e.g. ü or ü) in the selected text to Unicode characters.
- Convert all non-Basic Latin characters in the selected text to hexadecimal Numeric Character References (NCRs).
- Convert all non-Basic Latin characters in the selected text to decimal Numeric Character References (NCRs).
- Convert all Universal Character Names (e.g. \u00FC) in the selected text to Unicode characters.
- Convert all non-Basic Latin characters in the selected text to Universal Character Names (UCNs).
- Convert all characters in the selected text to their Unicode Names (e.g. LATIN SMALL LETTER U WITH DIAERESIS).
- Convert all characters in the selected text to U+XXXX notation (e.g. U+00FC).
- Transliteration Conversion :
- Convert the selected Extended Wylie Tibetan transliteration to Unicode Tibetan.
- Convert the selected Mongolian transliteration to Unicode Mongolian.
- Convert the selected Manchu transliteration to Unicode Manchu.
- Convert the selected Yi romanisation to Unicode Yi.
- Convert the selected Yi romanisation to International Phonetic Alphabet (IPA).
- Convert the selected Unicode Yi text to Yi romanisation.
- Convert the selected Unicode Yi text to International Phonetic Alphabet (IPA).
- Convert the selected Vietnamese Unicode text to VIQR transliteration.
- Convert the selected VIQR transliteration to Vietnamese Unicode.
- PUA Conversion :
- Convert precomposed Tibetan (SetA) to standard Unicode Tibetan.
- Convert standard Unicode Tibetan to precomposed Tibetan (SetA).
- Convert Hong Kong Supplementary Character Set (HKSCS) PUA characters to CJK Unified Ideograph characters.
- Reordering :
- Reverse the order of all selected characters in a line.
Rendering Features
- Utilises Microsoft's Uniscribe rendering engine to correctly render complex text.
- Option to display Unicode control characters with visible glyphs (if the selected font supports this).
- Option to render all Unicode characters as individual spacing glyphs (i.e. with no shaping or ligation of complex text, and combining characters not combined).
Input Methods
- Select any installed Windows Input Locale and/or Keyboard Layout/IME from a dropdown list on the toolbar.
- Romanised input methods for the following scripts :
- Tibetan (using the Tibetan & Himalayan Digital Library [THDL] Extended Wylie Transcription System [EWTS])
- Mongolian
- Manchu
- Yi (using the Liangshan Yi Phonetic Alphabet)
- Unicode Input Mode :
- Enter Unicode characters in the range U+0001 through U+10FFFF as scalar hexadecimal values (with or without leading zeros), demarcated by pressing the Space or Return key.
- Select One-off Unicode Input Mode by pressing Ctrl+Q (this allows you to enter a single Unicode character as described above, but on pressing Space, Enter or Escape you are returned to the original keyboard/IME).
Tools and Utilities
- Unicode Character Map utility :
- Unicode Character Grid showing all characters in the current version of Unicode.
- Magnify any character in the Unicode Character Grid by right-clicking on it.
- Select any Unicode range in Planes 0, 1, 2, 14, 15 and 16.
- Find a character by Unicode hexadecimal or decimal codepoint value.
- Search (backwards and forwards through the Unicode Grid) for characters by full or partial name.
- Browse through Unicode Grid by plane, range or page of 128 characters.
- Display characters as characters or NCR/UCN entities.
- Copy selected characters to clipboard.
- Insert selected characters into the document at the current cursor position.
- Open a Properties dialogue box for the selected character, which displays various additional information about the character (including Unicode Properties, Notes, Aliases, Cross-References, Standardized Variants, and CJK data).
- Advanced Character Search utility :
- List all characters that meet the specified search criteria.
- Select a wide range of information to display for each character that meets the specified criteria.
- Sort results by any property that has been selected for inclusion in the output.
- Copy the search results to the clipboard in tab-delimited format.
- CJK Radical Lookup utility :
- List all CJK ideographs with a given radical and number of strokes.
- Covers all 70,000+ characters in CJK, CJK-A and CJK-B ranges.
- CJK Pinyin Lookup utility :
- List all CJK ideographs with a given pinyin pronunciation.
- Covers CJK and CJK-A ranges, plus about 200 characters in CJK-B (in total 25,500+ characters).
- Yi Radical Lookup utility :
- List all Yi syllables with a given radical and number of strokes.
- Font Analysis utility :
- Lists all fonts that cover a particular Unicode block.
- Lists all Unicode blocks that are covered by a particular font.
- Copy Font Coverage or Unicode Block Coverage information to clipboard in plain text format.
- Preview sample text for any given Unicode block under any any selected font.
- Unicode Summary utility :
- Lists summary information for all planes in the current version of Unicode.
- Lists summary information for all blocks in the current version of Unicode.
- Lists summary information for all scripts in the current version of Unicode.
- Unicode Version History utility :
- Lists character repertoire statistics for all version of Unicode from 1.0.0 onwards.
- Lists summary of the blocks included in a given version of Unicode.
- Document Analysis utility :
- Provides statistical information about the current document, including distribution of Unicode characters over Unicode blocks and scripts.
- Highlights usage of unassigned or reserved codepoints, unpaired surrogate characters, undefined variant sequences, deprecated characters, and other such problems.
|