Full-width half-width conversion tool
(Katakana/Alphanumeric)
Batch normalizes
notation variations such as full-width alphanumeric characters, half-width katakana, and full-width spaces.
* Content is processed in your browser — never sent to a server.
⚙️ Advanced Options ▼
Diff View (Myers diff)
Before
After
About
This is a tool that normalizes "fluctuations" such as full-width alphanumeric characters, half-width katakana, and full-width spaces in Japanese text all at once. It can be used in a variety of situations, such as form input preprocessing, data cleansing, and manuscript proofreading.
In addition to 5 presets (for form input, half-width alphanumeric characters, full-width kana, all hiragana, and all katakana), it also includes advanced options such as individual settings for alphanumeric characters, symbols, spaces, kana width, and kana type, unifying line breaks, deleting spaces at the end of lines, and converting tabs. You can visually check the changes with the difference highlighting.
All processing is completed within the browser and no data is sent to an external server. No member registration or installation required; just paste the text and you can start normalizing immediately.
How to Use
Enter text
Paste the text you want to normalize into the input field. It supports all types of text, including form input, CSV, and manuscripts.
Preset options
"Auto-update" ON to adjust advanced options or choose a preset.
Copy result
Check, copy, and save the normalized text and differential highlighting results.
Glossary
- Half-width to Full-width Kana
- Converts legacy half-width Japanese katakana ("アイウエオ") into standard full-width katakana ("アイウエオ"). Essential for modern web readability.
- Full-width to Half-width Alphanumeric
- Converts wide English letters and numbers ("123ABC") to standard ASCII half-width format ("123ABC"). Critical for database validation.
- Unicode Normalization (NFC/NFD)
- The process of unifying different internal byte representations of the same character. For example, ensuring "が" is a single character rather than "か" + "゛".
- Machine Dependent Characters
- Old proprietary glyphs (like circled numbers or specific Roman numerals) that cause mojibake (scrambled text) on modern systems. Best to normalize them out.
- Orthographical Variance
- Inconsistencies in text, such as multiple ways to write "apple" in Japanese (りんご, リンゴ, 林檎). Normalizing helps search engines accurately index text.
- Whitespace Trimming
- The removal or standardization of spaces. It includes converting Japanese wide spaces to standard ASCII spaces, stripping end-of-line spaces, and collapsing multiple spaces.
- Regex Processing
- The underlying technology (Regular Expressions) used by the tool to instantly locate and swap millions of character patterns directly in your browser.
FAQ
- Q.Is my text data secure?
- Completely secure. All text processing and regex operations are executed in real-time within your web browser. Nothing is ever sent to or processed by external servers.
- Q.Can I paste columns directly from Excel/Google Sheets?
- Yes. You can copy a whole column of messy customer data, paste it in, normalize the alphanumeric characters, and paste it directly back into your spreadsheet cleanly.
- Q.Can I disable the conversion of full-width spaces?
- Yes. You have granular control via the settings panel. Simply uncheck the associated box if you wish to preserve Japanese full-width spaces.
- Q.Will it fix separated dakuten marks like "か" + "゙"?
- Yes. The tool automatically detects separated dakuten (voiced marks) and intelligently merges them back into single, standardized characters (e.g., "が").
- Q.What are typical use cases for this normalizer?
- Invaluable for data cleansing form submissions, migrating legacy databases, sanitizing e-commerce product catalogs, and standardizing formatting before sending to printing presses.
- Q.Does it process line breaks properly?
- Yes. It preserves your existing line breaks (or standardizes them to LF/CRLF depending on settings) while processing the text on a line-by-line basis without merging paragraphs.
- Q.Is there a limit to how much text I can process?
- Because it uses highly optimized local JavaScript, it can comfortably handle tens of thousands of characters in milliseconds without freezing the browser.
Use Cases
Preprocessing form input
Ideal for validation steps before storing in the database.
CSV cleansing
Improved aggregation accuracy by eliminating the mixture of full-width and half-width characters and inconsistent spacing.
Proofreading of manuscripts and articles
Difference display allows you to visually confirm changes, greatly improving proofreading efficiency.
Program preprocessing
Uniformly match full-width and half-width characters before processing user input.
Technical
Character code conversion mechanism
Full-width ↔ half-width conversion is implemented by offset calculation of Unicode code points. The difference between full-width alphanumeric characters and half-width characters is fixed at 0xFEE0, so they can be converted quickly by just adding and subtracting.
Kana conversion algorithm
Unicode normalization (NFKC) is used to convert from half-width kana to full-width kana. Half-width kana with voiced marks (e.g. ga) is two characters, but NKFC combines it into one full-width character (ga).
Difference display (Myers diff)
Myers for calculating the difference before and after the change It uses a diff algorithm to calculate deletions (red) and insertions (green) for each character and highlights them.