The iconv command in Linux is a powerful utility designed to convert text files from one character encoding to another. Whether you’re dealing with legacy systems, internationalization requirements, or file compatibility issues, iconv provides a reliable solution for character encoding conversion tasks.
What is Character Encoding?
Character encoding is a system that assigns numerical codes to characters, enabling computers to store and display text. Common encodings include:
- UTF-8: Universal character encoding supporting all languages
- ASCII: Basic English character set (7-bit)
- ISO-8859-1 (Latin-1): Western European character set
- UTF-16: Unicode encoding using 16-bit units
- CP1252: Windows-specific encoding
Basic Syntax and Options
The fundamental syntax of the iconv command follows this pattern:
iconv [OPTIONS] -f FROM_ENCODING -t TO_ENCODING [INPUT_FILE]
Essential Options
| Option | Description |
|---|---|
-f, --from-code |
Specify source encoding |
-t, --to-code |
Specify target encoding |
-o, --output |
Specify output file |
-c |
Omit invalid characters |
-s, --silent |
Suppress error messages |
-l, --list |
List supported encodings |
Listing Available Encodings
Before converting files, you can view all supported encodings:
iconv -l
Sample Output:
The following list contains all the coded character sets known.
ASCII
UTF-8
UTF-16
UTF-16BE
UTF-16LE
UTF-32
ISO-8859-1
ISO-8859-2
...
CP1252
WINDOWS-1251
Basic Conversion Examples
Converting UTF-8 to ASCII
Create a sample file with UTF-8 encoding:
echo "Hello World! Café résumé" > utf8_file.txt
Convert to ASCII (note that special characters may be lost):
iconv -f UTF-8 -t ASCII//IGNORE utf8_file.txt
Output:
Hello World! Caf rsum
Converting Between UTF-8 and ISO-8859-1
iconv -f UTF-8 -t ISO-8859-1 input.txt -o output.txt
Converting Windows Files to UTF-8
iconv -f CP1252 -t UTF-8 windows_file.txt -o unix_file.txt
Advanced Usage Scenarios
Handling Invalid Characters
When converting between incompatible encodings, use these suffixes:
//IGNORE– Skip invalid characters//TRANSLIT– Transliterate similar characters
# Skip invalid characters
iconv -f UTF-8 -t ASCII//IGNORE input.txt
# Transliterate characters
iconv -f UTF-8 -t ASCII//TRANSLIT input.txt
Example with transliteration:
echo "Café résumé naïve" | iconv -f UTF-8 -t ASCII//TRANSLIT
Output:
Cafe' re'sume' nai"ve
Batch File Conversion
Convert multiple files using a shell script:
#!/bin/bash
for file in *.txt; do
iconv -f ISO-8859-1 -t UTF-8 "$file" -o "${file%.txt}_utf8.txt"
done
Practical Use Cases
Database Migration
When migrating databases with different character sets:
mysqldump --default-character-set=latin1 database_name | \
iconv -f ISO-8859-1 -t UTF-8 | \
mysql --default-character-set=utf8 new_database_name
Web Development
Ensuring consistent encoding for web files:
find . -name "*.html" -exec iconv -f ISO-8859-1 -t UTF-8 {} -o {}.utf8 \;
Log File Processing
Converting system logs for analysis:
iconv -f CP1252 -t UTF-8 system.log | grep "ERROR"
Error Handling and Troubleshooting
Common Error Messages
Invalid input sequence:
iconv: illegal input sequence at position X
Solution: Use //IGNORE or //TRANSLIT suffixes
Detecting File Encoding
Use the file command to identify encoding:
file -bi filename.txt
Output:
text/plain; charset=utf-8
Performance Considerations
Processing Large Files
For large files, consider using pipes to avoid memory issues:
cat large_file.txt | iconv -f ISO-8859-1 -t UTF-8 > converted_file.txt
Monitoring Progress
Use pv (pipe viewer) for progress monitoring:
pv large_file.txt | iconv -f UTF-16 -t UTF-8 > output.txt
Integration with Other Commands
Combining with sed
iconv -f UTF-8 -t ASCII//TRANSLIT input.txt | sed 's/[^a-zA-Z0-9 ]//g'
Using with awk
iconv -f ISO-8859-1 -t UTF-8 data.csv | awk -F',' '{print $1, $3}'
Best Practices
- Always backup original files before conversion
- Test with sample data before processing large datasets
- Verify encoding using
file -bicommand - Use appropriate suffixes (//IGNORE or //TRANSLIT) for lossy conversions
- Document encoding changes for team collaboration
Alternative Tools
While iconv is the standard, consider these alternatives:
- recode: More user-friendly interface
- uconv: ICU-based converter with advanced features
- dos2unix/unix2dos: Specific for line ending conversion
Conclusion
The iconv command is an essential tool for Linux system administrators and developers working with text files in different character encodings. Its versatility in handling various encoding formats, combined with powerful options for error handling, makes it indispensable for data processing, migration tasks, and internationalization projects.
By mastering iconv, you can ensure consistent text encoding across different systems, resolve compatibility issues, and maintain data integrity during file conversions. Remember to always test conversions with sample data and backup original files to prevent data loss.







