5 Ways to Remove Junk / Special Characters In Unix

Control characters like ^M, ^B,^C are a common nightmare that a a programmer faces while generating text files from database sources.
And like every other trivial problem, Unix has a solution for this too. You can remove junk characters in Unix though a variety of ways. Below are five of the most popularly used and easiest ways:

::Way One:
In vi editor
:%s/^V^M//g 
Tells the vi editor to substitute the ^V and ^M characters anywhere in the file with the character between second and third slash (noting in this case). While entering the command, press CONTROL-V then press CONTROL-M (this will produce control M character while you are in command mode in the vi editor.)

:%s/.$//g 
Tells the vi editor to substitute the last character of each line (denoted by .$) with the character between second and third slash (noting in this case). The g at the tells the editor to do these substitutions globally.


::Wat Two::
Using col command:
$ cat filename | col -b > newfilename  #col removes the reverse line feeds from input file. The control sequences accepted by col are:

ESC-8            half reverse line feed (escape then 8)
ESC-9            half forward line feed (escape then 9)
ESC-7            reverse line feed (escape then 7)
backspace        moves back one column (8); ignored in the first column
carriage return  -13
space            moves forward one column (32)
tab              moves forward to next tab stop (9)
newline          forward line feed (10) and also does carriage return
shift in         shift to normal character set (15)
shift out        shift to alternate character set (14)
vertical tab     reverse line feed (11)

 
::Way Three::
Using sed command:

The options with sed (Stream EDitor) are similar to the options available with vi editor '%s' string substitution.

sed 's/^M//g' filename > newfilename
sed 's/'"$(printf '\015')"'//g' filename > newfilename

::Way Four::
Using dos2unix comand:
However, this utility might not be available in all the Unix flavours.
$ dos2unix filename newfilename

::Way Five::
To remove the ^M characters in all files of a directory:
First go to the directory where the files with the junk characters resides. Then run the below script to process all the files.
for filename in `ls`
do
tr -d '\r' >;$filename >temp.$$ && mv temp.$$ $filename
done


Comments / Views / Suggestions / Experiences??