Hi,
I am looking for a simple tool that can
1) convert Windows (CR+LF) line endings to Unix (LF)
2) trim trailing whitespace at the end of each line
looking for a tool to conver line endings
Moderators: hgm, Rebel, chrisw
-
- Posts: 481
- Joined: Thu Apr 16, 2009 12:00 pm
- Location: Slovakia, EU
-
- Posts: 38
- Joined: Wed Sep 03, 2008 4:12 am
Re: looking for a tool to conver line endings
To convert from dos to unix and vice-versa (and I think the package also includes the Mac text files, which if I am not mistaken is CR), there is dos2unix:
http://sourceforge.net/projects/dos2unix/
To trim space you could use a perl scrit - to do the format conversion as well, in fact.
http://sourceforge.net/projects/dos2unix/
To trim space you could use a perl scrit - to do the format conversion as well, in fact.
-
- Posts: 543
- Joined: Mon Jul 05, 2010 10:27 pm
Re: looking for a tool to conver line endings
rvida wrote:Hi,
I am looking for a simple tool that can
1) convert Windows (CR+LF) line endings to Unix (LF)
2) trim trailing whitespace at the end of each line
The white space has to be before end-line, so inverting task order you can do both in a single step, run:
Code: Select all
perl -pe 's/ *\r\n/\n/' inputfile
Hope it helps.
Regards
Ignacio
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: looking for a tool to conver line endings
Regular expressions can do itrvida wrote:Hi,
I am looking for a simple tool that can
1) convert Windows (CR+LF) line endings to Unix (LF)
2) trim trailing whitespace at the end of each line
First remove all trailing spaces and tabs at the end of each line with sed:
Code: Select all
sed 's/[ \t]*$//' file.txt > file_trimmed.txt
Code: Select all
sed 's/[ \t]*$//' file.txt | sed 's/\n/\n/' > file_trimmed.txt
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: looking for a tool to conver line endings
Linux has always had dos2unix and unix2dos commands. That what you want?rvida wrote:Hi,
I am looking for a simple tool that can
1) convert Windows (CR+LF) line endings to Unix (LF)
2) trim trailing whitespace at the end of each line
-
- Posts: 38
- Joined: Wed Sep 03, 2008 4:12 am
Re: looking for a tool to conver line endings
I am not fluent on perl one-liners, but now I am jealous of the previous posts, so here is a (untested) script to convert to / from any OS specific format:
just save this as whatever_you_like.pl, then make it executable:
and call it as:
SYSTEM being any one choice of mac, unix or windows (case-insensitive). Again, untested, if does not work, just get back here and someone will point to any possible errors.
edit: this assumes you are developing on unix/mac, if not, you have to call the script as:
Code: Select all
#!/usr/bin/perl
open (fh_in, "<", $ARGV[0]) or die "could not open file $!";
open (fh_out, ">", $ARGV[1]) or die "could not open file $!";
while (<fh_in>)
{
$_ =~ s/\r//;
$_ =~ s/\n//;
$_ =~ s/\s+$//;
if ( $ARGV[2] =~ m/mac/i ) { $_ = $_."\r"; }
elsif ( $ARGV[2] =~ m/unix/i ) { $_ = $_."\n"; }
elsif ( $ARGV[2] =~ m/windows/i ) { $_ = $_."\r\n"; }
print fh_out;
}
close (fh_in);
close (fh_out);
Code: Select all
chmod +x whatever_you_like.pl
Code: Select all
./whatever_you_like.pl input_file output_file SYSTEM
edit: this assumes you are developing on unix/mac, if not, you have to call the script as:
Code: Select all
perl whatever_you_like.pl input_file output_file SYSTEM
-
- Posts: 543
- Joined: Mon Jul 05, 2010 10:27 pm
Re: looking for a tool to conver line endings
those programs are not installed by default and the other problem is not solved: you still have spaces before end-line.bob wrote:Linux has always had dos2unix and unix2dos commands. That what you want?rvida wrote:Hi,
I am looking for a simple tool that can
1) convert Windows (CR+LF) line endings to Unix (LF)
2) trim trailing whitespace at the end of each line
@Horacio: Nice coding.
My command misses tabs.. This will do the trik form command line
Code: Select all
perl -pe 's/\s*\r\n/\n/' infile > outfile
-
- Posts: 481
- Joined: Thu Apr 16, 2009 12:00 pm
- Location: Slovakia, EU
Re: looking for a tool to conver line endings
Thanks for all the answers.
I decided to use the sed based solution. It is a wonderful tool, although for people coming from Dos/Windows world the syntax is somewhat obscure...
Btw. after some googling I found a list of very useful sed one liners:
http://sed.sourceforge.net/sed1line.txt
I decided to use the sed based solution. It is a wonderful tool, although for people coming from Dos/Windows world the syntax is somewhat obscure...
Btw. after some googling I found a list of very useful sed one liners:
http://sed.sourceforge.net/sed1line.txt
-
- Posts: 543
- Joined: Mon Jul 05, 2010 10:27 pm
Re: looking for a tool to conver line endings
Sure, sed is a great.rvida wrote:Thanks for all the answers.
I decided to use the sed based solution. It is a wonderful tool, although for people coming from Dos/Windows world the syntax is somewhat obscure...
Btw. after some googling I found a list of very useful sed one liners:
http://sed.sourceforge.net/sed1line.txt
There are little, but important, differences on how each program interprets regular expressions. Keeping track of those differences in memory is a mess so its common to stick to one of few programs.
Regular expression (regex) syntax is obscure but is all logic behind them, and crafting some regex can be great fun, as solving a chess problem!
Ignacio.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: looking for a tool to conver line endings
Hi Richard,rvida wrote:I decided to use the sed based solution. It is a wonderful tool,
take care of doing both steps (CR/LF->LF conversion + trailing whitespace removal) separately, starting with the CR/LF part. Depending on the platform where you perform these conversions, "sed" as well as "perl" or other tools using regular expressions may or may not recognize a CR/LF character sequence as something that matches a "$" (end of input line) in the given pattern. Therefore a pattern logically resembling "<whitespace><whitespace>*$" may or may not match an input line that ends with <whitespace><CR><LF>. You can expect it to succeed in a typical Windows-like environment where CR/LF is the typical text file line ending, but not in a typical UNIX environment. Furthermore, also combining "<whitespace><whitespace>*<CR><LF>" in one pattern will not always succeed since line endings could be inconsistent within one file.
Hmmm ... wasn't it an invention from the DOS world to have that CR/LF line ending that created one of the biggest (in)compatibility issues in the whole IT world?rvida wrote:although for people coming from Dos/Windows world the syntax is somewhat obscure...
Sven