Monday, August 20, 2007

Perl script to remove duplicate files created by itunes

The following scripts recursively runs and scans directories. If it finds a directory that has files: song.mp3 and song 1.mp3 or song 2.mp3 then it keeps song.mp3 and removes the others. It does a size check before removing just to make sure they are duplicates (it doesnt do content checking or bit by bit comparision).

By the way this is a windows specific script. it needs a bit of customization for unix, etc.
i am using fc to do a binary file compare on vista.

Begin Perl Script: removedupmusicfiles.pl



# This script scans the itunes music folder and removes duplicated files created by itunes.
# run this as: /perl removedupmusicfiles.pl

sub traverse($)
{
my($dir) = @_;


# print "\nTraverse: $dir";
removeDuplicates($dir);


local *DIR;
opendir (DIR, $dir) or die "Could not open directory $dir: $!";

while (defined(my $f = readdir (DIR)))
{
# Only recurse on directories, which do not start with '.', and skip symbolic links
if (-d "$dir/$f" &&
!(-l "$dir/$f") &&
($f !~ /^\.{1,2}$/)
)
{
traverse ("$dir/$f");
}
}

}

sub removeDuplicates($)
{
my($dir) = @_;
my %fnamehash;


# print "\nremoveDuplicates: $dir";

local *DIR;
opendir (DIR, $dir) or die "Could not open directory $dir: $!";

while (defined(my $f = readdir (DIR)))
{
#my $fsize = -s $f;
$fnamehash{$f} = $f;
}


foreach $f ( keys %fnamehash ) {

# if you find files like "foo 1.mp3"
if($f =~ /(.+)\s\d\.mp3/) {

#print "\n$dir: $f could possibly be duplicate of $orig";

$orig = "$dir/$1.mp3";


if(-e $orig) {
my $fsize = -s "$dir/$f";
my $osize = -s "$orig";

#print "\n$f Size: $fsize could be duplicate of $orig Size: $osize ";

if($osize == $fsize) {

my $cmd = "\"$orig\" \"$dir/$f\"";
$cmd =~ s/\//\\/g;
my $output = `fc /B $cmd`;
if ($output =~ /no differences encountered/){
my $name = "$dir/$f";
print "\n$dir/$f ( $fsize bytes) is a duplicate of $orig ( $osize bytes ) ";
print "\nRemoving: $name";

unlink($name) || die "Cannont unlink $name: $!";
}


}
}
}

}
}


$dir = shift;
traverse($dir);








End Perl Script: removedupmusicfiles.pl

No comments: