Posts Tagged JPG
Using Perl and ExifTool to Access EXIF Data in Digital Images
Posted by Spencer in Programming on September 2nd, 2009
Overview
EXIF data in digital images is a fairly complete snapshot of all camera and flash settings at the time of exposure. The information can be extremely useful in many ways, such as automatic image organization; metadata repositories for advanced searching; and unique file renaming, to name a few. In fact, Nikon NEF files have a full-size Basic JPG image stored in the EXIF information, along with a thumbnail-sized preview. Because of this, I never shoot RAW+JPG, since I get that JPG for free with every RAW file anyway — I just have to extract it. This article documents how to access this and other data using Perl and the ExifTool package, written by Phil Harvey (documented on his website, http://www.sno.phy.queensu.ca/~phil/exiftool/). Both are available as free downloads for several platforms, including Linux and Windows XP. I have used both packages on both platforms.
Reading Data
In order to use the package, it must be included at the top of the perl file:
#!/usr/bin/perl -w use Image::ExifTool;
Reading data is fairly simple once a general process is established. Information is organized for the most part in a “key/value” structure, like a hash in perl (and in fact, data returned by the exiftool object is represented by a hash structure). The following perl code reads the Shutter Speed associated with a particular image:
my $srcfile = "/home/spencerkellis/dsc0001.jpg";
my $srcfileExif = new Image::ExifTool;
my @srcfileTagList = ('ShutterSpeed');
my $srcfileInfo = $srcfileExif->ImageInfo($srcfile,@srcfileTagList);
print "Shutterspeed: ".$$srcfileInfo{'ShutterSpeed'};
The code is straightforward enough; line 1 establishes a file (note the absolute path); line 2 instantiates an Image::ExifTool object; line 3 specifies which tags to read; line 4 actually reads the information; finally, line 5 prints the returned value.
A Simple Example: Renaming Files Based on EXIF Date and Time
One of my principle uses of perl and exiftool is to rename files automatically to a unique filename based on date and time, in the following format:
YYYYMMDD-HHMMSS-II.FFF
‘I’ stands for an “index”, to allow for cases in which multiple pictures were taken in the same second. ‘F’ represents the file format extension (i.e., JPG or NEF for Nikon RAW files). The process of renaming files, tedious by hand, is simple and efficient to automate with perl.
#!/usr/bin/perl -w
use File::Spec;
use Image::ExifTool;
my $p = shift;
#extract date from EXIF
my @ioTagList = ('DateTimeOriginal');
my $exifTool = new Image::ExifTool;
$exifTool->Options(DateFormat => "%Y%m%d-%H%M%S-");
my $info = $exifTool->ImageInfo($p, @ioTagList);
#create new filename
my $name = sprintf("%s00",$$info{'DateTimeOriginal'});
my ($pathvol,$pathdir,$pathfile) = File::Spec->splitpath($p);
my $newfile = File::Spec->catfile($pathvol,$pathdir,$date.".JPG");
#write files
if( -f $newfile )
{
print "skipping $p: already exists at $newfilen";
return;
}
if($p ne $newfile)
{
rename($p,$newfile) or die "Error: could not copy $p to $newfile: $!n";
}
Again, a straightforward example with a few extra lines to take note of. In this case,to simplify my perl code, I used an exiftool construct allowing options to be set which govern output format:
$exifTool->Options(DateFormat => "%Y%m%d-%H%M%S-");
This option instructs exiftool to output the date in a specific format corresponding to the filename format I described above. As a simple example, I did not include my method for handling multiple files with the same date and time; the code will simply skip renaming files where the target filename already exists. The sprintf line constructs the string holding the filename, and the File::Spec package is used to split apart and reconstruct paths. The code requires an argument (the path to a file); in my full script, this code is a sub, and passed the filename for each entry in a directory.
Command Line Alternative
After spending quite a lot of time getting to know the exiftool package in perl, I started to think about ways to do it with simpler code. As it turns out, using the command line version of exiftool can result in much cleaner code. Consider the following, which is virtually the same perlscript as above, except executing the exiftool executable instead of in line perl code:
my $p = shift;
#extract date from EXIF
my $date = `exiftool -d %Y%m%d-%H%M%S- -DateTimeOriginal -S -s $p`;
chomp $date;
$date .= "00";
my ($pathvol,$pathdir,$pathfile) = File::Spec->splitpath($p);
my $newfile = File::Spec->catfile($pathvol,$pathdir,$date.".JPG");
#write files
if( -f $newfile )
{
print "skipping $p: already exists at $newfilen";
return;
}
if($p ne $newfile)
{
rename($p,$newfile) or die "Error: could not copy $p to $newfile: $!n";
}
Notice that the entire chunk of code instantiating the exiftool object, etc. has beenreplaced by one line calling the executable in backticks. I haven’t done any testing to analyze which is more efficient, but it does look better, and it’s easier to conceptualize.
EDIT 3 Sept 2009: Updated Command Line Alternative
Thanks so much to Phil Harvey, the author of ExifTool, for reading and commenting on this article. He suggested a much cleaner command-line alternative so disregard the perlscript above!
exiftool -d %Y%m%d-%H%M%S-%%.2c.%%e "-FileName<DateTimeOriginal" FILE
Writing Data
Let’s pose an issue based on a problem I faced some time back. Consider an automated process that copies NEF files into a source directory (all uniquely renamed), and creates a smaller JPG file specifically sized for the web (about 600×400). The process of copying and resizing, however, does not preserve the EXIF data, and I would like to restore the information for possible inclusion into my website’s database.
In order to do this, we need (1) the original EXIF information from the NEF source file; and (2) the ability to write or copy that EXIF data into the destination JPG file. The following perl accomplishes this task. The code below uses an image in the form of a “blob,” in this case what has been returned from the ImageMagick function “ImageToBlob()” which will be discussed in a different article soon.
#!/usr/bin/perl -w use Image::ExifTool; use Image::Magick; $srcfile = "/home/spencerkellis/20051008-133601-00.nef"; #webfile should not exist yet! $webfile = "/home/spencerkellis/20051008-133601-00.jpg"; #process image as needed my $IM = Image::Magick->new(magick=>'jpg'); $IM->Read($srcfile); # ... resize here ... # #create blob my $final_blob = $IM->ImageToBlob(); #copy exif my $exifTool = new Image::ExifTool; $exifTool->SetNewValuesFromFile($srcfile); $exifTool->WriteInfo($final_blob,$webfile);
All of the Image::Magick section will be discussed in a different article. Basically, it’s a package to perform image manipulation in perl (and other languages) the same as if you had opened the image in GIMP or PhotoShop. The last two lines are where the magic happens; after instantiating the exiftool object, the second-to-last line retrieves the EXIF data from the NEF source file, and the last line creates a new file using the image information in $final_blob (the backslash preceding the variable creates a reference to the variable) and the EXIF data already stored in the $exifTool object. $webfile now has the same EXIF data as $srcfile!
Extracting Basic JPG from NEF
Accounting for about 700KB of a NEF’s size (usually around 5MB) is a full-sized Basic JPG file. Incorporating the ability to extract this file into an automated post-processing phase means never shooting RAW+JPG, which further means more space on a compact flash card! Not to mention potential space savings on a hard drive, and potential nightmares keeping track of which files have both NEF and JPG vs. NEF only or JPG only, vs. what has been edited for print or for web… the list goes on, and it’s a battle every digital photographer tackles at some point.
Using Perl and ExifTool, I wrote a script which will extract these basic JPGs automatically. The following code is incorporated into a larger script, but it shows the basic idea. I also want to note that there are simpler ways of running batch jobs to get JPGs out of NEF files; for instance, Ihave used and enjoyed Udi Fuch’s UFRaw package on the command-line in batch mode, and it runs fairly quickly with pretty good output.
#!/usr/bin/perl -w
use Image::Magick;
use Image::ExifTool;
my $srcfile = "/home/spencerkellis/dsc0001.nef";
my $webfile = "/home/spencerkellis/dsc0001.jpg";
my @ioTagList = ('JpgFromRaw');
my $exifTool = new Image::ExifTool;
$exifTool->Options(Binary=>1);
my $info = $exifTool->ImageInfo($srcfile, @ioTagList);
#manipulation
$IM = Image::Magick->new(magick=>'jpg');
$IM->BlobToImage(${$$info{'JpgFromRaw'}});
# ... do manipulation here ... #
$final_blob = $IM->ImageToBlob();
#restore exif
$exifTool->SetNewValuesFromFile($srcfile);
$exifTool->WriteInfo($final_blob,$webfile);
undef $IM;
undef $final_blob;
There are a few things to note here. First, in order to extract binary data (i.e., a JPG image), we have to set the Binary option in the ExifTool object to ’1′ (verses default 0′). Since in my script I’m manipulating the image using the ImageMagick package, I instantiate the extracted JPG as a new ImageMagick object, do some manipulation, then export it as a blob. Then, since EXIF data has not been preserved, I restore it using ExifTool. The extracted JPG is finally written using ExifTool to the path specified as $webfile.
P.S. This article is a transfer of most of the content of an article on my old website.









