Archive for category Programming

Using Perl and ExifTool to Access EXIF Data in Digital Images

Overview

EXIF data in digital images is a fairly complete snapshot of all camera and flash settings at the time of exposure. The information can be extremely useful in many ways, such as automatic image organization; metadata repositories for advanced searching; and unique file renaming, to name a few. In fact, Nikon NEF files have a full-size Basic JPG image stored in the EXIF information, along with a thumbnail-sized preview. Because of this, I never shoot RAW+JPG, since I get that JPG for free with every RAW file anyway — I just have to extract it. This article documents how to access this and other data using Perl and the ExifTool package, written by Phil Harvey (documented on his website, http://www.sno.phy.queensu.ca/~phil/exiftool/). Both are available as free downloads for several platforms, including Linux and Windows XP. I have used both packages on both platforms.

Reading Data

In order to use the package, it must be included at the top of the perl file:

#!/usr/bin/perl -w
use Image::ExifTool;

Reading data is fairly simple once a general process is established. Information is organized for the most part in a “key/value” structure, like a hash in perl (and in fact, data returned by the exiftool object is represented by a hash structure). The following perl code reads the Shutter Speed associated with a particular image:

my $srcfile = "/home/spencerkellis/dsc0001.jpg";
my $srcfileExif = new Image::ExifTool;
my @srcfileTagList = ('ShutterSpeed');
my $srcfileInfo = $srcfileExif->ImageInfo($srcfile,@srcfileTagList);
print "Shutterspeed: ".$$srcfileInfo{'ShutterSpeed'};

The code is straightforward enough; line 1 establishes a file (note the absolute path); line 2 instantiates an Image::ExifTool object; line 3 specifies which tags to read; line 4 actually reads the information; finally, line 5 prints the returned value.

A Simple Example: Renaming Files Based on EXIF Date and Time

One of my principle uses of perl and exiftool is to rename files automatically to a unique filename based on date and time, in the following format:

YYYYMMDD-HHMMSS-II.FFF

‘I’ stands for an “index”, to allow for cases in which multiple pictures were taken in the same second. ‘F’ represents the file format extension (i.e., JPG or NEF for Nikon RAW files). The process of renaming files, tedious by hand, is simple and efficient to automate with perl.

#!/usr/bin/perl -w
use File::Spec;
use Image::ExifTool;

my $p = shift;

#extract date from EXIF
my @ioTagList = ('DateTimeOriginal');
my $exifTool = new Image::ExifTool;
$exifTool->Options(DateFormat => "%Y%m%d-%H%M%S-");
my $info = $exifTool->ImageInfo($p, @ioTagList);

#create new filename
my $name = sprintf("%s00",$$info{'DateTimeOriginal'});
my ($pathvol,$pathdir,$pathfile) = File::Spec->splitpath($p);
my $newfile = File::Spec->catfile($pathvol,$pathdir,$date.".JPG");

#write files
if( -f $newfile )
{
print "skipping $p: already exists at $newfilen";
return;
}
if($p ne $newfile)
{
rename($p,$newfile) or die "Error: could not copy $p to $newfile: $!n";
}

Again, a straightforward example with a few extra lines to take note of. In this case,to simplify my perl code, I used an exiftool construct allowing options to be set which govern output format:

$exifTool->Options(DateFormat => "%Y%m%d-%H%M%S-");

This option instructs exiftool to output the date in a specific format corresponding to the filename format I described above. As a simple example, I did not include my method for handling multiple files with the same date and time; the code will simply skip renaming files where the target filename already exists. The sprintf line constructs the string holding the filename, and the File::Spec package is used to split apart and reconstruct paths. The code requires an argument (the path to a file); in my full script, this code is a sub, and passed the filename for each entry in a directory.

Command Line Alternative

After spending quite a lot of time getting to know the exiftool package in perl, I started to think about ways to do it with simpler code. As it turns out, using the command line version of exiftool can result in much cleaner code. Consider the following, which is virtually the same perlscript as above, except executing the exiftool executable instead of in line perl code:

my $p = shift;

#extract date from EXIF
my $date = `exiftool -d %Y%m%d-%H%M%S- -DateTimeOriginal -S -s $p`;
chomp $date;
$date .= "00";
my ($pathvol,$pathdir,$pathfile) = File::Spec->splitpath($p);
my $newfile = File::Spec->catfile($pathvol,$pathdir,$date.".JPG");

#write files
if( -f $newfile )
{
print "skipping $p: already exists at $newfilen";
return;
}
if($p ne $newfile)
{
rename($p,$newfile) or die "Error: could not copy $p to $newfile: $!n";
}

Notice that the entire chunk of code instantiating the exiftool object, etc. has beenreplaced by one line calling the executable in backticks. I haven’t done any testing to analyze which is more efficient, but it does look better, and it’s easier to conceptualize.

EDIT 3 Sept 2009: Updated Command Line Alternative

Thanks so much to Phil Harvey, the author of ExifTool, for reading and commenting on this article.  He suggested a much cleaner command-line alternative so disregard the perlscript above! :)

exiftool -d %Y%m%d-%H%M%S-%%.2c.%%e "-FileName<DateTimeOriginal" FILE

Writing Data

Let’s pose an issue based on a problem I faced some time back. Consider an automated process that copies NEF files into a source directory (all uniquely renamed), and creates a smaller JPG file specifically sized for the web (about 600×400). The process of copying and resizing, however, does not preserve the EXIF data, and I would like to restore the information for possible inclusion into my website’s database.

In order to do this, we need (1) the original EXIF information from the NEF source file; and (2) the ability to write or copy that EXIF data into the destination JPG file. The following perl accomplishes this task. The code below uses an image in the form of a “blob,” in this case what has been returned from the ImageMagick function “ImageToBlob()” which will be discussed in a different article soon.

#!/usr/bin/perl -w
use Image::ExifTool;
use Image::Magick;

$srcfile = "/home/spencerkellis/20051008-133601-00.nef";

#webfile should not exist yet!
$webfile = "/home/spencerkellis/20051008-133601-00.jpg";

#process image as needed
my $IM = Image::Magick->new(magick=>'jpg');
$IM->Read($srcfile);

# ... resize here ... #

#create blob
my $final_blob = $IM->ImageToBlob();

#copy exif
my $exifTool = new Image::ExifTool;
$exifTool->SetNewValuesFromFile($srcfile);
$exifTool->WriteInfo($final_blob,$webfile);

All of the Image::Magick section will be discussed in a different article. Basically, it’s a package to perform image manipulation in perl (and other languages) the same as if you had opened the image in GIMP or PhotoShop. The last two lines are where the magic happens; after instantiating the exiftool object, the second-to-last line retrieves the EXIF data from the NEF source file, and the last line creates a new file using the image information in $final_blob (the backslash preceding the variable creates a reference to the variable) and the EXIF data already stored in the $exifTool object. $webfile now has the same EXIF data as $srcfile!

Extracting Basic JPG from NEF

Accounting for about 700KB of a NEF’s size (usually around 5MB) is a full-sized Basic JPG file. Incorporating the ability to extract this file into an automated post-processing phase means never shooting RAW+JPG, which further means more space on a compact flash card! Not to mention potential space savings on a hard drive, and potential nightmares keeping track of which files have both NEF and JPG vs. NEF only or JPG only, vs. what has been edited for print or for web… the list goes on, and it’s a battle every digital photographer tackles at some point.

Using Perl and ExifTool, I wrote a script which will extract these basic JPGs automatically. The following code is incorporated into a larger script, but it shows the basic idea. I also want to note that there are simpler ways of running batch jobs to get JPGs out of NEF files; for instance, Ihave used and enjoyed Udi Fuch’s UFRaw package on the command-line in batch mode, and it runs fairly quickly with pretty good output.

#!/usr/bin/perl -w
use Image::Magick;
use Image::ExifTool;

my $srcfile = "/home/spencerkellis/dsc0001.nef";
my $webfile = "/home/spencerkellis/dsc0001.jpg";
my @ioTagList = ('JpgFromRaw');
my $exifTool = new Image::ExifTool;
$exifTool->Options(Binary=>1);
my $info = $exifTool->ImageInfo($srcfile, @ioTagList);

#manipulation
$IM = Image::Magick->new(magick=>'jpg');
$IM->BlobToImage(${$$info{'JpgFromRaw'}});

# ... do manipulation here ... #

$final_blob = $IM->ImageToBlob();

#restore exif
$exifTool->SetNewValuesFromFile($srcfile);
$exifTool->WriteInfo($final_blob,$webfile);
undef $IM;
undef $final_blob;

There are a few things to note here. First, in order to extract binary data (i.e., a JPG image), we have to set the Binary option in the ExifTool object to ‘1′ (verses default 0′). Since in my script I’m manipulating the image using the ImageMagick package, I instantiate the extracted JPG as a new ImageMagick object, do some manipulation, then export it as a blob. Then, since EXIF data has not been preserved, I restore it using ExifTool. The extracted JPG is finally written using ExifTool to the path specified as $webfile.

P.S. This article is a transfer of most of the content of an article on my old website.

, , , ,

3 Comments

Merging Google Syntax Highlighter with TinyMCE

Syntax Highlighting in TinyMCE

TinyMCE is a powerful in-browser WYSIWYG editor. It’s used in well-known platforms such as WordPress to allow users the ability to edit blog posts right in the browser. I’ve been using TinyMCE for several years now, but until today I hadn’t found a decent solution for adding code snippets with syntax highlighting inside TinyMCE.

Google SyntaxHighlighter

Enter Google SyntaxHighlighter. You may have noticed the signature appearance on several web-design related websites: nettuts.com, davidwalsh.name, and scriptandstyle.com, just to name a few. It integrates several attractive features – decent syntax highlighting, cross-browser (all javascript/css), and the option to view plain text.

SyntaxHighlighter can be configured to use either ‘pre’ or ‘textarea’ elements (see this discussion for more details on the choice). In either case, add two attributes to the element and you’re set:

<pre name=code class=javascript></pre>

Unfortunately, working with textareas in TinyMCE is awkward at best (consider – what other use could there possibly be for textareas inside a WYSIWYG editor?). Okay, no textareas – no problem, just switch to the pre element! Here’s where TinyMCE’s powerful featureset gets in the way: the ‘name’ attribute isn’t technically supported for the pre tag, and TinyMCE will strip it from your HTML if you try and add it by viewing the code.

The cleanest way to get around this problem is to add an extended_valid_elements to your tinyMCE init, and include pre[name] in the element list. TinyMCE will merge the extended_valid_elements with the default valid_elements to allow the name attribute along with already-allowed attributes.

tinyMCE.init({
mode : "textareas",
theme : "advanced",
extended_valid_elements : "pre[name]"
});

Be aware that caching can make it seem like your changes aren’t making any difference! If in doubt, clear your cache.

RichGuk’s syntaxhl TinyMCE plugin

There’s an easier way than editing HTML every time you want to add a code snippet. I found a great plugin – syntaxhl by RichGuk (installation instructions are included with the download).

By default, his plugin uses textareas but changing it to use pre tags is simple. Edit syntaxhl/js/dialog.js and replace all instances of the textarea tag with a pre tag (there are only two instances, opening and closing tags). The final version is shown below:

f.syntaxhl_code.value = f.syntaxhl_code.value.replace(/</g,'<');
f.syntaxhl_code.value = f.syntaxhl_code.value.replace(/>/g,'>');
textarea_output = '<pre name="code" ';
textarea_output += 'class="' + f.syntaxhl_language.value + options + '" cols="50" rows="15">';
textarea_output +=  f.syntaxhl_code.value;
textarea_output += '</pre> '; /* note space at the end, had a bug it was inserting twice? */
tinyMCEPopup.editor.execCommand('mceInsertContent', false, textarea_output);
tinyMCEPopup.close();

You’ll need to get the newline and br options set up correctly to preserve whitespace in your code snippets.

Fully Integrated Syntax Highlighting

With syntaxhl integrated and working, I have single-button access to UI-level syntax highlighting. Writing tutorials is infinitely easier with a simple solution for sharing code. If you’re interested, I found a few alternatives along the way that might be better suited to your needs:

I hope this article has been useful. Let me know in the comments if you have suggestions or questions!

P.S. This article is a transfer of most of the content of an article on my old website.

, ,

1 Comment

MATLAB Tips and Tricks

Over the course of my graduate work, I’ve spent a fair amount of time with MATLAB.  These are a few small tricks I wish I had known from the beginning.

Preallocation

I am a believer in preallocation.  For a particular application, I read in about 13GB of data from a file into a 4-D matrix (this was running on a machine with 32GB of memory).  Before preallocation, I let the process run for about 10 hours, and it still hadn’t finished.  With preallocation the process finished in about 20 minutes.  That’s an improvement of at least 3,000%!

Permute

The permute function can be quite handy – it can shuffle around the dimensions in a matrix with a single function call.  Consider the following matrix:

m_one = rand([2 5 4000]);

The size of this matrix is reported as 2×5x4000 in MATLAB. Now, we can shuffle the dimensions. Let’s make it a 4000×2x5 array:

m_one = permute(m_one, [3 1 2]);

The permute() function takes the array to shuffle around, and the new order of dimensions. In this case, the 3rd dimension moved to become first, 1st dimension second, and 2nd dimension last so that a 2×5x4000 matrix becomes a 4000×2x5 matrix.

Vector notation

Be careful, though: as handy as permute can be, it’s easy to use it inefficiently.  Remember that 13GB 4-D matrix?  I ran permute on that, and memory usage immediately doubled.  In general, I recommend creating the data the right way first! It will save a lot of headache (and RAM) down the road.

If you desperately need only a subset of dimensions, an alternative solution is to use MATLAB’s built-in, efficient vector notation.  For example, to extract the first and third dimensions for a single 2nd-dimension element, just use

m_two = m_one(:,1,:);  

The one downside here is that you’ll end up with an annoying singleton dimension that can frustrate other builtin functions like plot. The squeeze function will rescue us.

Squeeze

Squeeze is cool.  After running the previous code, size(m_two) shows us that m_two is a 4000×1x5 matrix.  We could use indexing to access all these elements, but squeeze will make life much easier – it will remove the singleton dimension in the middle.

m_two = squeeze(m_two);  

Now, size(m_two) tells us we’ve got a 4000×5 matrix and using the matrix just got that much simpler.

Removing elements from vectors and matrices

There are times when you want to discard elements from a vector or matrix.  I used to do this by creating a new variable to hold just the elements I wanted to keep.  Obviously, there’s a better way.  Let’s remove all the elements of a matrix that are less than 0.5.  It’s insanely easy:

m_three = rand(1,1000);
m_three( m_three < 0.5 ) = [];

Now, size(m_three) gives 1x2023.

Conclusions

If you haven't noticed yet, MATLAB is all about the matrices. Understanding how to efficiently operate on subsets of matrices will give you huge returns in performance. Learn how and when to use permute, squeeze, and vector notation and you'll be well on your way. Anything else you think should be on this page? Let me know in the comments!

P.S. This article is a transfer of most of the content of an article on my old website.

, , ,

1 Comment

RSSPhoto: A Wordpress Widget

I’ve made a Wordpress widget called RSSPhoto that pulls images from photoblog RSS or Atom feeds and displays a thumbnail in the sidebar (I’m using it in my sidebar).  I mentioned the beginnings of this project a couple posts back.  That one was pulling images straight out of my database though.  This version is more generic and uses the SimplePie PHP library for parsing feeds.

Even though I say it pulls from photoblog feeds, really it can pull from any feed–it will just pull one of the images out of the content of a feed item.  I’ve used it to pull Flickr RSS feeds and a couple other random ones.  I also installed it easily on Emily’s blog.

Check it out on my new Projects page (direct link).  If you have a Wordpress blog, try it out and let me know how your experience was.  I’d definitely appreciate any feedback.

, , ,

6 Comments

Plugins and Widgets and Wordpress–Oh my!

From my last post you know I’m trying to figure out the best configuration of my domain and subdomains.  I’m leaning toward the blog as my principal domain content with the photography blog as a linked subdomain.

sidebar

To sleep well at night with this change, I need a good way to interface my photography with the blog.  After testing a few options, I decided to try developing a Wordpress Widget to display the most recent photo.  It was surprisingly easy!  I thought I might document a bit about the design of a Wordpress widget.  I know my audience (consisting mostly of family) has really been hoping for a technical article with some code to sink their teeth into.  Right?

To really understand widgets, you should play around with them first in your own admin section, under Appearance->Widgets.  You can drag and drop from available widgets to the sidebar or footer to define the contents of those areas of your blog. You’ll notice when you drag a widget, a form appears with some fields to define.

On to creating a widget.  First, make a directory to hold all the plugin files in the wp-content/plugins directory.  Mine, for example, is wp-content/plugins/skphoto/.

Second, you need to define a class that extends WP_Widget, and includes a constructor function, widget function, update function, and form function.  This same file should also register the widget.  Here’s an outline of what that looks like:

class WidgetName extends WP_Widget
{
  /* constructor */
  function WidgetName() {}

  /* display the widget */
  function widget($args, $instance) {}

  /* save widget settings */
  function update($new_instance, $old_instance) {}

  /* edit form for the widget */
  function form($instance) {}
}

function WidgetNameInit()
{
  register_widget('WidgetName');
}
add_action('widgets_init', 'WidgetNameInit');

See Jesse Altman’s tutorial post for details on these functions.

I do want to give a bit more information about the widget function. If you want to access your MySQL database from within the widget function, you’ll need to add global $wpdb; to the top of your function.  Then, you can use standard Wordpress database functions to pull data out.

Here’s the basics of my widget function for reference:

  function widget($args, $instance){
    global $wpdb;
    extract($args);
    $title = apply_filters('widget_title', empty($instance['title']) ? ' ' : $instance['title']);
    $feed_url = empty($instance['feed_url']) ? 'http://photography.spencerkellis.net/atom.php' : $instance['feed_url'];

    # Before the widget
    echo $before_widget;

    # The title
    if ( $title )
    echo $before_title . $title . $after_title;

    $sql = "SELECT `path` FROM `photos` ORDER BY `date` DESC LIMIT 1";
    $path = $wpdb->get_var($sql);
    echo "";

    # After the widget
    echo $after_widget;
  }

Finally, activate your new plugin in the Wordpress Plugins section. Then go back to the Widgets where the new widget will be available to drag and drop.

So there it is.  Easy-peezy, right?  Just a few lines of code, as I always like to say.  Here are a few resources if you want to explore on your own:

So I’ll be looking for your thoughts.  What do you think about this solution for a home page?

, , ,

No Comments