Collecting an image file from the internet provides a couple of obvious pieces of information:
- The objects and/or persons shown in the picture (which can be identified or unidentified)
- The URL from where the image file was collected
In addition to this, there may be metadata available in the web page where the image file was embedded, that could tell you things like:
- the name of the photographer
- the geographic location where the photo was taken
- the time and date of the occasion
- the reason for taking the photo
- the purpose of publishing the photo
- the message communicated along with the photo (which could be political propaganda, commercial marketing etc)
Not many people are aware of the additional information that can be read from an image file thanks to metadata being saved in the file automatically. Embedding metadata in the actual image file avoids the risk of having the image and the metadata separated by mistake. In common speech, such data is referred to as EXIF data and IPTC data, since the data is stored in the image file in accordance with specifications called EXIF, Exchangeable Image File format, and IPTC IIM, International Press Telecommunications Council Information Interchange Model. In addition to these, there is a newer standard called XMP, Extensible Metadata Platform, created by Adobe Systems Inc. Serialized XMP can be embedded into several kinds of files, while also maintaining their readability by non-XMP-aware applications. XMP information is typically included alongside EXIF and IPTC IIM data.
The list of information types that an image file can contain in its embedded metadata is in theory endless, since software and hardware manufacturers are free to define their own “XMP tags”. This is a central part of the idea with XMP, hence the name “extensible”. Among the more exciting types of information that can be added to digital photos by cameras are:
- GPS coordinates for the geographic location where the picture was taken (requires GPS function in camera)
- Camera temperature (which should be close to the surrounding air temperature in most cases)
- Camera make and model, as well as firmware version (which can be important info in a forensics setting)
- Degree of zoom used, which provides a hint about the distance from the object
An image that has been edited using some software will reveal the name and version number of that software product in the embedded metadata, as well as the date and time of the last change.
While alot of information is added to image files automatically, by cameras and software, it can also contain embedded information such as keywords and description, that was added by someone working with the picture in some photo management software before publishing the picture on the net. For example, when using Picasa, the keywords and captions you add to picture are stored as IPTC values inside the actual image file, and follow along if the file is copied. The tragic downside of that neat feature is that the second you add the first keyword (label) to a picture using Picasa, the entire set of camera maker XMP metadata in the image file is destroyed – it disappears. I would love to hear the Picasa people at Google explain why this is so, since the labels and captions added with Picasa are stored as IPTC IIM, not as XMP, so there does not seem to be any obvious conflict. Can this be a simple bug in Picasa?
For investigating (and modifying) the metadata embedded in an image file, the preferred tool by professionals world-wide is ExifTool by Phil Harvey. It comes as a platform-independent Perl library, or as a command line application without graphic user interface. For Windows users, I therefore recommend that you use the GUI for ExifTools provided by a Slovenian guy using the alias HBx: http://freeweb.siol.net/hrastni3/foto/exif/exiftoolgui.htm By putting Phil Harvey’s exiftool.exe and HBx’s ExifToolGUI.exe in the same folder, you have an easy to use application which can read and write a very large part of the different types of image file metadata out there.
In addition to ExifTool, there are dozens and dozens of free applications that let you view and edit embedded image metadata. I will point out one of them, which I recently learned about, since it has an interface that provides good overview and also presents the tag-id values and data format type: http://www.photome.de/home.html PhotoMe is created and offered for free by Jens Duttke from Germany.
And finally, a quote from the Wikipedia article on EXIF, highlighting the OSINT potential in harvesting embedded metadata from digital images: “Since the Exif tag contains information about the photo, it can pose a privacy issue. For example, a photo taken with a GPS-enabled camera can reveal the exact location it was taken, which is undesirable in some situations. By removing the Exif tag with software such as ExifTool and Exif Tag Remover before publishing, the photographer can avoid possible problems.”