Face-recognition added in Picasa 3.6 – great OSINT processing tool

In release 3.6 of Picasa, Google added the Name Tags functionality. This means they put face recognition logic into Picasa, and added a special tag for specifying person identity by name, in addition to the previously available metadata types Labels and Caption.

So what does this mean? Well, it means that anyone with a personal computer can build a searchable library of portrait photos. Picasa will automatically locate faces in the pictures, and build a library of cropped pictures showing faces only, one by one.

As you identify a face by adding a name to it, Picasa will automatically apply the same name tag to all other pictures where a face has been detected and where the software finds high enough resemblance. This means that as new photos are added later on, Picasa will automatically name tag them, provided that there are previously tagged face images that allow for a comparison and identification to be made.

When Picasa finds a possible but not certain match, the image is tentatively tagged with the name, and you can later press green to confirm or press red to deny.

The face-recognition algorithm in Picasa is more likely to give false positives than to miss anything, in terms of finding faces in pictures. Below are a couple of examples of “false positives”, i.e. parts of images that Picasa suspected might be faces of people, but are not. As you can see, the software is not likely to miss a face where there is one.

So what are the potential use cases? Well, let’s see if we can invent a few.

Use case 1
You are assigned with the mission to collect biographic information including portrait photos of industry specialists and key decision makers at a trade show. While you don’t really know much about who is who when the trade show starts, you can start by taking massive amounts of photos of people and crowds, where ever you see them. Let Picasa index the photos, and list the faces detected in the pictures. As a first step, tag the faces with some code or number with the aim of indicating which face pictures have the same identity. As the trade show goes on, you may increasingly be able to connect names and faces. As you do, you replace the dummy name tag codes with real names. In this way, you will not have wasted any opportunity to take pictures of people just because you didn’t know who they were from the start.

Use case 2
You have a large – 100 000 images – collection of digital photos of people, that are not tagged or indexed in any way. Without being able to search for a name of someone, and as a result see pictures of that person, the photo collection certainly has limited value. Manually evaluating, classifying and tagging each photo – for each of the persons in each photo – is simply not feasible. But if Picasa does the job of framing faces in the pictures and understanding which faces are the same, the situation changes. You will definitly avoid double and tripple work caused by photo duplicates, and you may well find that many faces are automatically identified and name tagged by Picasa once you have identified a few different pictures of the same person.

CHANGE 2010: Robert Steele on open source intelligence

Retired Major Robert Steele, former CIA clandestine officer, founder of the Marine Corps Intelligence Center and expert on open source and real time intelligence, offers his perspective on how the latest information technology can greatly enhance America’s national security capabilities, at the National Press Club.

[…] “The new revolution started about 3 or 4 years ago in Sweden, which has now displaced Canada as the third-party of choice for United Nations peace keeping intelligence and sense-making. I’ve been helping develop a multinational information sharing and sense-making architecture and protocol and training course, and the first conference on this will be in Madrid in November 2010. We’re not waiting for the US government. We are going to create what the Swedes call M4IS2: Multi national, Multi Agency, Multi discipline,  Multi Domain, Information Sharing and Sense-making. And then at larger context real-time information is the difference between intelligence, and non-intelligence.”

Posted in Video. 2 Comments »

TED.com: Sean Gourley on the mathematics of war

By analyzing raw data on violent incidents in the Iraq war and others, Sean Gourley and his team claim to have found a surprisingly strong mathematical relationship linking the fatality and frequency of attacks.  Sean Gourley, trained as a physicist, has turned his scientific mind to analyzing data about a messier topic: modern war and conflict.

“…And I saw that there was information there. There was data within the streams of news that we consume. All this noise around us actually has information. So what I started thinking was, perhaps there is something like open source intelligence here. If we can get enough of these streams of information togetherwe can perhaps start to understand the war…”

Why is that spying? That’s surfing the web…

A 30 seconds clip that recycles the good old cliché about old-school spooks not knowing about or understanding OSINT:

Posted in Video. Tags: . Leave a Comment »

Social Media Risks

Below, I have listed online articles that are relevant to issues of privacy, identity theft and fraud in relation to Social Media.

Siciliano, Robert
April 7, 2010
Using Facebook to Steal Company Data
https://www.infosecisland.com/blogview/3579–Using-Facebook-to-Steal-Company-Data.html
Robert Siciliano is CEO of IDTheftSecurity.com a professional speaker and author.

Siciliano, Robert
March 30, 2010
Social Media and Identity Theft Risks PT II
https://www.infosecisland.com/blogview/3456-Social-Media-and-Identity-Theft-Risks-PT-II.html
Robert Siciliano is CEO of IDTheftSecurity.com a professional speaker and author.

Siciliano, Robert
March 24, 2010
Social Media and Identity Theft Risks PT I
https://www.infosecisland.com/blogview/3417-Social-Media-and-Identity-Theft-Risks-PT-I.html
Robert Siciliano is CEO of IDTheftSecurity.com a professional speaker and author.

Himley, Mike
March 19, 2010
The limits of social network privacy

Siciliano, Robert
March 15, 2010
Social Media Sticky Situations
https://www.infosecisland.com/blogview/3283-Social-Media-Sticky-Situations.html
Robert Siciliano is CEO of IDTheftSecurity.com a professional speaker and author.

Language guide – OSINT in foreign tongues

Below are translations of “Open Source Intelligence” into a number of languages:

French: Renseignement de source ouverte
German: Offene Informationsgewinnung
Italian:  Informazioni di fonti aperte
Norwegian: Etterretning fra åpne kilder
Russian: Разведка по открытым источникам
Spanish: Inteligencia de fuente abierta
Swedish: Underrättelser från öppna källor

Past, Present and Future of OSINT

The International Relations and Security Network (ISN) is the world’s leading open access information service for international relations and security professionals. In a 12-minute podcast from October 12, 2009, professor Arthur S Hulnick is interviewed. He provides a brief history of the use of OSINT, and touches upon it’s limitations in relation to HUMINT. He also underlines the importance of source criticism and information evaluation.

This is 12 minutes well spent for anyone interested in an introduction to the concept of OSINT.
Listen to the podcast on this site: ISN podcast – Past, Present and Future of OSINT
Professor Arthur S Hulnick has a background 30 years in the US intelligence community, mostly in the CIA. Today, he is an Associate Professor at the Department of International Relations at  Boston University.

Unstructured text processing – software doing the job

An Information Week article on Tuesday, March 23, picked up a press release from Clarabridge, announcing that Wendy’s will start using Clarabridge for automated processing of unstructured data:

The Clarabridge text analytics solution will be used to analyze nearly half a million text-based customer comments per year collected from Wendy’s Web-based feedback form, call center notes, e-mail messages, receipt-based surveys, and social media sources.

[…]

Over the last decade, text analytics has evolved from a rarified technology used almost exclusively by government intelligence agencies and high-end financial firms to something far more accessible.

[…]

Clarabridge and Attensity are among the leading best-of-breed text analytics vendors.

Read  the full article here:
Wendy’s Taps Text Analytics To Mine Customer Feedback

Software that can process unstructured text with a successful result is an exciting area. For example, the potential value of any investment in such systems increases day by day thanks to the growing use of social media (read more: Social Media: Marketing Input, Intelligence Output). 80% or more is a common number for describing how large the portion of unstructured data is in relation to the total amount of data available globally. In plain english: the most part is free text, not tables.

Employment ads give it all away

A very useful source of information when doing competitive intelligence work are employment advertisements.  While any organization or company will often keep the description of themselves pretty polished and non-internal in their marketing and PR communication, there is a lot more that is both explicitly said and written between the lines in their job ads.

This is particularly useful when researching non-listed companies. In my experience, companies don’t seem to think about competitors as being among the readers of their employment ads, judging from how many of them are written. Also, it can be a difficult balance act to reveal enough to attract candidates, while not giving away details that provide competitors with too much insight.

A single employment advertisement can provide a lot, and a series of ads over time even more. Things such as organizational structure and chains-of-command can be mapped up, even for companies and organizations who otherwise are very discrete with that type of information. New technology development projects can be spotted as the company hires new specialists. The size of teams and departments in terms of coworkers is often written in the clear, for the team or department in question. Salary levels and in turn approximated total cost of staff is another thing that can be deduced by collecting ads over time.  The plain frequency of employment ads publications and the variation over time is an indicator of the state of business: are they growing and expanding business, or not?

For software development companies, the programming skills and programming language knowledge demanded will tell alot about what the company is up to. In some countries, it is required by law to list a name and phone number of a workers’ union representative. That name can in turn be a key to additional information from LinkedIn or Facebook.

So, my advice in summary:

  • monitor your competitors’ career pages on the web,
  • monitor their ads on Monster.com and similar services,
  • collect the ads they publish,
  • keep good track of when they were published and when applications were due,
  • process and organize the bits of information found in the ads
  • combine the information with facts from other sources