Comparing Images and Measuring their Similarity in PHP

I’m very proud of myself tonight. You see, I’m a programmer at heart. Yet it has been a long time since I just programmed something for fun. Not a contest, not a class. Just out of pure usefulness and interest. Here’s the story.

php code for webcamI setup a webcam to watch my room for the day, and some software that uploads the webcam’s image to my server via FTP, 6 times per minute. The software is Active Webcam, and I just used the free evaluation version. Over the course of the day, it generated 6,522 images – way too many to view at once. So I decided to script something to make looking at the images more interesting.

The first obvious problem is that most of the images look exactly the same. Nothing happened during the day, so the images have no easily visible distinction. Yet they are not the same in terms of bits: the brightness has slight variations, the JPEG compression differed, etc. So doing an md5 comparison (which will be the same for files which are the same) doesn’t cut it. I need to actually look at the image data.

Fortunately, PHP has a great built-in image library known as gd. You have to have PHP compiled with it, but my host does, so I suspect most others do, too. After much trial and error, I managed to do the following:

  1. Get a directory listing of all the images in the directory.
  2. Sort these images so that the oldest ones appear first, the newest last.
  3. Remove duplicates by comparing md5 hashes.
  4. Create a custom PHP function that compares two images and determines how similar (or different) they are.
  5. Loop through all the images, displaying only the ones that are substantially different from the one previously displayed.

In the end, this brought the total number of images from 6,522 down to 247. Take a look at the finished output here.

Here’s how the image comparison works:

First, I consider the image as a grid of 10×10 squares: this allows me to check fewer pixels, so the function takes much less time– I’ll call this the sampling rate. Then, I look at the color of that pixel in each image. By separating the Red, Blue, and Green values, I can find the difference between the pixels by using a simple 3-D distance formula; namely, sqrt( pow(r1-r2,2) + pow(b1-b2,2) + pow(g1-g2,2) / 255 (the range of a pixel is 0-255, which is the number of colors provided by 8 bits, or 2^8) – this should give me a decimal number. I add the values of all these comparisons, round the final result to the nearest integer, and use this number.

I did not do any extensive mathematical proof to see what my final range of values should be. Experimentally, I could see that a value of 98 (for my test images, sizes, and sampling rate) was a huge difference; and fortunately, 7 was a very small difference: too small for me to discern the difference. Thus there is a wide range returned by my function, and it is very accruate. Just look at the final output and see for yourself (warning! the page does contain 247 images!).

Here are the resources I used:

  • php sort array
  • php image compare
  • imagecopyresized and imagecopyresampled
  • php file does not exist (file_exists)
  • imagecreatefromjpeg
  • php substr
  • javascript load images / javascript slideshow (not implemented yet)
  • simple image comparison in php / compare images
  • php image similarity
  • imagesy, Image, imagecolorexact, imagecolorclosest, imagecolorat, imagecreate
  • 3d distance formula php
  • php imagecolorsforindex
  • php math, square [^2 is actually a bitwise NOT operator, you have to use pow(base,power)]
  • php round
  • dissimilarity

If you have any questions, or ideas for applications of my dynamic image comparison function for PHP, leave a comment below!

P.S. I created from scratch the “code on a slate” image above using Fireworks, a web graphics editor. Awesome, isn’t it? Only took a few minutes, and I got to learn about masks in the process. Inside of it is real code from the PHP script I created to process my webcam images.

42 Responses to “Comparing Images and Measuring their Similarity in PHP”

  1. Adam Dempsey says:

    Very nice! any chance you might release the script?

  2. Ryan Rampersad says:

    That is truly something useful! I know about GD but I never thought to use it to compare images like that.

  3. Michael says:

    I like it! Was talking to someone the other day about webcam monitoring at a retail establishment for security and theft prevention. I was wondering how useful would it be to scroll through a days’ worth of video.

    Monitoring for differences may not be the right application if you want to keep an eye on someone in a video (like over the cash register). But I can imagine it used as a cheap/convenient way to monitor areas with minor traffic to see what’s happening.

  4. N4news says:

    Elliot, thanks for the cool idea – i’ll try to make the script myself. If it will be sucsessful – i’ll release it.

    • mijndert spel says:

      Did you manage to create the php code for comparing (webcam) images? I’m still very interested. greetz Mijndert

  5. mijnderd spel says:

    Usefull idea!
    Can you provide the code?
    That would realy help me a lot in understanding the above writing.
    thanks.

  6. yhoan says:

    Will you provide the code?

  7. Michael says:

    As there’s no source here, I found a tool that does a similar thing and is open-source. Thought I’d post it here for the masses:

    http://www.marengo-ltd.com/open_source/index.php

    It’s a JPEG Comparison Tool (c’mon Google, index that term!)

  8. Joel says:

    why the HELL did you NOT provide code ?Do you do this to tease or insult us? Do you know how many people NEED this? What the fuck was the point of this?

    What a jerk

  9. bro says:

    Im very interested in this code if you would like to share. I have excactly the same problem except its on a family blog site and theres thousands of snap shot from a web cam,

    Im trying to mark the images with significant changes. Siz , pixles…

    can you share please.

    Chris

  10. Bart says:

    @joel : charming, really …

  11. Peter says:

    What a bastard :P

  12. John Dawson says:

    Hey… I’d love to take a look at the script you used. Care to share? Thx!

  13. Remco says:

    Sounds very interesting, can you post the code somewhere?

    remco

  14. Dave says:

    I’m looking for code to compare two images to see if they are the same. My question is, although I’m checking to see if it’s an identical image, won’t there be slight discrepancies because the images won’t all be the same size? Would I be able to compare the md5 or no?

  15. frank says:

    why dont you post the code? the link doesn work

  16. jane says:

    moron

  17. paul says:

    very smart…all GD library functions..nice! can youpost the code?

  18. paul says:

    can i ask you whick kind of AVERAGE do you use for valuate all the final values?

    thanks

  19. milton says:

    i think you were wrong not to post the cose because you code lacks of lots of smartness.

    you didnt considered to rotate the images and to check if the image has similarity in each 90 degree rotation. because your script work only with images pretty similar – webcamwise – but doesnt work if you work with images utterly different like for instance:

    image A: a square half red and half white
    image B: a square half white and half red

    for a human eye they are the same image but your script is too dumb to recognize it. so basically you should chek the similary between the images A and B for each 90degree rotation, so you ll have

    image A 0 degree : imageB 0degree = 0% similarity

    mage A 0 degree : imageB 90degree = 50% similarity

    mage A 0 degree : imageB 180degree = 100% similarity

    mage A 0 degree : imageB 270degree = 50% similarity

    the median average is 50% and it’s definetly better than your script that would have given 0% simarity

    i think there are other solution but it’s a betetr starting point

  20. mdaud says:

    Im trying to write som code for my telescope tracking system. first I thought of vb. but I dont like windows. now I think php is great because it has all image manipulation functions.
    what I plan to do is compare 2 images and direct step motors to keep some object just in center.
    I intend to use just black and white colors.
    can you help

  21. Richie says:

    Very cool. I’m using something similar for a non-monospaced ascii art generator. People shouldn’t be complaining, you provided good instructions.

    @milton you should check out tineye ( http://tineye.com )

  22. dav says:

    Since you don’t post your code, it does not exist ;)

  23. davy says:

    Not very nice to keep the code all by yourself :(, I found this http://codefat.com/?p=80 , it’s not very nice but it’s working and it’s open for modification…

  24. t.muthukumari says:

    hello sir im final year student in trichyengineering college I ll do the project in php, that project similarly to ur project… so pls send ur project ideas and how u connected this project to php……..

  25. Asylum says:

    I am just commenting to all those who have been so rude to this article’s author in comments here. You should all be ashamed of yourselves. So he wrote a very detailed description of how to do an image comparison and did not give you the source code. Go pout now…

    I think it is safe to assume by this articles title that this is not a PHP tutorial. There is one found at PHP.net so try RTFM. As for the contents of this article he gives plenty of detail to make your own and I do not see once where he suggests that he is going to spoon feed anyone so go make your own and stop whining. You may just learn something and if you don’t want to make your own then there are plenty software developers out there that would be glad to make one for you for some form of fee.

    Had any of you been decent I would have offered up my versions source code as a comment here but seriously not a one is deserving of it so I wont bother.

    Props to you Elliot Lee

Leave a Reply to mdaud Cancel reply