Comparing Images and Measuring their Similarity in PHP

I’m very proud of myself tonight. You see, I’m a programmer at heart. Yet it has been a long time since I just programmed something for fun. Not a contest, not a class. Just out of pure usefulness and interest. Here’s the story.

php code for webcamI setup a webcam to watch my room for the day, and some software that uploads the webcam’s image to my server via FTP, 6 times per minute. The software is Active Webcam, and I just used the free evaluation version. Over the course of the day, it generated 6,522 images – way too many to view at once. So I decided to script something to make looking at the images more interesting.

The first obvious problem is that most of the images look exactly the same. Nothing happened during the day, so the images have no easily visible distinction. Yet they are not the same in terms of bits: the brightness has slight variations, the JPEG compression differed, etc. So doing an md5 comparison (which will be the same for files which are the same) doesn’t cut it. I need to actually look at the image data.

Fortunately, PHP has a great built-in image library known as gd. You have to have PHP compiled with it, but my host does, so I suspect most others do, too. After much trial and error, I managed to do the following:

  1. Get a directory listing of all the images in the directory.
  2. Sort these images so that the oldest ones appear first, the newest last.
  3. Remove duplicates by comparing md5 hashes.
  4. Create a custom PHP function that compares two images and determines how similar (or different) they are.
  5. Loop through all the images, displaying only the ones that are substantially different from the one previously displayed.

In the end, this brought the total number of images from 6,522 down to 247. Take a look at the finished output here.

Here’s how the image comparison works:

First, I consider the image as a grid of 10×10 squares: this allows me to check fewer pixels, so the function takes much less time– I’ll call this the sampling rate. Then, I look at the color of that pixel in each image. By separating the Red, Blue, and Green values, I can find the difference between the pixels by using a simple 3-D distance formula; namely, sqrt( pow(r1-r2,2) + pow(b1-b2,2) + pow(g1-g2,2) / 255 (the range of a pixel is 0-255, which is the number of colors provided by 8 bits, or 2^8) – this should give me a decimal number. I add the values of all these comparisons, round the final result to the nearest integer, and use this number.

I did not do any extensive mathematical proof to see what my final range of values should be. Experimentally, I could see that a value of 98 (for my test images, sizes, and sampling rate) was a huge difference; and fortunately, 7 was a very small difference: too small for me to discern the difference. Thus there is a wide range returned by my function, and it is very accruate. Just look at the final output and see for yourself (warning! the page does contain 247 images!).

Here are the resources I used:

  • php sort array
  • php image compare
  • imagecopyresized and imagecopyresampled
  • php file does not exist (file_exists)
  • imagecreatefromjpeg
  • php substr
  • javascript load images / javascript slideshow (not implemented yet)
  • simple image comparison in php / compare images
  • php image similarity
  • imagesy, Image, imagecolorexact, imagecolorclosest, imagecolorat, imagecreate
  • 3d distance formula php
  • php imagecolorsforindex
  • php math, square [^2 is actually a bitwise NOT operator, you have to use pow(base,power)]
  • php round
  • dissimilarity

If you have any questions, or ideas for applications of my dynamic image comparison function for PHP, leave a comment below!

P.S. I created from scratch the “code on a slate” image above using Fireworks, a web graphics editor. Awesome, isn’t it? Only took a few minutes, and I got to learn about masks in the process. Inside of it is real code from the PHP script I created to process my webcam images.

42 Responses to “Comparing Images and Measuring their Similarity in PHP”

  1. Alex says:

    Seems overly simplistic. Maybe it’s good for comparing something that might be slightly similar, but I think your false positive rate would be very high, and done on a large scale most of your “buckets” would actually store images together with almost no similarity. Say someone used a negative filter. There would be no similarity, even though only filter was applied.

  2. Nick says:

    I think the article was quite clear on how he did it. Very interesting read, thanks. As someone mentioned, checking rotations of the images would be great, and you’d also want to check mirrored copies. I imagine you could come up with an algorithm that wouldn’t even have to perform the translations in order to check the aforementioned variations.

    Anyway, a very interesting read, thanks!

  3. moli says:

    useless amateur.

  4. bollegijs says:

    Dude the links to the code don’t work…
    Here is my version, do with it what you want.
    Srry for the dutch comments :P

    //Bepaal het verschil in kleuren tussen 2 foto’s
    function compare_imgs($path, $source_pic, $compare_pic){

    //Bepaal de hoogte en breedte van het bronbestand
    list($width, $height) = getimagesize($path.$source_pic);

    //Bereken de factor waarmee de aspect ration behouden kan worden
    $factor = $height / $width;

    //Nieuwe grootte bepalen
    $new_width = 13;
    $new_height = $new_width * $factor;

    //Source_image verkleinen
    $source_image = imagecreatefromjpeg($path.$source_pic);
    imagecopyresampled($source_image, $source_image, 0, 0, 0, 0, $new_width, $new_height, $width, $height);

    //Compare_image verkleinen
    $compare_image = imagecreatefromjpeg($path.$compare_pic);
    imagecopyresampled($compare_image, $compare_image, 0, 0, 0, 0, $new_width, $new_height, $width, $height);

    //X & Y pixel startcoordinaat bepalen
    $px_x = 1;
    $px_y = 1;

    //Total diff resetten
    $total_diff = 0;

    //Door de Y waardes lopen
    While($px_y < $new_height){

    //RGB van source image
    $source_rgb = imagecolorat($source_image, $px_x, $px_y);
    $source_colors = imagecolorsforindex($source_image, $source_rgb);

    //RGB van compare image
    $compare_rgb = imagecolorat($compare_image, $px_x, $px_y);
    $compare_colors = imagecolorsforindex($compare_image, $compare_rgb);

    //X & Y opslaan met de color difference
    $diff['x'] = $px_x;
    $diff['y'] = $px_y;
    $diff['diff'] = colordiff($source_colors['red'], $source_colors['green'], $source_colors['blue'], $compare_colors['red'], $compare_colors['green'], $compare_colors['blue']);

    //Toevoegen aan totale lijst met berekende pixels
    $colordiff[$i] = $diff;
    $total_diff += $diff['diff'];

    //Door de X waardes lopen
    While($px_x < $new_width){

    //RGB van source image
    $source_rgb = imagecolorat($source_image, $px_x, $px_y);
    $source_colors = imagecolorsforindex($source_image, $source_rgb);

    //RGB van compare image
    $compare_rgb = imagecolorat($compare_image, $px_x, $px_y);
    $compare_colors = imagecolorsforindex($compare_image, $compare_rgb);

    //X & Y opslaan met de color difference
    $diff['x'] = $px_x;
    $diff['y'] = $px_y;
    $diff['diff'] = colordiff($source_colors['red'], $source_colors['green'], $source_colors['blue'], $compare_colors['red'], $compare_colors['green'], $compare_colors['blue']);

    //Toevoegen aan totale lijst met berekende pixels
    $colordiff[$i] = $diff;
    $total_diff += $diff['diff'];

    $px_x++;
    $i++;
    }
    $px_x = 0;

    $px_y++;
    $i++;

    }

    • Sarabjit says:

      Hi dear

      I try the above code but it return an error “Fatal error: Call to undefined function colordiff() in D:\My projects\Image_match\a.php on line 50”

      pls let me know how can i use this

      thanks

    • tarun says:

      not work at php 5

  5. priya says:

    i need comapare to two images.one image stored to database.another one stored to folder.How to i comapare

  6. yauri says:

    Try my application on http://image.visualtechsolution.com
    I compare the image from the colors. The image clustered to 9 piece, so each image will have 9 histogram and i stored it into database. So image will be compared by color histogram. try it :) that system can be used to known if the image has been rotated per 90 degree.
    i still want to optimize the algorithm…. (Sorry for my English)

  7. ravi says:

    i got struck in one project .i need this visual comparision technique in your code.
    my project is to compare with the original website and find out the visual similarity.
    i have to compare all phishing pages .
    i would be extremely thankful if you could share this and my efforts of 4 months will come to an end.
    Thank you very much.

  8. tarun says:

    $diff[‘diff’] = colordiff($source_colors[‘red’], $source_colors[‘green’], $source_colors[‘blue’],

    Not work please help

  9. Jay says:

    OK, for those left without the rest of the above function, I’ve found a colordiff function elsewhere that was doing a similar thing and rewritten it slightly to work with the above code:

    function colorDiff($r1,$g1,$b1,$r2,$g2,$b2)
    {
    // do the math on each tuple
    // could use bitwise operates more efeceintly but just do strings for now.
    $red1 = hexdec($r1);
    $green1 = hexdec($g1);
    $blue1 = hexdec($b1);

    $red2 = hexdec($r2);
    $green2 = hexdec($g2);
    $blue2 = hexdec($b2);

    return abs($red1 – $red2) + abs($green1 – $green2) + abs($blue1 – $blue2) ;

    }

    Also, the function posted above isn’t quite complete as it doesn’t return anything and gives a warning about $i not being declared, so it should be (with dutch comments intact! ;)):

    //Bepaal het verschil in kleuren tussen 2 foto’s
    function compare_imgs($path, $source_pic, $compare_pic){

    //Bepaal de hoogte en breedte van het bronbestand
    list($width, $height) = getimagesize($path.$source_pic);

    //Bereken de factor waarmee de aspect ration behouden kan worden
    $factor = $height / $width;

    //Nieuwe grootte bepalen
    $new_width = 13;
    $new_height = $new_width * $factor;

    //Source_image verkleinen
    $source_image = imagecreatefromjpeg($path.$source_pic);
    imagecopyresampled($source_image, $source_image, 0, 0, 0, 0, $new_width, $new_height, $width, $height);

    //Compare_image verkleinen
    $compare_image = imagecreatefromjpeg($path.$compare_pic);
    imagecopyresampled($compare_image, $compare_image, 0, 0, 0, 0, $new_width, $new_height, $width, $height);

    //X & Y pixel startcoordinaat bepalen
    $px_x = 1;
    $px_y = 1;

    //Total diff resetten
    $total_diff = 0;

    //Door de Y waardes lopen
    $i=0;
    While($px_y < $new_height){

    //RGB van source image
    $source_rgb = imagecolorat($source_image, $px_x, $px_y);
    $source_colors = imagecolorsforindex($source_image, $source_rgb);

    //RGB van compare image
    $compare_rgb = imagecolorat($compare_image, $px_x, $px_y);
    $compare_colors = imagecolorsforindex($compare_image, $compare_rgb);

    //X & Y opslaan met de color difference
    $diff['x'] = $px_x;
    $diff['y'] = $px_y;
    $diff['diff'] = colordiff($source_colors['red'], $source_colors['green'], $source_colors['blue'], $compare_colors['red'], $compare_colors['green'], $compare_colors['blue']);

    //Toevoegen aan totale lijst met berekende pixels
    $colordiff[$i] = $diff;
    $total_diff += $diff['diff'];

    //Door de X waardes lopen
    While($px_x < $new_width){

    //RGB van source image
    $source_rgb = imagecolorat($source_image, $px_x, $px_y);
    $source_colors = imagecolorsforindex($source_image, $source_rgb);

    //RGB van compare image
    $compare_rgb = imagecolorat($compare_image, $px_x, $px_y);
    $compare_colors = imagecolorsforindex($compare_image, $compare_rgb);

    //X & Y opslaan met de color difference
    $diff['x'] = $px_x;
    $diff['y'] = $px_y;
    $diff['diff'] = colordiff($source_colors['red'], $source_colors['green'], $source_colors['blue'], $compare_colors['red'], $compare_colors['green'], $compare_colors['blue']);

    //Toevoegen aan totale lijst met berekende pixels
    $colordiff[$i] = $diff;
    $total_diff += $diff['diff'];

    $px_x++;
    $i++;
    }

    $px_x = 0;

    $px_y++;
    $i++;

    }

    return $total_diff;

    }

  10. MJ says:

    Hello Everyone,

    I’ve encounter a problem with regards of comparing images. Is it possible in PHP to get the differences between the two images? What I mean is, example the first image has a horizontal scratch, the second image has a vertical scratch, is this possible in PHP to detect that differences?

    Thanks in advance.

  11. Or Koren says:

    IS this an opensource code, can you share?
    please contact me

  12. Marcio says:

    Parse error: syntax error, unexpected T_STRING

    return abs($red1 – $red2) + abs($green1 – $green2) + abs($blue1 – $blue2) ;

Leave a Reply to ravi Cancel reply