Scala fork of pHash library. This library identifies whether images are similar. You can try it at demo page.
Original pHash uses CImg library for image processing but I could not find CImg for jvm. Therefore I use java.awt
and self-made functions for image processing. Consequently, results of my library is different from original phash.
My library implements three Perceptual Hashing algorithms: Radial Hash, DCT hash and Marr hash. More info about it.
libraryDependencies += "com.github.poslegm" %% "scala-phash" % "1.2.2"
There is three functions for each hashing algorithm. Let's consider them by example of DCT hash:
def dctHash(image: BufferedImage): Either[Throwable, DCTHash]
― compute image's hash;def unsafeDctHash(image: BufferedImage): DCTHash
― compute image's hash unsafely (danger of exception);def dctHashDistance(hash1: DCTHash, hash2: DCTHash): Long
― compare hashes of two images.
Similar functions written for Marr and Radial Hash algorithms.
All public api with scaladocs decsribed in object PHash
.
import scalaphash.PHash._
import javax.imageio.ImageIO
val img1 = ImageIO.read(new File("img1.jpg"))
val img2 = ImageIO.read(new File("img2.jpg"))
val radialDistance: Either[Throwable, Double] = for {
img1rad <- radialHash(img1)
img2rad <- radialHash(img2)
} yield radialHashDistance(img1rad, img2rad)
radialDistance.foreach {
case distance if distance > 0.95 => println("similar")
case _ => println("not similar")
}
radialDistance.left.foreach(e => println(e.getMessage))
Radial distance is more when images are similar. DCT and Marr distances are less when images are similar. Recommended to make a decision on image similarity when at least two hashes pass thresholds.
radial: 0.9508017124330319
dct: 13
marr: 0.5052083333333334
radial: 0.3996241672331173
dct: 41
marr: 0.4704861111111111
My results is not compatible with original pHash. Use original library if you have an opportunity.
Also, it works much slower than c++ version (about 5-7 times).