This is a brief comment for anyone interested in learning how to do image processing and computer vision. It seems that some of the books mentioned here have some sort of free online version, so you may want to do a search before you buy.
Image processing is a big subject. But if you’re interested in image processing and stuff like neural networks, books about Computer Vision will apply image processing, exposing areas you may want to focus on. Learning OpenCV describes algorithms briefly and applies them. Computer Vision: Dana H. Ballard, Christopher M. Brown: 9780131653160: Amazon.com: Books is a bit old these days, but very nice. There’s a free online version, too.
Getting back to pure image processing, perhaps the most important technique is filtering, using convolution, so you’d probably want to be comfortable with that early on. The Fourier Transform allows you to view your images in the “frequency domain”, where am image is really a sum of waves whose amplitudes add in bright areas, and cancel each other in dark areas. Sharp edges are crafted by summing high frequency waves. So smoothing involves eliminating the high frequencies, and so on. You may never need to work in the frequency domain, but it’s good to know about it. For me, it is not easy to grok, but the insights are worth the effort (and for large filters, FFT is quicker).
For pure image processing, Digital Image Processing: Kenneth R. Castleman: 9780132114677: Amazon.com: Books is something of a classic (I think—I’ve had it for a long time), also has some sort of free online version. Likewise, Digital Image Processing: PIKS Scientific Inside: William K. Pratt: 9780471767770: Amazon.com: Books (I have a 1991 edition). Pratt and Castleman get very detailed, and get deep in the weeds, if you need that. There’s also Fundamentals of Digital Image Processing: Anil K. Jain: 9780133361650: Amazon.com: Books and Image Processing, Analysis, and Machine Vision: Milan Sonka, Vaclav Hlavac, Roger Boyle: 9780495082521: Amazon.com: Books, both beautiful books with lots of cool techniques covering a wide area. Browsing through these can give you ideas.
My copies are earlier editions, so they don’t go into the latest stuff. But the fundamentals don’t change much, anyway. I strongly recommend Python for your research because the feedback is immediate. There’s a website, PyImageSearch - Be awesome at learning OpenCV, Python, and computer vision which is extremely informative, with many examples in Python.
A blog about the PicLookup site, image search engines, web sites, databases and other technical and web business stuff.
Sunday, July 3, 2016
Sunday, May 1, 2016
If you're interested in image processing and image search, and you want to quickly get up to speed with current technology, take a look at PyImageSearch, and tell Adrian we sent you! He'll have you doing stuff like face detection and OCR in no time. Seriously, if you like to learn by doing, he's got cookbooks to get you started and offers advanced training from there. We're talking 15 lines of code, and you're detecting objects! And if you like theory, you get to learn about the state of the art, and can then deepen your knowledge in areas important to you.
At PicLookup, we're pretty good at fast search, but we're getting interested in AI, Deep Learning, and all that stuff, which is how we found Adrian's site.
At PicLookup, we're pretty good at fast search, but we're getting interested in AI, Deep Learning, and all that stuff, which is how we found Adrian's site.
Friday, April 29, 2016
A Tiny but Brave New Reverse Image Search Engine
A recent post describes how to use Reverse Image Search offered by the major search engines. Basically, you upload an entire image and discover where else it can be found. Our new reverse image search engine, piclookup.com introduces a new twist. It works even if you have only a tiny piece of the original image. After all, with text search a "piece of quoted" text reveals documents containing that phrase. So, why not expect the same thing with images?
Finding the original Flickr picture from a small portion of the left area. This is the Flickr license.
This article describes briefly how our little site came into existence, and how it works. But first, here's an example use case, called "catfishing". Let's say a stranger presents an image as their own selfie. Is it really them, or is it a tiny clipping copied from someone else's online class photo? Unless the photo is famous and heavily visited, today's big search engines will not likely find the original from a small part, so it's hard to catch the faker. Below is a snapshot of our examples page, with image portions taken from Flickr. Just copy one of these images to your clipboard, and paste it into the home page "paste" button. Beyond catfish, our search engine can find any sort of clipping--not just faces, but tree branches, coffee cups or whatever, regardless of rotation. In fact, the fourth image has been rotated.
PicLookup got started when one of us was developing a program for computer vision. The program chopped an image into lots of little pieces and memorized them for recall. Later, when exposed to only a portion of the original image, the program could still identify the image. By memorizing the tiny pieces, it could recall the original from a few pieces. And it could do this for a great number of images. Realizing this, we stopped working on vision and began work on the image search site instead.
We hurried to build the web site, the search engine, web crawler-robots to scour the web for pictures, and a big database to hold data derived from all those pictures. The next few paragraphs offer some details about how we got these pieces to work, starting with the heart of our site, the search engine itself.
The engine is a java program, serving as the backend of the web site. It leverages the amazing, free image-processing library called OpenCV (see the book, "Learning OpenCV") by leading every image through the processing steps, letting OpenCV do the gritty work of finding and extracting pieces, and then sending the final chunks of information to the database. That's for storage. For recall, it does exactly the same thing, except this time, instead of inserting data into the database, it queries the database for a match.
The database simply holds all the chunks of information about each image, including where the image is located. We are using MySQL, a well established standard database, supported by Oracle. High speed database performance is essential for an online search engine. Today, developers use a trick called caching to give lightning fast recall. Since RAM (your computer's memory chips) works thousands of times faster than a disk drive, the trick is to load your data into RAM, your cache. Alas, RAM expensive. A middle ground is to use solid state drives instead of disk drives whenever possible. We use all three media. For an enormous database, one that is too big for a single machine to contain, there is another essential technique called "sharding". The idea is to distribute the data across a number of separate machines, called "shards". A very informative book, "High Performance MySQL" has been an essential reference for us.
The data is collected by our web crawlers, which were written in java. We combined some online examples, and added our custom code, creating two separate crawlers. The first scans the web looking for images, recording their locations. The second crawler loads the images, extracts image data and stores it in the database. This robot shares image processing code with the search engine, since they both treat images the same way, capturing information about each tiny piece of the image, and where the image was found.
We wrote most of the web page html ourselves, but eventually we were helped by consultants for the finishing touches, like improving CSS, and the enhancing the page layout. Technical web issues we overcame included using AJAX to upload the image, perform the search and return results. This involves javascript, php, and java, a fairly standard "stack", which enables us to solve problems using plenty of online advice from sources like StackOverflow.com.
Conclusion: our image upload is remarkably easy for users. Search for an image by pasting from the clipboard or using file upload. (Confession: we still have work to do for mobile platforms.) So far, our robots have harvested over one million images. However, the web has untold billions of images, and we're wrestling with the resource problem--we need many more servers. We hope to grow by gaining "traction" and investing in more hardware. Thanks for letting us share our experience. We very much appreciate any and all feedback. If you're curious, please try out our site or watch a 2 minute, whirlwind demo YouTube video which dashes through some "reverse image lookups".
Finding the original Flickr picture from a small portion of the left area. This is the Flickr license.
This article describes briefly how our little site came into existence, and how it works. But first, here's an example use case, called "catfishing". Let's say a stranger presents an image as their own selfie. Is it really them, or is it a tiny clipping copied from someone else's online class photo? Unless the photo is famous and heavily visited, today's big search engines will not likely find the original from a small part, so it's hard to catch the faker. Below is a snapshot of our examples page, with image portions taken from Flickr. Just copy one of these images to your clipboard, and paste it into the home page "paste" button. Beyond catfish, our search engine can find any sort of clipping--not just faces, but tree branches, coffee cups or whatever, regardless of rotation. In fact, the fourth image has been rotated.
PicLookup got started when one of us was developing a program for computer vision. The program chopped an image into lots of little pieces and memorized them for recall. Later, when exposed to only a portion of the original image, the program could still identify the image. By memorizing the tiny pieces, it could recall the original from a few pieces. And it could do this for a great number of images. Realizing this, we stopped working on vision and began work on the image search site instead.
We hurried to build the web site, the search engine, web crawler-robots to scour the web for pictures, and a big database to hold data derived from all those pictures. The next few paragraphs offer some details about how we got these pieces to work, starting with the heart of our site, the search engine itself.
The engine is a java program, serving as the backend of the web site. It leverages the amazing, free image-processing library called OpenCV (see the book, "Learning OpenCV") by leading every image through the processing steps, letting OpenCV do the gritty work of finding and extracting pieces, and then sending the final chunks of information to the database. That's for storage. For recall, it does exactly the same thing, except this time, instead of inserting data into the database, it queries the database for a match.
The database simply holds all the chunks of information about each image, including where the image is located. We are using MySQL, a well established standard database, supported by Oracle. High speed database performance is essential for an online search engine. Today, developers use a trick called caching to give lightning fast recall. Since RAM (your computer's memory chips) works thousands of times faster than a disk drive, the trick is to load your data into RAM, your cache. Alas, RAM expensive. A middle ground is to use solid state drives instead of disk drives whenever possible. We use all three media. For an enormous database, one that is too big for a single machine to contain, there is another essential technique called "sharding". The idea is to distribute the data across a number of separate machines, called "shards". A very informative book, "High Performance MySQL" has been an essential reference for us.
The data is collected by our web crawlers, which were written in java. We combined some online examples, and added our custom code, creating two separate crawlers. The first scans the web looking for images, recording their locations. The second crawler loads the images, extracts image data and stores it in the database. This robot shares image processing code with the search engine, since they both treat images the same way, capturing information about each tiny piece of the image, and where the image was found.
We wrote most of the web page html ourselves, but eventually we were helped by consultants for the finishing touches, like improving CSS, and the enhancing the page layout. Technical web issues we overcame included using AJAX to upload the image, perform the search and return results. This involves javascript, php, and java, a fairly standard "stack", which enables us to solve problems using plenty of online advice from sources like StackOverflow.com.
Subscribe to:
Posts (Atom)