Search by color
I will explain how search by color works.
Some features I think should be highlighted, like none of my known galleries has this feature.
- Search by multiple colors at the same time.
- Single color can be appended more than one time. That way we can decrease of increase one color density. See Black color is three times more than white
- Search by keyword including color weighting. This search combination uses only sphinx and does not relay on mysql database. I will explain later how this was reached.
- Search by similar colors density images, it just searches of other images with relevant colors density.
- Search mode features include and exclude modes. You want images with yellow color (include), but do not want that they had black color (exclude). Example (screenshots must have green color, but also do not have black color):http://pixwebs.com/gallery/color/(color)/116/(ncolor)/120
Search by color algorithm is based on information found at:
Also regarding search by color functionaly few new settings are presented in settings file
'color_search' => array ( 'search_enabled' => true, 'delay_index' => false, 'memory_table' => false, 'minimum_color_match' => 25, 'maximum_filters' => 8, 'color_indexer_external' => false, 'color_indexer_path' => './bin/color_indexer/color_indexer', 'delay_index_portion' => 1000, 'max_matches' => 1000, 'database_handler' => true, 'extended_search' => false, 'inside_width' => 68, 'inside_height' => 68, 'roi_form' => 'rectangle' ),
- search_enabled - this variable disables or enables search by color. Second one actives delay color indexing. Because color indexins is CPU and Mysql intesive application it can be done by main gallery cronjob in background process.
- memory_table - uses memory table for sorting matched images. With small number of photos i suggest keep it false. But if your gallery has more than 10k images I suggest activate this feature. Just make sure that your mysql installation support Engine = Memory
- minimum_color_match - images are indexed from thumbnail size images. By default this size is 120x130 So script have to index 15600 pixels for each images. This variable set's how many same color have to be matched to have index stored in database. It's threshold of colors. Changing this value requires re-indexing all images.
- maximum_filters - describes how many color filters users can use at single time. By default it's 8.
- color_indexer_external - if you are running on dedicated, or VPS server, you can enable external color indexer, this small application was writeln by me and gives 25x times speed improvement against standard implementation on php and mysql. This feature works only on linux machines. Read more about color_indexer
- color_indexer_path - defines where is executable location
- delay_index_portion - then using delay_index feature this defines how many records can be processed in single cronjob cycle. If you are using standard color indexing feature, I suggest keep this number relative low, to avoid server overload, but if you are using provided color_indexer I suggest increase this size till 500 or more.
- max_matches - maximum matches then multiple filters are used. Because multiple color filters involves hard internal mysql optimization, we limit this to 1000 by default, only first 1000 images will be returned and later displayed, default sort order of these images are from newest to oldest. We perhaps lose some images, but performance increases dramatically.
- database_handler - by default all searches by color is done by MySQL, but if you set this variable to false it will use sphinx as color search handler, sphinx produces better results, and it does not use max_matches. It provides much better and stable performance then more than one color is searched at a time. I definetly recomend to set this to false then you have sphinx running.
extended_search - by default then searching by multiple colors without keywords all sorting is based on keyword density using sphinx word_count as ranker. But with multicolor search it produces a little bit worse results than MySQL search handler. In order to use another algorithm witch is based on same relevance ranking as MySQL handler you have to do few things.
- Create a view based on this view
- Sphinx index configuration have to be updated according to this configuration sample.
set extended_search to true.
- Notice This search produces more relevance images with multiple colors searches, but increases load on MySQL, generally cpu. Because of dynamic color density calculation. Its up to you use it or not. If you do not care mutch how precise multicolor search works with sphinx I suggest keep this to false.
- inside_width - use original thumbnail width in percent. I recommend keeps this setting around 68%. Reference:http://www.cs.cmu.edu/~har/visapp2006.pdf
- inside_height - use original thumbnail height in percent. I recommend keeps this setting around 68%. Reference:http://www.cs.cmu.edu/~har/visapp2006.pdf
- roi_form - it can be rectangle or ellipse.
To gain maximum performance from color search main color index table can be paritioned using mysql partitions feature. If you now that your Mysql server supports partitions i suggest run this query
ALTER TABLE lh_gallery_pallete_images partition BY HASH(pallete_id) partitions 200;
Each color in pallete will have it's own partition witch will boost search/indexing speed. If anyone has better solution how to store image color index'es I'm ready to listen.
Best practise for best performance
- Partition mysql table, it will boost indexing speed. The biggest table i have has 9.5M records.
- Disable search by database layer (MySQL) setting database_handler to false.
- Set color_indexer_external to true, it will boost indexing speed.
Enjoy amazing usability and feature, that none of other gallery has.Back »