In some embodiments, image spam is identified by comparing color histograms of suspected spam images with color histograms of reference (known) images. The histogram comparison includes comparing a first color content in a query image with a range of similar color contents in the reference image. For example, a pixel count for a given color in the query image may be compared to pixel counts for a range of similar colors in the reference image. A histogram distance between two images may be determined according to a computed pixel count difference between the given query histogram color and a selected color in the range of similar reference histogram colors.
Legal claims defining the scope of protection, as filed with the USPTO.
1. An image classification method comprising employing a computer system comprising at least one processor to perform the steps of: performing a plurality of subtraction operations between a pixel count of a first color in a first image and a corresponding plurality of pixel counts in the second image, the plurality of pixel counts in the second image representing pixel counts for a range of colors centered about the first color, the first image and the second image being selected from a reference image and a query image, the first image being different from the second image; and classifying the query image according to a plurality of results of the plurality of subtraction operations.
2. The method of claim 1 , further comprising determining whether an electronic communication comprising the query image is spam or non-spam according to a result of classifying the query image.
3. The method of claim 2 , wherein the electronic communication comprises an electronic mail message.
4. The method of claim 1 , wherein the range of colors is determined by three ranges of basis color contents centered substantially about a vector defined by three basis color contents of the first color.
5. The method of claim 1 , further comprising determining an image distance between the first and second images according to the plurality of results of the plurality of subtraction operations, and determining whether the electronic communication is spam or non-spam according to the distance.
6. The method of claim 5 , wherein determining the distance comprises comparing the plurality of results of the plurality of the subtraction operations to a color bin similarity threshold.
7. The method of claim 5 , wherein determining the distance comprises determining a distance indicator D(h,g) substantially according to D ( h , g ) = ∑ A ∑ B ∑ C min ( h ( a , b , c ) , g ( a ± δ , b ± δ , c ± δ ) ) | h ( a , b , c ) - g ( a ± δ , b ± δ , c ± δ ) max ( h ( a , b , c ) , g ( a ± δ , b ± δ , c ± δ ) ) ≤ min ( h , g ) , wherein h and g denote histogram representations selected from a query histogram representation of the query image and a reference histogram representation of the reference image, g being distinct from h, wherein a, b, and c are basis colors, wherein δ is a basis color evaluation range, and wherein Δ is a bin similarity threshold.
8. The method of claim 1 , further comprising performing an identical-color subtraction operation between the pixel count of the first color in the first image and a pixel count of the first color in the second image, and classifying the query image according to a result of the identical-color subtraction operation.
9. A non-transitory computer-readable storage medium encoding instructions which, when executed on a computer system, cause the computer system to perform the steps of: performing a plurality of subtraction operations between a pixel count of a first color in a first image and a corresponding plurality of pixel counts in the second image, the plurality of pixel counts in the second image representing pixel counts for a range of colors centered about the first color, the first image and the second image being selected from a reference image and a query image, the first image being different from the second image; and classifying the query image according to a plurality of results of the plurality of subtraction operations.
10. The storage medium of claim 9 , wherein the instructions further cause the computer system to determine whether an electronic communication comprising the query image is spam or non-spam according to a result of classifying the query image.
11. The storage medium of claim 10 , wherein the electronic communication comprises an electronic mail message.
12. The storage medium of claim 10 , wherein the range of colors is determined by three ranges of basis color contents centered substantially about a vector defined by three basis color contents of the first color.
13. The storage medium of claim 9 , wherein the instructions further cause the computer system to determine an image distance between the first and second images according to the plurality of results of the plurality of subtraction operations, and determining whether the electronic communication is spam or non-spam according to the distance.
14. The storage medium of claim 13 , wherein determining the distance comprises comparing the plurality of results of the plurality of subtraction operations to a color bin similarity threshold.
15. The storage medium of claim 13 , wherein determining the distance comprises determining a distance indicator D(h,g) substantially according to D ( h , g ) = ∑ A ∑ B ∑ C min ( h ( a , b , c ) , g ( a ± δ , b ± δ , c ± δ ) ) | h ( a , b , c ) - g ( a ± δ , b ± δ , c ± δ ) max ( h ( a , b , c ) , g ( a ± δ , b ± δ , c ± δ ) ) ≤ min ( h , g ) , wherein h and g denote histogram representations selected from a query histogram representation the query image and a reference histogram representation the reference image, g being distinct from h, wherein a, b, and c are basis colors, wherein δ is a basis color evaluation range, and wherein Δ is a bin similarity threshold.
16. The storage medium of claim 9 , wherein the instructions further cause the computer system to perform an identical-color subtraction operation between the pixel count of the first color in the first image and a pixel count of the first color in the second image, and to classify the query image according to a result of the identical-color subtraction operation.
17. An image classification method comprising employing a computer system comprising at least one processor to perform the steps of: performing a plurality of comparisons between a pixel count of a first color in a first image and each of a plurality of pixel counts of the second image, the first image and the second image being selected from a reference image and a query image, the first image being different from the second image, the plurality of pixel counts of the second image representing pixel counts for a range of colors centered about the first color; and classifying the query image according to the plurality of comparisons.
18. The method of claim 17 , further comprising determining whether an electronic communication comprising the query image is spam or non-spam according to a result of classifying the query image.
19. A non-transitory computer-readable storage medium encoding instructions which, when executed on a computer system, cause the computer system to perform the steps of: performing a plurality of comparisons between a pixel count of a first color in a first image and each of a plurality of pixel counts of the second image, the first image and the second image being selected from a reference image and a query image, the first image being different from the second image, the plurality of pixel counts of the second image representing pixel counts for a range of colors centered about the first color; and classifying the query image according to the plurality of comparisons.
20. The storage medium of claim 19 , wherein the instructions further cause the computer system to determine whether an electronic communication comprising the query image is spam or non-spam according to a result of classifying the query image.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
July 5, 2010
December 18, 2012
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.