ECWU Homepage

To Dark Mode
Featured Images
Photo by Mika Baumeister on Unsplash

Finding the Name for Color

Zhenghao Wu

2022-02-23

Status: Finished Confidence: likely Importance: 8

Post Details

This post is part 3 of 3 in the Project series.

Goto: First Post | Prev: 【记录】智能门锁 - Introduction to Robotics 课程项目

Table of Contents

This topic first came to my mind when I was in college as a freshman. I don’t quite remember what application I am going to embed this. But the goal is clear: for color in HEX or RGB format, find a name to describe it accurately.

Recently, I saw a forum thread. A color-blind person is looking for a solution to identifying the color. I immediately recall the experience I had almost five years ago.

But after I did some prototype coding, I found the result was not as good as I expected, which drove me to learn more about the topic.

And the result is this article.

Understand Color (1)

We won’t discuss the basal stuff. Just the numeric representation of the RGB color model. You can learn more about color @ Wikipedia

First, we need to know how to represent a color in the computer. The RGB color model describes the color as a mixture of three primary colors: Red, Green, and Blue. Each color has an integer intensity value between 0 and 255 (0 means no such color, 255 means such color has maximum intensity). So, having these three numbers, we can indicate a color.

alt
A representation of RGB color mixing. Projection of primary color (Red, Green, Blue) lights on a white wall show secondary colors where two overlap; the combination of all three in equal intensities makes white. Wikimedia Commons, by User:Bb3cxv

In HTML, pure white color under such a model is RGB(255,255,255). comma-separated values match the Red, Green, and Blue colors in order. By converting each color value from decimal to hexadecimal, we have a tidier form, RGB(#FFFFFF). I provide an interactive slider below, so you play with the intensity of each primary color and see the mixing result.


Interactive: Color Mixing


Color Naming

Consider three primary colors, each with 256 different intensities. We have $256^3 = 16777216$ possible colors. But our commonsense tells us that we don’t have such an enormous number of names for colors. But we can take several dominant color naming schemes that provide various names.

Two easy-to-access examples are the X11 color names and CSS Color Module Level 3 (Basic, Extended). Especially the CSS Color Module Level 3, which you can directly use when writing web pages.

The PANTONE Color is a more proprietary and well-known color naming system. They put effort on standardize the color, and their color matching system are used all around the world. The general public gets to sound them by the annual “Color of the year” published since 2000. In 2022, the color of the year is PANTONE 17-3938 Very Peri.

alt
Pantone Color of the Year 2022: PANTONE 17-3938 Very Peri

In our case, we need the color name and its corresponding RGB value, so I made a Chinese color name mapping file using the list from Wikipedia, I also made an English one which I stored here.

In the mapping files I create, there are 248 colors named in Chinese and 977 colors named in English. Personally, I thought some color names used in English were less intuitive. And as a native Chinese speaker, I am more familiar with those Chinese color names, So in the following examples on color naming, I will use the Chinese name mapping file (I will provide translation or description).

Distance Between Colors

Consider the RGB color space. If we set each color intensity as the axis of the rectangular coordinates, we can form a $255^3$ size cube. Points inside the cube represent different colors (consider only the integer value).

alt
The RGB color model mapped to a cube. The horizontal x-axis as red values increasing to the left, y-axis as blue increasing to the lower right and the vertical z-axis as green increasing towards the top. The origin, black is the vertex hidden from view. Wikimedia Commons, by SharkD

When we find a name for a color, if there is no exact match in the naming mapping, we can find a similar match by calculating the “distance” and finding the color and its name with the slightest distance difference.

The distance here is the $L_2$ distance (Euclidean distance). For $C_1: [R_1, G_1, B_1]$ and $C_2: [R_2, G_2, B_2]$:

$$d([R_1, G_1, B_1], [R_2, G_2, B_2]) = \sqrt{(R_1-R_2)^2+(G_1-G_2)^2+(B_1-B_2)^2}$$

The algorithm will traverse and calculate the distance between the query color $C_1$ all colors $C_k$ in the list and uses the color with minimum distance $min(d([R_1, G_1, B_1], [R_k, G_k, B_k]))$ as the closest match.

I create these codes to do the procedure above.

import pandas as pd
import math

def hex_to_rgb(hex):
    h = hex.lstrip('#')
    return tuple(int(h[i:i+2], 16) for i in (0, 2, 4))

def calculate_distance(cor1, cor2):
    c1 = hex_to_rgb(cor1)
    c2 = hex_to_rgb(cor2)
    return math.sqrt((c1[0]-c2[0])**2+(c1[1]-c2[1])**2+(c1[2]-c2[2])**2)

def closest_n_color(c_list, candidate, n=10):
    distance_dict = dict()
    for name, _hex in c_list:
        distance_dict[name] = calculate_distance(candidate, _hex)
    return [[k, v] for k, v in sorted(distance_dict.items(), key=lambda c: c[1])][:n]

if __name__ == "__main__":
    match_list = pd.read_csv('color-name-mapping-cn.csv').values.tolist()
    source = '#FFFFFF'
    print(closest_n_color(match_list, source))

Use #FFFFFF (white) color for a try, the program return the top three matches: [['白色 (White)', 0.0], ['雪色 (Snow White)', 7.0710678118654755], ['幽灵白 (Ghost White)', 9.899494936611665]]. We can see that except for the exact match, the other two matches pretty much refer to the same color: White. So, it seem that our little program has been working perfectly!

Problem

But when I tried another color, like #6A4764. Seeing it with my eye, the color belongs to dark magenta and dark purple. But the program’s result is not as close. It returns iron-gray, dust gray, and dark rock blue. When we look closer at the color distance. The dark magenta (#8B008B) has a $d = 87.46$ but the others is way less ($[25.15, 34.38, 52.69]$).

Program Generated Top three matches

Please hover the picture to see the original color in dark mode.

Why this happens?

Understand Color (2)

I try to visualize the named colors in 3d space. Those colors aren’t uniformly distributed. Some are clustered together, and some are relatively sparse with almost no named color. This means some of the colors may not be easy to describe.

Named Colors in 3D space

Please hover the picture to see the original color in dark mode.

The other problem is that such a color model is hard to understand. When manipulating a color, how to make it to some color or “more that color" (hue, saturation; not just red, green, or blue); How to make it brighter or darker.

The HSV color model is preferred instead of the RGB color model, in which the HSV color space is converted from the RGB color space.

Visualisation for conversion between color models RGB and HSV. video from Wikimedia Commons, by VerbaGleb.

As shown in the video, the original color cube of the RGB color model is converted into a cylinder object. At the cylinder’s cap, we can easily see distinguishable colors. And color at the top is the brightest, and it gets darker at the bottom. Slice out a layer of cylinder horizontally, the color inside is more faded, and the exterior color is more vivid.

My verbal description of the HSV color cylinder is actually its three variables: Hue, Saturation and Value (Or Brightness). Here are their definitions:

The hue in the HSV color model could help distinguish color, which you can try with the following interactive tool.


Interactive: HSV Color Mixing


RGB to HSV Converstion

First, scale the range for $R, G, B$ to 0 and 1 by dividing 255 to each color channel and getting $R', G', and B'$.

$$M=\max(R',G',B')$$

$$m=\min(R',G',B')$$

$$C= M - m$$

$C$ is also call Chroma.

Then, we can calculate H, S, L with the equations.

Hue

$$H=\begin{cases} 0\degree, & \text{if } C = 0 \\ 60\degree\times(\frac{G'-B'}{C} \mod 6), & \text{if } M = R' \\ 60\degree\times(\frac{B'-R'}{C} + 2), & \text{if } M = G' \\ 60\degree\times(\frac{R'-G'}{C} + 4), & \text{if } M = B' \end{cases}$$

Saturation

$$S=\begin{cases} 0, & \text{if } M = 0 \\ \frac{C}{M}, & \text{if } M \neq 0 \end{cases}$$

Lightness

$$V= M$$

Color Distance under HSV color space

Apparently, the shape of the HSV color space is a cylinder, and it is not a good idea to use $L_2$ distance here. I found this solution from SO for comparing the colors under HSV space.

First, create a weighted function to convert vector-like HSV to a value.

$$f(H,S,V) = \sqrt{a\times H^2+b\times S^2+c\times V^2}$$

Then compare the color by the closeness of the values. The weights $a, b, c$ should be on par with human perception when deciding color differences. Suggested weights are $a=b=1.0$ and $c=0.5$.

Perceptually-uniform Color Space

The distance function for RGB and HSV color space can partially solve our problems. But it still leaves the fundamental problem unsolved. The problem with RGB and HSV is that it does not model how humans perceive color. Specifically, color perception is non-linear and not exactly orthogonal.

The scientist then tries to model the way humans see colors and create some relatively perceptually-uniform color space. Like CIELAB and CIELUV.

The CIELAB color space is the one I want to introduce. It represents color using three values: Lightness, and a for Green and Red color channel and b for Blue and Yellow color channel. These two color channels are in such a way because it is set according to the opponent’s color model of human vision.

How CIE modeled CIELAB color space is quite complicated; I suggest checking out Wikipedia or related papers.

For more intuitive visualization for the color spaces, please try the following 3D color space visualization. You can switch between color spaces and manipulate the model.


Interactive: Color Space in 3D Powered by afc163/color3d

Switch to Color Space:

Color Distance in CIELAB Color Space

The relations between the L, a, and b are non-linear to mimic the human’s non-linear response to color. But the model itself is uniform to measure it with a simple $L_2$ distance. But before doing the calculation, you need to convert the RGB color to CIELAB color space.

The way to do it is an RGB -> CIEXYZ -> CIELAB conversion.

After you have the CIELAB values, calculates the “perceptual color distance” should be easy.

$$\Delta E([L_1, a_1, b_1], [L_2, a_2, a_2]) = \sqrt{(L_1-L_2)^2+(a_1-a_2)^2+(b_1-b_2)^2}$$

sqrt is unnecessary if you only need to find the closest color.

Such distance is usually called $\Delta E$ in color science. Stand for the color difference. When you buy some electronics with a screen, the claim for the screen with $\Delta E$ is less than a certain number. It is what it means. (It is an average of multiple colors measurements differences). A $\Delta E \leq 1$ is hard for humans to distinguish; a good monitor should have a $\Delta E \leq 3$.

Epilogue

This article is the first academic work I finished without outer pushes (like school or work). Initially, I was not entirely familiar with the topic (I still do not fully understand the complete picture after my research, like how the CIELAB converted and modeled). But I was having a lot of fun when writing the text and creating the visualization and interactive modules.

Back to the color naming problem. I could give a good enough solution from what I have learned, but there exists more that I have not covered so that we can improve. Like:

I am not professional in this area, hope I have not messed up too much knowledge. And hope you learn something just like I do. Welcome to leave a comment.

Reference Materials

Color Distance Tools

Solutions

Color Distances

Color Naming List

Explanations

Article Card

For "Finding the Name for Color"

Author Zhenghao Wu
Publish & Update Date 2022-02-23
Tags Color Design Programming Art Color Space Color Differences Perception HSV HSL CIELAB RGB
Extra Materials