Quantitative Performance Evaluation of Commonly Used Colormaps for Image Display in Myocardial Perfusion Imaging: Analysis based on Perceptual Metrics

Mohsen Qutbi

doi:10.4274/mirt.galenos.2024.34711

ABSTRACT

Objectives

To quantitatively evaluate the performance of the most used colormaps in image display using perceptual metrics and to what extent these measures are congruent with the true intensity or uptake of pixels at different levels of defect severity in simulated cardiac images.

Methods

Six colormaps, labeled “Gray”, “Thermal”, “Cool”, “CEqual”, “Siemens” and “S Pet” extracted from FIJI ImageJ software are included. Colormap data are converted from the red, green, blue color space to CIELAB. Perceptual metrics for measuring “color difference” were calculated, including difference (ΔE⁷⁶) and “speed”. The pairwise color difference in every two levels or entries is visualized in a 2-dimensional “heatmap distance matrix” for each colormap. Curves are plotted for each colormap and compared. In addition, to apply this technique to clinical images, simulated short-axis cardiac slices with incremental defect severity (10% grading) were employed. The circumferential profile curves of true pixel intensity, lightness or luminance, and color difference are plotted simultaneously for each defect severity to visualize the concordance of the three curves in various colormaps.

Results

In 0% defect, all the curves are at the highest level, except for “s pet”, in that the lightness is not at its maximum value. In the phantom with 10% defect (or 90% of maximum value), discrepancies among curves appear. In “Siemens”, the ΔE⁷⁶ drops sharply. In “Siemens” colormap, the ΔE⁷⁶ drops sharply. In 80% defect, ΔE⁷⁶ curve, in “gray” colormap drops more slowly than other curves of other colormaps. In “s pet”, lightness curve rises paradoxically, although the count intensity and ΔE⁷⁶ curve match. In 70% defect, again, the curves are in good agreement in “thermal”, “Siemens” and “cequal”. However, a consistent lag exists in “gray”. Up to 50% defect, curves maintain their expected pattern, but in defects more severe than 40%, lightness and ΔE⁷⁶ curves in “cool” and “cequal” rise paradoxically, and in “thermal”, they start to slow down in descent. In “Siemens”, falling pattern of the three curves continues. For “s pet” colormap, an erratic pattern of lightness and ΔE⁷⁶ curves exists.

Conclusion

Of 6 colormaps investigated for estimating defect severity, “grayscale” is less favorable than others and “thermal” performs slightly better. “s pet” or rainbow, which is used traditionally by many practitioners, is strongly discouraged. The “Siemens” colormap suffers from decreased discriminating power in the range of mild to moderate/severe. In contrast, the “cool” and “cequal” colormaps outperform the other colormaps employed in this study to some extent, although they have some shortcomings.

Keywords:

Colormap, look-up table, performance, perceptual metric, image display, quantitative analysis

Introduction

Image display or visualization is one of the key steps in medical image interpretation after necessary image processing and analysis. In most circumstances, readers apply various combinations of colors to gray-intensity images for various reasons, known as image pseudo-coloring. Visualization is generally enhanced and optimized depending on the purpose and application. Lesion-finding tasks and estimating the level of intensity (for example severity of defects in myocardial perfusion imaging) require the use of different colormaps. The former requires enhancing contrast and maximizing the conspicuity of the area of interest (or lesions), and the latter is based on the input-output relationship between the original data and the displayed image (1, 2, 3, 4). The input-output relationship is related to the intensity transformation of images and is defined as linear or other simple nonlinear mathematical functions (logarithmic or exponential) relating input to output. Except for the direct input-output relationship in the gray colormap, this relationship is non-linear for other colormaps and therefore compromises the estimation of relative intensity. This non-linearity between input and output frequently occurs with multiple-hue colormaps (5, 6, 7, 8). Furthermore, display in two- or three-dimensional modes using different methods of shading and rendering and fusing of images also affects one’s judgment (1, 9, 10). However, based on the application, particular colormaps may be preferred. Furthermore, colormaps that consist of different hues, saturation, and intensity may have a strong effect on the reader’s or interpreter’s perception. In most cases, medical imaging practitioners use commercially available colormaps embedded in the software by a vendor and it is used because one is accustomed to using it in their interpretation or based on their own prior experience, preference, or convention. Despite this fact, domain-specific colormaps are widely used across various disciplines (3).

Frequently, colormaps are compared subjectively. Therefore, interpretation is subject to biased estimation because of the different interobserver perceptual impacts of colors. Because of the complex multi-faceted nature of the phenomenon of color perception, quantification and measurement are inherently difficult tasks, and the results may not satisfactorily reflect the true psychological consequences (9, 11, 12, 13, 14, 15, 16, 17). Owing to the reasons mentioned and difficulties in quantifying the characteristics, specifications, and behavior of different colormaps, few studies have investigated the properties of various colormaps and, more importantly, their implications in clinical settings such as nuclear medicine imaging. Evaluation of colormaps with more quantitative measures facilitates comparison and enables one to choose the right one. In addition, except for grayscale, colors in other colormaps are perceived differently in humans. This issue complicates the problem because using perceptual metrics and assessing colors in various available color spaces are not straightforward. Employing mathematical modeling for the psychophysical and physiological aspects of different colors and their association in a sequential pattern may be beneficial to some extent. Of the various color spaces available, CIELAB (L*a*b*) and CIELUV are more compliant with human perception of colors with slight differences. In the CIELAB color space, the dominant feature is the perceptually uniform distances between colors. Therefore, the colors in the red, green, blue (RGB) color space can be converted to corresponding ones in the L*a*b* color space to examine the distance of colors in a sequential colormap (2, 3, 4, 18, 19). Thus, the evaluation of the compatibility of its pattern with the raw data values of pixels (or counts) in images is feasible. Then, according to the application, one can find the agreement and correlation between them.

The present study aims to quantitatively evaluate the performance of mostly used colormaps in image display and visualization by measures that are considered perceptual metrics. In addition, to examine the extent to which these measures are congruent with the true intensity or uptake of pixels over various values, i.e., different levels of defect severity in simulated cardiac single photon emission computed tomography (SPECT) or positron emission tomography (PET) images.

Materials and Methods

Mathematical Representation and Quantitative Metrics

Mathematically, a colormap is a discrete-valued function that relates the values as input to the corresponding values as output. Discretized input values, because of spatial sampling and quantization, are sorted and then “mapped” to another set of discrete values. For an 8-bit or 256-level colormap function, this process can be expressed mathematically as follows:

In this case, index i represents the i th bin of the image histogram and the one-to-one function, F, maps p_i to c_i as elements of domains and codomains. By this means, a distinct color or shade of gray is assigned. For colormaps consisting of a range of color hues, each level or entry is formed by a triple component as channels (R, G, and B representing red, green, and blue). Therefore, c_i equals [c_i(R), c_i(G), c_i(B)]. In gray shades, in contrast, all the values in each triplet are equal (4). To demonstrate the overall characteristics and major features of each colormap, a gray transformation is applied using the following formula (20):

Where the coefficients, a = 0.2989, b = 0.5870, and g = 0.1140 are weights according to the perceptual impacts of different wavelengths in humans. f is the original image and g is the gray-transformed image. The resulting gray intensity is graphed and compared.

Next, for each colormap, a conversion is implemented from the RGB color space to the CIE L*a*b* or CIELAB color space. First, the colormap in RGB space is converted to the corresponding one in XYZ color space, and the XYZ/L*a*b* transformation is conducted. To accomplish this transformation, a MATLAB built-in algorithm is employed. L* denotes the luminance or lightness and ranges from 0 to 100. Values of 0 and 100 specify black and white in the image. a* and b* indicate the perpendicular coordinate axes in the chromaticity plane. The a* axis determines the amount of red and green hues (in positive and negative directions, respectively). Likewise, the b* axis determines the number of yellow and blue hues (in positive and negative directions, respectively). For graphing, L* is demonstrated as a straight or curved line ranging from 0 to 100 as the target line, and the order of the color components in the colormap is depicted as a path (multiple segmented curved lines) in the a*b* plane.

To compare the impact of colors in neighboring or distant levels in the colormap the pairwise “color difference” is computed by a distance metric in the Euclidean space as follows:

In that, ∆E⁷⁶ represents the distance between two colors in CIELAB color space (version 1976). This metric is computed between each possible pair in the colormap. For an 8-bit colormap (256 levels or entries and a range of values from 0 to 255 in each level), a heatmap distance matrix with a size of 256×256 is computed, and the value of ∆E⁷⁶ is visualized as gray intensity at the intersection of the row and column of interest. The higher the value of that element in the matrix (pixel), the brighter the intensity of that pixel. Furthermore, the “speed” of color change between two arbitrary levels in the colormap (levels i and j for example) is calculated using equations (4, 21):

V_i,j is the speed between two arbitrary levels (l_i and l_j). Therefore, for “local speed” between two successive levels (l_j and l_j-1), the equation is as follows:

Then, the average, standard deviation, maximum speed, and minimum speed are calculated (4).

Colormaps

In the present study, 15 colormaps were categorized into four groups according to general similarity in hues included (i.e., the sequential pattern of hues and the gray intensity curves), as shown in Figure 1. Six colormaps that are frequently used for applications of medical visualization were enrolled and analyzed. To perform this task, the original files for generating colormaps were extracted from FIJI ImageJ software, which is a Java-based framework for biomedical image processing and analysis developed by the National Institutes of Health (22). The 6 selected colormaps under investigation in this study are “Gray”, “Thermal”, “Cool”, CEqual”, “Siemens” and “S Pet”. As can be seen in the figure, in the upper panel, the colormaps are presented from the lower bound at the leftmost and the upper bound at the rightmost side of each color bar. The gray intensity values of each colormap are plotted as shown in the lower panel. In category 1, the gray intensity curves of “gray” and “thermal” are fitted to the target line nearly perfectly. There is a consistent linear pattern over the entire spectrum. In category 2, “cool”, “cequal”, “ge” and “mmc” are present, and the patterns of the gray intensity curves are roughly similar. There is a linear pattern from the mid-range to the upper bound with a steeper than the target line. In the lower half, no consistent relationship exists “cequal” reaches a plateau near the upper bound. In category 3, “Siemens”, “hot iron”, “warm metal and “fire” are selected. There is an approximately linear pattern from the lower to upper bounds with some irregular fluctuations. Finally, in category 4, “s pet”, “rain bow”, “a squared”, “physics” and “prism” are included. There are wide fluctuations over the entire range of colormaps, which are far from the target line.

These labels are the standard names that are well known by researchers and practitioners in biomedical disciplines, although some differences exist, particularly in non-medical contexts. The files in the format of .lut files are converted to .csv text files and are then imported into MATLAB software for processing and analysis. The file consisted of the values of RGB triples for 256 levels or entries. In each entry, values are in the range [0, 255] as integers that indicate the 8-bit format of colormaps. For visualization and analysis, necessary reformatting, transformation, and conversion from RGB to other color spaces are implemented. The scale is preserved as linear in the original data (no logarithmic, exponential, or other custom-function-based transformation is applied).

Cardiac Phantom

A simplistic model of the short axis of the left ventricle (LV) is designed in a 128×128 matrix. Gaussian blurring is then imposed (with s = 8 pixels). To simulate a perfusion defect, the short-axis phantom is divided into four sectors, including anterior, septal, lateral, and inferior, each 90°. The defect is placed in the anterior sector with a graded 10-percent increment from no defect (100% of maximal myocardial uptake) to absent uptake (0% of maximal myocardial uptake). In total, 9 phantoms with various levels of defect severity were obtained. This simplistic model is employed because, in cardiac images, the field is limited to the organ of interest (LV).

Tomographic Reconstruction

All phantoms are entered one by one during the process of tomographic reconstruction. Iterative maximum likelihood expectation maximization (MLEM) is used. An acquisition arc of 360° and angular sampling of 3° were set. Collectively, 120 projections were generated during the Radon transform. Then, the generated sinograms are back-projected (inverse Radon transform) to create the tomographic slices. This procedure is repeated 10 times (number of iterations: 10), and in each step, the tomographic slice is updated by dot multiplication of the tomographic slice of the previous iteration to the error image resulting from the element-by-element division of the measured and estimated sinograms. The initial (guess) image is considered to be a uniform one-valued image of the same size as the phantom.

Image Analysis and Plotting

For the analysis of tomographic images, maximum normalization is applied. Thus, the maximum value in each tomographic slice is set to one. This process is performed to ensure the comparability of images and curves and to avoid possible normalization errors. To visually assess the tomographic images and defects, they are displayed side-by-side using 6 colormaps introduced previously. Each slice with a known perfusion defect in the anterior sector (or wall) is profiled by a vertical line crossing the anterior and inferior walls. The intensity profile is then plotted. To plot the circumferential intensity profile of the myocardium, 16 samples (non-uniformly dispersed spatially in the left and right half of the LV) were selected. Hence, the radial profile of the walls is fitted to a Gaussian curve, and the sample points are chosen such that they reside in the maximum or peak of the radial profile. Then, the real count or intensity profile curve is drawn. In addition, for each colormap and each defect severity, the lightness of the luminance curve and the ∆E⁷⁶ curve (as discussed before) are plotted for those sample points.

Implementation and Coding

Codes for designing phantoms, image acquisition, iterative tomographic reconstruction, and image analysis are written in the MATLAB programing language and run in MATLAB software (The MathWorks Inc., version 2021b). The colormap files (or lookup tables as .lut text file formats) are extracted from FIJI ImageJ software.

Results

In Figure 2, Figure the heatmap Euclidean distance matrix is visualized for 6 colormaps selected for more in-depth analysis. The entire range of the colormap is displayed at the main diagonal of the matrix, and the distance between each pair of levels or entries (i.e., color hue or shades of gray) is displayed as a shade of gray. The greater the distance between two arbitrary pairs on the main diagonal of the matrix, the higher the intensity of the corresponding pixel (which is located at the intersection of the row and column of those pairs). As can be seen in the “gray” colormap, the maximum distance is between the lower and upper bounds, as expected. The distance between neighboring pairs (that are close to each other) is too small to be recognized (barely distinguishable) by the human eye. In addition, there is a consistent pattern of distance over the entire range. In “thermal” colormap, an almost similar pattern exists with the exception that the distance between the lower and upper bounds is more prominent than “gray” colormap. In other words, the distance between each pair is augmented in “thermal” compared to “gray”. In other colormaps, “cool”, “Siemens”, “cequal” and “s pet”, no regular pattern is visible. In “cool” colormap, the highest distance between pairs is in the mid-range of the map. The largest distance is between the orange and blue pixels. In the upper third of the spectrum, the distance is minimal. An almost similar pattern is evident in “cequal”. In “Siemens”, the lower third (blue pixels) is remarkably distant from the upper two-thirds (red-orange-yellow pixels), but the pixels in the upper two-thirds of the spectrum are minimally distant. In “s pet” the greatest distance is between the lower and middle thirds (blue pixels and green-yellow pixels).

The curves of lightness or L* in the CIELAB color space and ∆E⁷⁶ for 6 colormaps are provided in Figure 3andFigure 4,respectively. As demonstrated in Figure 3, the y-axis and x-axis denote the absolute value of lightness in the CLELAB color space and the levels or entries from 1 to 256. The mid-value (50% of myocardial uptake) is indicated by a vertical green line in bold and splits the graph into two parts (lower half and upper half). The curves of “gray” and “thermal” are straight lines almost fitted to the target line. The curve of “Siemens” follows the target line to a high extent. The curve of “s pet” is far deviated from the target line. In the upper half (myocardial uptake in the defect 50% to 100% of maximal myocardial uptake), the curves of “cool” and “cequal” rise almost linearly, but in the lower half, an irregular pattern is evident. In the upper half of the graph (>50%), the steepest line is for “cool” but for “gray”, “thermal” and “Siemens”, the slope is lower and equals that of the target line. The curve for “s pet” is reversed. In Figure 4, the y-axis and x-axis denote the absolute value of the color difference in the CLELAB color space and the levels or entries from 1 to 256. The mid-value (50% of myocardial uptake) is indicated by a vertical green line in bold and splits the graph into two parts (lower half and upper half). Here, the ∆E⁷⁶ is the difference between the color in the uppermost bound and the color of an arbitrary level so that the value in the rightmost part of the x-axis ends in 0. For “gray” spectrum, the ∆E⁷⁶ curve is a straight line starting from max at level 1 to 0 at level 256. In “thermal” colormap, the curve plateaus in the first half (<50%) and gradually falls in the upper half (>50%). The curves of “cool” and “cequal” colormaps are similar in pattern, but the output range is wider compared to “thermal” (0-100 for “thermal and 0-140 for “cool” and “cequal”). In the upper half, the curves are almost a straight line with a notch in the mid-way (about 75% of maximal uptake or 25% defect severity). The ∆E⁷⁶ curve for “Siemens” shows an irregular pattern in the lower half and remains steady in the upper half before a sharp decline at the end of the range. The curve is plateau at a wide range in the middle. A similar scenario is for “s pet” in the lower half but falls in a linear pattern in the upper half.

The local speed and its statistical measures for each colormap are shown in Figures 5, Figures 6. In Figure 5, Figure the blue line indicates the original data, and the red curve in bold indicates the smoothed data. For “gray”, the speed is high in the leftmost part of the range but uniformly decreases then. In absolute terms, the changes in speed are profoundly low compared with those of other colormaps. In “thermal”, the speed is remarkably high in the rightmost and leftmost parts of the range. For “cool” and “cequal” colormaps, the speed is minimum in the upper and lower zones of the range, but in the middle, it demonstrates several peaks. For “Siemens”, the speed curve is irregular over its entire range. Finally, the speed is uniform in the upper half and peaks sharply.

The tomographic slices reconstructed using the iterative MLEM method with graded defect severity are displayed in Figure 7.Each row is dedicated to one colormap. The incremental defect severity of the anterior wall was calculated as the percentage of maximal myocardial uptake (100% means no defect and 0% means no uptake). Here, the defects based on the color that appears to the reader’s eye are to a high extent subjective, and readers, based on their own experience and perception, decide to estimate the severity of the defect. Although this way of interpretation constitutes the main part of decision making in clinical settings, it seems that approximately noticeable differences exist among different colormaps. In “gray” and “thermal” colormaps, defects seem to the reader’s eye to change more slowly than other colormaps, “cool”, “cequal” and “s pet” in particular. In other words, the perceptual impact of changes is dramatic, in close agreement with the results shown in Figures 2, Figures 4.

To make the results more objective, the colormaps were analyzed with more quantitative metrics. For this task, certain sample points are selected on the tomographic slice of the cardiac phantom (Figure 8), and the values of those pixels are analyzed in terms of relative count or intensity, lightness or luminance and the ∆E⁷⁶. Figure 9A, B, C, D, E, Fconsists of 6 parts according to defect severity from 100% uptake (or no defect) to 50% of maximal uptake. The circumferential profiles of intensity or count, lightness and ∆E⁷⁶ curve are plotted for various levels of defect severity. Thus, the three curves are comparable for each defect severity. In each part, the upper panel displays the tomographic image with its specific defect severity in six colormaps, and in the lower panel, the curves of count, lightness and ∆E⁷⁶ are plotted. In each plot, the profile starts from the inferior wall and then through the septum reaches the anterior wall (which is defective). Subsequently, by passing through the lateral wall reaches a similar point in the inferior wall. Thus, the defect is in the mid-profile. Each plot has three curves. The black curve denotes the true value of the sample points (or counts), and the red and cyan curves indicate lightness (L*) and ∆E⁷⁶. When there is no defect, all the curves are at the highest level (equals 1 or 100%), except for “s pet”, in that the lightness is not at its maximum value. In the phantom with 10% defect (or 90% of maximum value), discrepancies among curves appear. In “Siemens” colormap, the ∆E⁷⁶ drops sharply. This pattern is concordant with the way it appears to the eye. In 80% defect, ∆E⁷⁶ curve, in “gray” colormap drops more slowly than other curves of other colormaps. Instead, the defect is more noticeable in “cool”, “Siemens” and “cequal”. In “s pet” colormap, the lightness curve rises paradoxically although the count intensity and ∆E⁷⁶ curves are matching. In 70% defect, again, the curves are in good agreement in “thermal”, “Siemens” and “cequal”. However, a consistent lag exists in “gray” colormap. Up to 50% defects, the curves maintain their expected pattern, but in defects more severe than 40%, lightness and ∆E⁷⁶ curves in “cool” and “cequal” rise paradoxically, and in “thermal” they start to slow down in descent. In “Siemens”, the falling pattern of the three curves continues. For “s pet” colormap, an erratic pattern of lightness and ∆E⁷⁶ curves exists.

Discussion

Colormaps have long been used for color coding of inherently gray-intensity images. Broadly speaking, they are generally employed for scientific data visualization. The process of pseudo-coloring, apart from its aesthetical purposes, seeks ways to facilitate the conveyance of information qualitatively and semi-quantitatively (1, 2, 3, 4). For this reason, many colormaps have been designed, some of which are frequently used in medical imaging. In nuclear medicine, in particular, a few colormaps including “grayscale”, “inverted gray”, “thermal”, “cool”, “Siemens”, and traditionally, “rainbow” or “s pet” are of high interest among practitioners in this field (23, 24, 25). In other words, these colormaps are specific to this particular domain and are mostly used based on convention, convenience, accessibility, etc. Few studies have investigated the effect of using colormaps in clinical images. As mentioned before, the inherently complex and multi-aspect nature of the assessment of colors and the difficulty in devising accurate methods for quantification and modeling preclude much investigation on this topic.

Colors are mathematically modeled using several color spaces and particular measures. Of the several color models, CIELAB, developed by the Commission Internationale de l’Eclairage, is considered the most perceptually uniform. The model comprises a pyramidal geometry. The vertical axis indicates lightness, and the horizontal plane includes a coordinate system with a- and b-axes. The a-axis ranges the colors between red and green on the positive and negative sides, respectively. The b-axis includes colors between blue and yellow on the negative and positive sides. The colors of points that reside in the origin or on the lightness axis are a shade of gray between black and white. Distances between two colors (or points) are calculated in Euclidean space, as mentioned in the methods section. The main application of this color space is to create perceptually uniform colormaps. However, the related measures are employed as perceptual metrics. ∆E⁷⁶ is the first ever metric introduced in 1976 (4, 20, 21). This study was intended as a preliminary investigation for characterizing and specifying various colormaps and demonstrating their advantages and pitfalls. In cardiac SPECT and PET images, the colormaps are generally used to categorize the level of uptake or defect severity as well as the changes from one image to another corresponding image (for example, in comparing stress and rest images, termed as reversibility, or in serial imaging) (9, 24, 26). Therefore, the performance of colormaps in each of these applications can impact the process of decision making. Since there is not much knowledge and understanding about the precise performance of each image, it is expected that the interpretation of such images is highly subject to errors. More prevalently, readers are not fully aware of the advantages and disadvantages and of where to use each one based on the application. Considering all these facts, we intended to analyze their performance quantitatively as much as possible. This warrant consistent objectivity in our comparative evaluation.

There are no perfect colormaps that are applicable to various purposes. Therefore, according to the basic characteristics and specifications of a colormap, the selection should be made. The performance of colormaps is assessed using several metrics, including distance, speed, linearity, uniformity, discriminative power, order, and smoothness (4, 20, 21). Unfortunately, there is no unanimously accepted nomenclature for these attributes and measures. This issue creates some ambiguities when comparing the colormaps. Despite this fact, we tried to use measures that are more clarified in definition, such as distance and speed. Heatmap Euclidean distance matrix is an easy-to-use method for visualizing pairwise distance between every two colors in the colormap at a glance. Because the distance is based on its perceptual impact, colormaps with softer fluctuations or variations are more desirable. This fact defines the property of uniformity and smoothness measured by local speed (the speed between the colors of two neighboring levels in colormap. Uniformity can be evaluated on the basis of statistical parameters, including standard deviation. According to our findings in Figure 3, Figure “gray” and “thermal” colormaps are perfectly linear. After that, “Siemens”, “cool” and “cequal” come. In this respect, “s pet” colormap is far from a linear property. Now, the question that arises here is which one is important, whole-range or partial-range linearity. The answer, to the best of our knowledge, lies in the application. Since in cardiac SPECT and PET images, the defects with a severity from 0% to 50% are much more of interest, the upper half of the range of the colormaps are of greatest importance (levels from 128 to 256). In semi-quantitative analysis, the grading of defect severity is as follows: mild 10%-25%, moderate 25%-50%, and severe >50% reduction in count compared with maximal myocardial uptake or equivalent semi-quantitative scores of 1, 2, and 3, respectively (26, 27). Thus, mild, moderate, and severe lesions are distinguishable. Although “gray”, “thermal” and “Siemens” as well as “cool” and “cequal” are all linear in the upper half, the slope of the curves of “cool” and “cequal” is steeper. This leads to higher and more accentuated discrimination or power. In technical terms, “cool” and “cequal” colormaps inherently have an effect similar to that of an exponential function during intensity transformation or a wider dynamic range. This can result in more distinguishing ability. In summary, the whole-range linearity of “gray” and “thermal” does not seem to be an advantage for the analysis of cardiac SPECT and PET images, in contrast to the common belief. It may be of more benefit in other applications because of maintaining whole-range linearity. In the figures, the upper and lower halves are separated by a green vertical line, which indicates 50% of the maximal myocardial uptake. For a more in-depth analysis, the distance is computed using the perceptual metric, ∆E⁷⁶. The distance, as shown in Figure 4, Figure is calculated based on the color of the maximal value or level 256 in the colormap as a reference. Therefore, the distance between all other levels with level 256 is plotted. Likewise, the output of the upper half in “gray” colormap ranges from 0 to 50 compared to “thermal” colormap, which ranges from 0 to 100, and is steeper in the uppermost sub-range. This finding also provides a more distinguishing ability or discriminative power to categorize lesions as mild or moderate. The results are even better for “cool” and “cequal”. In “Siemens”, the distance curve is remarkably steep at the rightmost sub-range, enabling the distinction of mild lesions from normal uptake. The findings presented in Figure 5 confirm these results. The speed in “gray” is too low (in the range of 0.2 to 0.5) to distinguish levels from each other. Considering this property, “cool”, “cequal” and “thermal” are more favorable compared to other colormaps.

In the next section of this project, the above analysis is performed on simulated cardiac images, which are displayed using the six colormaps. To accomplish this goal, a cardiac phantom with various graded levels of defect severity in the anterior wall was designed, and the respective color differences between normal walls with defects were qualitatively or visually analyzed (Figure 7). This graded defect severity with its color appearance in various colormaps enables readers to make a consistent qualitative assessment. In addition, the circumferential profile of the myocardial walls in the short-axis section is plotted for true pixel intensity or count, changes in the lightness of colors, and color distance as the main perceptual metric. The profile is drawn from samples with the highest myocardial intensity in the radial direction. The order for plotting is from inferior, septal, anterior, lateral, and then inferior walls, so that, at the defect region, the three curves may overlap or separate from each other congruently or incongruently. Thus, one can evaluate the effect of each colormap in the perception and estimation of defect severity. As expected, some underestimate the defects, which means that the curves for lightness and color difference (∆E⁷⁶) may fall behind the intensity or uptake curve. It is worth noting that this lag occurs at different levels as the grade of defect severity increases. In contrast, others may overestimate the defects, which can be interpretable as the curves for lightness and color difference ∆E⁷⁶ rush into the lead or get ahead. As an example (Figure 9), in “gray” colormap, the curve for ∆E⁷⁶ always lags the other two curves. Therefore, there is a consistent underestimation, but the pattern is uniform over the entire range of defect severity. In “thermal” colormap, the curves of ∆E⁷⁶and lightness and uptake run at the same pace. This pattern is more favorable than that of “gray”, but the output intensity accentuation is less than that of other colormaps, which means that it lacks enhanced discrimination between mild, moderate, and severe defects. The “cool” and “cequal” colormaps have a favorable effect on readers. However, in “cequal” colormap, the distinction between normal and mild defects is so small that the curve of lightness reaches a plateau. In “Siemens” colormap, the defect is markedly overestimated at first (a large gap between ∆E⁷⁶ and uptake curve) but remains steady after that. It is interpretable as the distinction between mild and moderate/severe defects becomes vague. Finally, for “s pet” color map, the curves for ∆E⁷⁶ and lightness change irregularly, which hampers the accurate estimation of defect severity. Therefore, its use in the interpretation of cardiac SPECT and PET images is strongly discouraged.

This project focuses on a particular technical aspect of color perception, namely color difference, with quantitative metrics. The process of perception itself is much more complicated and involves neurophysiological and psychophysical aspects. Despite the quantifications based on these methods, human perceptions are different among individuals. In parallel, the issue can be investigated by observer performance models using receiver operating characteristic curve analysis among different readers or observers. However, all these methods are complementary for modeling and quantitative evaluation of human perception of colors and lesion detection (15, 16, 28).

Conclusion

There certainly is not a perfect colormap for various purposes of image visualization. Selection depends on the application. Of the 6 colormaps investigated in this study for estimating defect severity, “grayscale” is less favorable than others, and “thermal” performs slightly better. The “s pet” or rainbow, which is traditionally used by many practitioners, is strongly discouraged. The “Siemens” colormap suffers from decreased discriminating power in the range of mild to moderate/severe defects. In contrast, the “cool” and “cequal” colormaps outperform the other colormaps employed in this study to some extent, although they have some shortcomings.

Ethics

Ethics Committee Approval: Not necessary.

Informed Consent: Not necessary.

Financial Disclosure: The author declared that this study received no financial support.

References

Silva S, Sousa Santos B, Madeira J. Using color in visualization: A survey. Comput Graph 2011;35:320-333.

Stone MC. Representing colors as three numbers. IEEE Comput Graph Appl 2005;25:78-85.