# Attributes for
Connected Components _Connected components are fundamental to the fields of image processing and computer vision, providing the basis for segmenting and analyzing images. In my [Introduction to Max-Trees](./mt.html) I delve into the methods of extracting these components from images in order to construct a Max-Tree --- a data structure centered around the concept of connected components. However, the true utility of connected components emerges when we compute attributes that provide meaningful information about the image._ This article compiles a set of attributes that can be computed relatively easily, offering insights into the properties and characteristics of connected components. Additionally, a comprehensive collection of GNU Octave implementations for these attributes is provided, facilitating practical application. Some of these implementations are included in the body of this article as well. While many of these attributes are particularly relevant for connected components within a max-tree structure, they are also applicable in other contexts. The content of this article is derived from my master's thesis, "Detecting Astronomical Objects with Machine Learning" and aims to make its insights more accessible to a broader audience. ## Area An attribute that can easily be computed is the _area_. Berger et al. define this attribute as the number of elements within the component.[^berger+et-al:2007] Tushabe provides the formal mathematical definition of area $A(X)$ of component $X$ as[^tushabe:2010] $$ A(X) = \sum_{x \in X} \mathbf{1}_X(x) \text{.} $$ Here, $\mathbf{1}_X(x)$ is the so-called identicator function, which is $1$ if $x \in X$ or $0$ otherwise. While the term "area" originated from two-dimensional components, it can also apply to higher dimensions, such as indicating the volume of a cube in three dimensions. Berger et al. provide an algorithm for computing the area of a component in the context of a max-tree, which is included below in GNU Octave code.[^berger+et-al:2007] ```octave function a = area(f, R, parent) a = ones(size(R)); [_, v] = sort(R(:)); for x = v' if parent(x) == x continue end a(parent(x)) = a(parent(x)) + a(x); end end ``` ## Perimeter The perimeter is closely related to the area attribute. Gonzalez and Woods describe the perimeter as the length of the boundary of a component.[^gonzalez+woods:2008] Interpreting the boundary as the number of elements not exclusively connected to elements within the same component, a novel approach to computing the perimeter is presented in my master's thesis.[^van-de-weerd:2021] The complexity of computing the perimeter arises from the fact that an element on the boundary of its own component can also contribute to the boundary of its parent, its parent's parent, etc. To address this, a _boundary vector_ $\mathcal{B}(X)$ is constructed, indicating the contribution of a component $X$ to each layer in the tree. These contributions apply only to the direct and indirect parents of the component. First, we compute the intermediate contribution set $\Lambda(x)$ for each element $x$, composed of the layers where $x$ is part of the perimeter: $$ \Lambda(x) = \left{y \in \mathbb{Z} \text{ : } \underset{n \in \mathcal{N}(x)}\text{argmin} \text{ } \lambda(n) < u \leq \lambda(x) \right} \text{.} $$ Here, $\mathcal{N}(x)$ are the neighbors of $x$ and $\lambda(x)$ its level. With $\Lambda(x)$ computed for every element $x$, the contribution of $X$ to the perimeter of layer $\lambda$ is the number of $\lambda$ inclusions in the contribution set of elements in $X$: $$ \mathcal{B}(X)_\lambda = { \lvert\left{x \in X \text{ : } \lambda \in \Lambda(x)\right}\rvert } \text{.} $$ The perimeter $\mathfrak{B}(X_\lambda^\mu)$ of component $X_\lambda^\mu$ in branch $\mu$ and layer $\lambda$ is determined by summing the contributions of all components to $\lambda$: $$ \mathfrak{B}\left(X_\lambda^\mu\right) = \sum_{\nu \in N} \mathcal{B}\left(X_\gamma^v\right)_\lambda \text{,} $$ where $N = {\nu \in \mathbb{Z} \text{ : } \mu \text{ } \preceq \text{ } v}$ and $\gamma \geq \lambda$. The set of branches $N$ includes all branches $\nu$ that are either equal to or divarications of branch $\mu$ (written as $\mu \text{ } \preceq \text{ } \nu$). Below is a GNU Octave implementation of the perimeter attribute: ```octave function l = per(f, R, parent) b = zeros(size(f)); l = zeros(size(f)); for a = unique(f)' g = f >= a; c = zeros(size(f)); c = c | g > [zeros(1, size(g, 2)); g(1:end-1, :)]; % n c = c | g > [zeros(size(g, 1), 1), g(:, 1:end-1)]; % e c = c | g > [g(2:end, :); zeros(1, size(g, 2))]; % s c = c | g > [g(:, 2:end), zeros(size(f, 1), 1)]; % w b = b + c; end [_, s] = sort(f(:), "descend"); for x = s' y = x; while b(x) > 0 l(y) = l(y) + 1; b(x) = b(x) - 1; y = parent(y); end end end ``` ## Compactness & Circularity Ratio Using the area $A(X)$ and perimeter $\mathfrak{B}(X)$ attributes, composite attributes like _compactness_ and the _circularity ratio_ can be computed. Gonzales and Woods provide the following formal definitions of compactnes $C(X)$ and circularity ratio $R_c(X)$:[^gonzalez+woods:2008] $$ \begin{split} C(X) = & \frac{\mathfrak{B}(X)^2}{A(X)} \text{, and} \\ R_c(X) = & \frac{4 \pi A(X)}{\mathfrak{B}(X)^2} \text{.} \end{split} $$ These attributes indicate the shape of the components they represent. For example, $R_c(X)$ approaches a value of $1$ as the shape of the component $X$ becomes more circular. ## Grayscale, Power & Volume In addition to positional information, the value or _intensity_ represented by elements can be indicative of a component attribute. Many examples are straightforward, such as the _sum of (squared) intensities_ and _grayscale_ (average value within a component). Tushabe provides the formal definition of the latter: $$ G(X) = \frac{\sum_{x \in X} f(x)}{A(X)} \text{.} $$ This can be implemented in GNU Octave code as follows: ```octave function g = grayscale(f, R, parent) g = zeros(size(f)); a = area(f, R, parent); [_, s] = sort(f(:), "descend"); for x = s' y = x; while true g(y) = g(y) + f(x) / a(y); if y == parent(y) break; end y = parent(y); end end end ``` More sophisticated measurements based on intensity levels include _power_ $$ \mathcal{P}(X, f, \alpha) = \sum_{x \in X}\left(f(x) - \alpha\right)^2 $$ for an image $f$ and parent intensity level $\alpha$, and _volume_ $$ V(X, f, \alpha) = \sum_{x \in X}\left(f(x) - \alpha\right) \text{.} $$ ## Entropy Gonzalez and Woods define the entropy attribute as the average amount of information conveyed by each element in a component.[^gonzalez+woods:2008] Given a discrete set of distinct grayscale values $\left{a_1, a_2, \ldots, a_J\right}$ in a component of size $M \times N$, the probability of encountering a grayscale value $a_j$ in the component is $$ P(a_j) = \frac{a_J}{MN} \text{.} $$ This function can be used to compute a _histogram_ of component $X$. Using this definition, Gonzalez and Woods provide the formal definition of entropy:[^gonzalez+woods:2008] $$ H(X) = -\sum_{j = 1}^J P(a_j) \log P(a_j) \text{.} $$ ## Invariant Moments Westenberg, Roerdink, and Wilkinson provide definition for the four _invariant moments_: _non-compactness_, _elongation_, _flatness_ and _sparseness_.[^westenberg+roerdink+wilkinson:2007] These are suitable for usage in two- and three-dimensional components, unlike those presented by Hu, which only apply to two-dimensional components.[^hu:1962] Westenberg, Roerdink, and Wilkinson define the non-compactness attribute $\mathcal{N}(X)$ as $$ \mathcal{N}(X) = \frac{\text{Tr}\mathbf{I}(X)^\frac{3}{5}}{A(X)} $$ with the _moment of inertia_ tensor $\mathbf{I}(X)$ defined as $$ \mathbf{I}_{ij}(X) = \begin{cases} \sum_X(i - \bar{i})^2 + \frac{A(X)}{12} & \text{if } i = j \\ \sum_X(i - \bar{i})(j - \bar{j}) & \text{otherwise} \end{cases} $$ for $i, j \in \left{x, y, z\right}$. Computing _eigenvalues_ $\lambda_i(X)$ of $\mathbf{I}(X)$ and ordering them such that $$ {\lvert\lambda_1(X)\rvert} \geq {\lvert\lambda_2(X)\rvert} \geq {\lvert\lambda_3(X)\rvert} \text{.} $$ allows the computation of the attributes elongation $\mathcal{E}(X)$, flatness $\mathcal{F}(X)$ and sparseness $\mathcal{S}(X)$: $$ \begin{split} \mathcal{E}(X) = & {\lvert\frac{\lambda_1(X)}{\lambda_2(X)}\rvert} \text{,} \\ \mathcal{F}(X) = & {\lvert\frac{\lambda_2(X)}{\lambda_3(X)}\rvert} \text{, and} \\ \mathcal{S}(X) = & \frac{\pi}{6A(X)}\prod_{i=1}^3\sqrt{\frac{20{\lvert\lambda_i(X)\rvert}}{A(X)}} \text{.} \end{split} $$ [^berger+et-al:2007]: Christophe Berger, Theirry Géraud, Roland Levillain, Nicolas Widynski, Anthony Baillard and E. Berin. "Effective Component Tree Computation with Application to Pattern Recognition in Astronomical Imaging". In _Proceedings of the International Conference of image Processing_, vol. 4, 2007, pp. 41 -- 44. ISBN: 978-1-4244-1436-9. DOI: [10.1109/ICIP2007.4379949] [^tushabe:2010]: Florence F. Tushabe. "Extending Attribute Filters to Color Processing and Multi-Media Applications". 2010. ISBN: 978-90-367-4630-4. [^gonzalez+woods:2008]: Rafael C. Gonzalez and Richard E. Woods. "Digital Image Processing". 3rd edition. 2008. ISBN: 978-0-13-505267-9. [^westenberg+roerdink+wilkinson:2007]: Michael A. Westenberg, Jos B. Roerdink and Michael H.F. Wilkinson. "Volumetric Attribute Filtering and Interactive Visualization Using the Max-Tree Representation". In: _IEEE Transactions on Image Processing_, vol. 16, issue 12, 2007, pp. 2943 -- 2952. DOI: [10.1109/TIP.2007.909317]. [^hu:1962]: Ming-Kuei Hu. "Visual Pattern Recognition by Moment Invariants". In: _IRE Transactions on Informational Technology_, vol. 8, issue 2, 1962, pp. 179 -- 187. DOI: [10.1109/tit.1962.1057692]. [^van-de-weerd:2021]: Michaël P. van de Weerd. "Detecting Astronomical Objects with Machine Learning". 2021. URL: [https://fse.studenttheses.ub.rug.nl/23854/](https://fse.studenttheses.ub.rug.nl/23854/). [10.1109/ICIP2007.4379949]: https://doi.org/10.1109/ICIP2007.4379949 [10.1109/TIP.2007.909317]: https://doi.org/10.1109/TIP.2007.909317 [10.1109/tit.1962.1057692]: https://doi.org/10.1109/tit.1962.1057692 *[URL]: Uniform Resource Locator *[DOI]: Digital Object Identifier *[ISBN]: International Standard Book Number *[GNU]: GNU's Not Unix! *[IRE]: Institute of Radio Engineers *[IEEE]: Institute of Electrical and Electronics Engineers