Morphology (biology)

Morphology of a male Caprella mutica

Morphology is a branch of biology dealing with the study of the form and structure of organisms and their specific structural features.

This includes aspects of the outward appearance (shape, structure, colour, pattern,size), i.e., external morphology (eidonomy) as well as the form and structure of the internal parts like bones and organs, i.e., internal morphology or anatomy. This is in contrast to physiology, which deals primarily with function. Morphology is a branch of life science dealing with the study of gross structure of an organism or taxon and its component parts.

    Etymology and term usage

    The word "morphology" is from the Ancient Greek μορφή, morphé, meaning "form", and λόγος, lógos, meaning "word, study, research". The biological concept of morphology was developed by Johann Wolfgang von Goethe (1790) and independently by the German anatomist and physiologist Karl Friedrich Burdach (1800).

    In English-speaking countries, the term "molecular morphology" has been used for some time for describing the structure of compound molecules, such as polymers and RNA. The term "gross morphology" refers to the collective structures or an organism as a whole as a general description of the form and structure of an organism, taking into account all of its structures without specifying an individual structure.

    Branches of morphology

    • Comparative Morphology is analysis of the patterns of the locus of structures within the body plan of an organism, and forms the basis of taxonomical categorization.
    • is the study of the relationship between the structure and function of morphological features.
    • Experimental Morphology is the study of the effects of external factors upon the morphology of organisms under experimental conditions, such as the effect of genetic mutation.
    • "Anatomy" is a "branch of morphology that deals with the structure of organisms".

    Morphology and classification

    Most taxa differ morphologically from other taxa. Typically, closely related taxa differ much less than more distantly related ones, but there are exceptions to this. Cryptic species are species which look very similar, or perhaps even outwardly identical, but are reproductively isolated. Conversely, sometimes unrelated taxa acquire a similar appearance as a result of convergent evolution or even mimicry. In addition, there can be morphological differences within a species, such as in Apoica flavissima where queens are significantly smaller than workers. A further problem with relying on morphological data is that what may appear, morphologically speaking, to be two distinct species, may in fact be shown by DNA analysis to be a single species. The significance of these differences can be examined through the use of allometric engineering in which one or both species are manipulated to phenocopy the other species.

    3D cell morphology:classification

    Invention and development of microscopy enable the observation of 3-D cell morphology with both high spatial and temporal resolution. The dynamic processes of these cell morphology which are controlled by a complex system play an important role in varied important biological process, such as immune and invasive responses. In order to extract these information, the data processing work is numerous, since we need to systematically and quantitatively analyze the 3-D cell morphology, which is challenging for multi-signal processing. Existing computational imaging methods are rather limited in analyzing and tracking such time-lapse data sets, and manual analysis is unreasonably time-consuming and subject to observer variances.

    One of these challenges is how to do classify different cells after simplified representations of the raw data from 3-D signals by using shape extraction and description, since we need to know the relationship between the number patterns and the function which the multidimensional signal represents. Machine learning algorithms build a model that describes the properties of a given population of individuals, to characterize subgroups of individuals with similar properties, or to predict the properties of a new unknown (or simulated) individual.
    There are basically three main approaches of supervised machine-learning techniques for processing the 3-D signals, all of which require that a subset of the data in each subpopulations be manually annotated to classify the rest of the data set.

    K-nearest Neighbors Method

    Basic principle of it is shown in figure. Suppose we have a pattern located at (x1,y1, z1), if Nk(x1, y1, z1) stands for the k nearest neighborhood samples based on some metric, e.g., Euclidean distance or Hamming distance, then the decision rule to classify this pattern is defined by a majority vote on {Pi | (xi,yi,zi) ∈Nk(x,y,z) }.

    K-nearest Neighbors Method. The nearest units (determined base on some metric) in different cell(A,B,C) contribute equally to the unit (yellow) to be classified.

    Specifically, in identification of cancer cells at different time phases, k=6 and a set of 35 subset features can generate best results. Also, certain constraints need to be applied in order to compensate the phase identification errors: a) The go-forward rule: Cell cycle progress can only go forward in the biologically cell phase sequence; b) The continuation rule: Cell cycle progress cannot skip a cell phase and enter the phase after; c) The phase-timing rule: The time period that a cell stays in a phase cannot be changed dramatically. Results of the k-nearest neighbors show good accuracy.

    K-nearest neighbor method is easy to realize. However, when the cell type distribution is seriously skewed, this method is not suitable. Also, this method will cost a lot of time when the dataset is very large.

    Decision Trees

    Schematic View of Decision Tree

    The decision tree adds candidate node for splitting by defining half planes P1 = {x | xj ≤ s} and P2 = {x | xj ≥ s}, in which xj is the splitting variable and s is the splitting point. At each candidate node, compute the impurity, e.g., the Gini index I = p-1 (1 - p-1) + p+1 (1 - p+1), in which pk (k=-1, +1) is the fraction of class k observed at that node. The splitting nodes are selected to improve the homogeneity sequentially, and the decision at each leaf node is by majority vote.

    This method is used to determine the relationship between gene expressions and image traits, and helps to find the modules of co-regulated genes. First, the gene expressions of the image traits are obtained. Based on them, we could find a rule describing the qualitative behavior (upregulation, no change or downregulation) of a small set of genes that control the expression of the genes in the module. A regression tree of the gene expression array based on the rule is built by two blocks: the decision nodes and leaf nodes. Each decision node corresponds to one of the regulatory inputs (regulator expression values) and a query on its value. Each decision node contains two child nodes: one is chosen when answer for the query is true and the other node is chosen when false. For a certain given array, we could find sets of paths: those find the way down to the corresponding leaf node. These responses can be modeled as a normal distribution of the expression values of the module’s genes, and this distribution is encoded using a mean and variance stored at the corresponding leaf node. This regression tree allows for expression profiles with different degrees of conservation of the mean behavior of the module. After programming the rule by the regression tree, we could find the best results by applying an iterative algorithm, which have the smallest number of modules of genes that could be co-regulated.

    The decision tree method could provide an efficient way for classification. However, this method is sometimes unstable when perturbation is added, which means it does not work well for classification with high noise level.

    Support Vector Machine(SVM)

    SVM method in principle is to find a hyperplane which is in higher or infinite dimensional space and results in the largest separation of sample data. The assumption is that the data sets which are not linearly separable may be mapped into a higher-dimensional space, which may make the separation easier. These data sets could be multidimensional, not limited to 2-D. Those data points nearest the boundary which will define the hyperplane are called 'support vectors'.

    Principle of Supporting Vector Machine

    The SVM method is used to do the time-resolved phenotype annotation. M. Held et al. first used water shed split-and-merge to segment each cell. From the segmentation results, the radial-based kernel and probability estimates, 8 training patterns were defined for different cell phases. Based on the training algorithm designed by B.E. Boser et al. which can maximize the margin between the training patterns (or 'support vector'), they were able to build the hyperplane. By using the algorithm, they could predict the annotation of the cell, which corresponds with the human results very well.

    The advantage of this method is that it is very efficient. However, this kind of kernel fitting will be very sensitive to over fitting the model selection criterion.

    See also