Techniques that distinguish signal from noise.
Pattern recognition techniques distinguish signal from noise through statistical analyses, Bayesian analysis, classification, cluster analysis, and analysis of texture and edges. Pattern recognition techniques apply to sensors, data, imagery, sound, speech, language.
Automated classification tools distinguish, characterize and categorize data based on a set of observed features. For example, one might determine whether a particular mushroom is “poisonous” or “edible” based on its color, size, and gill size. Classifiers can be trained automatically from a set of examples through supervised learning. Classification rules discriminate between different contents of a document or partitions of a database based on various attributes within the repository.
Statistical learning techniques construct quantitative models of an entity based on surface features drawn from a large corpus of examples. In the domain of natural language, for example, statistics of language usage (e.g., word trigram frequencies) are compiled from large collections of input documents and are used to categorize or make predictions about new text.
Statistical techniques can have high precision within a domain at the cost of generality across domains. Systems trained through statistical learning do not require human-engineered domain
modeling. However, they require access to large corpora of examples and a retraining step for each new domain of interest.