In information science, profiling refers to the process of construction and application of user profiles generated by computerized data analysis.
Information science is an academic field which is primarily concerned with analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information.
Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusions and supporting decision-making.
A user profile is a visual display of personal data associated with a specific user, or a customized desktop environment.
This is the use of algorithms or other mathematical techniques that allow the discovery of patterns or correlations in large quantities of data, aggregated in databases.
In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or two sets of data.
When these patterns or correlations are used to identify or represent people, they can be called profiles.
Other than a discussion of profiling technologies or population profiling, the notion of profiling in this sense is not just about the construction of profiles, but also concerns the application of group profiles to individuals, e. g., in the cases of credit scoring, price discrimination, or identification of security risks.
A credit score is a numerical expression based on a level analysis of a person's credit files, to represent the creditworthiness of an individual.
Security is the degree of resistance to, or protection from, harm.
Profiling is not simply a matter of computerized pattern-recognition; it enables refined price-discrimination, targeted servicing, fraud detection, and extensive social sorting.
Social sorting is understood as the breakdown and categorization of group- or person-related raw data into various categories and segments by data manipulators and data brokers.
In law, fraud is intentional deception to secure unfair or unlawful gain, or to deprive a victim of a legal right.
Real-time machine profiling constitutes the precondition for emerging socio-technical infrastructures envisioned by advocates of ambient intelligence, autonomic computing and ubiquitous computing.
Sociotechnical systems in organizational development is an approach to complex organizational work design that recognizes the interaction between people and technology in workplaces.
Ubiquitous computing is a concept in software engineering and computer science where computing is made to appear anytime and everywhere.
In computing, ambient intelligence refers to electronic environments that are sensitive and responsive to the presence of people.
One of the most challenging problems of the information society involves dealing with increasing data-overload.
An information society is a society where the creation, distribution, use, integration and manipulation of information is a significant economic, political, and cultural activity.
With the digitizing of all sorts of content as well as the improvement and drop in cost of recording technologies, the amount of available information has become enormous and increases exponentially.
Digitization is the process of converting information into a digital format, in which the information is organized into bits.
It has thus become important for companies, governments, and individuals to discriminate information from noise, detecting useful or interesting data.
The development of profiling technologies must be seen against this background.
These technologies are thought to efficiently collect and analyse data in order to find or test knowledge in the form of statistical patterns between data.
This process, called Knowledge Discovery in Databases, provides the profiler with sets of correlated data usable as "profiles".
Data mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.
The technical process of profiling can be separated in several steps:
Preliminary grounding: The profiling process starts with a specification of the applicable problem domain and the identification of the goals of analysis.
Data collection: The target dataset or database for analysis is formed by selecting the relevant data in the light of existing domain knowledge and data understanding.
Data collection is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes.
Data preparation: The data are preprocessed for removing noise and reducing complexity by eliminating attributes.
Data preparation is the act of manipulating raw data into a form that can readily and accurately be analysed, e.g. for business purposes.
Data mining: The data are analysed with the algorithm or heuristics developed to suit the data, model and goals.
Interpretation: The mined patterns are evaluated on their relevance and validity by specialists and/or professionals in the application domain.
Application: The constructed profiles are applied, e.g. to categories of persons, to test and fine-tune the algorithms.
Institutional decision: The institution decides what actions or policies to apply to groups or individuals whose data match a relevant profile.