A data model is a design for how to structure and represent information driven by the data strategy and thus provides structure and meaning to data.
As a first step a "Data model theory" is coined and then translated into a data model instance. Such model of required information about things in the business or trade, usually with some system, context in mind, further broken in kind of model, classes represent information that must be stored about things or objects in that context. Here is a quick glance into data science and data modeling fitment...
Data Modelling is a process, a critical milestone for organization while embarking journey of Datafication and delivers a strategic goal of specific definition and further analysed for data requirements involving professional data modellers working closely with business stakeholders, thus data model becomes critical reference to all data management and transformation stages, strategies and execution approaches and the process is termed as 'Data Modeling'.
A Data Model is thus...
A conceptual data model further illustrates, attributes and relationships of entities / tenets and not the technical implementation of the same. The data model must represent active references or classification of the dataset or the subject matter.
This analysis helps build a real data model that provides visibility into entities, attributes and relationship representation in a collective way.
What are the reasons to have a data model?
There could be multiple reasons and desired outcomes to put in place a data model for a specific use case or derive references from existing one, the fundamental need always will be to...
The true sense of the data model is always integrated with the DNA of the organization and important to have a focus and clarity around it.
Levels of data models on the data modeling process
Since the data model is an outcome of the large enterprise information management strategy and key enabler thus touches entire audience to provide big picture thus we call it enterprise data strategy and the instance has these three kinds namely conceptual, logical and physical as per ANSI as defined in 1975.
This three-schema approach has been well received in software engineering as well while few forums are of the opinion that the enterprise level exist above all these to establish the business use case as a scope statement by sponsors.
Once data model is in place it can further be normalized to remove redundant datasets, ensure data integrity, make it generally available for multiple data mining needs, maintain the quality and keeping it current.
Requirements for Data Models
All data models have a context and scope, although they may not be formally defined. The context of a data model is the range within which it is valid, whilst the scope of a model is what it contains. A problem that often arises is that when there is a change in business requirements due to market, regulatory or organizational dynamics, the addition in scope to the model takes it outside the original context
Since there is not set standards for data modeling, here are few requirements in principle that every data model should follow...
More about Entities, Attributes and Relationships...
Data modeling is either a top-down or bottom up process. In the top down process data model is derived from an intimate understanding of the business and in bottom up process it is derived by reviewing specifications and business documents, in either case it remain a basic model, representing entities and relationships, is developed first to help visualize the model. Then detailing is done to the model by including information about attributes and business rules. There are different approaches, yet everything revolves around three key tenets, which are Entities, Attributes and Relationships. However, an effective data model must completely and accurately represent the data requirements of the stakeholders (Read, end users)
Entities contain descriptive information, entities are objects which contain descriptive information. If a data object has been identified and is described by other objects, then it is an entity. However, an entity is a "thing", "concept" or, object" by large yet entities can sometimes represent the relationships between two or more data objects is known as an associative entity.
Attributes either identify or describe entities, they are data objects that either identify or describe entities. Attributes that identify entities are called key attributes and have values that are atomic (Rad, present a single fact.)
Relationships are associations between entities, Typically, a relationship is indicated by a verb connecting two or more entities like "a customer and his bank account", and further should be classified in terms of cardinality, optionality, direction, and dependence. Cardinality quantifies the relationships and associates multiple entities such have "a customer has two accounts" wherein it also enforces mandatory one relationship i.e. " a customer must have at least one account" etc. point to remember is that the relationship can be "one-to-one", "one-to-many" or "many-to-many" depending on the context in with the data model is rested.
Diving bit deeper, Complex relationships are classified as ternary, an association among three entities and must be replaced by an association entity and the original entities are related to this new entity to simplify the implementation, i.e. this is the job of the data modeller and analysts working together in tandem. This is resolved via introducing "Keys" that tie the different entities together of making it ease and remain contextual.
Validating the Keys and Relationships is a Basic hygiene to be followed and three rules ( Read, Constraints) governing the identification and migration of primary keys must be enforced such as...
In a nutshell, primary and foreign keys are vital components on which relational theory of data model is rested. Each entity must have an attribute or attributes, the primary key, whose values uniquely identify each instance of the entity. Every child entity must have an attribute, the foreign key, that completes the association with the parent entity and provides linear association. This helps build a generalization hierarchy for structured grouping of entities that share common attributes while preserving their respective differences.
Data Model integration with Process Model
In the context of business process integration, data modeling complements business process modeling, and ultimately results in proof of value. The tenets must co-exist and well-orchestrated for the synergy and balance. Process modeling or specifically Business Process Modeling (BPM) involves representing processes of an enterprise such that the existing processes could be analyzed to improve quality and efficiency. BPM is generally a diagrammatic representation of the sequence of activities or workflows carried out in an organization. It displays the events, actions and connection points from start to the end of the sequence used to improve the efficiency and the quality of the business process. The difference being data model focuses on data objects and assets and process model focuses on functional primitive, processes and activities or tasks thereof.
A data flow diagram can be a good example of integrated outcome of both data and process model, it however does not show the program logic or processing steps it only limits on describing what system does rather than how it does it. Second step of deliverables is a set of entries about data objects to be stored in repository or project dictionary. This Repository links data, process and logic models of an information system. Data elements that are included in the DFD must appear in the data model and visa versa. Each data store in a process model must relate to business objects represented in the data model. A quick glance at the approaches on the different data modeling methodologies might include:
In Conclusion, without a data modeling exercise enterprise data fails to provide business value, and in some cases impede business success through inaccuracy, misuse, or misunderstanding. Having a well-defined Data Modeling as a best practice accelerates and augments the business value of data.