Sharding is a technique of splitting or partitioning a single and large logical dataset into multiple/smaller databases and storing them for easy management of data. The word shard means "a small part of a whole" and these smaller parts of a large database are called data shards.

What is Horizontal and Vertical Sharding?

When each new table or dataset has the same schema but unique rows, it's called horizontal sharding. on the other hand, when each new table has a schema that is an authentic subset of the original table's schema, it is known as vertical sharding. The difference between horizontal vs vertical comes from the traditional tabular view of a database.

What are the different Sharding strategies?

1. The Lookup strategy2. The Range strategy3. The Hash strategy

Data Modelling

What is Data Modelling?

Data modelling in software engineering is the process of creating data models or visual representations of either a whole information system or parts of it to communicate connections between data points and structures. Data modelling represents data with diagrams, symbols, or text to visualize the interrelation and data structure. It helps to improve data analytics by increasing consistency in naming, rules, semantics, and security.

In simple language, the goal of data modelling is to illustrate the types of data used and stored within the system, the relationships among these data types, the ways the data can be grouped and organized, and its formats and attributes. We generally built these models around or as per the business needs. Data representation helps the organization comprehend, analyze, and reuse complex data more easily.

‍

What is Data Modelling used for? Or the importance of Data Modelling?

Data Modelling helps create relations between data sets, tables, and data models across the information system.
The data model makes sure that all the data objects required by the database are mapped properly.
Data Modelling helps create a robust design that can show an organization's entire data on the same platform.
Consistency can be maintained with the help of a data model across all the projects.
It also helps in designing the logical, physical, and conceptual levels to define the model’s parameters, structure the data for use, and produce sets of data in a common format for analysis and use by the organization.
Data Modelling also reduces the level of errors and faults that improves the accuracy of the data reports.
The data models create a visual representation of the data. With the help of it, the data analysis gets improved.

‍

What are the stages of Data Modelling?

There are three stages in data modelling: conceptual, logical, and physical. Each stage brings the database closer to reality.

‍

Conceptual Data Modelling

The conceptual model defines what data entities are to be represented and determines what kinds of relationships exist between them. The conceptual data model is a view of the data that is required to help business processes. It also keeps track of business events and keeps related performance measures. The conceptual model defines what the system contains. This type of data modelling focuses on finding the data used in a business rather than processing flow. The main purpose of this data model is to organize, define business rules and concepts. For example, it helps business people to view any data like market data, customer data, and purchase data.

Logical Data Modelling

A logical data model is the map of rules and data structures that includes the data required such as tables, rows, columns, etc. A logical data model talks about "how the system should be implemented?" and what the data representation should look like. The purpose is to develop a technical map of rules and data structures. This data model helps to form the base for the physical model.

Physical Data Modelling

The physical data model focuses on the implementation of data, which database management technology to be used, the design of the tables that will make up the actual database, and the keys that will represent the relationships between these tables. The physical data model represents each table, column, constraint like primary key, foreign key, etc. Since the physical data model is the most detailed and usually the final step before database creation, it often accounts for database management system-specific properties and rules.

‍

What are the Data Modelling techniques/types?

Data Models are used to show how data is stored, connected, accessed, and updated in the database management system. Though many data models are being used nowadays, the Relational model is the most widely used. Apart from the Relational model, there are many other types of data models which are as follows.

‍

Hierarchical Model

This model organizes the data in the hierarchical tree structure. The hierarchy starts from the root which has root data and then it expands in the form of a tree adding child node to the parent node.

Network Model

This model is the same as the hierarchical model, the only difference is that a record can have more than one parent. It replaces the hierarchical tree with a graph.

Entity-Relationship Model

In this model, we represent the real-world problem in the pictorial form to make it easy for the stakeholders to understand.

Relational Model

The relational model is the most widely used. In this model, the data is maintained in the form of a two-dimensional table. All the information is stored in the form of rows and columns.

Object-Oriented Data Model

The real-world problems are more closely represented through the object-oriented data model. In this model, both the data and relationship are present in a single structure known as an object.

Object-Relational Data Model

As the name suggests, it is a combination of relational and object-oriented models. This model was built to fill the gap between the object-oriented model and the relational model.

Flat Data Model

It is a simple model in which the database is represented as a table consisting of rows and columns.

Semi-Structured Data Model

Some entities may have missing attributes in this model while others may have an extra attribute. This model gives flexibility in storing the data. It also gives flexibility to the attributes.

Associative Data Model

Associative Data Model is a model in which the data is divided into two parts. Everything which has independent existence is called an entity and the relationship among these entities are called association.

Context Data Model

This consists of models like network models, relational models, etc. Using this model we can do various types of tasks that are not possible using any model alone.

‍