Sharding

Table of contents

Automate your business at $5/day with Engati

REQUEST A DEMO
Sharding

What is Sharding?

Sharding is a technique of splitting or partitioning a single and large logical dataset into multiple/smaller databases and storing them for easy management of data. The word shard means "a small part of a whole" and these smaller parts of a large database are called data shards. Distributing data among multiple machines helps enterprises to create clusters of database systems that can be easily handled and accessed faster.

Sharding is essential to incase if the dataset is too large to be stored in a single database or machine. Hence, many sharding strategies allow additional machines to be added to the system to distribute the load. It enables a database cluster to scale along with its data and traffic growth and scale-up operations. 

Sharding is a database partitioning technique highly used by blockchain companies to achieve scalability, and efficiency and enables them to process more transactions per second. An individual shard is comprised of its own data, making it distinctive and independent when compared to other shards in the entire system.

Sharding can help reduce the processing speed of a network since it splits a blockchain network into separate small shards. A single shard is managed by an individual server. Based on the replication schema, the shard could be replicated twice on other shards, which might cause trouble for the system.

Source: SingleStore

What is Horizontal and Vertical Sharding?

When each new table or dataset has the same schema but unique rows, it's called horizontal sharding. on the other hand, when each new table has a schema that is an authentic subset of the original table's schema, it is known as vertical sharding. The difference between horizontal vs vertical comes from the traditional tabular view of a database.

In horizontal sharding,  more machines are added to an existing stack to spread out the load, partition the data, increase processing speed, and support more traffic. This method is most effective when queries return a subset of rows that are often grouped based on certain common characteristics. And vertical sharding is effective when queries usually return only a subset of columns of the data.

A database can be split vertically and used for storing different tables & columns in a separate database, or horizontally used to store rows of the same table in multiple database nodes. nodes. Vertical partitioning is very domain-specific, where you can draw a logical split within your application data, storing them in different databases inside the system. 

Original Dataset

Patient ID Name Age Department Doctor
1221 Keith 23 Cardiology Dr Moses
1222 Simon 37 General  Dr Solomon
1223 Jhon 43 Orthopaedic Dr Jacob
1224 Rebecca 26 Gynecology Dr Issac
1225 Catherine 51 Oncology Dr Noah

Horizontal shards

Shard 1

Patient ID Name Age Department Doctor
1221 Keith 23 Cardiology Dr Moses
1222 Simon 37 General  Dr Solomon

Shard 2

Patient ID Name Age Department Doctor
1223 Jhon 43 Orthopaedic Dr Jacob
1224 Rebecca 26 Gynecology Dr Issac
1225 Catherine 51 Oncology Dr Noah

Vertical Shards

Shard 1

Patient ID

Name

Age

1221

Keith

23

1222

Simon

37

1223

Jhon

43

 

Shard 2

Patient ID

Department

1221

Cardiology

1222

General 

1223

Orthopaedic

 

Shard 3

Patient ID

Doctor

1221

Dr Moses

1222

Dr Solomon

1223

Dr Jacob

What are the different Sharding strategies?

The Lookup strategy

In the lookup strategy, the sharding logic implements a map that connects a request for data to the shard that retains that particular data using a unique shard key. With the sharding technique, the system designers assign shard keys to the physical storage that can be mapped with physical shards where each shard key attributes to a physical partition. There's another way to shard under the lookup strategy is to distribute the shards virtually. The system designers can assign unique keys to individual shards in the database and reduce the number of physical shardings in the database. It follows an in-line method where, an application locates data using a shard key that refers to a virtual shard, and the system transparently maps virtual shards to physical sections. 

The Range strategy

Under the range strategy, the related items are compiled into a single shard and classified by shard key in a sequential manner. This strategy is useful for applications and services that frequently retrieve sets of items using range queries where queries have been assigned to shard keys and data has been retrieved from the database from the sequence of shards. For example, a hospital using an application regularly needs to find the list on a monthly basis. It's advisable to use the range strategy and save the patients of a month list in date and time order in the same shard. If each order was stored in a different shard, they'd have to be fetched individually by performing a large number of point queries that could be time consuming and lengthy. But, if the system designer uses a range strategy, they can help hospitals find monthly data with the common composite shard key.  

The Hash strategy

The hash strategy is similar to hashing techniques used in neural network training. Where the load on the server is equally divided with the help of virtual nodes. In database management, the benefit of this strategy is to reduce the chance of hotspots, which means shards should not receive data more than their carrying capacity. Here, the system distributes the data across the shards in a way that achieves a balance between the size of each shard and the average load that each shard will encounter. By introducing some random element into the computation, we can perform an equal distribution between the shards. 

Close Icon
Request a Demo!
Get started on Engati with the help of a personalised demo.
Thanks for the information.
We will be shortly getting in touch with you.
Oops! something went wrong!
For any query reach out to us on contact@engati.com
Close Icon
Congratulations! Your demo is recorded.

Select an option on how Engati can help you.

I am looking for a conversational AI engagement solution for the web and other channels.

I would like for a conversational AI engagement solution for WhatsApp as the primary channel

I am an e-commerce store with Shopify. I am looking for a conversational AI engagement solution for my business

I am looking to partner with Engati to build conversational AI solutions for other businesses

continue
Finish
Close Icon
You're a step away from building your Al chatbot

How many customers do you expect to engage in a month?

Less Than 2000

2000-5000

More than 5000

Finish
Close Icon
Thanks for the information.

We will be shortly getting in touch with you.

Close Icon

Contact Us

Please fill in your details and we will contact you shortly.

Thanks for the information.
We will be shortly getting in touch with you.
Oops! Looks like there is a problem.
Never mind, drop us a mail at contact@engati.com