<script type="application/ld+json">
{
 "@context": "https://schema.org",
 "@type": "FAQPage",
 "mainEntity": [{
   "@type": "Question",
   "name": "What is concept drift?",
   "acceptedAnswer": {
     "@type": "Answer",
     "text": "In machine learning, predictive modeling, and data mining, concept drift is the gradual change in the relationships between input data and output data in the underlying problem. The ‘concept’ in question is the unknown and hidden relationship between input and output variables."
   }
 },{
   "@type": "Question",
   "name": "How do concept drift changes occur?",
   "acceptedAnswer": {
     "@type": "Answer",
     "text": "1. Suddenly.
2. Gradually.
3. Seasonal."
   }
 },{
   "@type": "Question",
   "name": "How do you fix concept drift?",
   "acceptedAnswer": {
     "@type": "Answer",
     "text": "1. Static model.
2. Periodically Re-Fit.
3. Periodically update.
4. Weight Data.
5. Learn The Change.
6. Detect and Choose Model.
7. Data Preparation."
   }
 }]
}
</script>

Concept drift

What is concept drift?

In machine learning, predictive modeling, and data mining, concept drift is the gradual change in the relationships between input data and output data in the underlying problem. The ‘concept’ in question is the unknown and hidden relationship between input and output variables.

“A difficult problem with learning in many real-world domains is that the concept of interest may depend on some hidden context, not given explicitly in the form of predictive features. Often the cause of change is hidden, not known a priori, making the learning task more complicated.”
—  The problem of concept drift: definitions and related work, 2004.


It’s a phenomenon where the statistical properties of the variable which the model is trying to predict change over time. The context changes without the model being aware about the change. It occurs when the patterns that predictive models learned are no longer valid.

Usually, either the law behind the data changes, so the model built on past data cannot be used anymore or the assumptions that the model made based on past data need to be revised based on current data.


How do concept drift changes occur?

The manner in which concept drift takes place plays a role in deciding how the drift should be handled. The concept drift may involve changes that occur:

1. Suddenly

There is an abrupt shift from an old concept to a new one. For example, the lockdowns triggered by the pandemic caused instantaneous changes in the behavioral patterns of populations around the world.


2. Gradually

There may be incremental changes from one concept to another as new information comes to light and new concepts emerge. The quality decline is usually because of changes in external factors.


3. Seasonal

These are recurring changes. For example, buying patterns change during festive seasons and revert to normal after those seasons end.

How do you find concept drift?

Here are a few methods that can be used for the purpose of concept drift detection:

SPC / Sequential Analysis Concept Drift detectors

These detectors check whether the predictive model’s error-rate is in-control. They alert you when the error-rate is out-of-control.

Some methods under this family include

  • Drift Detection Method (DDM) and Early Drift Detection method (EDDM):
    EDDM is more effective when it comes to detecting incremental drift. But this method is more sensitive to noise.
  • LFR (Sequential Analysis)
  • Page Hinkley test (PHT):
    This is usually utilized to monitor change detection in the average of a Gaussian signal. 

How do you fix concept drift?

Here are a few ways to address concept drift:

1. Static model

Doing nothing and assuming that the data does not change is the most common way of addressing concept drift. If you suspect that your dataset is facing concept drift, you can monitor the skill of your static model to detect concept drift and even use the skill as a baseline to compare to any changes that you make.


2. Periodically Re-Fit

This involves periodically adding more recent historical data. You may need to back-test your model to decide how much historical data to include while retesting. Sometimes your best bet would be to only use a little recent historical data to understand the new relationships between input data and output data.


3. Periodically update

This involves updating the model fit by using a sample of the most recent historical data. It is best for machine learning algorithms like regression algorithms and neural networks that make use of weights or coefficients.


4. Weight Data

This involves using a weighting that is inversely proportional to the age of the data, allowing your model to prioritize the most recent data.


5. Learn The Change

This method leaves the static model untouched, while a new model are fit on more recent data.


6. Detect and Choose Model

In some domain abrupt changes have occurred in the past and you may want to check for those in the future. You can design systems to identify changes and choose a different model to make predictions.


7. Data Preparation

Sometimes data may be expected to change over time (especially in time series forecasting). The data can be prepared to eliminate the systematic changes to the data over time, like trends and seasonality by differencing.



Thanks for reading! We hope you found this helpful.

Ready to level-up your business? Click here.

About Engati

Engati powers 45,000+ chatbot & live chat solutions in 50+ languages across the world.

We aim to empower you to create the best customer experiences you could imagine. 

So, are you ready to create unbelievably smooth experiences?

Check us out!

Concept drift

October 14, 2020

Table of contents

Key takeawaysCollaboration platforms are essential to the new way of workingEmployees prefer engati over emailEmployees play a growing part in software purchasing decisionsThe future of work is collaborativeMethodology

What is concept drift?

In machine learning, predictive modeling, and data mining, concept drift is the gradual change in the relationships between input data and output data in the underlying problem. The ‘concept’ in question is the unknown and hidden relationship between input and output variables.

“A difficult problem with learning in many real-world domains is that the concept of interest may depend on some hidden context, not given explicitly in the form of predictive features. Often the cause of change is hidden, not known a priori, making the learning task more complicated.”
—  The problem of concept drift: definitions and related work, 2004.


It’s a phenomenon where the statistical properties of the variable which the model is trying to predict change over time. The context changes without the model being aware about the change. It occurs when the patterns that predictive models learned are no longer valid.

Usually, either the law behind the data changes, so the model built on past data cannot be used anymore or the assumptions that the model made based on past data need to be revised based on current data.


How do concept drift changes occur?

The manner in which concept drift takes place plays a role in deciding how the drift should be handled. The concept drift may involve changes that occur:

1. Suddenly

There is an abrupt shift from an old concept to a new one. For example, the lockdowns triggered by the pandemic caused instantaneous changes in the behavioral patterns of populations around the world.


2. Gradually

There may be incremental changes from one concept to another as new information comes to light and new concepts emerge. The quality decline is usually because of changes in external factors.


3. Seasonal

These are recurring changes. For example, buying patterns change during festive seasons and revert to normal after those seasons end.

How do you find concept drift?

Here are a few methods that can be used for the purpose of concept drift detection:

SPC / Sequential Analysis Concept Drift detectors

These detectors check whether the predictive model’s error-rate is in-control. They alert you when the error-rate is out-of-control.

Some methods under this family include

  • Drift Detection Method (DDM) and Early Drift Detection method (EDDM):
    EDDM is more effective when it comes to detecting incremental drift. But this method is more sensitive to noise.
  • LFR (Sequential Analysis)
  • Page Hinkley test (PHT):
    This is usually utilized to monitor change detection in the average of a Gaussian signal. 

How do you fix concept drift?

Here are a few ways to address concept drift:

1. Static model

Doing nothing and assuming that the data does not change is the most common way of addressing concept drift. If you suspect that your dataset is facing concept drift, you can monitor the skill of your static model to detect concept drift and even use the skill as a baseline to compare to any changes that you make.


2. Periodically Re-Fit

This involves periodically adding more recent historical data. You may need to back-test your model to decide how much historical data to include while retesting. Sometimes your best bet would be to only use a little recent historical data to understand the new relationships between input data and output data.


3. Periodically update

This involves updating the model fit by using a sample of the most recent historical data. It is best for machine learning algorithms like regression algorithms and neural networks that make use of weights or coefficients.


4. Weight Data

This involves using a weighting that is inversely proportional to the age of the data, allowing your model to prioritize the most recent data.


5. Learn The Change

This method leaves the static model untouched, while a new model are fit on more recent data.


6. Detect and Choose Model

In some domain abrupt changes have occurred in the past and you may want to check for those in the future. You can design systems to identify changes and choose a different model to make predictions.


7. Data Preparation

Sometimes data may be expected to change over time (especially in time series forecasting). The data can be prepared to eliminate the systematic changes to the data over time, like trends and seasonality by differencing.



Thanks for reading! We hope you found this helpful.

Ready to level-up your business? Click here.

Share

Continue Reading