Overfitting and UnderfittingWhat's a real-world example of “overfitting”?What are the reasons why a...

Inferring from (∃x)Fx to (∃x)(∃x)Fx using existential generalization?

Rear brake cable temporary fix possible?

What is better: yes / no radio, or simple checkbox?

Integral inequality of length of curve

Why avoid shared user accounts?

How to replace the content to multiple files?

Getting a UK passport renewed when you have dual nationality and a different name in your second country?

Why did Bush enact a completely different foreign policy to that which he espoused during the 2000 Presidential election campaign?

What to do when being responsible for data protection in your lab, yet advice is ignored?

How to avoid being sexist when trying to employ someone to function in a very sexist environment?

Can I become debt free or should I file for bankruptcy? How do I manage my debt and finances?

Am I a Rude Number?

What's a good word to describe a public place that looks like it wouldn't be rough?

Manipulating a general length function

Quenching swords in dragon blood; why?

Strange Sign on Lab Door

How to implement expandbefore, similarly to expandafter?

Does the "particle exchange" operator have any validity?

Why zero tolerance on nudity in space?

High pressure canisters of air as gun-less projectiles

Using only 1s, make 29 with the minimum number of digits

How experienced do I need to be to go on a photography workshop?

Question about それに following a verb in dictionary form

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

Overfitting and Underfitting

What's a real-world example of “overfitting”?What are the reasons why a classifier could produce bad results?difference between overtraining and overfittingHow do bias, variance and overfitting relate to each other?Bias and Variance, Overfitting and UnderfittingTest and Training dataset correlation while Splitting the datasetvalidation/training accuracy and overfittingAvoiding snooping and overfittingWhy limiting weights help against overfitting in neural networks?What is a figure of the learned function in binary classification called?How to distinguish overfitting and underfitting from the ROC AUC curve?

I have made some research about overfitting and underfitting, and I have understood what they exactly are, but I cannot find the reasons.

What are the main reasons for overfitting and underfitting?

Why do we face these two problems in training a model?

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

$begingroup$
You might find What's a real-world example of “overfitting”? useful
$endgroup$
– Silverfish
12 hours ago

add a comment |

I have made some research about overfitting and underfitting, and I have understood what they exactly are, but I cannot find the reasons.

What are the main reasons for overfitting and underfitting?

Why do we face these two problems in training a model?

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

$begingroup$
You might find What's a real-world example of “overfitting”? useful
$endgroup$
– Silverfish
12 hours ago

add a comment |

I have made some research about overfitting and underfitting, and I have understood what they exactly are, but I cannot find the reasons.

What are the main reasons for overfitting and underfitting?

Why do we face these two problems in training a model?

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

I have made some research about overfitting and underfitting, and I have understood what they exactly are, but I cannot find the reasons.

What are the main reasons for overfitting and underfitting?

Why do we face these two problems in training a model?

machine-learning dataset overfitting

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

edited 1 hour ago

mpiktas

29.4k466130

edited 1 hour ago

mpiktas

29.4k466130

edited 1 hour ago

mpiktas

29.4k466130

asked yesterday

Goktug

1634

New contributor

asked yesterday

Goktug

1634

asked yesterday

Goktug

1634

New contributor

Goktug is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
You might find What's a real-world example of “overfitting”? useful
$endgroup$
– Silverfish
12 hours ago

add a comment |

$begingroup$
You might find What's a real-world example of “overfitting”? useful
$endgroup$
– Silverfish
12 hours ago

You might find What's a real-world example of “overfitting”? useful

– Silverfish
12 hours ago

add a comment |

5 Answers
5

active

oldest

votes

I'll try to answer in the simplest way. Each of those problems has its own main origin:

Overfitting: Data is noisy, meaning that there are some deviations from reality (because of measurement errors, random influentially factors, non-observed variables and rubbish correlations) that makes us harder to see their true relationship with our explaining factors. Also, it is usually not complete (we don't have examples of everything).

As an example, let's say I am trying to classify boys and girls based on their height, just because that's the only information I have about them. We all know that even though boys are taller on average than girls, there is a huge overlap region so it's impossible to perfectly separate them just with that bit of information. Depending on the density of the data, a sufficiently complex model might be able to achieve a better success rate on this task than is theoretically possible on the training dataset because it could draw boundaries that allow some points to stand alone by themselves. So, if we only have a person who is 2.04 meters tall and she's a woman, then the model could draw a little circle around that area meaning that a random person who is 2.04 meters tall is most likely to be a woman.

The underlying reason for it all is trusting too much in training data (and in the example, the model says that as there is no man with 2.04 height, then it is only possible for women).

Underfitting is the opposite problem, in which the model fails to recognize the real complexities in our data (i.e. the non-random changes in our data). The model assumes that noise is greater than it really is and thus uses a too simplistic shape. So, if the dataset has much more girls than boys for whatever reason, then the model could just classify them all like girls.

In this case, the model didn't trust enough in data and it just assumed that deviations are all noise (and in the example, the model assumes that boys simply do not exist).

Bottom line is that we face these problems because:

We don't have complete information.

We don't know how noisy the data is (we don't know how much should we trust it).

We don't know in advance the underlying function that generated our data, and thus the optimal model complexity.

I hope this helps.

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

1

$begingroup$
welcome to CV. nice answer, which makes me want to del my answer ...
$endgroup$
– hxd1011
22 hours ago

4

$begingroup$
Was the "button line" instead of "bottom line" intentional?
$endgroup$
– Džuris
19 hours ago

add a comment |

Overfitting is when a model estimates the variable you are modeling really well on the original data, but it does not estimate well on new data set (hold out, cross validation, forecasting, etc.). You have too many variables or estimators in your model (dummy variables, etc.) and these cause your model to become too sensitive to the noise in your original data. As a result of overfitting on the noise in your original data, the model predicts poorly.

Underfitting is when a model does not estimate the variable well in either the original data or new data. Your model is missing some variables that are necessary to better estimate and predict the behavior of your dependent variable.

The balancing act between over and underfitting is challenging and sometimes without a clear finish line. In modeling econometrics time series, this issue is resolved pretty well with regularization models (LASSO, Ridge Regression, Elastic-Net) that are catered specifically to reducing overfitting by respectively reducing the number of variables in your model, reducing the sensitivity of the coefficients to your data, or a combination of both.

answered 15 hours ago

Sympa

4,16732344

add a comment |

Perhaps during your research you came across the following equation:

Error = IrreducibleError + Bias² + Variance.

Why do we face these two problems in training a model ?

The learning problem itself is basically a trade-off between bias and variance.

What are the main reasons for overfitting and underfitting ?

Short: Noise.

Long: The irreducible error: Measurement errors/fluctuations in the data as well as the part of the target function that cannot be represented by the model. Remeasuring the target variable or changing the hypothesis space (i.e. selecting a different model) changes this component.

Edit (to link to the other answers): Model performance as complexity is varied:

where errorD is the error over the entire distribution D (in practice estimated with test sets).

edited 2 hours ago

answered 14 hours ago

lnathan

873420

1

$begingroup$
I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.
$endgroup$
– Gregor
13 hours ago

add a comment |

What are the main reasons for overfitting and underfitting ?

For overfitting, the model is too complex to fit the training data well. For underfitting, the model is too simple.

Why do we face these two problems in training a model ?

It is hard to pick the "just right" model and parameters for the data.

answered 22 hours ago

hxd1011

18.7k653145

add a comment |

Almost all statistical problems can be stated in the following form:

Given the data $(y, x)$ find $hat{f}$ which produces $hat{y}=hat{f}(x)$.

Make this $hat{f}$ as close as possible to "true" $f$, where $f$ is defined as

$$y = f(x) + varepsilon$$

The temptation is always to make $hat{f}$ produce $hat{y}$ which are very close to the data $y$. But when new data point arrives, or we use data which was not used to construct $hat{f}$ the prediction may be way off. This happens because we are trying to explain $varepsilon$ instead of $f$. When we do this we stray from "true" $f$ and hence when new observation comes in we get a bad prediction. This when overfitting happens.

On the other hand when we find $hat{f}$ the question is always maybe we can get a better $tilde{f}$ which produces better fit and is close to "true" $f$? If we can then we underfitted in the first case.

If you look at the statistical problem this way, fitting the model is always a balance between underfitting and overfitting and any solution is always a compromise. We face this problem because our data is random and noisy.

answered 57 mins ago

mpiktas

29.4k466130

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Goktug is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f395197%2foverfitting-and-underfitting%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

I'll try to answer in the simplest way. Each of those problems has its own main origin:

The underlying reason for it all is trusting too much in training data (and in the example, the model says that as there is no man with 2.04 height, then it is only possible for women).

In this case, the model didn't trust enough in data and it just assumed that deviations are all noise (and in the example, the model assumes that boys simply do not exist).

Bottom line is that we face these problems because:

We don't have complete information.

We don't know how noisy the data is (we don't know how much should we trust it).

We don't know in advance the underlying function that generated our data, and thus the optimal model complexity.

I hope this helps.

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

1

$begingroup$
welcome to CV. nice answer, which makes me want to del my answer ...
$endgroup$
– hxd1011
22 hours ago

4

$begingroup$
Was the "button line" instead of "bottom line" intentional?
$endgroup$
– Džuris
19 hours ago

add a comment |

I'll try to answer in the simplest way. Each of those problems has its own main origin:

The underlying reason for it all is trusting too much in training data (and in the example, the model says that as there is no man with 2.04 height, then it is only possible for women).

In this case, the model didn't trust enough in data and it just assumed that deviations are all noise (and in the example, the model assumes that boys simply do not exist).

Bottom line is that we face these problems because:

We don't have complete information.

We don't know how noisy the data is (we don't know how much should we trust it).

We don't know in advance the underlying function that generated our data, and thus the optimal model complexity.

I hope this helps.

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

1

$begingroup$
welcome to CV. nice answer, which makes me want to del my answer ...
$endgroup$
– hxd1011
22 hours ago

4

$begingroup$
Was the "button line" instead of "bottom line" intentional?
$endgroup$
– Džuris
19 hours ago

add a comment |

I'll try to answer in the simplest way. Each of those problems has its own main origin:

The underlying reason for it all is trusting too much in training data (and in the example, the model says that as there is no man with 2.04 height, then it is only possible for women).

In this case, the model didn't trust enough in data and it just assumed that deviations are all noise (and in the example, the model assumes that boys simply do not exist).

Bottom line is that we face these problems because:

We don't have complete information.

We don't know how noisy the data is (we don't know how much should we trust it).

We don't know in advance the underlying function that generated our data, and thus the optimal model complexity.

I hope this helps.

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

I'll try to answer in the simplest way. Each of those problems has its own main origin:

The underlying reason for it all is trusting too much in training data (and in the example, the model says that as there is no man with 2.04 height, then it is only possible for women).

In this case, the model didn't trust enough in data and it just assumed that deviations are all noise (and in the example, the model assumes that boys simply do not exist).

Bottom line is that we face these problems because:

We don't have complete information.

We don't know how noisy the data is (we don't know how much should we trust it).

We don't know in advance the underlying function that generated our data, and thus the optimal model complexity.

I hope this helps.

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

edited 16 hours ago

answered 22 hours ago

Luis Da Silva

1814

New contributor

answered 22 hours ago

Luis Da Silva

1814

answered 22 hours ago

Luis Da Silva

1814

New contributor

Luis Da Silva is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

$begingroup$
welcome to CV. nice answer, which makes me want to del my answer ...
$endgroup$
– hxd1011
22 hours ago

4

$begingroup$
Was the "button line" instead of "bottom line" intentional?
$endgroup$
– Džuris
19 hours ago

add a comment |

1

$begingroup$
welcome to CV. nice answer, which makes me want to del my answer ...
$endgroup$
– hxd1011
22 hours ago

4

$begingroup$
Was the "button line" instead of "bottom line" intentional?
$endgroup$
– Džuris
19 hours ago

welcome to CV. nice answer, which makes me want to del my answer ...

– hxd1011
22 hours ago

Was the "button line" instead of "bottom line" intentional?

– Džuris
19 hours ago

add a comment |

answered 15 hours ago

Sympa

4,16732344

add a comment |

answered 15 hours ago

Sympa

4,16732344

add a comment |

answered 15 hours ago

Sympa

4,16732344

answered 15 hours ago

Sympa

4,16732344

answered 15 hours ago

Sympa

4,16732344

answered 15 hours ago

Sympa

4,16732344

answered 15 hours ago

Sympa

4,16732344

add a comment |

Perhaps during your research you came across the following equation:

Error = IrreducibleError + Bias² + Variance.

Why do we face these two problems in training a model ?

The learning problem itself is basically a trade-off between bias and variance.

What are the main reasons for overfitting and underfitting ?

Short: Noise.

Edit (to link to the other answers): Model performance as complexity is varied:

where errorD is the error over the entire distribution D (in practice estimated with test sets).

edited 2 hours ago

answered 14 hours ago

lnathan

873420

1

$begingroup$
I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.
$endgroup$
– Gregor
13 hours ago

add a comment |

Perhaps during your research you came across the following equation:

Error = IrreducibleError + Bias² + Variance.

Why do we face these two problems in training a model ?

The learning problem itself is basically a trade-off between bias and variance.

What are the main reasons for overfitting and underfitting ?

Short: Noise.

Edit (to link to the other answers): Model performance as complexity is varied:

where errorD is the error over the entire distribution D (in practice estimated with test sets).

edited 2 hours ago

answered 14 hours ago

lnathan

873420

1

$begingroup$
I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.
$endgroup$
– Gregor
13 hours ago

add a comment |

Perhaps during your research you came across the following equation:

Error = IrreducibleError + Bias² + Variance.

Why do we face these two problems in training a model ?

The learning problem itself is basically a trade-off between bias and variance.

What are the main reasons for overfitting and underfitting ?

Short: Noise.

Edit (to link to the other answers): Model performance as complexity is varied:

where errorD is the error over the entire distribution D (in practice estimated with test sets).

edited 2 hours ago

answered 14 hours ago

lnathan

873420

Perhaps during your research you came across the following equation:

Error = IrreducibleError + Bias² + Variance.

Why do we face these two problems in training a model ?

The learning problem itself is basically a trade-off between bias and variance.

What are the main reasons for overfitting and underfitting ?

Short: Noise.

Edit (to link to the other answers): Model performance as complexity is varied:

where errorD is the error over the entire distribution D (in practice estimated with test sets).

edited 2 hours ago

answered 14 hours ago

lnathan

873420

edited 2 hours ago

answered 14 hours ago

lnathan

873420

answered 14 hours ago

lnathan

873420

answered 14 hours ago

lnathan

873420

1

$begingroup$
I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.
$endgroup$
– Gregor
13 hours ago

add a comment |

1

$begingroup$
I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.
$endgroup$
– Gregor
13 hours ago

I think you should define your terminology. OP doesn't use the terms "bias" or "variance" in the question, you don't use the terms "overfitting" or "underfitting" in your answer (except in a quote of the question). I think this would be a much clearer answer if you explain the relationship between these terms.

– Gregor
13 hours ago

add a comment |

What are the main reasons for overfitting and underfitting ?

For overfitting, the model is too complex to fit the training data well. For underfitting, the model is too simple.

Why do we face these two problems in training a model ?

It is hard to pick the "just right" model and parameters for the data.

answered 22 hours ago

hxd1011

18.7k653145

add a comment |

What are the main reasons for overfitting and underfitting ?

For overfitting, the model is too complex to fit the training data well. For underfitting, the model is too simple.

Why do we face these two problems in training a model ?

It is hard to pick the "just right" model and parameters for the data.

answered 22 hours ago

hxd1011

18.7k653145

add a comment |

What are the main reasons for overfitting and underfitting ?

For overfitting, the model is too complex to fit the training data well. For underfitting, the model is too simple.

Why do we face these two problems in training a model ?

It is hard to pick the "just right" model and parameters for the data.

answered 22 hours ago

hxd1011

18.7k653145

What are the main reasons for overfitting and underfitting ?

For overfitting, the model is too complex to fit the training data well. For underfitting, the model is too simple.

Why do we face these two problems in training a model ?

It is hard to pick the "just right" model and parameters for the data.

answered 22 hours ago

hxd1011

18.7k653145

answered 22 hours ago

hxd1011

18.7k653145

answered 22 hours ago

hxd1011

18.7k653145

answered 22 hours ago

hxd1011

18.7k653145

add a comment |

Almost all statistical problems can be stated in the following form:

Given the data $(y, x)$ find $hat{f}$ which produces $hat{y}=hat{f}(x)$.

Make this $hat{f}$ as close as possible to "true" $f$, where $f$ is defined as

$$y = f(x) + varepsilon$$

answered 57 mins ago

mpiktas

29.4k466130

add a comment |

Almost all statistical problems can be stated in the following form:

Given the data $(y, x)$ find $hat{f}$ which produces $hat{y}=hat{f}(x)$.

Make this $hat{f}$ as close as possible to "true" $f$, where $f$ is defined as

$$y = f(x) + varepsilon$$

answered 57 mins ago

mpiktas

29.4k466130

add a comment |

Almost all statistical problems can be stated in the following form:

Given the data $(y, x)$ find $hat{f}$ which produces $hat{y}=hat{f}(x)$.

Make this $hat{f}$ as close as possible to "true" $f$, where $f$ is defined as

$$y = f(x) + varepsilon$$

answered 57 mins ago

mpiktas

29.4k466130

Almost all statistical problems can be stated in the following form:

Given the data $(y, x)$ find $hat{f}$ which produces $hat{y}=hat{f}(x)$.

Make this $hat{f}$ as close as possible to "true" $f$, where $f$ is defined as

$$y = f(x) + varepsilon$$

answered 57 mins ago

mpiktas

29.4k466130

answered 57 mins ago

mpiktas

29.4k466130

answered 57 mins ago

mpiktas

29.4k466130

answered 57 mins ago

mpiktas

29.4k466130

add a comment |

Goktug is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Goktug is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Kdtyjb