Why Your SMOTE-Oversampled Data Is Leaking Into Your Validation Set
You applied SMOTE to handle class imbalance, your cross-validation scores look great, and then your model falls apart on real data. The culprit is almost always data leakage from oversampling before splitting. Here's exactly what goes wrong and how to fix it.