PinnedDavid B Rosen (PhD)inTowards Data ScienceHow To Deal With Imbalanced Classification, Without Re-balancing the DataBefore considering over-sampling your data, try simply tuning your classification decision thresholdAug 2, 20205Aug 2, 20205

PinnedDavid B Rosen (PhD)inTowards Data ScienceHow to calculate confidence intervals for performance metrics using an automatic bootstrap methodAre your model performance measurements very precise due to a “large” test set, or very uncertain due to a “small” or imbalanced test…Sep 8, 20216Sep 8, 20216

PinnedDavid B Rosen (PhD)inTowards Data ScienceHow to Auto-Detect the Date/Datetime Columns & Set Their Datatype When Reading a CSV File in PandasWhen read_csv( ) reads “2021-03-04” and “2021-03-04 21:37:01.123” as mere “object” types, often you can auto-convert them all to datetimes.Aug 16, 20212Aug 16, 20212

David B Rosen (PhD)Three more tips:1. Find and copy Unicode math symbols on Mathematical operators and symbols in Unicode - WikipediaDec 16, 2023Dec 16, 2023

David B Rosen (PhD)R² is completely equivalent to MSE in that it is simply a (negatively) scaled and offset form of…Aug 29, 2022Aug 29, 2022

David B Rosen (PhD)Good article!How To Deal With Imbalanced Classification, Without Re-balancing the Data (Before considering over-sampling your data, try simply tuning…Aug 29, 2022Aug 29, 2022

David B Rosen (PhD)Good article!As an alternative to using a specific set of functional forms for the relationships, we could use the ROC area of each feature as a "score"…Jun 7, 2022Jun 7, 2022

David B Rosen (PhD)Good article -- I was not aware of how flawed the mean decrease in feature impurity method could be!One pitfall of permutation feature importance could be if there are some important subsets of the features that are highly-correlated or…Jun 7, 2022Jun 7, 2022

David B Rosen (PhD)The cost to the business of false positives vs false negatives is unlikely to be equal, so you need…Also it is incorrect to apply ordinary cross-validation to an already-oversampled training set (even oversampled by adasyn or smote)…Jun 3, 2022Jun 3, 2022

David B Rosen (PhD)You mention the average squared error -- how is this defined if it is different from mean squared…Also it is worth mentioning that R-squared (coeff. of determination) is a rescaled version of MSE such that 100% is perfection and 0%…Nov 1, 2021Nov 1, 2021