statistics - Data Mining and Unbalanced Classes -
i have unbalanced classes of records , data following:
x y z class 1 4 3 5 7 6 8 7 excellent 4 8 pass 3 7 34 6 1 5 4 3 excellent b 4 4 excellent b
i want predict class:
- what best data mining techniques?
- i used decision tree unfortunately faced problem of unbalanced record , wasn't able classify data
i'd recommend looking smote (synthetic minority oversampling technique). technique randomly selects, replacement, set of minority instances within training dataset. these selected instances added duplicates training dataset resulting in more balanced classes , thereby preventing classifier learning predict majority class.
depending on software or module using, , whether or not need use decision trees specifically, there may other options. instance, svms (again depending on software or module used) accompanied ability specify class-specific costs. combat problem relating can specify higher cost (i.e., penalty) on minority class.
hope helps!