SMOTE-N

$\begin{displaymath} \delta(V_{1},V_{2}) = \sum_{i=1}^n{\vert\frac{C_{1i}}{C_1} - \frac{C_{2i}}{C_2}\vert}^k\end{displaymath}$

(1)

In the above equation, V₁ and V₂ are the two corresponding feature values. C₁ is the total number of occurrences of feature value V₁, and C_1i is the number of occurrences of feature value V₁ for class i. A similar convention can also be applied to C_2i and C₂. k is a constant, usually set to 1. This equation is used to compute the matrix of value differences for each nominal feature in the given set of feature vectors. Equation 1 gives a geometric distance on a fixed, finite set of values [37]. Cost and Salzberg's modified VDM omits the weight term w_f^a included in the $\delta$ computation by Stanfill and Waltz, which has an effect of making $\delta$ symmetric. The distance $\Delta$ between two feature vectors is given by:

$\begin{displaymath} \Delta(X, Y) = w_xw_y \sum_{i=1}^N\delta(x_i,y_i)^r\end{displaymath}$

(2)

r = 1 yields the Manhattan distance, and r = 2 yields the Euclidean distance [37]. w_x and w_y are the exemplar weights in the modified VDM. w_y = 1 for a new example (feature vector), and w_x is the bias towards more reliable examples (feature vectors) and is computed as the ratio of the number of uses of a feature vector to the number of correct uses of the feature vector; thus, more accurate feature vectors will have w_x $\approx$ 1. For SMOTE-N we can ignore these weights in equation 2, as SMOTE-N is not used for classification purposes directly. However, we can redefine these weights to give more weight to the minority class feature vectors falling closer to the majority class feature vectors; thus, making those minority class features appear further away from the feature vector under consideration. Since, we are more interested in forming broader but accurate regions of the minority class, the weights might be used to avoid populating along neighbors which fall closer to the majority class. To generate new minority class feature vectors, we can create new set feature values by taking the majority vote of the feature vector in consideration and its k nearest neighbors. Table 6.2 shows an example of creating a synthetic feature vector.

Table 7: Example of SMOTE-N
Let F1 = A B C D E be the feature vector under consideration

and let its 2 nearest neighbors be

F2 = A F C G N

F3 = H B C D N

The application of SMOTE-N would create the following feature vector:

FS = A B C D N