# Classification example

### # Problem statement

In this example, we consider a dataset where each input vector $$X = ( x , y)$$ is assocated to a class A(+1) or B(-1). The following figure illustrates the classification problem:

### # Network architecture

The single layer architecture is the following:

As we need to distinguish class A from class B, we have to use an activation function that can separate classes. In this example, hyperbolic tangent has been selected:

The choice of hyperbolic tangent is motivated by the fact that this function output a value between -1 and +1. Output can be interpretated in two ways, in terme of binary classes (A or B) or in term of probabilities.

##### # Binary interpretation

To determine if the sample belongs to class A or B, ones can specify the following rule: positive outputs belongs to class A, while negative to class B. Mathematicaly, we add the following function after the output of the network:

• $$o=+1$$ when tanh is positive
• $$o=-1$$ when tanh is negative

##### # Probabilistic interpretation

The second option to interpret the output of the network is to consider it as a probability to belonging to class A or B. When the output is equal to +1, the probability for the sample to be classify in class A or B is respectively one and zero. The following equations generalize this concept, and convert the network output into probablities:

Probablity to be in class A : $$p_A = \frac{o+1}{2}$$ Probablity to be in class B : $$p_B = \frac{1-o}{2}$$ Note that sum of probablities is always equal to one ( $$p_A + p_B = 1$$ ).

### # Results

The following figures shows how the space is splitted to separate classes:

The following figures is an overview of training results.

• The surface is the raw output of the network.
• Red dots are points in training dataset belonging to class A.
• Blue dots are points in training dataset belonging to class B.
• Green circle are well classified points.
• Black cross are badly classified points.

### # Code source

Cliquez sur l'un des langages suivants pour afficher le code source de cette classification :

### Matlab

%% Single layer classifier
close all;
clear all;
clc;

%% Parameters
% Dataset size
N=1000;
% Learning rate
Eta=0.003;

%% Dataset
% Generate dataset for each class [ X , Y , class (+1 or -1) ]

% Class A (+1)

% ~ 98% good classification
classA = mvnrnd ([2, 2] , [5 1.5; 1.5 1] ,N/2) ;
%% Uncomment the following line to create a 100% good classification
%classA = mvnrnd ([2, 4] , [5 1.5; 1.5 1] ,N/2) ;
classA= [classA , ones(N/2,1) ];

% class B (-1)

classB = mvnrnd ([2,-2] , [3,0;0,0.5] ,N/2);
classB= [classB , -ones(N/2,1) ];

% Merge classes for creating the dataset
dataset=[ classA ; classB ];
% Shuffle dataset
dataset=dataset(randperm(length(dataset)),:);

%% Initialize weight
W=[0;0;0];

%% Trainig loop
for i = 1:size(dataset,1)
% Forward
S=W'*[dataset(i,1:2),1]';
Y=tanh (S);

% Expected output
Y_=dataset(i,3);

% Update weights
W=W+Eta*(Y_ - Y)*[dataset(i,1:2),1]'*(1-tanh(S)*tanh(S));
end

%% Display

% Get boundaries (for display)
Xmin=min(dataset(:,1));
Xmax=max(dataset(:,1));
Ymin=min(dataset(:,2));
Ymax=max(dataset(:,2));

%% Output
[X,Y] = meshgrid(Xmin:0.1:Xmax,Ymin:0.1:Ymax);
Z = tanh ( X*W(1) + Y*W(2) + W(3) );
surf(X,Y,Z,'facecolor','texture')
hold on;

%% Display dataset
plot3 (classA(:,1),classA(:,2),4+classA(:,3),'.r'); hold on;
plot3 (classB(:,1),classB(:,2),4+classB(:,3),'.b');
grid on;
axis square equal;

%% Test on training set
good=0;
for i = 1:size(dataset,1)
% Compute network output
Y=tanh (W'*[dataset(i,1:2),1]');

% Compare to the expected output
if (sign(Y)==dataset(i,3))
% Good classification (green circle)
good=good+1;
plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'og');
else
% Wrong classification (black cross)
plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'xk');
end
end

% Axis labels and colormap
colormap(jet);
colorbar
xlabel ('X');
ylabel ('Y');
% Uncomment for a top view
%view(0,90);

% Compute success ratio

badly_classified =