I. Overview
Perceptron Model is also called neuron model, which is inspired by the operation mechanism of biological neurons to complete the process of receiving, processing and outputting information sequentially. Currently, various artificial neural network models are composed of artificial neurons, therefore, the Perceptron Model (neuron model) introduced in this paper is the basic unit of various neural network models.
II. Principles of the model
Model Principle
The core of the model is summarized as linear regression + symbolic function mapping. For unknown data, linear fitting is done first, and the output value is then mapped by the sign function to complete the category determination. Therefore, the perceptual machine model is also a direct model for binary classification tasks. The model schematic can be represented as
The modeling principle is also expressed directly as
For any sample to be measured, it is sufficient to substitute its eigenvectors directly into the calculation.
Training of the model
The parameters of the model are the weights and biases in linear regression, and determining them also determines the entire model. The determination of the parameters is often implemented through the training dataset, that is, a loss function is constructed from the correspondence between the training set and the labels with respect to the parameters to be solved, and the optimal parameter values are determined in the process through continuous iterative optimization. The loss function is usually constructed in such a way that the sum of the distances from all misclassified samples to the decision function is calculated. The expression is
Among them.\(\left| \left| w \right| \right|=\sqrt{w_{1}^{2}+w_{2}^{2}+...+w_{n}^{2}}\), M is the set of misclassified samples.
For further simplicity, the absolute value calculation can be replaced with the equivalent of '-y'. y is the label of the sample and takes the value of either 1 or -1. If y is 1, it indicates that the sample is positive, and the regression value computed in the case of an incorrect determination is negative, at which point '-y'The 'negative value' is positive; if y is -1, it indicates that the sample is negative, and the regression value computed at the time of the erroneous determination is positive, in which case '-yThe 'positive value' is still positive and is equivalent to the absolute value operation, at which point the loss function expression is
Eq.\(\frac{1}{\left| \left| w \right| \right|}\)essentially characterizes the directionality of the decision function, whereas the model is concerned with determining the class outcome for two classes of samples, and is not actually concerned with the specific direction of the decision function and the specific difference in the distance from the sample to the function, and thus this part can be omitted, and the loss function reduces to
III. Python implementation
Manual realization:
import numpy as np
from sklearn import datasets
def model(X, theta):
return X @ theta
def predict(x, theta):
flags = model(x, theta)
y = np.ones_like(flags)
y[(flags < 0)[0]] = -1
return y
def computerCost(X, y, theta):
y_pred = predict(X, theta)
error_index = (y_pred != y)[0]
return (-y_pred[error_index].T @ y[error_index])
def gradientDescent(X, y, alpha, num_iters=1000):
n = [1]
theta = ((n, 1))
J_history = []
for i in range(num_iters):
y_pred = predict(X, theta)
error_index = (y_pred != y)[0]
theta = theta + alpha * X[error_index, :].T @ y[error_index]
cur_cost = computerCost(X, y, theta)
J_history.append(cur_cost)
print('.', end='')
if cur_cost == 0:
print(f'Finished in advance in iteration {i + 1}!')
break
return theta, J_history
iris = datasets.load_iris()
X =
m = [0]
X = ((((m, 1)), X))
y =
y[(y != 0)[0]] = -1
y[(y == 0)[0]] = 1
y = ((len(y), 1))
theta, J_history = gradientDescent(X, y, 0.01, 1000)
y_pred = predict(X, theta)
acc = (y_pred == y) / len(y)
print('acc:\n', acc)
Based on PyTorch implementation:
import torch
import as nn
import as optim
import matplotlib
('TkAgg')
import as plt
import numpy as np
# Generate some random linearly differentiable data
(42)
num_samples = 100
features = 2
x = 10 * (num_samples, features) # Generating random input features
w_true = ([2, -3.4]) # True weights
b_true = 4.2 # True bias
y_true = (x, w_true) + b_true + 0.1 * (num_samples) # Add Noise
y_true = (y_true > 0, 1, -1) # Converting output labels to binary classification problems
# Convert data to PyTorch (used form a nominal expression) Tensor
x = (x, dtype=torch.float32)
y_true = (y_true, dtype=torch.float32)
# Define the perceptron model
class Perceptron():
def __init__(self, input_size):
super(Perceptron, self).__init__()
= (input_size, 1)
def forward(self, x):
return ((x))
# Initialize the perceptual machine model
perceptron = Perceptron(input_size=features)
# Define the loss function and optimizer
criterion = ()
optimizer = ((), lr=0.01)
# Training Perceptron Machine Models
num_epochs = 100
for epoch in range(num_epochs):
# forward propagation
y_pred = perceptron(x)
# Calculation of losses
loss = criterion(y_pred.view(-1), y_true)
# Backpropagation and optimization
optimizer.zero_grad()
()
()
# Print Loss
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {():.4f}')
# Prediction on training data
with torch.no_grad():
predictions = perceptron(x).numpy()
# Visualization results
(x[:, 0], x[:, 1], c=(), cmap='coolwarm', marker='o')
('Perceptron Model')
('Feature 1')
('Feature 2')
()
End.
pdf download