preamble
Picking up where the previous post left off with the optimizer.
optimizer
The code is the same as in the previous post, as follows.
import torch
import numpy as np
import as nn
X = ([1, 2, 3, 4], dtype=torch.float32)
Y = ([2, 4, 6, 8], dtype=torch.float32)
w2 = (0.0, requires_grad=True)
def forward(_x).
return w2* _x
learning_rate = 0.01
n_iter = 100 # number of iterations
loss =()
optimizer =([w2],lr=learning_rate)
for epoch in range(n_iter).
y_pred = forward(X)#
l = loss(Y, y_pred)
() #After performing backpropagation, it's already in w2
() #optimizer receives w2 when it is initialized, and now that w2 has grad, it can perform STEP for optimization, which will use w2's gradient grad attribute and learning rate learning_rate
optimizer.zero_grad() #gradient clear zero
if epoch % 1 == 0.
print(f'epoch {epoch+1}:w2= {w2:.3f} , loss = {l:.8f}')
print(f'f(5)={forward(5):.3f}')
As you can see, we have added an import as nn to the reference here.
Here it's just a simple use of () so we don't have to handwrite this step of calculating the average.
Then we define an optimizer that receives two parameters, a weight w2 and a learning_rate learning_rate.
Here we are passing an array of Tensors, not w2.
The previous article has introduced w is generated according to the gradient, so it should be a matrix with the same type as x. Here [w2] is not the same type as x, but here he is just a number, this is because the calculation will automatically deform this kind of a number of matrices with the same type as x matrix.
The correct way to write it would be something like the following.
import torch
import numpy as np
import as nn
X = ([1, 2, 3, 4], dtype=torch.float32)
Y = ([2, 4, 6, 8], dtype=torch.float32)
w2 = ([0.0,0.0,0.0,0.0], requires_grad=True)
def forward(_x):
return w2* _x #in the event thatw2fault1elements or4individual element,It's impossible to multiply here.
learning_rate = 0.01
n_iter = 100 # Number of cycles
loss =()
optimizer =([w2],lr=learning_rate)
for epoch in range(n_iter):
y_pred = forward(X)#
l = loss(Y, y_pred)
()
()
optimizer.zero_grad()
if epoch % 1 == 0:
print(f'epoch {epoch+1}:w21= {w2[0]:.3f} w22= {w2[1]:.3f} ,loss = {l:.8f}')
Computation Logic Restatement
Recalling the frog example from earlier, let's recount the logic of this calculation, first we have a y, which is our target, and then we have an x, which is our input data. Then through the two parameters w and b, combined with the formula y = wx + b, one at a time to try to find out w and b, and then through the calculated w and b correction x, and then we get a new matrix - correction x; we make y_predict = x correction matrix, we form the x through the change of the prediction of y, i.e. y _predict. we can then compare y and y_predict.
synopsis
is a core module of PyTorch dedicated to building and training neural networks. This module provides a variety of classes and functions that make it easy for you to define neural network models, implement forward and backward propagation, define loss functions, and perform tasks such as parameter optimization.
Linear
concept is a module for implementing linear transformations (fully connected layers) in PyTorch. Let's ignore his definition here.
Let's look at the meaning of a few variables first.
The information returned is the number of rows (number of samples) and columns (number of features) of an object, with special attention paid to the terms sample and feature, which are two terms that interfere with our learning very much.
(input_size, output_size): this is the instantiation of Linear, the input parameter is two numbers, respectively called input_size, output_size, the meaning of these two parameters are as follows.
Definition of anti-human
input_size: is the number of input features, which is the dimension of each input sample.
output_size: is the number of output features, that is, the dimension of features that the model wants to output.
Normal definition
input_size: indicates the number of columns of input data X.
output_size: indicates the number of columns in the model's predicted output y_predict.
Note: It's important to read the definition of anti-human a few times here because, if you study AI, you'll see people using the anti-human definition to describe operations and problems in various videos and articles.
Here, with a little thought, we can surmise a conclusion based on the input_size and output_size passes, respectively, that we can input a 3 * 3 matrix x, and then, using this library, output it as a 4 * 4 matrix, which can then be compared to a 4 * 4 matrix y.
Linear, however, requires that the dimensions of the input and output matrices must match, so we don't have to do that here, but a little bit of association leads to the conclusion that multilayer neural networks or other layers (such as convolutional layers) can certainly do such complex mappings.
The Linear usage code is as follows:
import torch
import numpy as np
import as nn
X = ([[1], [2], [3], [4]], dtype=torch.float32)
Y = ([[2], [4], [6], [8]], dtype=torch.float32)
n_samples, n_features = # xbe4classifier for objects in rows such as words1column matrix,Here returns4cap (a poem)1
print("n_samples", n_samples, "n_features", n_features)
input_size = n_features
output_size = n_features
model = (input_size, output_size)
learning_rate = 0.01
n_iter = 100 # Number of cycles
loss = ()
[w, b]= ()
optimizer = ([w, b], lr=learning_rate)
for epoch in range(n_iter):
y_pred = model(X) # Here. model(X) 调用(used form a nominal expression)be model (used form a nominal expression) forward methodologies
l = loss(Y, y_pred)
()
()
optimizer.zero_grad()
if epoch % 1 == 0:
[w, b] = ()
print(f'epoch {epoch+1}:w2= {w[0][0].item():.3f} ,loss = {l:.8f}')
As in the above code, we have defined a linear model object using model = (input_size, output_size).
Then the return value of () is passed in when used.
The return values of () are w and b. () creates a w and a b internally when it is called.
Weight matrix w: shape [output_size, input_size].
Bias vector b: shape [output_size].
We then call this instance using model(x), where the __call__ method should be implemented in the Linear class, so instances of the class can be called like functions.
Here we pass x, and with x it can be forward propagated, i.e., in model(x) we pass x and at the same time trigger forward propagation.
So the return value of model(x) is a predicted y-value.
We then use the [scalar function/loss function] we defined via () to perform the scalar computation.
This scalar can then be used for backpropagation.
Then, we have the values of the model parameters w and b.
synopsis
is the base class for all neural network modules in PyTorch. All neural network layers (such as linear, convolutional, LSTM, etc.) inherit from this class.
Through inheritance , you can define your own network layers or models and utilize PyTorch's auto-differentiation feature for training.
is a subclass of is a specific neural network layer class inherited from . It implements one of the simplest linear transformation layers, also called a fully connected layer.
Through inheritance, it is possible to take advantage of all the functionality provided, such as registering parameters, forward propagation, saving and loading models, and so on.
The structure is as follows:
#
# |
# |--
# |-- nn.Conv2d
# |--
# |-- (Other Modules)
Below customize a class that inherits from Module to implement Linear's code:
X = ([[1], [2], [3], [4]], dtype=torch.float32) # 4classifier for objects in rows such as words1column matrix
Y = ([[2], [4], [6], [8]], dtype=torch.float32)
n_samples, n_features =
print("n_samples", n_samples, "n_features", n_features)
input_size = n_features
output_size = n_features
# model = (input_size, output_size)
class LinearRegression():
def __init__(self, input_dim, output_dim):
super(LinearRegression,self).__init__()
# define layers
= (input_dim, output_dim)
def forward(self, x):return (x)
model =LinearRegression(input_size, output_size)
learning_rate = 0.01
n_iter = 100 # Number of cycles
loss = ()
[w, b]= ()
optimizer = ([w, b], lr=learning_rate)
for epoch in range(n_iter):
y_pred = model(X)
l = loss(Y, y_pred)
()
()
optimizer.zero_grad() # Gradient clearing
if epoch % 1 == 0:
[w, b] = ()
print(f'epoch {epoch+1}:w2= {w[0][0].item():.3f} ,loss = {l:.8f}')
Portal:
Learning Artificial Intelligence from Zero - Python-Pytorch Learning (I)
Learning Artificial Intelligence from Zero - Python-Pytorch Learning (II)
Learning Artificial Intelligence from Zero - Python-Pytorch Learning (III)
Learning Artificial Intelligence from Zero - Python-Pytorch Learning (IV)
That's enough studying for now.
Note: This post is original, please contact the author for authorization and attribution for any form of reproduction!
If you think this article is still good, please click [Recommend] below, thank you very much!
/kiba/p/18354543