Paddlapaddle: Project1: hello Paddle
1. New Project
Interpreter Type: virtual environment
2. Ordinary program VS Machine Learning
The biggest difference between machine learning programs and common programs is that common programs tell computers the rules for processing data given inputs and then get the processed rules, while machine learning programs let machines learn rules from data without knowing these rules.
Task: When taking a taxi, there is a strating fee of 10 Euros, which is charged as soon as you get in the car. For every kilometer the taxi travels, you need to pay an additional 2 Euros per kilometer. When a passenger gets off the taxi, the taximeter in the car needs to calculate the fare that passengers need to pay.
def calculate_fee(distance_travelled):
return 10 + 2 * distance_travelled
for x in [1.0, 3.0, 5.0, 9.0, 10.0, 20.0]:
print(calculate_fee(x))Now, Let's change the problem a little bit. We know the number of kilometers that the passenger travels each time in a taxi, and we also know that the total fare that passenges pays the taxi driver each time when he gets off the taxi. However, we don't the starting fare of the taxi, nor the fare per kilometer.We hope to let the computer learn the rules for calculting the total fare from these data.
More specifically, we want the machine learning program to learn the prrameters $w$ and $b$ in the following formula.$$totalFee=w\times distanceTravelled+b$$
3. Import PaddlePaddle
(1) Install Paddle Lib
File --> Settings --> Project: Project Interpreter --> Add --> Paddlepaddle --> version 2.6.0 --> Install package.
python -m pip install paddlepaddle==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
(2) Import Paddlepaddle
import paddle
print("Paddle" + paddle.__version__)
4. Prepare data
In this machine learning task, the mileage of passenges $distanceTravelled$ and the corresponding total cost of these passenges $totalFee$ are already known.
Typically, in machine learning tasks, $distanceTravelled$ input values like this are generally called $x$ (or features), and $totalFee$ output values likes this are generally called $y$ (or labels).
PaddlePaddle is the same as other Deep learning Framework, it use Tensor (张量) to represent data. Convert sample data into paddle's Tensor data.
x_data = paddle.to_tensor([[1.], [3.0], [5.0], [9.0], [10.0], [20.0]]) y_data = paddle.to_tensor([[12.], [16.0], [20.0], [28.0], [30.0], [50.0]])
5. Calculation of the model defiend by PaddlePaddle
Use machine learning methods through PaddlePaddle to learn the sum of the following formula from data $w$ and $b$. $$yPredict = w \times x + b$$
In this way, in the future, we can estimate the value $y$ giving $x$.
The linear transformation layer of Paddlepaddle will be used ($paddle.nn.linear$) to implement this calculation process. The variables correspond to the concept of Tensor in Paddle. ($x, y, w, b, yPredict$).
Tips: In the example here, based on experience, we already know that there is a linear relationship between $distanceTravelled$ and $totalFee$. But in most practical problems, the relationship betwen features (x) and labels (y) is usually nonlinear, so more types of complex neural netwoks are needed. (For example, the BMI index is not linearly related to your height, and a certain pixel value in a picture is not linearly related to whether the picture is a cat or a dog.)
linear = paddle.nn.Linear(in_features=1, out_features=1)
5. Get ready to run Paddle
The computer will randomly guess $w$ and $b$, see how well it guesses at the begining. The value $w$ is a random value, $b$ is zero, which is the initialization strategy of Paddle and is also commonly used intialization strategy in this field.
w_before_opt = linear.weight.numpy().item()
b_before_opt = linear.bias.numpy().item()
print("w before optimize: {}".format(w_before_opt))
print("b before optimize: {}".format(b_before_opt))
6. Tell PaddlePaddle how to learn
Afte defining the neural network (although it is the simplest one), we need to tell PaddlePaddle how to learn so that we can get the parameters $w$ and $b$.
In machine/deep learning, the machine gets the paramenters by random guessing at the beginning. When using the parameter values obtained by random guessing to perform calculations (predictions), there must be a gap between the obtained values and the actual values. Next, the machine will adjust the sum according to this gap. With such gradual adjustments, the sum will beacome more and more correct, and the gap between predict values and actual values will become smaller and smaller, so that a useful sum can be finally obtained.
The function that measures the gap is the loss function, and the method used to adjust the paremeters is the optimization algorithm.
In this example, the mean square error is used as the loss function and the optimization algorithm SGD (stocastic gradient descent) is used. The parameters learning_rate can be understood as the parameter that control the step size of each adjustment.
mse_loss = paddle.nn.MSELoss()
sgd_optimizer = paddle.optimizer.SGD(learning_rate=0.001, parameters = linear.parameters())
7. Run the optimization algorithm
Next, let PaddlePaddle run this optimization algorithm. This will be a process of gradually adjusting parameters as described earilier. You should be able to see that the loss value (which measures the difference between the sum y and y_predict) is certantly decreasing.
total_epoch = 5000
for i in range(total_epoch):
y_predict = linear(x_data)
loss = mse_loss(y_predict, y_data)
loss.backward()
sgd_optimizer.step()
sgd_optimizer.clear_grad()
if i%1000 == 0:
print("epoch {} loss {}".format(i, loss.numpy()))
print("finished training, loss {}".format(loss.numpy()))
8. Parameters learned by machine learning method
After adjusting the parameters (learning) in this way, we get a value very close to 2.0 and a value close to 10.0. Although they are exactly 2 and 10, they are good model parameters learned from the data. If you want, let the machine learn for a longer period of time to get parameter vlaues closer to true values.
w_after_opt = linear.weight.numpy().item()
b_after_opt = linear.bias.numpy().item()
print("w after optimize: {}".format(w_after_opt))
print("b after optimize: {}".format(b_after_opt)) 9. Results
b before optimize: 0.0
epoch 0 loss 1069.3748779296875
epoch 1000 loss 8.01093578338623
epoch 2000 loss 1.7911834716796875
epoch 3000 loss 0.40050235390663147
epoch 4000 loss 0.0895506963133812
finished training, loss 0.020052796229720116
w after optimize: 2.0180323123931885
b after optimize: 9.769429206848145
