Learning Artificial Intelligence from Zero - Python-Pytorch Learning (XIII)

preamble

Recently learned a new concept, called scientific discovery and scientific and technological invention, scientific discovery is higher than scientific and technological invention, and this statement I think is quite reasonable, we always say that China's science and technology is not as good as Europe and the United States, but we actually feel that, whether it is the building, hardware or software, the theory, we have been higher than Europe and the United States, so why do we still say that we are not as good as Europe and the United States?
The fact that scientific discovery is above technological invention explains this well, i.e., our online payments, the construction industry, etc., these are technological inventions, not scientific discoveries, which lead technological inventions, and Europe and the US are far ahead of us in scientific discoveries, and there are a lot of substantial leads in technological inventions, though not many, such as chatgpt.

The main purpose of all this is to show that software development is also a technological invention, so the masters of this profession, even at a high level, are just that.
That said, even if you're from Clearwater North, once you're on the tech invention team, that's about it.
Neural networks aren't hard, as this series of posts of mine proves, you can learn it in no time just as well if you don't know python at all and haven't learned algorithms at all. I personally feel that it can be learned within a week to a month.
So those who can don't have to look at others in a superior way, and those who can't don't have to feel that they are at a high level.

Content of this article

This paper focuses on combining neural networks for robot development.

preliminary

Before running the code, we need to download the nltk package.
First install the nltk package.

pip install nltk

Then download the nltk tool, write a py file and write the code as follows:

import nltk
()

Then use administrator to open cmd and run this py file.

C:\Project\python_test\github\PythonTest\venv\Scripts\ C:\Project\python_test\github\PythonTest\robot_nltk\

Then the interface pops up as follows, change the save address:

PS: Some sources say you can just run ('punkt') and download the specified packages we need, but I didn't succeed and I downloaded them all anyway.

# ('punkt') # is a command in the NLTK (Natural Language Toolkit) library that downloads a resource called 'punkt', which is usually used for Tokenization.
# ('popular') # command will download most of the popular resources in NLTK, more than punkt.

coding

Write a model

First write a NeuralNet() as follows:

import  as nn
class NeuralNet():
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.l1 = (input_size, hidden_size) 
        self.l2 = (hidden_size, hidden_size) 
        self.l3 = (hidden_size, num_classes)
         = ()
    
    def forward(self, x):
        out = self.l1(x)
        out = (out)
        out = self.l2(out)
        out = (out)
        out = self.l3(out)
        # no activation and no softmax at the end
        return out

Then write a tool nltk_utils.py as follows:

import numpy as np
import nltk

from  import PorterStemmer
stemmer = PorterStemmer()

def tokenize(sentence):
    return nltk.word_tokenize(sentence)


def stem(word):
    return (())


def bag_of_words(tokenized_sentence, words):
    sentence_words = [stem(word) for word in tokenized_sentence]
    bag = (len(words), dtype=np.float32)
    for idx, w in enumerate(words):
        if w in sentence_words: 
            bag[idx] = 1

    return bag

a="How long does shipping take?"
print(a)
a = tokenize(a)
print(a)

This file can be run directly to test the application of the functions within the tool.

stemming and tokenization

Stemming is the extraction of words into stems. The logic is as follows:

words =["0rganize","organizes", "organizing"]
stemmed_words =[stem(w) for w in words]
print(stemmed_words)

The process is illustrated below:

Tokenization is the conversion of words into tokens.
The following code is a test tokenization.

a="How long does shipping take?"
print(a)
a = tokenize(a)
print(a)

The logic of tokenization is roughly as follows:

Preparation of test data

Writing json files (English version)

{
  "intents": [
    {
      "tag": "greeting",
      "patterns": [
        "Hi",
        "Hey",
        "How are you",
        "Is anyone there?",
        "Hello",
        "Good day"
      ],
      "responses": [
        "Hey :-)",
        "Hello, thanks for visiting",
        "Hi there, what can I do for you?",
        "Hi there, how can I help?"
      ]
    },
    {
      "tag": "goodbye",
      "patterns": ["Bye", "See you later", "Goodbye"],
      "responses": [
        "See you later, thanks for visiting",
        "Have a nice day",
        "Bye! Come back again soon."
      ]
    },
    {
      "tag": "thanks",
      "patterns": ["Thanks", "Thank you", "That's helpful", "Thank's a lot!"],
      "responses": ["Happy to help!", "Any time!", "My pleasure"]
    },
    {
      "tag": "items",
      "patterns": [
        "Which items do you have?",
        "What kinds of items are there?",
        "What do you sell?"
      ],
      "responses": [
        "We sell coffee and tea",
        "We have coffee and tea"
      ]
    },
    {
      "tag": "payments",
      "patterns": [
        "Do you take credit cards?",
        "Do you accept Mastercard?",
        "Can I pay with Paypal?",
        "Are you cash only?"
      ],
      "responses": [
        "We accept VISA, Mastercard and Paypal",
        "We accept most major credit cards, and Paypal"
      ]
    },
    {
      "tag": "delivery",
      "patterns": [
        "How long does delivery take?",
        "How long does shipping take?",
        "When do I get my delivery?"
      ],
      "responses": [
        "Delivery takes 2-4 days",
        "Shipping takes 2-4 days"
      ]
    },
    {
      "tag": "funny",
      "patterns": [
        "Tell me a joke!",
        "Tell me something funny!",
        "Do you know a joke?"
      ],
      "responses": [
        "Why did the hipster burn his mouth? He drank the coffee before it was cool.",
        "What did the buffalo say when his son left for college? Bison."
      ]
    }
  ]
}

intents_cn.json Chinese version data.

{
  "intents": [
    {
      "tag": "greeting", "patterns": [ {
      "patterns": [
        "Hello",
        "Hi".
        "Hello".
        "Is anyone there?" ,
        "Hello", "Good morning".
        "Good morning", "Good afternoon".
        "Good afternoon", "Good evening".
        "Good evening."
      ],, "responses": [ "good morning", "good afternoon", "good evening
      "responses": [
        "Hello! Is there anything I can do to help?" ,
        "Hello! Thank you for stopping by." ,
        "Hi! Is there anything I can do for you?" ,
        "Good morning! How are you today?"
      ]
    }
    {
      "tag": "goodbye", {
      "patterns": [
        "goodbye".
        "goodbye".
        "See you next time".
        "Take care".
        "Good night."
      ].
      "responses": [
        "Bye! Hope to see you again soon." ,
        "Bye! Have a nice day." ,
        "Take care! See you next time." ,
        "Good night and sweet dreams!"
      ]
    }
    {
      "tag": "thanks", ,
      "patterns": [
        "thank you",
        "thank you",
        "Thanks a lot",
        "Thanks a lot"
      ],, "responses": [ "thank you", "thank you", "thank you very much
      "responses": [
        "You're welcome! I'm glad I could help." ,
        "No problem! At your service anytime." ,
        "Don't mention it! I hope I can help you." ,
        "Glad to help!"
      ]
    }
    {
      "tag": "help", , , "help
      "patterns": [
        "What can you do for me?" ,
        "What can you do?" ,
        "Can you help me?" ,
        "I need help".
        "Can you help me?"
      ], ,
      "responses": [
        "I can help you answer questions, provide information, or perform simple tasks." ,
        "I can help you look up information, organize tasks, etc." ,
        "You can ask me questions or have me do simple things." ,
        "Please let me know what you need help with!"
      ]
    },
    {
      "tag": "weather", , "weather": [ "weather" ], {
      "patterns": [
        "What's the weather like today?" ,
        "What's the weather like today?" ,
        "What is the weather forecast?" ,
        "Is it cold outside?" .
        "Is it nice?"
      ], "responses": [ "responses": [ "is it cold outside?
      "responses": [
        "It's a beautiful day to be outside!" ,
        "It's a little chilly today, so remember to dress warmly." ,
        "It's a beautiful day to go for a walk." ,
        "The weather is sunny and the temperature is perfect for going out."
      ]
    },
    {
      "tag": "about",
      "patterns": [
        "What are you?" ,
        "Who are you?" ,
        "What do you do?" ,
        "What can you do?"
      ],
      "responses": [
        "I am a chatbot that can answer your questions and help you solve your problems." ,
        "I am an intelligent assistant that helps you with various tasks." ,
        "I am a virtual assistant that can handle simple tasks and queries." ,
        "I can help you get information or do simple tasks."
      ]
    }
  ]
}

Training data

The training data logic is as follows:

import numpy as np
import random
import json

import torch
import  as nn
from  import Dataset, DataLoader

from nltk_utils import bag_of_words, tokenize, stem
from model import NeuralNet

with open('intents_cn.json', 'r', encoding='utf-8') as f:
    intents = (f)

all_words = []
tags = []
xy = []
# loop through each sentence in our intents patterns
for intent in intents['intents']:
    tag = intent['tag']
    # add to tag list
    (tag)
    for pattern in intent['patterns']:
        # tokenize each word in the sentence
        w = tokenize(pattern)
        # add to our words list
        all_words.extend(w)
        # add to xy pair
        ((w, tag))

# stem and lower each word
ignore_words = ['?', '.', '!']
all_words = [stem(w) for w in all_words if w not in ignore_words]
# remove duplicates and sort
all_words = sorted(set(all_words))
tags = sorted(set(tags))

print(len(xy), "patterns")
print(len(tags), "tags:", tags)
print(len(all_words), "unique stemmed words:", all_words)

# create training data
X_train = []
y_train = []
for (pattern_sentence, tag) in xy:
    # X: bag of words for each pattern_sentence
    bag = bag_of_words(pattern_sentence, all_words)
    X_train.append(bag)
    # y: PyTorch CrossEntropyLoss needs only class labels, not one-hot
    label = (tag)
    y_train.append(label)

X_train = (X_train)
y_train = (y_train)

# Hyper-parameters 
num_epochs = 1000
batch_size = 8
learning_rate = 0.001
input_size = len(X_train[0])
hidden_size = 8
output_size = len(tags)
print(input_size, output_size)

class ChatDataset(Dataset):

    def __init__(self):
        self.n_samples = len(X_train)
        self.x_data = X_train
        self.y_data = y_train

    # support indexing such that dataset[i] can be used to get i-th sample
    def __getitem__(self, index):
        return self.x_data[index], self.y_data[index]

    # we can call len(dataset) to return the size
    def __len__(self):
        return self.n_samples

dataset = ChatDataset()
train_loader = DataLoader(dataset=dataset,
                          batch_size=batch_size,
                          shuffle=True,
                          num_workers=0)

device = ('cuda' if .is_available() else 'cpu')

model = NeuralNet(input_size, hidden_size, output_size).to(device)

# Loss and optimizer
criterion = ()
optimizer = ((), lr=learning_rate)

# Train the model
for epoch in range(num_epochs):
    for (words, labels) in train_loader:
        words = (device)
        labels = (dtype=).to(device)
        
        # Forward pass
        outputs = model(words)
        # if y would be one-hot, we must apply
        # labels = (labels, 1)[1]
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        ()
        ()
        
    if (epoch+1) % 100 == 0:
        print (f'Epoch [{epoch+1}/{num_epochs}], Loss: {():.4f}')


print(f'final loss: {():.4f}')

data = {
"model_state": model.state_dict(),
"input_size": input_size,
"hidden_size": hidden_size,
"output_size": output_size,
"all_words": all_words,
"tags": tags
}

FILE = ""
(data, FILE)

print(f'training complete. file saved to {FILE}')

Preparation for use of the chat

Write the code for using the chat as follows:

import random
import json

import torch

from model import NeuralNet
from nltk_utils import bag_of_words, tokenize

device = ('cuda' if .is_available() else 'cpu')

with open('intents_cn.json', 'r',encoding='utf-8') as json_data:
    intents = (json_data)

FILE = ""
data = (FILE)

input_size = data["input_size"]
hidden_size = data["hidden_size"]
output_size = data["output_size"]
all_words = data['all_words']
tags = data['tags']
model_state = data["model_state"]

model = NeuralNet(input_size, hidden_size, output_size).to(device)
model.load_state_dict(model_state)
()

bot_name = "laptops"
print("Let's chat! (type 'quit' to exit)")
while True:
    # sentence = "do you use credit cards?"
    sentence = input("me:")
    if sentence == "quit":
        break

    sentence = tokenize(sentence)
    X = bag_of_words(sentence, all_words)
    X = (1, [0])
    X = torch.from_numpy(X).to(device)

    output = model(X)
    _, predicted = (output, dim=1)

    tag = tags[()]

    probs = (output, dim=1)
    prob = probs[0][()]
    if () > 0.75:
        for intent in intents['intents']:
            if tag == intent["tag"]:
                print(f"{bot_name}: {(intent['responses'])}")
    else:
        print(f"{bot_name}: me不知道")

The running effect is as follows:

Portal:
Learning Artificial Intelligence from Zero - Python-Pytorch Learning - Full Episode

Note: This post is original, please contact the author for authorization and attribution for any form of reproduction!

If you think this article is still good, please click [Recommend] below, thank you very much!

/kiba/p/18610399