Learning Machine Learning from Scratch

First of all, I'd like to introduce you to a very useful study address:/columns

Today, our main task was to run the model again following the established process and successfully load it into a web application so that it could be invoked through the web interface. The resulting model will be able to infer the address of the city where the event occurred based on the UFO sighting data and latitude and longitude information. Although the latitude and longitude information seems to be sufficient, we still need the model to make predictions, and this is just an exercise for that purpose. Hopefully, through this process, you will gain a better understanding of the use and integration of machine learning models in web applications.

Knowledge Review

artifact

For this task, you'll need two tools: Flask and Pickle, both of which run on Python.

Flask is a lightweight web application framework for building simple web applications and RESTful API. it follows the WSGI (Web Server Gateway Interface) standard , supports a variety of extensions , with the following characteristics :

simple and easy to use: The core of Flask is very simple, easy to get started, and suitable for beginners.
dexterity: Developers can select and add extensions as needed to make it more powerful.
routing system: Flask provides a flexible URL routing mechanism for managing different pages and requests.
template engine: Integrated Jinja2 template engine , you can easily generate dynamic HTML pages .
RESTful API support: Perfect for building RESTful services.

Pickle is a built-in module for Python that is used for object serialization and deserialization. Serialization is the process of converting Python objects to byte streams, while deserialization is the process of reducing byte streams to Python objects. Its features include:

simple and convenient: Simple to use and easy to save and load Python objects.
Support for multiple objects: Can serialize most Python data types, including custom classes.
Persistent data: Suitable for storing data in files for subsequent use.

take note of: Pickle has some security pitfalls, as deserializing untrustworthy data can lead to code execution vulnerabilities.

Cleaning data

First, let's dive into our dataset to see its exact structure and content. This step is very important because the quality and characteristics of the data will directly affect the performance of the model and the prediction results.

import pandas as pd
import numpy as np

ufos = pd.read_csv('./data/')
()

Therefore, depending on the problem we are trying to solve, we need to extract the vector field data associated with it. These fields include: city name, latitude and longitude coordinates, and duration of the sighting. This information will provide important input features for our subsequent analysis and modeling to help us accurately infer the city where the UFO sighting occurred.

ufos = ({'Seconds': ufos['duration (seconds)'], 'Country': ufos['country'],'Latitude': ufos['latitude'],'Longitude': ufos['longitude']})

()
(inplace=True)
ufos = ufos[(ufos['Seconds'] >= 1) & (ufos['Seconds'] <= 60)]
()

After completing this series of data cleansing and filtering, our dataset will contain only the relevant fields we need, ensuring that it is streamlined and efficient. Additionally, we have applied special filtering on the duration of sightings to ensure the validity and consistency of this feature.

This is because we understand that the model's sensitivity to the data affects its predictions. Next, we need to convert the text fields into a numerical data format that is suitable for the model to handle. This conversion process will allow the model to understand and utilize this textual information, thus improving the prediction capability.

from  import LabelEncoder
ufos['Country'] = LabelEncoder().fit_transform(ufos['Country'])
()

So far, we have successfully completed data cleaning and preprocessing to ensure that our dataset has good quality and consistency. Next, we will enter the phase of model training. Considering that our task is to predict cities and city names are a limited and fixed class, a logistic regression model would be an ideal choice.

model

Now, we will start the key step of preparing the training model, i.e., dividing the dataset into training and testing groups.

from sklearn.model_selection import train_test_split
from  import accuracy_score, classification_report 
from sklearn.linear_model import LogisticRegression

Selected_features = ['Seconds','Latitude','Longitude']

X = ufos[Selected_features]
y = ufos['Country']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)


model = LogisticRegression()
(X_train, y_train)
predictions = (X_test)

print(classification_report(y_test, predictions))
print('Predicted labels: ', predictions)
print('Accuracy: ', accuracy_score(y_test, predictions))

Returns the following:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        41
           1       0.83      0.23      0.36       250
           2       1.00      1.00      1.00         8
           3       1.00      1.00      1.00       131
           4       0.96      1.00      0.98      4743

    accuracy                           0.96      5173
   macro avg       0.96      0.85      0.87      5173
weighted avg       0.96      0.96      0.95      5173

Predicted labels:  [4 4 4 ... 3 4 4]
Accuracy:  0.9605644693601392

The accuracy of the model is quite good, about 95%. This result is not surprising because there is indeed a significant correlation between our feature, Country, and Latitude/Longitude information.

Next, we need to package this trained model in order to integrate it into our web application. In this way, users will be able to easily invoke the model for city prediction through web pages and enjoy a smooth interactive experience.

Packaging model

We can directly leverage the pickle library, a convenient dependency, to serialize the trained model and write it to a file.

import pickle
model_filename = ''
(model, open(model_filename,'wb'))

model = (open('','rb'))

Build WEB

Now, our goal is to call the trained model through a Flask application and display the prediction results on a web page in a more aesthetically pleasing and user-friendly way.Flask, as a lightweight web framework, can help us to quickly build web applications that allow users to interact with the model through an intuitive interface.

web-app/
  static/
    css/
  templates/

This is the new directory structure we need to create, be sure to follow this layout closely, as not organizing files and folders according to this structure may cause the program to run with errors.

Add the following lines to the

scikit-learn
pandas
numpy
flask

Then install the relevant dependencies

pip install -r

For the rest of the CSS and HTML code, please go to the Learning Center and copy and refer to it yourself, so I won't list it here.

Add.

import numpy as np
from flask import Flask, request, render_template
import pickle

app = Flask(__name__)

model = (open("./", "rb"))


@("/")
def home():
    return render_template("")


@("/predict", methods=["POST"])
def predict():

    int_features = [int(x) for x in ()]
    final_features = [(int_features)]
    prediction = (final_features)

    output = prediction[0]

    countries = ["Australia", "Canada", "Germany", "UK", "US"]

    return render_template(
        "", prediction_text="Likely country: {}".format(countries[output])
    )


if __name__ == "__main__":
    (debug=True)

This code implements a simple web application that allows the user to input feature data and make predictions based on a loaded machine learning model. The prediction results are rendered to the same page and displayed to the user. User requests and responses are easily handled through Flask's routing system.

For example, when we input the feature data "50, 44, -12", the application will call the trained model to perform the calculation and display the prediction result on the page, as shown in the figure.

summarize

In this project, we successfully integrated a machine learning model into a web application by using Flask and Pickle to enable users to make predictions through a user-friendly interface. This process not only allowed us to experience the details of model training and data preprocessing, but also gained a deep understanding of how to implement machine learning in real applications.

Through this series of hands-on exercises, we not only consolidate our understanding of tools and techniques, but also enhance our ability to translate theoretical knowledge into practical applications. The importance of this process lies in the fact that it not only challenges our technical skills, but also develops the ability to solve practical problems.

I'm Rain, a Java server-side coder, studying the mysteries of AI technology. I love technical communication and sharing, and I am passionate about open source community. I am also a Tencent Cloud Creative Star, Ali Cloud Expert Blogger, Huawei Cloud Enjoyment Expert, and Nuggets Excellent Author.

💡 I won't be shy about sharing my personal explorations and experiences on the path of technology, in the hope that I can bring some inspiration and help to your learning and growth.

🌟 Welcome to the effortless drizzle! 🌟