The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 57 Python Strings and Serialization - Serialization and Deserialization
Abstracts:
Python serialization and deserialization is the process of converting Python objects into a stream of bytes (serialization) for storage or transmission, and converting the stream of bytes back into an object (deserialization). The pickle module and shelve module are Python's built-in serialization tools to serialize Python objects into binary data for storage or transmission.
Link to original article:
FreakStudio's Blog
Past Recommendations:
You're learning embedded and you don't know how to be object oriented?
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 00 Introduction to Object-Oriented Design Methods
The network's most suitable for the introduction of object-oriented programming tutorials: 01 Basic Concepts of Object-Oriented Programming
The Best Object-Oriented Programming Tutorials for Getting Started on the Web: 02 Python Implementations of Classes and Objects - Creating Classes with Python
The Best Object-Oriented Programming Tutorials for Getting Started on the Web: 03 Python Implementations of Classes and Objects - Adding Attributes to Custom Classes
The Best Object-Oriented Programming Tutorial on the Net for Getting Started: 04 Python Implementation of Classes and Objects - Adding Methods to Custom Classes
The Best Object-Oriented Programming Tutorial on the Net for Getting Started: 05 Python Implementation of Classes and Objects - PyCharm Code Tags
The best object-oriented programming tutorials on the net for getting started: 06 Python implementation of classes and objects - data encapsulation of custom classes
The best object-oriented programming tutorial on the net for getting started: 07 Python implementation of classes and objects - type annotations
The best object-oriented programming tutorials on the net for getting started: 08 Python implementations of classes and objects - @property decorator
The best object-oriented programming tutorials on the net for getting started: 09 Python implementation of classes and objects - the relationship between classes
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 10 Python Implementations of Classes and Objects - Class Inheritance and Richter's Replacement Principle
The best object-oriented programming tutorials on the net for getting started: 11 Python implementation of classes and objects - subclasses call parent class methods
The network's most suitable for the introduction of object-oriented programming tutorials: 12 classes and objects of the Python implementation - Python using the logging module to output the program running logs
The network's most suitable for the introduction of object-oriented programming tutorials: 13 classes and objects of the Python implementation - visual reading code artifacts Sourcetrail's installation use
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 14 Python Implementations of Classes and Objects - Static Methods and Class Methods for Classes
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 15 Python Implementations of Classes and Objects - __slots__ Magic Methods
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 16 Python Implementations of Classes and Objects - Polymorphism, Method Overriding, and the Principle of Open-Close
The Best Object-Oriented Programming Tutorials for Getting Started on the Web: 17 Python Implementations of Classes and Objects - Duck Types and "file-like objects"
The network's most suitable for the introduction of object-oriented programming tutorials: 18 classes and objects Python implementation - multiple inheritance and PyQtGraph serial data plotting graphs
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 19 Python Implementations of Classes and Objects - Using PyCharm to Automatically Generate File Annotations and Function Annotations
The best object-oriented programming tutorials on the web for getting started: 20 Python implementation of classes and objects - Combinatorial relationship implementation and CSV file saving
The best introductory object-oriented programming tutorials on the net: 21 Python implementation of classes and objects - Organization of multiple files: modulemodule and packagepackage
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 22 Python Implementations of Classes and Objects - Exceptions and Syntax Errors
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 23 Python Implementation of Classes and Objects - Throwing Exceptions
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 24 Python Implementations of Classes and Objects - Exception Catching and Handling
The best object-oriented programming tutorials on the web for getting started: 25 Python implementation of classes and objects - Python to determine the type of input data
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 26 Python Implementations of Classes and Objects - Context Managers and with Statements
The best introductory object-oriented programming tutorials on the web: 27 Python implementation of classes and objects - Exception hierarchy and custom exception class implementation in Python
The best object-oriented programming tutorials on the net for getting started: 28 Python implementations of classes and objects - Python programming principles, philosophies and norms in a big summary
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 29 Python Implementations of Classes and Objects - Assertions and Defensive Programming and Use of the help Function
The Best Object-Oriented Programming Tutorials for Getting Started on the Web: 30 Python's Built-In Data Types - the root class of object
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 31 Python's Built-In Data Types - Object Object and Type Type
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 32 Python's Built-in Data Types - Class Class and Instance Instance
The Best Object-Oriented Programming Tutorials for Getting Started on the Web: 33 Python's Built-In Data Types - The Relationship Between the Object Object and the Type Type
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 34 Python's Built-In Data Types - Python's Common Compound Data Types: Tuples and Named Tuples
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 35 Python's Built-In Data Types - Document Strings and the __doc__ Attribute
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 36 Python's Built-In Data Types - Dictionaries
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 37 Python's Common Composite Data Types - Lists and List Derivatives
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 38 Python's Common Composite Data Types - Using Lists to Implement Stacks, Queues, and Double-Ended Queues
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 39 Python Common Composite Data Types - Collections
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 40 Python's Common Compound Data Types - Enumeration and Use of the enum Module
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 41 Python's Common Composite Data Types - Queues (FIFO, LIFO, Priority Queue, Double-Ended Queue, and Ring Queue)
The best introductory object-oriented programming tutorials on the web: 42 Python commonly used composite data types-collections container data type
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 43 Python's Common Composite Data Types - Extended Built-In Data Types
The Best Object-Oriented Programming Tutorial on the Net for Getting Started: 44 Python Built-In Functions and Magic Methods - Magic Methods for Rewriting Built-In Types
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 45 Python Implementations of Common Data Structures - Chain Tables, Trees, Hash Tables, Graphs, and Heaps
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 46 Python Function Methods and Interfaces - Functions and Event-Driven Frameworks
The network's most suitable for the introduction of object-oriented programming tutorials: 47 Python function methods and interfaces - callback function Callback
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 48 Python Function Methods and Interfaces - Positional Arguments, Default Arguments, Variable Arguments, and Keyword Arguments
Best Object-Oriented Programming Tutorials on the Net for Getting Started: 49 Python Functions Methods and Interfaces - Difference between Functions and Methods and lamda Anonymous Functions
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 50 Python Function Methods and Interfaces - Interfaces and Abstract Base Classes
The Best Object-Oriented Programming Tutorials on the Web for Getting Started: 51 Python Function Methods and Interfaces - Implementing Interfaces with Zope
Best Object-Oriented Programming Tutorials for Beginners on the Web: 52 Python Functions Methods and Interfaces-Protocol Protocols and Interfaces
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 53 Python Strings and Serialization - Strings and Character Encoding
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 54 Python Strings and Serialization - String Formatting and the format method
The Best Object-Oriented Programming Tutorial on the Net for Getting Started: 55 Python Strings and Serialization - Byte Sequence Types and Variable Byte Strings
The Best Object-Oriented Programming Tutorials on the Net for Getting Started: 56 Python Strings and Serialization - Regular Expressions and re Module Applications
More highlights to watch:
Accelerating Your Python: A Quick Guide to Python Parallel Computing
Understanding CM3 MCU Debugging Principles in One Article
Liver half a month, embedded technology stack summary out of the big
The "Secrets of the Martial Arts" of the Computer Competition
A MicroPython open source project collection: awesome-micropython, including all aspects of Micropython tool library
Avnet ZUBoard 1CG Development Board - A New Choice for Deep Learning
SenseCraft Deploys Models to Grove Vision AI V2 Image Processing Module
Documentation and code acquisition:
The following link can be accessed to download the document:
/leezisheng/Doc
This document mainly introduces how to use Python for object-oriented programming, which requires readers to have a basic understanding of Python syntax and microcontroller development. Compared with other blogs or books that explain Python object-oriented programming, this document is more detailed and focuses on embedded host computer applications, with common serial port data sending and receiving, data processing, and dynamic graph drawing as application examples for the host computer and the lower computer, and using Sourcetrail code software to visualize and read the code for readers' easy understanding.
The link to get the relevant sample code is below:/leezisheng/Python-OOP-Demo
main body (of a book)
Serialization and Deserialization
We have made it clear that all variables are stored in memory during program execution. For example, we define a variable namedd
dictionary that contains thename
、age
、grade
cap (a poem)score
and other key-value pairs. We can change the values of these variables at any time during program execution, for example by replacing thename
value is changed from 'Larry' to 'david'. However, once the program has finished executing, all the memory occupied by these variables will be reclaimed by the operating system. It is worth noting that if we make changes to the variables during program execution but do not persist this modified data to disk, then the next time we re-run the program, these variables will be re-initialized to their original state, i.e.name
value is still 'Larry'. Therefore, the value ofTo ensure data continuity and consistency, we need to write critical data to disk at the right time so that it can be restored to the correct state after a program restart.
On the other hand, objects stored in memory enough are difficult to transmit and interact with in the network due to factors such as programming language, network environment, and so on.From this, a mechanism was born thatIt is possible to interconvert objects in memory with data formats (strings, bytes, etc.) that are conveniently persisted on disk or interacted with over the network. This mechanism is called serialization and deserialization.:
- Serialization: the process of converting non-persistible and transportable objects in memory into objects that can be easily persisted and transportedIn Python, it's called pickling, and in other languages it's called serialization, marshalling, flattening, and so on, which all mean the same thing.Once serialized, the serialized content can be written to disk or transferred to another machine over the network;
- Deserialization: the process of converting a persistent and transportable object into a non-persistent and transportable object, i.e., reading the contents of a variable back into memory from a serialized object, also known as unpickling.
Three common ways to serialize objects in Python are pickle, shelve, and JSON: the json module is commonly used to write web interfaces that convert Python data into the common json format for passing to other systems or clients; it can also be used to save Python data to a local file; the pickle module implements a binary protocol for serializing and deserializing Python object structures; the shelve module can be thought of as an upgrade to the pickle module, since shelve uses the same protocol as pickle, but shelve is much more powerful than pickle. The pickle module implements a binary protocol for serializing and deserializing Python object structures. The shelve module can be seen as an upgraded version of the pickle module, because shelve uses the serialization protocol of pickle, but shelve provides a simpler and more convenient way of operating than pickle.
Serialization with the pickle module
Python's pickle module stores objects directly in a special storage format in an object-oriented way. Converting an object (everything it holds exists as an attribute) into a sequence of bytes is necessary to store or transfer it when we need it.
The pickle has the following methods for storing and loading data:
methodologies | corresponds English -ity, -ism, -ization | operating object |
---|---|---|
** **** **dump | The dump method takes an object and a file-like object and writes the serialized bytes to the file. The file object must have a write method, and this method must know how to handle the bytes parameter (so that files opened in text-output mode cannot be used). A file-like object is simply an object that looks like a file object and has at least two methods, read () and write (). | file-like objects |
load | The load method reads a serialized object from the file object. The file object must have appropriate read and readline methods, and of course they must both return bytes. pickle module will load the object from these bytes, and the load method will return a completely rebuilt object. | |
dumps | Returns the sealed object as bytes instead of writing it to a file. | bytes object |
loads | Reconstructs and returns the object hierarchy of a sealed representation of an object, data. data must be a bytes-like object. |
To serialize an object that contains a hierarchy, simply call the dumps() function. Similarly, to deserialize a stream of data, you can call the loads() function. However, for more control over serialization and deserialization, you can create Pickler or Unpickler objects, respectively.
Let's try to use the pickle module to store and load list objects, the sample code is as follows:
import pickle
_# List object to be serialized _
some_data = ["a list", "containing", 5, "values including another list", ["inner", "list"]]
_# Serialize the objects and store the list in a file _#
_# Use the open() function to open a file named "pickled_list" _#
_# Open the file in binary write mode 'wb' _# with open("pickled_list")
with open("pickled_list", 'wb') as file.
_# Serialize some_data object and write it to the file using the () method _# (some_data, file)
(some_data, file)
_# Deserialize the object and load the list in the file _#
_# Open the same file in binary read mode 'rb' _# with open("pickled_list")
with open("picked_list", 'rb') as file.
_# Deserialize some_data object from file using () method _# loaded_data = (file)
loaded_data = (file)
_# Print the loaded list _
print(loaded_data)
_# Determine if the list files are identical _#
if loaded_data == some_data.
print("Deserialized list is the same as the original list.")
The result of the run is as follows, you can see that a new pickled_list binary file has appeared, while the deserialized list is the same as the original list:
bothdump
methods all have optionalprotocol
parameter. If the objects we save and load are limited to use by Python 3 programs, we do not need to specify this parameter. However, if we are storing objects that may need to be compatible with older versions of Python, we can only use the relatively inefficient older protocol. To ensure data compatibility and security, we need to carefully consider the setting of this parameter when we use it.
When data is deserialized back, all source data is assumed to be available. Modules, classes, and functions are automatically imported as needed. For applications where Python data is shared by parsers on different machines, data preservation can be problematic because all machines must access the same source code.
One of the side effects of pickle loading is that it automatically loads the appropriate module and constructs the instance object.
When using pickle, it is important to note that it is not a secure format and loading serialized objects from unknown or untrusted sources may introduce malicious code/viruses, so don't pass pickles over the internet to unknown interpreters.
The pickle module also has built-in exceptions for failed serialization and deserialization operations:
The dump or load method can be executed multiple times on an open file. Each call to dump will store a single object (plus all the objects it contains), while a load will load and return only one object. So for individual files, each call to dump to store an object should be accompanied by an associated call to load.
pickle is not an efficient way to encode large data structures such as binary arrays created with the array or numpy modules. If you need to move a large amount of array data, you're better off saving it as an array block in a file or using a more advanced standard encoding such as HDF5 (which requires third-party library support).
For the most common Python objects, pickle does a good job of serializing them. Basic types such as integers, floats, and strings can be serialized, including any container objects such as lists or dictionaries. In addition, theImportantly, any object can be pickle serialized, as long as all of its properties are pickleable.
But be warned.Time-related properties or objects that depend on external system state should not be serialized with the pickle module if possible.for example, an open network socket, an open file, a running thread, or a database connection.It is not reasonable to attempt to reload these objects at some point in the future, as much of the system state information associated with them may no longer exist.
When wanting to serialize objects where time-related properties exist, we can customize the storage and loading process for such transiently existing data.User-defined classes can get around these restrictions by providing getstate() and setstate() methods. If these two methods are defined, () calls getstate() to get the serialized object. Similarly, setstate() is called on deserialization.
In the following code, we define a class called UpdatedURL that is used to periodically update the content of a given URL:
- In the initialization method of the class, pass in a URL parameter and call the update() method to get the content of the URL and the last update time. Then call the schedule() method to set the timer to call update() every hour (3600 seconds);
- The update() method uses the urlopen() function to open the specified URL, read its contents, and record the current time as the last update time. The schedule() method is then called again to set the next timer;
- The schedule() method creates a Timer object, uses the update() method as a callback function and sets it as a daemon thread, and finally starts the timer.
Note that this code uses functions or classes such as urlopen(), (), and Timer, which need to be imported before they can work properly.
The sample code is as follows:
from threading import Timer
import datetime
from import urlopen
import pickle
class UpdatedURL:
def __init__(self, url):
= url
= ''
self.last_updated = None
()
def update(self):
= urlopen().read()
self.last_updated = ()
()
def schedule(self):
= Timer(3600, )
= True
()
u = UpdatedURL("/")
url, contents and last_updated are all serializable, so let's try to serialize an instance of this class:
serialized = (u)
The results of the run are as follows:
When the pickle module serializes an object, it first tries to check if the object exists__getstate__
method. If this method exists, pickle will choose to store the__getstate__
method, and conversely, it will try to store the object's__dict__
Properties.__dict__
is a dictionary that maps all the property names of an object and their corresponding values.
Next, we implement serialization of objects with time-related properties by overriding the __getstate__ method of the UpdatedURL class. In this __getstate__ method, all properties and values of the class instance are first copied into a new dictionary object, new_state. Then the new_state is checked to see if it contains a key named 'timer', and if it does, the key and its corresponding value are deleted. Finally, the new dictionary object is returned:
def __getstate__(self):
new_state = self.__dict__.copy()
if 'timer' in new_state:
del new_state['timer']
return new_state
Now serializing this object will not fail anymore. It can be successfully loaded via loads. However, the reloaded object no longer has a timer attribute, so it won't be able to refresh periodically as it was originally designed to do, and we need to create a new timer for the deserialized object.
As with overriding the __getstate__ method to implement a custom serialization operation, we can also implement a custom deserialization operation by setting the __setstate__ method. This method takes only one argument, the object returned by the __getstate__ method. If you implement both methods, __getstate__ doesn't have to return a dictionary object. Because __setstate__ can handle whatever object is returned. Here, we refix __dict__ by customizing the __setstate__ method.
def __setstate__(self, data):
self.__dict__ = data
()
Next, we deserialize the serialized serialized using the load method:
u2 = (serialized)
_# Use the hasattr() function to determine if the object contains the corresponding attribute _
print(hasattr(u2,'timer'))
print()
The output is as follows:
We can see that by overriding the __setstate__ method we can implement a custom deserialization operation that creates a new timer for the deserialized object.
Serialization with the shelve module
The Shelve module, which is part of the Python standard library, uses Python's pickle module to serialize and deserialize Python objects, saving them to a file on disk. But unlike the pickle module, it stores data using key-value pairs, similar to a dictionary.
The Shelve module is part of the Python standard library, so no additional installation is required. To use Shelve, simply import it in a Python script. When you use Shelve to save data, you usually create a Shelve file, which is actually a database file containing key-value pairs, usually with a .db, .shelf, or .dat extension.
In the next example, we create a Shelve file and store the data into the file where we can access and store the data using keys.
import shelve
_# Create or open a Shelve file using the () function _# with ('') as shelf.
with ('') as shelf.
_# Write key-value pairs to a Shelve file _with shelf['key'] = value
shelf['name'] = 'Alice'
shelf['age'] = 30
shelf['scores'] = [95, 88, 72]
_# Read the data from the Shelve file using the shelf['key'] method _#
_# Assign it to the appropriate variable _#
name = shelf['name']
age = shelf['age']
scores = shelf['scores']
print(f'Name: {name}')
print(f'Name: {name}')
print(f'Scores: {scores}')
The results of the run are as follows:
We can also update the data in a Shelve file like a dictionary. If an already existing key is used to store a new value, it will overwrite the old value. Similarly, keys can be deleted to remove the corresponding values.
with ('', writeback=True) as shelf:
_# Updated data_
shelf['name'] = 'Bob'
_# Delete data_
del shelf['age']
name = shelf['name']
print(name)
try:
age = shelf['age']
print(age)
except:
print("No ages")
While the Shelve module is very convenient, it has some limitations and caveats: Shelve does not support multi-threaded write operations. If you need to write a Shelve file in a multi-threaded environment, consider using a thread lock to protect the file operation; also, the keys of a Shelve file must be strings, and the values can be any serializable Python objects.Meanwhile Shelve is usually suitable for small applications, configuration files, and simple database needs, but not for storing large amounts of data as they require the entire database to be loaded in memory.
When using the pickle and shelve modules, it is important to note that due to their unique serialization protocols, the serialized data can only be recognized by Python, and therefore can only be used within Python.In addition, Python uses a different serialization protocol than the default.For compatibility you need to specify the protocol version in the protocol parameter when serializing.In addition to these drawbacks, the advantage of the pickle and shelve modules over the json module is that custom data types can be serialized and deserialized directly without the need to write additional conversion functions or classes.