One of the powerful and mysterious keywords in Python programming is yield. beginners are often confused by it, while advanced developers leverage it for efficient code. What exactly is yield? How does it work in Python code? Let's find out.
A very important but also rather confusing keyword in Python - theyield。
What is yield and why do we need it in Python?
Come on, let's break it down and see what yield really is.
Iteration and Iterable Objects
To understand yield, we first need to figure out what iterable objects (iterables) are.
An iterable object is simply an object whose elements you can read one by one, such as a list, a string, a file, and so on. For example, when you create a list, you can read its elements one by one using a for loop:
mylist = [1, 2, 3] for i in mylist: print(i)
The output will be:
1 2 3
A mylist is an iterable object. You can also use list comprehension to create a list, which is also iterable.
mylist = [x*x for x in range(3)] for i in mylist: print(i)
The output is.
0
1
4
Where you can use for... in... is an iterable object, including lists, strings, files, and so on.
Iterable objects are very convenient because you can read their values as many times as you want, but only if you store all the values in memory. This poses a problem: when the amount of data is large, this approach is obviously not very suitable.
generator
The generator (generators) are a type of iterator, and you can only traverse them once. Instead of storing all the values in memory like a list, a generator generates them as they are used. Take a look at an example of a generator.
mygenerator = (x*x for x in range(3)) for i in mygenerator: print(i)
The output is the same as the list derivatives.
0
1
4
But watch out.The generators can only be used once, because they are "used and forgotten": they forget about 0 after calculating 0, forget about 1 after calculating 1, and end up after calculating 4. If you use the same generator object to do a for loop again, there will be no result.
yield keywords
Speaking of yield, this is a keyword similar to return, but instead of returning a value, it returns a generator. Take a look at this example.
def create_generator(): mylist = range(3) for i in mylist: yield i*i
mygenerator = create_generator() # Create a generator
print(mygenerator) # mygenerator is a generator object!
The output is:
<generator object create_generator at 0xb7555c34>
Iterate through this generator with a for loop.
for i in mygenerator: print(i)
Output.
0
1
4
This example may seem simple, but it's especially useful when working with large amounts of data because the generator only generates values when it needs them, rather than generating them all at once and then storing them in memory.
A deeper understanding of yield
In order to thoroughly grasp yield, we need to understand that when a generator function is called, the code within the function is not executed immediately. What the function returns is a generator object, and then your code continues to execute from where it was last interrupted each time the for loop is called, until it encounters the next yield.
The first time the for loop is called, the generator object runs the code in the function from scratch until it encounters a yield, then returns the first value in the loop. Each subsequent call executes the next iteration of the loop in the function until the generator no longer has a value to return. This may be because the loop has ended, or the condition is no longer satisfied.
Take a look at a practical example.
1 def _get_child_candidates(self, distance, min_dist, max_dist): 2 if self._leftchild and distance - max_dist < self._median: 3 yield self._leftchild 4 if self._rightchild and distance + max_dist >= self._median: 5 yield self._rightchild
The code here is called every time the generator object is used:
Returns the next child node if the node object still has a left child node and the distance is appropriate.
Returns the next child node if the node object still has a right child node and is the right distance away.
If there are no more child nodes, the generator is considered empty.
The method to call this generator is as follows:
1 result, candidates = list(), [self] 2 while candidates: 3 node = () 4 distance = node._get_dist(obj) 5 if distance <= max_dist and distance >= min_dist: 6 (node._values) 7 (node._get_child_candidates(distance, min_dist, max_dist)) 8 9 return result
There are a couple of clever things about the code here:
- The loop traverses a list which expands during the loop. This makes it easy to traverse all nested data, although it's a bit dangerous, as it can get stuck in an infinite loop. In this example, (node._get_child_candidates(distance, min_dist, max_dist)) exhausts all the values of the generator, but the while loop keeps creating new generator objects because they act on different nodes to produce different values.
- The extend() method is a method of the list object that expects an iterable object and adds its value to the list. Normally we pass a list to it, but in the code it receives a generator, which is a good idea because:
- You don't need to read the fetch value twice.
- You probably have a lot of child nodes and don't want to store them all in memory.
This code shows why Python is so cool: it doesn't care if the method's arguments are lists or other iterable objects. This feature is called duck typing, and it's an example of Python's flexibility.
Advanced Usage
Let's look at a more advanced use - controlling the exhaustion of the generator.
1 class Bank(): 2 crisis = False 3 def create_atm(self): 4 while not : 5 yield "$100" 6 7 hsbc = Bank() 8 corner_street_atm = hsbc.create_atm() 9 print(next(corner_street_atm)) # Output $100 10 print(next(corner_street_atm)) # Output $100 11 print([next(corner_street_atm) for _ in range(5)]) # Output ['$100', '$100', '$100', '$100', '$100'] 12 13 = True 14 print(next(corner_street_atm)) # Output StopIteration
Here we have simulated an ATM where you can keep withdrawing money when there is no crisis in the bank, but as soon as the crisis comes, the ATM stops working and you can't withdraw any more money even from a new ATM.
itertools module
Finally, I'd like to introduce you to a very useful module called itertools, which contains a number of special functions for manipulating iterable objects. If you ever want to copy a generator, join two generators, group values into nested lists with a single line of code, or use map and zip without creating another list, then you should import itertools.
As an example, let's look at the possible order of arrival for a four-horse race.
import itertools horses = [1, 2, 3, 4] races = (horses) print(list((horses)))
Output:
[(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2), (2, 1, 3, 4), (2, 1, 4, 3), (2, 3, 1, 4), (2, 3, 4, 1), (2, 4, 1, 3), (2, 4, 3, 1), (3, 1, 2, 4), (3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1), (4, 1, 2, 3), (4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1)]
The itertools module is simply a great companion for Python programmers and allows you to work with iterated objects like a charm.
summarize
Yield is a powerful tool in Python that can help you work with large amounts of data in an efficient way. Understanding how yield works is crucial to mastering Python programming.
In the era of big data, processing huge amounts of data has become the norm. As an efficient way of data processing, generators are favored by more and more developers because of their superior memory management capabilities. Whether it is log processing, data flow analysis, or real-time data processing, generators have shown irreplaceable value.
By explaining yield in detail, we not only understand its basic concepts and usage, but also recognize its importance in efficient data processing. Mastering yield will add a sharp tool to your Python programming journey.