Recently, I have been doing some code reconstruction, involving scenarios where unit test implementation is more troublesome and even difficult to implement after some code reconstruction in Python. One of the main reasons is that the constructor is too complex.
Therefore, this article uses this to summarize what constructors we should need. The concepts involved in this article are not limited to Python.
What is the constructor
Constructors are used to trigger when creating objects. If the constructor is not customized, modern programming languages will automatically add a constructor without parameters during compilation, and set class members to the default value. Python requires defining object members to access them. In C languages such as C#, int, bool, float, etc. will be set to values such as 0, floating point number is 0, or boolean is false. For reference types that are not original values, such as String or Class, they will be set to Null.
The constructor is a very reasonable place for class initialization. Because the constructor is triggered when a new object is created, it brings far more trouble than the benefit compared to modifying the object properties after the object is constructed, such as null pointers, timing coupling, multi-threading problems, etc. If you are interested, you will talk about these later. But in short, putting the initialization of the class into the constructor is like laying the foundation first and then building a house, rather than building half of the house and then repairing the foundation, which also prevents the class from being in the "semi-finished" state.
Although constructors should do a complete job to avoid semi-finished products, if they are given too much responsibility to the constructor, it will cause a lot of trouble to the system. It is like moving furniture into the house before the main structure (constructor) is completed, which usually brings unnecessary burdens.
What constructor do we need
In a word: In my opinion, the constructor should only do assignments and the most basic parameter verification. Instead of doing external calls and complex initialization, using simple constructors can bring the following benefits:
Maintainability
Single responsibility to avoid surprises
The constructor should also follow the single responsibility principle, be responsible only for the initialization and basic verification of objects, and should not include other complex operations. When the constructor takes too much responsibility, unexpected "surprises" will occur, making the code difficult to understand and maintain.
For example, the following code performs database query operations (external dependencies) and statistical calculations (no external dependencies, complex internal calculations) in the constructor. It is difficult for us to see at a glance what the function is initialized to increase the cognitive burden of reading and understanding the code.
class UserReport:
def __init__(self, user_id):
self.user_id = user_id
# Constructor performs database operations (external dependencies)
= database.fetch_user(user_id)
# Execute complex calculations in the constructor (internal complex calculations, no external dependencies)
= self._calculate_statistics()
def _calculate_statistics(self):
# Assuming it is a complex statistical calculation
return {"login_count": 42, "active_days": 15}
An ideal constructor should simply do the "initialization assignment" operation, as shown below:
class UserReport:
def __init__(self, user, statistics):
"""Constructor is only responsible for initialization and does not perform other operations"""
= user
= statistics
This constructor only performs initialization assignments, without any expected situations. For example, in the example, if other classes continue to refer to other classes within the method, other classes will have external dependencies access (such as IO, API calls, database operations, etc.), it will cause surprises.
Reduce unexpected side effects
The inclusion of complex operations in the constructor not only violates the principle of single responsibility, but also may bring unexpected side effects. These side effects may cause unpredictable system behavior, increase debugging difficulty, and even cause undetectable bugs.
Let's continue to look at the previous code example:
class UserReport:
def __init__(self, user_id):
self.user_id = user_id
# Constructor performs database operations
= database.fetch_user(user_id)
# Perform complex calculations in the constructor
= self._calculate_statistics()
def _calculate_statistics(self):
# Complex statistical calculations
data = database.fetch_user_activities(self.user_id)
if not data:
# Exception may be thrown
raise ValueError(f"No activity data for user {self.user_id}")
return {"login_count": len(data), "active_days": len(set(() for d in data))}
This code shows that the _calculate_statistics() function has database access, which is a hidden dependency. At the same time, if there is an exception in the database access, it may cause the entire object to be created. The caller only wants to create the object, but may raise an exception that the database cannot connect to. This is all accidental at runtime.
Traceback (most recent call last):
File "", line 42, in <module>
report = UserReport(user_id=1001) # The caller just wants to create a report object
File "user_report.py", line 5, in __init__
= database.fetch_user(user_id) # The database query may fail
File "", line 78, in fetch_user
user_data = self._execute_query(f"SELECT * FROM users WHERE id = {user_id}")
File "", line 31, in _execute_query
connection = self._get_connection()
File "", line 15, in _get_connection
return (host=, user=, password=, db=self.db_name)
File "/usr/local/lib/python3.8/site-packages/pymysql/__init__.py", line 94, in Connect
return Connection(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pymysql/", line 327, in __init__
()
File "/usr/local/lib/python3.8/site-packages/pymysql/", line 629, in connect
raise excc
: (2003, "Can't connect to MySQL server on '' (timed out)")
If the computational logic is extracted to a special function, and the logic that accesses external dependencies is performed through injection, there will be no problem:
class UserReport:
def __init__(self, user, statistics=None):
"""Constructor is only responsible for initialization and has no side effects"""
= user
= statistics if statistics is not None else {}
def calculate_statistics(self, activity_source):
"""Separate the computational logic into a specialized method and accepts dependency injection""""
activities = activity_source.get_activities()
= {
"login_count": len(activities),
"active_days": len(set( for a in activities))
}
Return
class UserActivity:
def __init__(self, user_id, date, action):
self.user_id = user_id
= date
= action
class DatabaseActivity:
def get_activities(self, user_id):
# In actual application, the database will be queried
return database.fetch_user_activities(user_id)
Convenient debugging and evolution
The constructor is only responsible for simple initialization, and the code becomes easier to debug and evolve. In contrast, constructors containing complex logic make problem location and system scaling difficult. For example, the following example
class UserReport:
def __init__(self, user_id):
self.user_id = user_id
= database.fetch_user(user_id)
= database.fetch_user_activities(user_id)
= self._calculate_statistics()
= self._generate_recommendations()
# More complex logic...
You can see that the constructor includes too many possible failure points, and it is not easy to find which line is specified during debugging. The following debugging method is much easier:
class UserReport:
def __init__(self, user, activities=None, statistics=None, recommendations=None):
= user
= activities or []
= statistics or {}
= recommendations or []
And when evolving, complex constructors have great risks, such as:
# It is necessary to modify the original constructor, the risk is very high
class UserReport:
def __init__(self, user_id, month=None): # Add new parameters
self.user_id = user_id
= database.fetch_user(user_id)
# Modify existing logic
If month:
= database.fetch_user_activities_by_month(user_id, month)
else:
= database.fetch_user_activities(user_id)
# The following calculations may require adjustments
= self._calculate_statistics()
= self._generate_recommendations()
We need to add monthly filtering activity data and add a parameter. This situation is also common in actual code maintenance. Write wherever you think of it, which makes the constructor very complex and difficult to understand, and at the same time increases the possibility of errors. The better way is as follows:
class UserReport:
def __init__(self, user, activities=None, statistics=None, recommendations=None):
= user
= activities or []
= statistics or {}
= recommendations or []
def filter_by_month(self, month):
"""Add new features as separate method"""
filtered_activities = [a for a in if == month]
return UserReport(
,
activities=filtered_activities,
# The original data can be recalculated or retained as needed
)
New functions can be added independently without affecting existing functions. At the same time, it also avoids the worry of online launch caused by incomplete testing when modifying this core logic.
Testability
Good constructor design has a decisive impact on the testability of the code. When the constructor is simple and only responsible for basic initialization, testing becomes easier, more reliable, and does not depend on a specific environment. This is also the reason why I wrote this article, because when writing unit tests, I found that many classes are almost untestable (some of the third-party class libraries referenced, the class itself belongs to other components, and I have no right to modify it, .).
Dependency injection and testability
If the constructor has more logic, for example:
class UserReport:
def __init__(self, user_id):
self.user_id = user_id
= database.fetch_user(user_id)
= database.fetch_user_activities(user_id)
= self._calculate_statistics()
Then our unit tests will become very expensive. Every external dependency requires mock. Even if you only need to test a very simple Case, you also need to simulate all external dependencies, such as
def test_user_report():
# Requires a lot of simulation settings
with patch('.fetch_user') as mock_fetch_user:
with patch('.fetch_user_activities') as mock_fetch_activities:
# Configure the simulation return value
mock_fetch_user.return_value = User(1, "Test User", "test@")
mock_fetch_activities.return_value = [
Activity(1, datetime(2023, 1, 1), "login"),
Activity(1, datetime(2023, 1, 2), "login")
]
# Create an object - even if you only test a small number of functions, you need to mock all dependencies
report = UserReport(1)
# Verification results
assert ["login_count"] == 2
assert ["active_days"] == 2
# Verify call
mock_fetch_user.assert_called_once_with(1)
mock_fetch_activities.assert_called_once_with(1)
With the simple constructor, our unit tests will also become very simple, such as testing for the following code:
class UserReport:
def __init__(self, user, activities=None):
= user
= activities or []
= {}
def calculate_statistics(self):
"""Calculate statistics"""
login_count = len()
active_days = len(set( for a in ))
= {
"login_count": login_count,
"active_days": active_days
}
Return
You can see that unit tests no longer require complex mocks
def test_report_should_calculate_correct_statistics_when_activities_provided():
# Create test objects directly without mocking external dependencies
user = User(1, "Test User", "test@")
activities = [
UserActivity(1, datetime(2023, 1, 1), "login"),
UserActivity(1, datetime(2023, 1, 2), "login"),
UserActivity(1, datetime(2023, 1, 2), "logout") # Another activity on the same day
]
# Creating an object is very simple
report = UserReport(user, activities)
# Test specific methods
stats = report.calculate_statistics()
# Verification results
assert stats["login_count"] == 3
assert stats["active_days"] == 2
During testing at the same time, Mock object injection becomes very simple, as follows:
def test_report_should_use_activity_source_when_calculating_statistics():
# Prepare test data
user = User(42, "Test User", "test@")
mock_activities = [
UserActivity(42, datetime(2023, 1, 1), "login"),
UserActivity(42, datetime(2023, 1, 2), "login")
]
# Create a simulated data source
activity_source = MockActivity(mock_activities)
# Use dependency injection
report = UserReport(user)
report.calculate_statistics(activity_source)
# Verification results
assert ["login_count"] == 2
assert ["active_days"] == 2
It is easier to do boundary value tests:
def test_statistics_should_be_empty_when_activities_list_is_empty():
user = User(1, "Test User", "test@")
report = UserReport(user, []) # empty activity list
stats = report.calculate_statistics()
assert stats["login_count"] == 0
assert stats["active_days"] == 0
def test_constructor_should_throw_exception_when_user_is_null():
# Test invalid user situation
with (ValueError):
report = UserReport(None) # Assume that we verify that the user is not empty in the constructor
Therefore, the entire code logic will become more robust through unit testing, rather than requiring a large number of complex mocks. Complex mocks will make unit tests very fragile (that is, modifying a little logic to cause existing unit tests to be invalid)
Architecture-related impact
Easier to dependency injection
The core concept of dependency injection is that high-level modules should not rely on implementation details of low-level modules, but on abstraction. For example, we need to take a taxi to work in the company. We just need to open Didi to enter the destination. Our higher-level needs are from A to B. The specific implementation details are which car the taxi is, or who the driver is, which is not what we care about. Which car and which driver can switch to the service at any time.
Dependency injection is one of the core practices of modern software architectures, and simple constructor design is the basis for implementing effective dependency injection. Through constructor injection dependencies, we can build loosely coupled, highly cohesive systems, significantly improving the maintainability and scalability of our code.
# Create dependencies directly inside the class
class UserReport:
def __init__(self, user_id):
self.user_id = user_id
# Directly depend on specific implementation
= MySQLDatabase()
= .fetch_user(user_id)
# Inject dependencies through constructors
class UserReport:
def __init__(self, user, activity_source):
= user
self.activity_source = activity_source
= {}
def calculate_statistics(self):
activities = self.activity_source.get_activities()
# Computational logic...
Through the second piece of code, we can see that dependency injection is easier to implement. Usually, in actual use, it also combines the creation and injection of dependency injection containers (IoCs) to automate dependency creation and injection, but this is beyond the length of this article.
More easily exposed design problems
The constructor only performs assignment operations, which can easily expose class design problems. This often indicates a deeper design flaw when the constructor becomes bloated or complex.
For example, when a class constructor has a large number of parameters, it usually means that the class assumes too many responsibilities, such as:
# Need to be alert: constructors with too many parameters
class UserReport:
def __init__(self, user, activity_list, login_calculator, active_days_calculator,
visualization_tool, report_exporter, notification_system):
= user
self.activity_list = activity_list
self.login_calculator = login_calculator
self.active_days_calculator = active_days_calculator
self.visualization_tool = visualization_tool
self.report_exporter = report_exporter
self.notification_system = notification_system
= {}
A common solution is to use the Builder mode to make the initialization process more elegant, but this usually only covers up the problem, not solves the problem.
Therefore, the constructor with too many parameters can be regarded as a red flag. The correct solution is to re-view the class design and separate responsibilities:
# Core report category, focusing only on data and basic statistics
class UserReport:
def __init__(self, user, activities):
= user
= activities
= {}
def calculate(self, calculate):
= ()
Return self
# Statistical calculation of separation
class ActivityStatistics:
def compute(self, activities):
login_count = len([a for a in activities if == 'login'])
unique_days = len(set( for a in activities))
return {"logins": login_count, "active_days": unique_days}
# Separated report export function
class ReportExport:
def to_pdf(self, report):
# PDF export logic
pass
def to_excel(self, report):
# Excel export logic
pass
# Separated notification function
class ReportNotification:
def send(self, report, recipients):
# Send notification logic
pass
Then the class calls will become very clear:
# Clear separation of responsibilities
user = User(42, "John Doe", "john@")
activities = activity_database.get_user_activities()
# Create and calculate reports
calculate = ActivityStatistics()
report = UserReport(user, activities).calculate(calculator)
# Export the report (if required)
if export_needed:
exporter = ReportExport()
pdf_file = exporter.to_pdf(report)
# Send notifications (if required)
if notify_admin:
notifier = ReportNotification()
(report, ["admin@"])
This way, each class has a clear single responsibility, the constructor is simple and straightforward, and the functions can be combined on demand and testing becomes simple (each component can be tested individually).
special case
In some cases, it is reasonable that the constructor can do some other work in addition to assignment, as follows:
Parameter legality check
It is reasonable to perform basic parameter verification in the constructor, which ensures that the object is in a valid state since its creation. For example, as long as the constructor does not perform external dependency operations or complex logical operations, it is reasonable
class User:
def __init__(self, id, name, email):
# Basic parameter verification
if id <= 0:
raise ValueError("User ID must be positive")
if not name or not ():
raise ValueError("User name cannot be empty")
if not email or "@" not in email:
raise ValueError("Invalid email format")
= id
= name
= email
Simple derived value calculation
Sometimes it is reasonable to calculate some simple derived values in the constructor, as long as the calculated values remain unchanged throughout the class declaration cycle:
class Rectangle:
def __init__(self, width, height):
if width <= 0 or height <= 0:
raise ValueError("Dimensions must be positive")
= width
= height
# Simple derived value calculation
= width * height
= 2 * (width + height)
Initialization of immutable objects
For immutable objects (objects whose state cannot be changed after creation), the constructor needs to complete all the necessary initialization work:
class ImmutablePoint:
def __init__(self, x, y):
self._x = x
self._y = y
# Pre-calculate common values
self._distance_from_origin = (x**2 + y**2)**0.5
@property
def x(self):
return self._x
@property
def y(self):
return self._y
@property
def distance_from_origin(self):
return self._distance_from_origin
summary
A constructor with a reasonable design is the basis for creating an easy-to-maintenance, easy-to-test and easy-to-extend system. We should always adhere to the principle of constructor "only assignments and necessary basic verification" to make the code clearer and more flexible.
A simple constructor can bring the following advantages:
-
Easy to maintain: Single responsibilities and few side effects, which facilitates subsequent debugging and iteration.
-
Easy to test: It does not depend on the external environment, and can easily implement simulation and unit testing.
-
A clearer architecture: It is convenient to implement dependency injection, more in line with SOLID principles, and can also identify design problems faster.
When we find that the constructor is starting to become more complicated and the parameters are increasing, this is usually a problem with the code design itself, rather than a problem that can be quickly covered up with techniques such as the Builder pattern. The correct approach is to take a step back and re-examine the responsibilities of the class and reconstruct them in a timely manner.
Of course, in the actual encoding process, sometimes we may make some compromises, such as performing basic legality checks on parameters, simple data derived calculations, or initializing immutable objects. These situations should be minority exceptions, not universal rules.
In short, by keeping the constructor concise and intuitive, we can not only write high-quality code, but also discover and solve potential design problems early, making the entire system more stable and easy to maintain.