Location>code7788 >text

When processing data asynchronously in SQLAlchemy, get the handling of the associated collection

Popularity:200 ℃/2024-09-05 10:43:20

We use relationships to identify relationships when defining relationships in the SQLAlchemy object model, where the parameter lazy has a variety of different loading strategies, and this accompanying article describes the relationships between them, as well as some code examples in asynchronous processing.

1. Define relationships in SQLAlchemy

In SQLAlchemy.relationship() Functions are used to define relationships between tables (e.g.one-to-manymany-to-onemany-to-many (etc.). It supports many parameters to control how the associated data is loaded and processed. Here are some of the commonly usedrelationship() parameters and their descriptions:

1. lazy

  • corresponds English -ity, -ism, -ization: Controls how Linked Data is loaded.
  • selectable value:
    • 'select': Delayed loading. When accessing a relational property, send a separate query to get the associated data (default value).
    • 'selectin': UseIN Query batch loading of associated objects to avoid the n+1 query problem.
    • 'joined': UseJOIN Load Linked Data directly in the main query.
    • 'subquery': Use subqueries to bulk load associated objects.
    • 'immediate': Load the associated object immediately after loading the primary object.
    • 'dynamic':: Applicable onlyone-to-many, which returns a query object that can be further filtered or manipulated with associated data.
  • particular

In SQLAlchemy.lazyis a parameter that defines how an ORM relationship is loaded and is primarily used to control the association relationship (e.g., theone-to-manymany-to-oneetc.) are loaded at the time of access.

1)lazy='select' (default)

  • clarification: This is the most common way to use the "delayed loading" strategy. When accessing an associated property, SQLAlchemy sends a new SQL query to load the data.
  • vantage: Avoid unnecessary queries and save resources.
  • drawbacks: When you access multiple Linked Objects, this can lead to the "n+1 query problem" where a new SQL query is issued each time you access the Linked Data.

2) lazy='selectin'

  • clarification: Similar tolazy='select'But byINstatement to query related objects in bulk.SQLAlchemy fetches related data for multiple objects in bulk in a single query, rather than querying for each object individually.
  • vantage:: Solve the "n+1 query problem", more efficient thanselect
  • drawbacks: Applies to applications that can be accessed through theINstatement for efficient querying scenarios, but may impact performance if the result set is very large.

3) lazy='joined'

  • clarification: In the case of the main query, use theJOINstatement loads the associated object directly. This means that the associative objects are loaded immediately at query time and no additional queries are required.
  • vantage: Avoids multiple SQL queries and is suitable for scenarios where a large amount of linked data is required in the same query.
  • drawbacks: IfJOINwith more table data may result in complex query results and performance degradation.

4)lazy='immediate'

  • clarification: Load all associated objects as soon as the main object is loaded. Same asselectSimilar, but sends a query request to load the associated object right after the main object is loaded.
  • vantage: Ensures that complete data is available immediately after the object is loaded.
  • drawbacks: A separate query is still sent for each associated object, which may cause the "n+1 query problem".

5)lazy='subquery'

  • clarification: Use subqueries to load related objects.SQLAlchemy generates a subquery when querying the main object to bulk load related objects.
  • vantage: Avoids the "n+1 query problem" and is suitable for handling large data sets.
  • drawbacks: Subqueries may result in less efficient queries, especially in complex query scenarios.

6)lazy='dynamic'

  • clarification:: Applicable onlyone-to-manyrelationship that returns a query object instead of the actual result set. You can further filter or manipulate related objects by calling the query object.
  • vantage: Very flexible, you can query the associated objects as often as you need to.
  • drawbacks:: Associative attributes cannot be accessed in the usual way, and the data must be further accessed through queries.

 

2. backref

  • corresponds English -ity, -ism, -ization: Defines a reverse reference that allows access to the current table from an associated table.

  • usage:: AdoptionbackrefThe following is an example of an inverse relationship that can be automatically generated in the associated table, avoiding the need to manually define a bi-directional relationship.

  • typical example:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", backref="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey(''))

3. back_populates

  • corresponds English -ity, -ism, -ization: When manually defining a bi-directional relationship, use theback_populates to explicitly represent the interrelationship between two tables.

  • typical example:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey(''))
    parent = relationship("Parent", back_populates="children")

4. cascade

  • corresponds English -ity, -ism, -ization: Defines a cascade operation that determines whether an operation on a parent object automatically performs the corresponding operation on the associated child object.

  • common value:

    • 'save-update': When a parent object is saved or updated, the child object is also saved or updated.
    • 'delete': When the parent object is deleted, the child object is also deleted.
    • 'delete-orphan': The child object will be deleted when it loses its association with the parent object.
    • 'all': Contains all cascading operations.
  • typical example:

children = relationship("Child", cascade="all, delete-orphan")

5. uselist

  • corresponds English -ity, -ism, -ization: Controls whether the associated property returns a list. Applies toone-to-one cap (a poem)one-to-many Relationships.

  • usage:

    • True: Returns a list (forone-to-many(default value).
    • False: Returns a single object (forone-to-one)。
  • typical example:

parent = relationship("Parent", uselist=False)  # one-to-one relationship

6. order_by

  • corresponds English -ity, -ism, -ization: Defines how the associated objects are sorted.

  • typical example:

children = relationship("Child", order_by="")

7. foreign_keys

  • corresponds English -ity, -ism, -ization: Explicitly specifies which columns are foreign keys used to define the association relationship, for scenarios where multiple foreign keys exist.

  • typical example:

parent = relationship("Parent", foreign_keys="[Child.parent_id]")

8. primaryjoin

  • corresponds English -ity, -ism, -ization: Explicitly defines the join conditions of an association, typically used when SQLAlchemy cannot automatically infer them.

  • typical example:

parent = relationship("Parent", primaryjoin=" == Child.parent_id")

9. secondary

  • corresponds English -ity, -ism, -ization:: Define many-to-many (many-to-many) relationship, specify the intermediate table of the association.

  • typical example:

class Association(Base):
    __tablename__ = 'association'
    parent_id = Column(Integer, ForeignKey(''))
    child_id = Column(Integer, ForeignKey(''))

children = relationship("Child", secondary="association")

10. secondaryjoin

  • corresponds English -ity, -ism, -ization: Definitionssecondary Associative conditions in tables, typically used for complex many-to-many relationships.

  • typical example:

children = relationship("Child", secondary="association", 
                        secondaryjoin=" == Association.child_id")

11. viewonly

  • corresponds English -ity, -ism, -ization: Defines a read-only relationship through which data modification is not allowed.

  • typical example:

children = relationship("Child", viewonly=True)

12. passive_deletes

  • corresponds English -ity, -ism, -ization: Controls the behavior when deleting. If set toTrueSQLAlchemy does not actively delete associated objects, but relies on the database's cascading deletion.

  • typical example:

children = relationship("Child", passive_deletes=True)

These parameters can be tuned to specific business needs and scenarios to optimize query and data management strategies.

 

2. Relationship analysis of the user role table

In the actual business, the organization and the user are many-to-many relationship, we use the organization table definition to carry on the analysis of their relationship information.

If the model definition of the organization table is roughly as follows.

class Ou(Base):
    """Institutional (departmental) information-table model"""
    __tablename__ = "t_acl_ou"
    id = Column(Integer, primary_key=True, comment="primary key", autoincrement=True)
    pid = Column(Integer, ForeignKey("t_acl_ou.id"), comment="Parent organization ID", default="-1")
    handno = Column(String, comment="Organization code")
    name = Column(String, comment="Name of organization")

    # Define parent relationship
    parent = relationship(
        "Ou", remote_side=[id], back_populates="children", lazy="immediate"
    )
    # Define children relationship
    children = relationship("Ou", back_populates="parent", lazy="immediate")
    # Define the users relationship
    users = relationship(
        "User", secondary="t_acl_ou_user", back_populates="ous", lazy="select"
    )

We can see that the many-to-many relationships loaded in it are using lazy=select.

When you use theawait (Ou, ou_id) to get aOu After the object is accessed, its relational properties (e.g.) when you may encounter asynchrony-related problems. The reason for this is that asynchronous sessions in SQLAlchemy require the use of theselectinload or other asynchronous loading options to ensure that Linked Data is loaded correctly in an asynchronous environment.

In the defaultlazy='select' In a relationship, loading the relationship object triggers a synchronous query, which is incompatible with an asynchronous session and results in an error. To work around this, you need to make sure that the loading of the relationship is done asynchronously.

Solution:

1. Utilizationselectinload Preloading

At query time, explicitly pass theselectinload to load the associatedusers Relationships:

from  import selectinload

ou = await (Ou, ou_id, options=[selectinload()])

# Now that you have access to , the relationship object has been loaded asynchronously
print()

 

2. Utilizationlazy='selectin' or other asynchronous compatible loading strategies

You can also define the model's associative relationships with thelazy='selectin' Set to the default loading method so that SQLAlchemy automatically uses an asynchronous-compatible loading mechanism when accessing associated properties:

class Ou(Base):
    __tablename__ = 'ou'
    id = Column(Integer, primary_key=True)
    users = relationship("User", lazy='selectin')  # Asynchronous loading with selectin

ou = await (Ou, ou_id)
print()  # Associated objects can be accessed normally asynchronously

Summary:

  • When accessing relational objects in an asynchronous environment, if you use the synchronouslazy='select', which can lead to asynchronous incompatibility issues.
  • The solution is by querying with theselectinload Or will the relationship of thelazy attribute is set to an asynchronous-compatible option such asselectin

Therefore, if the relationship between the organization and the user's information, we can selectload relationship to achieve loading, you can also consider the use of intermediate table relationship to obtain, as shown in the following code: to obtain the list of organizations associated with the specified user.

    async def get_ous_by_user(self, db: AsyncSession, user_id: str) -> list[int]:
        """Get the list of organizations associated with the specified user"""
        # Mode 1, subquery mode
        stmt = select(User).options(selectinload()).where( == user_id)
        result = await (stmt)
        user = ().first()
        ous =  if user else []

        # Approach II, Linked Table Approach
        # stmt = (
        #     select(Ou)
        #     .join(user_ou,  == user_ou.c.user_id)
        #     .where(user_ou.c.user_id == user_id)
        # )
        # result = await (stmt)
        # ous = ().all()

        ouids = [ for ou in ous]
        return ouids

The above two ways are equivalent, one is to fetch the relationship collection through the orm relationship and the other is to retrieve the main table data collection through the intermediate table relationship.

Through the intermediate table, we can also easily add the role of the relationship, such as the following is to add users for the role, that is, in the intermediate table can be processed.

    async def add_user(self, db: AsyncSession, role_id: int, user_id: int) -> bool:
        """Adding Roles - User Associations"""
        stmt = select(user_role).where(
            and_(
                user_role.c.role_id == role_id,
                user_role.c.user_id == user_id,
            )
        )

        if not (await (stmt)).scalars().first():
            await (
                user_role.insert().values(role_id=role_id, user_id=user_id)
            )
            await ()
            return True

        return False

Of course. If we do not use this intermediate table processing, it is also possible to use the regular many-to-many relationship to add processing, but it requires a little more retrieval of the data, and perhaps the performance will be worse.

    async def add_user(self, db: AsyncSession, ou_id: int, user_id: int) -> bool:
        """Adding users to an organization"""
        # This can be done in the following way, or by using an intermediate table
        # First determine if the user exists
        user = await (User, user_id)
        if not user:
            return False

        # Re-judging the existence of institutions
        result = await (
            select(Ou).options(selectinload()).filter_by(id=ou_id)
        )
        # await (Ou, ou_id) #This way can't get users because it's configured as selectin
        # await (Ou, ou_id, options=[selectinload()]) # This way you can get the users
        ou = ().first()
        if not ou:
            return False

        # Then determine if the user already exists in the organization
        if user in :
            return False

        # Join an organization
        (user)
        await ()
        return True