"Play Streamlit"--Cache Mechanism

StreamlitWhen the application is running, each user interaction will trigger the re-execution of the entire script.

This means that some time-consuming operations, such as data loading, complex calculations, and model training, may be repeated, seriously affecting the application response speed.

This article introducesCache mechanismCan help us solve these problems and improveStreamlitPerformance of the application.

andStreamlitofCache mechanismIt's like equip the app with a"Memory Assistant", It allows developers to save the calculation results of a specific function. When the function is called again the same input next time, the function is not necessary to be re-executed, and the cached result is directly returned, which greatly improves the application operation efficiency and reduces the waiting time.

1. Why cache is needed

StreamlitThe mechanism is that the script will be rerun every time the user interaction or code changes, which leads to:

Repeat calculations: Long-running functions may be called multiple times, causing application response to slower
Waste of resources: Frequent loading and processing large amounts of data consumes a lot of memory and computing resources
Poor user experience: The application loading time is too long, affecting the user interaction experience

To solve these problems,StreamlitA caching mechanism is provided.

Cache mechanismIt's like equip the app with a"Memory Assistant", It allows developers to save the calculation results of a specific function. When the function is called again the same input next time, the function is not necessary to re-execute the function, and the cached result is directly returned.

By caching the output results of the function, avoiding repeated calculations, it can significantly improve the performance and response speed of the application.

2. Two cache decorators

StreamlitTwo cache decorators are provided:st.cache_dataandst.cache_resource,Their main differences lie in the cached object type and usage scenarios.

2.1. st.cache_data

st.cache_dataYes forCache datadecorators.

Applicable to cached functions' output results, especially those that return serializable data objects (e.g.Pandas DataFrame、NumPyarrays, strings, integers, etc.).

Its main parameters are:

ttl: The cached survival time (in seconds). After this time is exceeded, the cache will expire and be recalculated.
max_entries: The maximum number of entries allowed in the cache. If this number is exceeded, the oldest cache entry will be deleted.
persist: Whether to persist cache to disk. Default isFalse。
show_spinner: Whether to display loading animation. Default isTrue。
allow_output_mutation: Whether the return value is allowed to be modified. Default isFalse, it is recommended to use with caution.

2.2. st.cache_resource

st.cache_resourceYes forCache resourcesdecorators.

Suitable for caching objects that need to be initialized but do not require frequent recalculation, such as database connections, model loading, etc.

Its main parameters are:

ttlandmax_entries:andst.cache_datasame.
show_spinner: Whether to display loading animation. Default isTrue。
allow_output_mutation: Whether the return value is allowed to be modified. Default isFalse。

2.3. Summary of the differences between the two

	st.cache_data	st.cache_resource
Use scenarios	Applicable to cached functions' output results, especially those that return serializable data objects	Suitable for caching objects that need to be initialized but do not require frequent recalculation, such as database connections, model loading, etc.
Features	The cache is the output result of the function, suitable for scenarios where frequent calls and the output result may change.	The cache is the resource object itself, suitable for scenarios where initialization is time-consuming but does not require frequent updates.
Cache content example	Get data from API, load CSV files, data processing, etc.	Loading pretrained models, establishing database connections, etc.

3. Cache usage example

The following examples are used to demonstrate the use of these two cache decorators.

3.1. st.cache_data example

Suppose we have an application that needs toAPIGet data and display it to the user.

Since the data may take a long time to load, we can usest.cache_datato cache the results.

import streamlit as st
 import requests
 import pandas as pd

 # Use st.cache_data to cache data loading
 @st.cache_data(ttl=3600) # Cache for 1 hour
 def fetch_data(api_url):
     response = (api_url)
     data = ()
     df = (data)
     return df

 # User Interface Part
 ("Cached data loading with st.cache_data")
 api_url = "/posts"
 df = fetch_data(api_url)
 (df)

In this example,fetch_dataThe function is@st.cache_dataDecorator decoration.

When the first call is called, the data will be loaded and cached. It will be read directly from the cache during subsequent calls to avoid repeated requests.API，

Until1 hourAfter that, the cache will not be re-requested after it expires.

3.2. st.cache_resource example

Suppose we have a machine learning application that needs to load a pretrained model.

Since the model can take a long time to load, we can usest.cache_resourceto cache model objects.

import streamlit as st
 import joblib

 # Use st.cache_resource to cache model loading
 @st.cache_resource
 def load_model(model_path):
     model = (model_path)
     return model

 # User Interface Part
 ("Cached model loading using st.cache_resource")
 model_path = "path/to/your/"
 model = load_model(model_path)
 ("The model is loaded and can be predicted!")

In this example,load_modelThe function is@st.cache_resourceDecorator decoration.

The model will be cached after loading, and will be read directly from the cache during subsequent calls to avoid repeated loading.

4. Summary

StreamlitThe cache mechanism passesst.cache_dataandst.cache_resourceProvides powerful performance optimization capabilities.

They can help developers reduce duplicate calculations, save resources, and significantly improve application responsiveness.

In actual development, developers can choose the appropriate cache decorator according to their needs:

If you need to cache the output result of the function, usest.cache_data
If you need to cache the initialized resource object, usest.cache_resource

Use the cache mechanism rationally to allowStreamlitThe application is more efficient and smooth, improving the user experience.