Streamlit
When the application is running, each user interaction will trigger the re-execution of the entire script.
This means that some time-consuming operations, such as data loading, complex calculations, and model training, may be repeated, seriously affecting the application response speed.
This article introducesCache mechanismCan help us solve these problems and improveStreamlit
Performance of the application.
andStreamlit
ofCache mechanismIt's like equip the app with a"Memory Assistant", It allows developers to save the calculation results of a specific function. When the function is called again the same input next time, the function is not necessary to be re-executed, and the cached result is directly returned, which greatly improves the application operation efficiency and reduces the waiting time.
1. Why cache is needed
Streamlit
The mechanism is that the script will be rerun every time the user interaction or code changes, which leads to:
- Repeat calculations: Long-running functions may be called multiple times, causing application response to slower
- Waste of resources: Frequent loading and processing large amounts of data consumes a lot of memory and computing resources
- Poor user experience: The application loading time is too long, affecting the user interaction experience
To solve these problems,Streamlit
A caching mechanism is provided.
Cache mechanismIt's like equip the app with a"Memory Assistant", It allows developers to save the calculation results of a specific function. When the function is called again the same input next time, the function is not necessary to re-execute the function, and the cached result is directly returned.
By caching the output results of the function, avoiding repeated calculations, it can significantly improve the performance and response speed of the application.
2. Two cache decorators
Streamlit
Two cache decorators are provided:st.cache_data
andst.cache_resource
,Their main differences lie in the cached object type and usage scenarios.
2.1. st.cache_data
st.cache_data
Yes forCache datadecorators.
Applicable to cached functions' output results, especially those that return serializable data objects (e.g.Pandas DataFrame
、NumPy
arrays, strings, integers, etc.).
Its main parameters are:
-
ttl
: The cached survival time (in seconds). After this time is exceeded, the cache will expire and be recalculated. -
max_entries
: The maximum number of entries allowed in the cache. If this number is exceeded, the oldest cache entry will be deleted. -
persist
: Whether to persist cache to disk. Default isFalse
。 -
show_spinner
: Whether to display loading animation. Default isTrue
。 -
allow_output_mutation
: Whether the return value is allowed to be modified. Default isFalse
, it is recommended to use with caution.
2.2. st.cache_resource
st.cache_resource
Yes forCache resourcesdecorators.
Suitable for caching objects that need to be initialized but do not require frequent recalculation, such as database connections, model loading, etc.
Its main parameters are:
-
ttl
andmax_entries
:andst.cache_data
same. -
show_spinner
: Whether to display loading animation. Default isTrue
。 -
allow_output_mutation
: Whether the return value is allowed to be modified. Default isFalse
。
2.3. Summary of the differences between the two
st.cache_data | st.cache_resource | |
---|---|---|
Use scenarios | Applicable to cached functions' output results, especially those that return serializable data objects | Suitable for caching objects that need to be initialized but do not require frequent recalculation, such as database connections, model loading, etc. |
Features | The cache is the output result of the function, suitable for scenarios where frequent calls and the output result may change. | The cache is the resource object itself, suitable for scenarios where initialization is time-consuming but does not require frequent updates. |
Cache content example | Get data from API, load CSV files, data processing, etc. | Loading pretrained models, establishing database connections, etc. |
3. Cache usage example
The following examples are used to demonstrate the use of these two cache decorators.
3.1. st.cache_data example
Suppose we have an application that needs toAPI
Get data and display it to the user.
Since the data may take a long time to load, we can usest.cache_data
to cache the results.
import streamlit as st
import requests
import pandas as pd
# Use st.cache_data to cache data loading
@st.cache_data(ttl=3600) # Cache for 1 hour
def fetch_data(api_url):
response = (api_url)
data = ()
df = (data)
return df
# User Interface Part
("Cached data loading with st.cache_data")
api_url = "/posts"
df = fetch_data(api_url)
(df)
In this example,fetch_data
The function is@st.cache_data
Decorator decoration.
When the first call is called, the data will be loaded and cached. It will be read directly from the cache during subsequent calls to avoid repeated requests.API
,
Until1 hourAfter that, the cache will not be re-requested after it expires.
3.2. st.cache_resource example
Suppose we have a machine learning application that needs to load a pretrained model.
Since the model can take a long time to load, we can usest.cache_resource
to cache model objects.
import streamlit as st
import joblib
# Use st.cache_resource to cache model loading
@st.cache_resource
def load_model(model_path):
model = (model_path)
return model
# User Interface Part
("Cached model loading using st.cache_resource")
model_path = "path/to/your/"
model = load_model(model_path)
("The model is loaded and can be predicted!")
In this example,load_model
The function is@st.cache_resource
Decorator decoration.
The model will be cached after loading, and will be read directly from the cache during subsequent calls to avoid repeated loading.
4. Summary
Streamlit
The cache mechanism passesst.cache_data
andst.cache_resource
Provides powerful performance optimization capabilities.
They can help developers reduce duplicate calculations, save resources, and significantly improve application responsiveness.
In actual development, developers can choose the appropriate cache decorator according to their needs:
- If you need to cache the output result of the function, use
st.cache_data
- If you need to cache the initialized resource object, use
st.cache_resource
Use the cache mechanism rationally to allowStreamlit
The application is more efficient and smooth, improving the user experience.