1. The core upgrade highlight speed view
🚀 Multi Context sharing engine: high reasoning, maximizing hardware resource utilization rate
Tensorrt-YOLO 6.0 introduces the innovative multi-Context sharing engine mechanism, allowing multiple threads to share the same Engine for reasoning, maximize the utilization rate of hardware resources, and significantly reduce memory occupation. This design makes multi -tasking concurrent reasoning more efficient, especially suitable for the scene of multi -way video flow or large -scale data reasoning at the same time.
Core advantage:
-
Heavy sharing: Multiple contexts can share the same one
ICudaEngine
The model weight and parameters of the model means that only one copy is retained in memory or memory, which greatly reduces memory occupation. - Memory optimization: Although each Context needs to allocate independent memory buffer for input and output, the overall memory occupation does not increase linearly, thereby optimizing resource utilization.
-
Multi -threading reasoning: Multiple threads can use the same one at the same time
ICudaEngine
, Each thread creates its ownIExecutionContext
, Independent reasoning, make full use of the parallel computing power of the GPU.
CompareState memory occupation comparison test
Number of model examples | Clone mode | Native mode | Resource savings |
---|---|---|---|
1 | 408MB | 408MB | - |
2 | 536MB | 716MB | 25.1% |
3 | 662MB | 1092MB | 39.4% |
4 | 790MB | 1470MB | 46.3% |
Test environment:AMD Ryzen7 5700X + RTX2080Ti 22GB + YOLO11x
💾 Vague of memory management: accurate adaptation of three major models, release hardware potential
Tensorrt-YOLO 6.0 has conducted in-depth optimization in memory management.BaseBufferThe base class designed three memory management models to accurately adapt to different hardware platforms and application scenarios to maximize the release of hardware performance potential. The program can automatically determine the type of hardware, select the optimal mode by default, while supporting manual configuration to meet diversified needs.
Mode Three major memory management model comparison
DiscreteBuffer | MappedBuffer | UnifiedBuffer | |
---|---|---|---|
Applicable scenario | U️ Desktop GPU | 📱 edge equipment | Set️ User's explicit configuration |
Trigger condition | Automatic choice | Automatic choice | enable_managed_memory() |
Core technology | PCIE explicit copy | Zero-Copy | CUDA unified memory |
Memory efficiency | High throughput | Ultra -low delay | Flexible balance |
️ Smart switching logic
🎛️ Free customization: Flexible adaptation and diverse scenarios
Tensorrt-yolo 6.0 passedInferOptionThe structure provides developers with high flexible reasoning and configuration capabilities, and supports multi -dimensional parameter tuning. BelowGraphic combinationandStructured displayDisplay core functions intuitive:
Functional classification | Configuration item | Description |
---|---|---|
Hardware resource management | ⚙️ set_device_id(id)
|
The GPU device ID runs at the designated reasoning task to ensure that the task is executed on the designated device. |
Memory optimization | 💾 enable_cuda_memory()
|
When the inference data has been stored in CUDA memory, the data is directly reused to avoid additional data transmission overhead and improve the efficiency of reasoning. |
🌐 enable_managed_memory()
|
Enable CUDA unified memory management to optimize the data access efficiency between the host and the video memory, and reduce the memory of memory copy. | |
Pre -processing | 🔄 set_swap_rb()
|
The RGB/BGR channel sequence of the input data is automatically switched to adapt to the input format requirements of different frameworks. |
📏 set_normalize_params(mean, std)
|
The average and variance of the input data of the input data are used to adapt to non -standardized data sets. | |
🖼️ set_border_value(value)
|
Set the boundary value of the image filling to ensure that the input data size meets the model requirements. | |
Performance tuning | 🚀 enable_performance_report()
|
Generate detailed reasoning reports to facilitate performance analysis and optimization. |
Input control | 📐 set_input_dimensions(width, height)
|
The width and height of the input data is forced to be suitable for fixed resolution tasks (such as game AI, monitoring video analysis). |
📦 Minimalist deployment interface: unified API, farewell to choice difficulty
Tensorrt-YOLO 6.0 integrates the five major task models into intuitive API interfaces, simplify the deployment process, and improve development efficiency:
Type | New interface | Old version interface |
---|---|---|
🏷️ Image classification | ClassifyModel |
DeployCls 、DeployCGyCls
|
🎯 target detection | DetectModel |
DeployDet 、DeployCGDet
|
Measure Rotating target detection | OBBModel |
DeployOBB 、DeployCGOBB
|
✂️ Example division | SegmentModel |
DeploySeg 、DeployCGSeg
|
💃The key point detection | PoseModel |
DeployPose 、DeployCGPose
|
2. Full analysis of actual combat code
D python version Demo
import cv2
From Tensorrt_yolo.Infer Import Inferport, DetectModel, Generate_labels, Visualize, Visualize, Visualize
def main ():
# ----------------------------------------------------------------------------
# Configuration reasoning settings
option = Infer propion ()
option.enable_swap_rb () # Turn the default BGR format of OpenCV to RGB format
# Special model configuration example (such as the PP-YOLOE series needs to cancel the comment below)
# option.set_normalize_params ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])))
# ---------------------------------------------------
# Load Tensorrt engine files (note the file path)
# Note: The first load engine may take a long time for optimization
Model = DetectModel (Engine_path = "",
option = option)
# ------------------------------------------------
# Load test pictures (recommended to add file existence check)
input_img = ("test_image.jpg")
if input_img is none:
Raise Filenotfounderror ("Test picture loading failed, please check the file path"))
# ------------------------------------------------
# Execute target detection (return results include boundary box, confidence, category information)
Detection_result = (input_img)
Print (f "==> Detection_result: {detection_result}")
# -------------------------------------------------------------------------------------------------------------------------------------------------------------
# Loading category tags (need to ensure matching with the model)
class_Labels = Generate_labels (labels_file = "")
#
Visualized_img = Visualize (
Image = Input_img,
result = Detection_Result,
labels = class_labels,
Cure
("Vis_image.jpg", Visualized_img)
# --------------------------------------------------------------
# (Suitable for multi -threaded scenes)
cloned_model = () # Create an independent copy to avoid resource competition
# Verify the consistency of the clone model reasoning
cloned_result = cloned_model.predict (input_img)
Print (f "==> Cloned_result: {cloned_result}")
if __Name__ == "__main__":
main (main ()
D️ C ++ version DEMO
#include <Memory>
#include <opencv2/>
// In order to facilitate the call, in addition to using CUDA and Tensorrt, the rest of the modules use the standard library to implement
#include "depoy/" //
#include "depoy/" //
#include "deploy/" //
int Main () {
try {
// -----------------------------------------------------------------------------------------------------------------------------------------------------------
deploy :: integition option;
(); // BGR-> RGB conversion
// Special model parameter setting example
// const std :: vector <float> Mean {0.485F, 0.456F, 0.406F};
// const std :: vector <float> STD {0.229F, 0.224F, 0.225F};
// (Mean, STD);
// --------------------------------------------------------------------------------------------------------------------------------------------------------------------------—
Auto detector = std :: Make_unique <DEPLOY :: DETECTMOL> (
"", // model path
option // reasoning settings
);
// -----------------------------------------------------------------------------------------------------------------------------------------------------
CV :: MAT CV_Image = CV :: Imream ("test_image.jpg");
if (cv_image.empty ()) {{
Throw std :: runTime_error ("Can't load test pictures");
}
// Packaging image data (not reproduced pixel data)
deploy :: Image input_image (
cv_image.data, // Pixel data pointer
cv_image.cols, // Image width
cv_image.rows, // image height
);
// -----------------------------------------------------------------------------------------------------------------------------------------------------------------
deploy :: DetResult Result = DETECTOR-> Predict (input_image);
STD :: COUT << Result << std :: Endl;
// -------------------- Visualization(Sign) --------------------
// Actual development requires visual logic. Example:
// cv :: mat vis_image = visualize_detection (cv_image, result);
// cv :: Imwrite ("vis_result.jpg", vis_image); vis_image);
// ---------------------------------------------------------------------
Auto CloneD_DETECTOR = DETECTOR-> Clone (); // Create an independent instance
deploy :: DetResult CloneD_RESULT = CloneD_DETECTOR-> Predict (input_image);
// Consistency of verification results
STD :: COUT << clned_resul << std :: Endl;
} Catch (const std :: Exception & E) {{
STD :: CERR << "Program abnormality:" << () << std :: Endl;
Return exit_failure;
}
Return exit_success;
}
Third, the application scenario panoramic outlook
0 Industrial Quality Inspection 4.0 Solution
- Microsecond -level defect detection: Realize 0.1mm accuracy detection on the 200m/s production line
- Multi -camera synchronous processing: 8K 4K camera data real -time analysis
🌆 Smart City Center
- 400 Video Streaming Real -time Analysis: Support Urban -level AI Supervision
- Dynamic resource scheduling: Morning and evening peak automatic adjustment computing resources
LiftThe autonomous driving sense upgrade
- Multi -mode data fusion: lidar+camera joint reasoning
- Safety redundant design: dual context mutual verification mechanism
Fifth, ecological construction: developer resources panoramic view
Resource type | Way of getting | Include content |
---|---|---|
Support model list | View support model | Support YOLOV3 to YOLOV11 full series models, as well as PP-YOLOE and PP-YOLOE+, covering target detection, instance segmentation, image classification, posture recognition, rotating target detection and other task scenarios [1][3]。 |
Tool chain | Get dockerfile | Provide integrated development of environmental mirrors, simplify environmental configuration, and accelerate the start of projects. |
Corporate support | Contact by email: LAUGH12321@ | Provide customized SDK and technology white paper to help enterprises quickly integrate and deploy. |
Community forum | Join Discussion | Real -time technical Q & A and case sharing, jointly solve problems, accelerate project progress. |
Immediately experience:Github warehouse | Example | Start quickly