In the last article we deployed ArgoWorkflow and created a simple pipeline to do a Demo. this article mainly analyzes the concepts related to pipelines in ArgoWorkflow, understand the concepts in order to better use ArgoWorkflow.
This paper analyzes the following issues:
- 1) How to create a pipeline? Meaning of each parameter in Workflow
- 2) WorkflowTemplate How is the Workflow Template used.
- 3) Reference relationship between Workflow, WorkflowTemplate, template
- 4) ArgoWorkflow Pipeline Best Practices
1. Basic concepts
The following concepts are included in ArgoWorkflow:
- Workflow: pipeline, a real running instance of a pipeline, similar to pipelinerun in Tekton.
- WorkflowTemplate: pipeline template, you can create a pipeline based on the template, similar to the pipeline in Tekton.
- ClusterWorkflowTemplate: cluster-level pipeline template, the relationship with WorkflowTemplate is similar to Role and ClusterRole in K8s.
- templates: Workflow or WorkflowTemplate/ClusterWorkflowTemplate of the smallest unit of composition, the pipeline consists of a number of templates, can be understood as a step in the pipeline.
WorkflowTemplate and ClusterWorkflowTemplate are collectively referred to as Templates for now.
The relationship between Workflow, Template (uppercase), and template (lowercase) is as follows:
The relationship between the three is complicated, and officials have also mentioned that the naming of this area is confusing due to some historical problems.
Personally, I feel that the following is a better way to understand:
- template (lowercase): the basic unit of Template (uppercase), which can be understood as the steps in the pipeline.
- Template (uppercase): a complete pipeline, typically consisting of multiple templates (lowercase).
- Workflow: the real running pipeline instances, generally created directly by the Template, similar to the pipeline running records, each record is a Workflow
After clarifying the basic concepts, the next step is to look at the analysis of specific objects.
Workflow
is the most important resource in Argo and has two important functions:
- 1) Workflow definition
- 2) Workflow state storage
Let's take a look at what Workflow looks like. Here's a simple Workflow example:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello # We reference our first "template" here
templates:
- name: hello # The first "template" in this Workflow, it is referenced by "entrypoint"
steps: # The type of this "template" is "steps"
- - name: hello
template: whalesay # We reference our second "template" here
arguments:
parameters: [{name: message, value: "hello"}]
- name: whalesay # The second "template" in this Workflow, it is referenced by "hello"
inputs:
parameters:
- name: message
container: # The type of this "template" is "container"
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
The core content of the entire Workflow object is divided into the following three parts:
-
templates
: A list of templates where all the steps in the pipeline and the order of precedence between them are defined. -
entrypoint
: A pipeline entry, similar to the main method in code, which generally references one of thetemplate invocators
Ready to go. -
parameters
: The parameters used in the pipeline, including global parameters in the arguments block and local parameters in the inputs block.
entrypoint
The entrypoint must be specified in the Workflow, and the entrypoint is the starting point for the execution of the task, similar to the main method in a program.
templates
ArgoWorkflow currently supports 6 templates, which are analyzed one by one.
container
cap (a poem)Kubernetes container spec is the same, this type of template is to start a container, the user can specify the image, command, args and other information to control the specific execution of the action.
- name: whalesay
container:
image: docker/whalesay
command: [cowsay]
args: ["hello world"]
script
script is actually a wrapper around the container, the spec is the same as the container, and the source field is added to define a script, the results of which are logged into the{{tasks.<NAME>.}}
or {{steps.<NAME>.}}
。
script is understood to simplify the configuration of a container for executing scripts.
- name: gen-random-int
script:
image: python:alpine3.6
command: [python]
source: |
import random
i = (1, 100)
print(i)
resource
A template of type Resource is used to manipulate the resources in the cluster, and the action parameter represents the specific action, which supports get, create, apply, delete, replace, patch.
- name: k8s-owner-reference
resource:
action: create
manifest: |
apiVersion: v1
kind: ConfigMap
metadata:
generateName: owned-eg-
data:
some: value
suspend
The template for the Suspend type is simple, it is used to pause the pipeline execution.
The default is to keep blocking until the user passes the
argo resume
command to recover manually, or via theduration
The parameter specifies the pause time, after which it will be resumed automatically.
- name: delay
suspend:
duration: "20s"
steps
Steps is used to deal with the relationship between templates, and consists of two specific aspects:
- 1) Which tasks need to be run
- 2) In what order of priority do these tasks run
Look at the example below:
- name: hello-hello-hello
steps:
- - name: step1
template: prepare-data
- - name: step2a
template: run-data-first-half
- name: step2b
template: run-data-second-half
What tasks need to be run?
The steps define the 3 templates to run, step1, step2a, step2b.
In what order do these tasks run?
The order in which the elements in steps are defined is the order in which the tasks are executed.In this case, step1 is run first, then step2a and step2b are run in parallel.
Note: Look closely at yaml to see that step2a and step2b are in the same element, and that steps is a two-dimensional array defined as follows:
type Template struct {
Steps []ParallelSteps `json:"steps,omitempty" protobuf:"bytes,11,opt,name=steps"`
}
type ParallelSteps struct {
Steps []WorkflowStep `json:"-" protobuf:"bytes,1,rep,name=steps"`
}
The conversion to json looks like this:
{
"steps": [
["step1"],
["step2a", "step2b"]
]
}
That should make it clearer. The order of precedence is clear.
dag
Dag templates work the same as steps.
The DAG here is the Directed Acyclic Graph DAG.
The difference between DAG and Steps is in the definition of task order:
- Steps Use the order of definitions as the order of template execution.
- The DAG defines the dependencies between tasks, and argo generates its own order of precedence based on those dependencies.
Look at the example below:
- name: diamond
dag:
tasks:
- name: A
template: echo
- name: B
dependencies: [A]
template: echo
- name: C
dependencies: [A]
template: echo
- name: D
dependencies: [B, C]
template: echo
A new dependencies field has been added to the DAG, which allows you to specify the dependencies on which the current step depends.
What tasks need to be run?
The steps define the 4 tasks to be run, A, B, C, and D. The steps define the tasks to be run.
In what order do these tasks run?
It's not as straightforward as Steps, you need to analyze dependencies based on dependencies.
A has no dependencies and is therefore executed first, B and C both depend only on A and are therefore executed at the same time after A, and D depends on B and C and is therefore executed after B and C have finished.
The conversion to json form is as follows:
{
"steps": [
["A"],
["B", "C"],
["D"]
]
}
ps: Compared to steps, it is more direct, and the order of tasks is clear at a glance. If the order of all tasks in the whole workflow is clear, we recommend using steps, if it is very complicated and you only know the dependency relationship between each task, then you can use DAG directly and let ArgoWorkflow calculate.
template definitions & template invocators
As you can see, the steps and dag templates are different from the other four in that they can specify multiple templates.
The six templates in ArgoWorkflow were introduced earlier, and can actually be categorized according to their specific roles as followstemplate definitionsas well astemplate invocatorsTwo kinds.
-
template definitions: This type template is used to define what specific steps are to be performed, as in the whalesay template in the example.
- embody
container
,script
,resource
,suspend
and other types
- embody
- template invocators: This type template is used to combine othertemplate definitions The hello template in the example is of this type, defining the order of execution between steps and so on.
- Normally an entrypoint points to a template of that type.
- embody
dag
cap (a poem)steps
The hello template in the example is of type steps.
Spoiler alert: template is a bit convoluted, and it would be nice if theTemplate definition, template caller Splitting into two different objects is clearer.
Once you understand the template classification, it's clearer to go back and look at the previous Workflow examples:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello # We reference our first "template" here
templates:
- name: hello # The first "template" in this Workflow, it is referenced by "entrypoint"
steps: # The type of this "template" is "steps"
- - name: hello
template: whalesay # We reference our second "template" here
arguments:
parameters: [{name: message, value: "hello"}]
- name: whalesay # The second "template" in this Workflow, it is referenced by "hello"
inputs:
parameters:
- name: message
container: # The type of this "template" is "container"
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
- 1) First of all the whalesay template is a
container
template of typetemplate definitions - 2) Secondly hello is a
steps
template of typetemplate invocators- The steps field in this caller defines a step named hello, which references the whalesay template
- 3) Entrypoint specifies hello, thetemplate invocators
Next is another important object in Workflow, entrypoit.
entrypoint
entrypoint serves as the starting point for the execution of the task, similar to the main method in a program.The entrypoint must be specified in every workflow.
Attention:Only tasks specified by entrypoint will run.For this reason, entrypoint generally only specifies templates of type Steps and DAGs, i.e.template invocatorsThe tasks are then specified by steps in Steps or tasks in the DAG. Multiple tasks are then specified by steps in Steps or tasks in DAG.
Therefore, not all templates written in Workflow will be executed.
Look at the example below:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello # We reference our first "template" here
templates:
- name: hello # The first "template" in this Workflow, it is referenced by "entrypoint"
steps: # The type of this "template" is "steps"
- - name: hello
template: whalesay # We reference our second "template" here
arguments:
parameters: [{name: message, value: "hello"}]
- name: whalesay # The second "template" in this Workflow, it is referenced by "hello"
inputs:
parameters:
- name: message
container: # The type of this "template" is "container"
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
Entrypoint specifies hello, which is then a template of type steps, i.e.template invocators。The whalesay template is then specified in the steps of the hello template, and the final whalesay template is of type container, i.e.template definitions。Here is the final task to run.
Of course, entrypoint can also specifytemplate definitionstype template, but this will only run one task, like this:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay
command: [cowsay]
args: ["hello"]
At this point, we should have basically figured out the Workflow object (except for the parameters part). Let's take a look at the last part, parameters.
Demo
List a few Workflows that are a little more complex, and see if you really understand Workflow.
Here is a Workflow with 4 tasks:
- 1) First print hello
- 2) Then execute a python script to generate random numbers
- 3)sleep 20s
- 4) Create a Configmap
The steps and dag methods are provided for comparison.
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello
templates:
- name: hello
steps:
- - name: hello
template: whalesay
arguments:
parameters: [{name: message, value: "hello"}]
- - name: runscript
template: gen-random-int
- - name: sleep
template: delay
- - name: create-cm
template: k8s-owner-reference
# - name: diamond
# dag:
# tasks:
# - name: hello
# template: whalesay
# arguments:
# parameters: [{name: message, value: "hello"}]
# - name: runscript
# dependencies: [hello]
# template: gen-random-int
# - name: sleep
# template: delay
# dependencies: [runscript]
# - name: create-cm
# template: k8s-owner-reference
# dependencies: [sleep]
- name: whalesay
inputs:
parameters:
- name: message
container:
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
- name: gen-random-int
script:
image: python:alpine3.6
command: [python]
source: |
import random
i = (1, 100)
print(i)
- name: k8s-owner-reference
resource:
action: create
manifest: |
apiVersion: v1
kind: ConfigMap
metadata:
generateName: owned-eg-
data:
host:
wx: Explore Cloud Native
- name: delay
suspend:
duration: "20s"
parameters
Parameters in Workflow can be categorized into the following two types:
- formal parameter: in template(template definitions) via theinputs What parameters are required for field definitions, which can be specified with default values
- real parameter: in template(template invocators) via thearguments field assigns a value to the parameter, overriding the default value in the inputs
The above is only a personal understanding
inputs Formal parameters
The template can be used with the[*].inputs
field to specify theformal parameterIn the template, you can pass the{{.$name}}
syntax to refer to the parameters.
The following example declares that the template takes a parameter called message, so that the caller knows what to pass when using the template.
templates:
- name: whalesay-template
inputs:
parameters:
- name: message
container:
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
Of course, you can also specify a default value
templates:
- name: whalesay-template
inputs:
parameters:
- name: message
value: "default message"
container:
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
Note: If the default value is not specified, the parameter must be specified when the template is called, or it can be left unspecified if there is a default value.
arguments Actual arguments
is used to define the actual parameters to be passed, this part of the parameter is available in all Templates under the current Workflow, you can use the
{{.$name}}
syntax to quote.
For example, the following example specifies a parameter named message and assigns it the value hello world.
arguments:
parameters:
- name: message
value: hello world
parameter reuse
In addition to specifying arguments in steps/dag, you can even specify them directly in Workflow, and then in steps/dag you can specify them via the{{.$name}}
The syntax is referenced. This allows for reuse of parameters, defined once in Workflow and referenced multiple times in steps/dag.
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: example-
spec:
entrypoint: main
arguments:
parameters:
- name: workflow-param-1
templates:
- name: main
dag:
tasks:
- name: step-A
template: step-template-A
arguments:
parameters:
- name: template-param-1
value: "{{-param-1}}"
- name: step-template-A
inputs:
parameters:
- name: template-param-1
script:
image: alpine
command: [/bin/sh]
source: |
echo "{{-param-1}}"
Demo
Understand parameter passing with the following demo:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: steps-
spec:
entrypoint: hello # We reference our first "template" here
templates:
- name: hello # The first "template" in this Workflow, it is referenced by "entrypoint"
steps: # The type of this "template" is "steps"
- - name: hello
template: whalesay # We reference our second "template" here
arguments:
parameters: [{name: message, value: "hello"}]
- name: whalesay # The second "template" in this Workflow, it is referenced by "hello"
inputs:
parameters:
- name: message
container: # The type of this "template" is "container"
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
In the above example, the template whalesay defines that it requires a parameter named message, and when referencing whalesay in the steps template, it specifies the value of the parameter message to be hello via arguments, so hello will eventually be printed.
Official Original:
A WorkflowTemplate is a definition of a Workflow that lives in your cluster.
WorkflowTemplate is the definition of Workflow, WorkflowTemplate describes the details of the pipeline, including what tasks, the order of tasks and so on.
As you can see from the previous description of Workflow, we can create Workflow objects directly to run the pipeline, but there are some problems with this approach:
- 1) If there are a lot of templates, the Workflow object will be very large, which makes it more troublesome to modify it.
- 2) Templates can't be shared, different Workflows need to write the same template, or the same template will appear in different Workflow yaml.
So, best practice for Workflow and WorkflowTemplate: save the template to WorkflowTemplate, and just reference the Template in Workflow and provide the parameters.
The workflow templates in ArgoWorkflow are categorized according to their scope asWorkflowTemplate cap (a poem)ClusterWorkflowTemplate Two kinds.
- WorkflowTemplate: namespace scope, can only be referenced in the same namespace
- ClusterWorkflowTemplate: cluster scope, any namespace can be referred to
WorkflowTemplate
Here is a simple WorkflowTemplate:
apiVersion: /v1alpha1
kind: WorkflowTemplate
metadata:
name: workflow-template-submittable
namespace: default
spec:
entrypoint: whalesay-template
arguments:
parameters:
- name: message
value: tpl-argument-default
templates:
- name: whalesay-template
inputs:
parameters:
- name: message
value: tpl-input-default
container:
image: docker/whalesay
command: [cowsay]
args: ["{{}}"]
You can see that the WorkflowTemplate and Workflow parameters are exactly the same, so I won't repeat them here.
Simply replace kind from Workflow to WorkflowTemplate to achieve the conversion.
workflowMetadata
The workflowMetadata is a field unique to Template and is mainly used for theStoring metadata,Subsequent Templates created by this Template Workflow All automatically carry this information with them。
With this information you can track which Template the Workflow was created from.
It is used like the following, with a label specified in the workflowMetadata
apiVersion: /v1alpha1
kind: WorkflowTemplate
metadata:
name: workflow-template-submittable
spec:
workflowMetadata:
labels:
example-label: example-value
Workflow objects created by this Template will then carry this label:
apiVersion: /v1alpha1
kind: Workflow
metadata:
annotations:
/pod-name-format: v2
creationTimestamp: "2023-10-27T06:26:13Z"
generateName: workflow-template-hello-world-
generation: 2
labels:
example-label: example-value
name: workflow-template-hello-world-5w7ss
namespace: default
ClusterWorkflowTemplate
Similar to WorkflowTemplate, it can be understood as the relationship between Role and ClusterRole in k8s, but with different scopes.
and WorkflowTemplate are the same for all parameters, except that the yaml Replace kind with ClusterWorkflowTemplate.
Once you have created a WorkflowTemplate, you can use TemplateRef to directly reference the corresponding template in Workflow, so that the Workflow object will be cleaner.
There are also two ways to reference a WorkflowTemplate:
- 1)workflowTemplateRef: To reference the full WorkflowTemplate, you only need to specify the global parameters in Workflow.
- 2)templateRef: Only refer to a template, Workflow can also specify other templates, entrypoints and other information.
workflowTemplateRef
This can be done byworkflowTemplateRef
field directly references the WorkflowTemplate.
Note 📢: hereneed Workflow and WorkflowTemplate in the same namespace.。
It's like this:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: workflow-template-hello-world-
spec:
arguments:
parameters:
- name: message
value: "from workflow"
workflowTemplateRef:
name: workflow-template-submittable
workflowTemplateRef
Specify the name of the Template you want to reference, this sentence is equivalent to moving all the contents under the spec field of the corresponding Template, including entrypoint, template and so on.
Generally, Workflow only needs to override arguments with the argument field, but of course you can leave it unspecified. If you don't specify an argument in Workflow, the default value provided in the Template will be used.
A minimalist Workflow looks like this:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: workflow-template-hello-world-
spec:
workflowTemplateRef:
name: workflow-template-submittable
only ifworkflowTemplateRef
field, no argument is provided.
Contrasted with the previous Workflow, there is even less content, because most of it is written in the WorkflowTemplate, and the Workflow usually just needs to specify the parameters.
clusterWorkflowTemplateRef
Similar to workflowTemplateRef, just add theclusterScope: true
Configuration is sufficient.
The default is false, i.e. WorkflowTemplate
It's like this:
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: workflow-template-hello-world-
spec:
entrypoint: whalesay
templates:
- name: whalesay
steps: # You should only reference external "templates" in a "steps" or "dag" "template".
- - name: call-whalesay-template
templateRef: # You can reference a "template" from another "WorkflowTemplate or ClusterWorkflowTemplate" using this field
name: cluster-workflow-template-whalesay-template # This is the name of the "WorkflowTemplate or ClusterWorkflowTemplate" CRD that contains the "template" you want
template: whalesay-template # This is the name of the "template" you want to reference
clusterScope: true # This field indicates this templateRef is pointing ClusterWorkflowTemplate
arguments: # You can pass in arguments as normal
parameters:
- name: message
value: "hello world"
Core components:
workflowTemplateRef:
name: cluster-workflow-template-submittable
clusterScope: true
When specified as cluster-scoped, ArgoWorkflow will go searching for ClusterWorkflowTemplate objects, otherwise it will search for WorkflowTemplate in the current namespace.
templateRef
In addition to using workflowTemplateRef / clusterWorkflowTemplateRef to reference an entire WorkflowTemplate, you can also use thetemplateRef parameter to reference a template in the WorkflowTemplate.
apiVersion: /v1alpha1
kind: Workflow
metadata:
generateName: workflow-template-hello-world-
spec:
entrypoint: whalesay
templates:
- name: whalesay
steps: # You should only reference external "templates" in a "steps" or "dag" "template".
- - name: call-whalesay-template
templateRef: # You can reference a "template" from another "WorkflowTemplate" using this field
name: workflow-template-1 # This is the name of the "WorkflowTemplate" CRD that contains the "template" you want
template: whalesay-template # This is the name of the "template" you want to reference
arguments: # You can pass in arguments as normal
parameters:
- name: message
value: "hello world"
Core for:
templateRef:
name: workflow-template-1
template: whalesay-template
Parametric analysis:
- name refers to the name of the WorkflowTemplate, i.e., workflow-template-1 which is the name of the WorkflowTemplat.
- template refers to the name of the template, i.e., workflow-template-1, the template named whalesay-template in WorkflowTemplat.
Note 📢: according to the official documentation, it's best not to use thesteps
cap (a poem)dag
both casestemplate invocatorsUse templateRef in addition to templateRef.
The original text follows:
You should never reference another template directly on a template object (outside of a steps or dag template). This behavior is deprecated, no longer supported, and will be removed in a future version.
Parameters
The parameters used in WorkflowTemplate are basically the same as those used in Workflow, without much difference.
- The required parameters are defined in the template definitions by the inputs field.
- The arguments field in template invocators is copied as an argument.
When referencing a WorkflowTemplate in Workflow, you can redefine arguments to override the default arguments in the WorkflowTemplate.
best practice
The following best practices can be derived from the workflowTemplateRef and templateRef descriptions:
- (1) workflowTemplateRef use: WorkflowTemplate in the definition of the complete pipeline content, including entrypoint, template, Workflow through the workflowTemplateRef reference and through the argument to specify the parameters to override the default value.
- (2) templateRef use: WorkflowTemplate does not define the complete pipeline content, only the definition of commonly used template, Workflow need to use this type of template, through the templateRef reference, rather than in the Workflow to define the template once again.
In the first case WorkflowTemplate can be called a pipeline template, in the second case WorkflowTemplate acts as template-utils.
[ArgoWorkflow Series]Continuously updated, search the public number [Explore Cloud Native]Subscribe to read more articles.
5. Summary
This article analyzes the Workflow, WorkflowTemplate, template objects in ArgoWorkflow and the connection between them.
1)Workflow Object: Composed of templates, entrypoint.
-
templates: divided by role into template definitions and template invocators
- template definitions: used to define the content of the specific step to be executed, including container, script, resource, suspend
- template invocators: used to combine other templates to control the order of tasks, including steps, dag
- entrypoint: Starting point for mandate implementation
2) WorkflowTemplate/ClusterWorkflowTemplate Objects: consistent with the Workflow object, but only used to define the pipeline, can not run the
- The entire WorkflowTemplate can be referenced in Workflow via WorkflowTemplateRef/clusterWorkflowTemplateRef.
- Or refer to a template via templateRef
3)Parameters: Classified into formal and real parameters
- inputs: formal parameter that specifies which inputs are required by the template, default values can be specified.
- arguments: real parameters that assign values to the arguments in the corresponding template, overriding the default values provided by the inputs.
4) Parameter Prioritization:
- Workflow arguments > WorkflowTemplate arguments > WorkflowTemplate inputs
Finally, there is the relationship between Workflow, WorkflowTemplate, and template.
5) Some best practices:
WorkflowTemplate:
- 1) Can be used as a pipeline template: WorkflowTemplate defines all the entrypoints and templates, which can be used in Workflow by introducing them through WorkflowTemplateRef.
- (2) used as template-utils: WorkflowTemplate contains only commonly used templates, which are introduced through templateRef when they are to be used in Workflow, avoiding repeated definition of templates.
Steps and dag selection:
- The steps approach is more straightforward, as the order of tasks is clear at a glance, but you need to make sense of the order of all the tasks.
- The dag is less intuitive, but it can be filled out so that you only need to know the dependencies of a task in order to add a new one.
Suggestion: If the order of all the tasks in the whole Workflow is clear, it is recommended to use steps, if it is very complicated, and we only know the dependency relationship between each task, then we can use DAG directly and let ArgoWorkflow calculate.