Location>code7788 >text

ArgoWorkflow Tutorial (VII) - Efficient Inter-Step File Sharing Strategies

Popularity:546 ℃/2024-10-22 13:39:06

Previously we analyzed the use of artifact to achieve inter-step file sharing, today we share how to use PVC to achieve efficient inter-step file sharing.

1. General

Previously in the artifact section we demonstrated how to use artifact to achieve inter-step file transfer, today we introduce a simpler way of file transfer:PVC Shared

After all, artifact uses S3 for transit, which is definitely less efficient than sharing a PVC directly, and artifact is generally used for outputting results and saving the final result to S3, not just for sharing files.

2. Sharing files using artifact

I've already shared how to pass files between steps via artifact, so here's a recap.

apiVersion: /v1alpha1
kind: Workflow
metadata:
  generateName: artifact-passing-
spec:
  entrypoint: artifact-example
  templates:
  - name: artifact-example
    steps:
    - - name: generate-artifact
        template: whalesay
    - - name: consume-artifact
        template: print-message
        arguments:
          artifacts:
          # bind message to the hello-art artifact
          # generated by the generate-artifact step
          - name: message
            from: "{{-art}}"

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world | tee /tmp/hello_world.txt"]
    outputs:
      artifacts:
      # generate hello-art artifact from /tmp/hello_world.txt
      # artifacts can be directories as well as files
      - name: hello-art
        path: /tmp/hello_world.txt

  - name: print-message
    inputs:
      artifacts:
      # unpack the message input artifact
      # and put it at /tmp/message
      - name: message
        path: /tmp/message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["cat /tmp/message"]

As you can see, the artifact method of passing parameters between shared file steps is relatively similar.

Export artifact

outputs:
  artifacts:
  # generate hello-art artifact from /tmp/hello_world.txt
  # artifacts can be directories as well as files
  - name: hello-art
    path: /tmp/hello_world.txt

Subsequent steps refer to the exported artifact

arguments:
  artifacts:
  # bind message to the hello-art artifact
  # generated by the generate-artifact step
  - name: message
    from: "{{-art}}"

And how to bring in the artifact during the steps, for example, the demo below mounts the artifact as /tmp/message in the Pod.

inputs:
  artifacts:
  # unpack the message input artifact
  # and put it at /tmp/message
  - name: message
    path: /tmp/message

3. Efficient file sharing using PVCs

As the name suggests, the same PVC is mounted at different steps, which naturally enables file sharing.

Each step in ArgoWorkflow starts a separate Pod to run the

There are also two ways:

  • (1) Dynamically created PVCs: PVCs are created when Workflow is running, and deleted at the end of the run.
  • 2) Using an existing PVC: it will not be created or deleted.

Dynamic creation of PVCs

The full demo is below:

apiVersion: /v1alpha1
kind: Workflow
metadata:
  generateName: volumes-pvc-
spec:
  entrypoint: volumes-pvc-example
  volumeClaimTemplates:                 # define volume, same syntax as k8s Pod spec
  - metadata:
      name: workdir                     # name of volume claim
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi                  # Gi => 1024 * 1024 * 1024

  templates:
  - name: volumes-pvc-example
    steps:
    - - name: generate
        template: whalesay
    - - name: print
        template: print-message

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol

First, a PVC template is defined, which is used by the Workflow runtime to create a PVC.

spec:
  entrypoint: volumes-pvc-example
  volumeClaimTemplates:                 # define volume, same syntax as k8s Pod spec
  - metadata:
      name: workdir                     # name of volume claim
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi                  # Gi => 1024 * 1024 * 1024

The other steps then require that the PVC be mounted in the corresponding directory

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol

This makes file sharing very simple.

Argo automatically deletes the created PVC when the workflow has finished running.

Use of existing PVC

In some cases, we can want to access volumes that already exist instead of dynamically creating/destroying them.

The full demo is below:

# Define Kubernetes PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-existing-volume
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 1Gi

---
apiVersion: /v1alpha1
kind: Workflow
metadata:
  generateName: volumes-existing-
spec:
  entrypoint: volumes-existing-example
  volumes:
  # Pass my-existing-volume as an argument to the volumes-existing-example template
  # Same syntax as k8s Pod spec
  - name: workdir
    persistentVolumeClaim:
      claimName: my-existing-volume

  templates:
  - name: volumes-existing-example
    steps:
    - - name: generate
        template: whalesay
    - - name: print
        template: print-message

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

The first step is to manually create a PVC

# Define Kubernetes PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: my-existing-volume
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 1Gi

Then define in Workflow that you want to use this PVC.

apiVersion: /v1alpha1
kind: Workflow
metadata:
  generateName: volumes-existing-
spec:
  entrypoint: volumes-existing-example
  volumes:
  # Pass my-existing-volume as an argument to the volumes-existing-example template
  # Same syntax as k8s Pod spec
  - name: workdir
    persistentVolumeClaim:
      claimName: my-existing-volume

Think of it as replacing the previous volumeClaimTemplates with persistentVolumeClaims.

Then there are steps to mount the PVC to the corresponding directory

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

This step is unchanged from when the PVC was created dynamically.


[ArgoWorkflow Series]Continuously updated, search the public number [Explore Cloud Native]Subscribe to read more articles.


4. Summary

This article analyzes how to share files using PVCs in Workflow in Argo.

  • 1) Define a PVC template or specify the use of an existing PVC.
  • 2) Mount the PVC to the corresponding directory in step