Location>code7788 >text

Troubleshooting Mirror Network Initialization Failures under OpenStack

Popularity:813 ℃/2024-07-20 23:36:53

Performance of the problem

After migrating a batch of old images on my OpenStack cluster (from other 3-party cloud platforms), I found that this batch of images could not be initialized for non-first NIC images when injecting configurations for initialization using ConfigDrive (and after testing with a non-ConfigDrive datasource, it also did not work).

Troubleshooting paths

First, check that cloud-init is working.
Instantiate the image to see the cloud-init service, and the associated logs.

systemctl status cloud-init
systemctl status cloud-init-local

The services are all enabled normally.
Look at the Cloud-init initialization log again

[   19.254076] cloud-init[1483]: Cloud-init v. 0.7.5 finished at Tue, 02 Jul 2024 06:28:30 +0000. Datasource DataSourceConfigDriveNet [net,ver=2][source=/dev/sr0].  Up 19.24 seconds

You can see that there are logs that look like the data source was read and instantiated. You can basically rule out that cloud-init is not running properly.

cloud-init debugging

Only the process of cloud-init initialization can be refined.
This is a recommended article about the cloud-init runtime phase, so I won't go into that here./frankming/p/
The script for a quick rerun of the initialization is given here

# centos7
#! /bin/bash
cloud-init clean
rm -rf /var/run/cloud-init/
rm -rf /var/lib/cloud/
rm -rf /etc/sysconfig/network-scripts/ifcfg-*

# local Phase data source preparation
cloud-init init --local

# network phase, Rendering data
cloud-init init

# implementation module
cloud-init modules --mode=config
# centos6
#! /bin/bash
rm -rf /var/run/cloud-init/
rm -rf /var/lib/cloud/
rm -rf /etc/sysconfig/network-scripts/ifcfg-*

# local Phase data source preparation
cloud-init init --local

# network phase, Rendering data
cloud-init init

# implementation module
cloud-init modules --mode=config

Unfortunately, in re-running the initialization process did not see the end of the doubt, compared to the logs that can be initialized multiple NICs (on CentOS7 systems), always feel that CentOS6 in the NIC configuration phase of the taskless operation. So I dragged down the Cloud-init source code for static audit + Print big hair.
Source code path:

/usr/lib/python2.6/site-packages/cloudinit

Locate this position in Cloud-init 7.5

...
# sources/ +166
def read_config_drive(source_dir, version="2012-08-10"):
    reader = (source_dir)
    finders = [
        (reader.read_v2, [], {'version': version}),
        (reader.read_v1, [], {}),
    ]
    excps = []
    for (functor, args, kwargs) in finders:
        try:
            return functor(*args, **kwargs)
        except  as e:
            (e)
    raise excps[-1]
...
...
# sources/ +59
    def get_data(self):
        found = None
        md = {}
        results = {}
        if (self.seed_dir):
            try:
                results = read_config_drive(self.seed_dir)
                found = self.seed_dir
            except :
                (LOG, "Failed reading config drive from %s",
                            self.seed_dir)
        if not found:
            for dev in find_candidate_devs():
                try:
                    results = util.mount_cb(dev, read_config_drive)
                    found = dev
                except :
...

You can see that after mounting the /dev/sr0 device, cloud-init version 0.7.5 uses the 2012-08-10 data source
Manually mount and view

[root@aa home]# mount /dev/sr0 /mnt/
mount: /dev/sr0 is write-protected, mounting read-only
[root@aa home]# ls /mnt/
ec2  openstack
[root@aa home]# ls /mnt/openstack/2012-08-10/
meta_data.json  user_data

Hurrah, there is no such thing as network_data.json. Look at the initialization of the relevant network configuration and verify that the network initial logic is only adapted for ubuntu. As a result, cloud-init 0.7.5 version is too low and centos7 support is poor.

prescription

There are two broad types of solutions:

1. Upgrade Cloud-init
2. Manually implement the logic in the Network Initialization section of Cloud-init.

Solution 1

Upgrade the preferred need to upgrade the Python version, I have not used the upgrade program, so I do not go into details, but it is certainly feasible, it is recommended to manually upgrade Python and source code to install Cloud-init.

Solution 2

Recommended as I refer to a high version of the cloud-init driver to implement the logic of manual go or C language to rewrite a patch, tested and feasible. As this component is developed for the company, it is not convenient to open source, but welcome exchanges.