Location>code7788 >text

Gaussdb: CN fix fails on openssl version dependency issue handling

Popularity:72 ℃/2024-09-05 13:27:52

1. Background to the issue

After the cluster installation is complete, perform theopenssh and openssl upgrade, the existing environment openssh-8.2p1-9.p03.ky10.x86_64 and openssl-1.1.1f-2.ky10.x86_64 version, you can install the database, and then upgrade these two versions to openssh-8.2p1-9.p15.ky10. x86_64 and openssl-1.1.1f-4.p17.ky10.x86_64.

Command test on the cluster after the installation is complete, start and stop the cluster nodes are no problem, then but after the coordinated node is removed, the repair of this failure, there is an error, the same problem with the first installation of the cluster, the error screenshot is as follows:

 

The cluster state is as follows, with aCN nodes show that they are dropped and the cluster state changes to degraded, DNs are normal and the cluster remains in an available state

 

 

2.carry outopenssh and openssl version circumvention

 

Modification Notes:

1. modificationsGaussDB(DWS) environment variable file of the/opt/huawei/Bigdata/mppdb/.mppdbgs_profile, adjusting the LD_LIBRARY_PATH variable to execute the
Before modifying:
[omm@redhat-4 ~]$ cat  /opt/huawei/Bigdata/mppdb/.mppdbgs_profile  | grep -in LD_LIBRARY_PATH
5:export LD_LIBRARY_PATH=$GPHOME/lib:$LD_LIBRARY_PATH
7:export LD_LIBRARY_PATH=$GPHOME/lib/libsimsearch:$LD_LIBRARY_PATH
11:export LD_LIBRARY_PATH=$GAUSSHOME/lib:$LD_LIBRARY_PATH
12:export LD_LIBRARY_PATH=$GAUSSHOME/lib/libsimsearch:$LD_LIBRARY_PATH

modified:

[omm@redhat-4 ~]$ cat  /opt/huawei/Bigdata/mppdb/.mppdbgs_profile  | grep -in LD_LIBRARY_PATH
5:export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GPHOME/lib
7:export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GPHOME/lib/libsimsearch
11:export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GAUSSHOME/lib
12:export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GAUSSHOME/lib/libsimsearch
Add the following:
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH
2. exist/etc/profileaddLD_LIBRARY_PATHvariant。included among these/lib64 is the path to the dependency library for the ssh binary tool.
Add the following:
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH

3. Reconditioning CN

3.1 Redogs_replaceFix coordination nodes, but there are other reported errors

 

[omm@DN01 ~]$ gs_replace -t config -h DN02
Checking all the cm_agent instances.
There are [0] cm_agents need to be repaired in cluster.
Fixing all the CMAgents instances.
Checking and restoring the secondary standby instance.
The secondary standby instance does not need to be restored.
Configuring
Waiting for promote peer instances.
.
Successfully upgraded standby instances.
Configuring replacement instances.
Successfully configured replacement instances.
Deleting abnormal CN from pgxc_node on the normal CN.
No abnormal CN needs to be deleted.
Unlocking cluster.
Successfully unlocked cluster.
Locking cluster.
Successfully locked cluster.
Unlocking cluster.
Successfully unlocked cluster.
Creating all fixed CN on the normal CN.
No CN needs to be created.
Warning: failed to turn off O&M management. Please re-execute "cm_ctl set --maintenance=off" once again.
[GAUSS-51400] : Failed to execute the command: source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile ; cm_ctl set --maintenance=on  -n 2. Error:
cm_ctl: Starting to enable the maintenance mode.
cm_ctl: Close maintenance mode on cm instances.
cm_ctl: Close maintenance mode on cm instances failed.

 

3.2 Execute the error message as above

[omm@DN01 ~]$ source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
[omm@DN01 ~]$
[omm@DN01 ~]$ cm_ctl set --maintenance=on  -n 2
cm_ctl: Starting to enable the maintenance mode.
cm_ctl: Close maintenance mode on cm instances.
cm_ctl: Close maintenance mode on cm instances failed.

3.3 Viewing Logs

[omm@DN01 ~]$ cd $GAUSSLOG/bin/cm_ctl
[omm@DN01 cm_ctl]$ less cm_ctl-2024-07-13_191612-

A screenshot of the reported error is below:

 

3.4 Three-node removal of pssh files

[omm@DN01 cm_ctl]$ sudo mv /usr/bin/pssh /usr/bin/
[omm@DN02 cm_ctl]$ sudo mv /usr/bin/pssh /usr/bin/
[omm@DN03 cm_ctl]$ sudo mv /usr/bin/pssh /usr/bin/

 

3.5 Recalling the Prompt Command

[omm@DN01 cm_ctl]$ cm_ctl set --maintenance=on  -n 2
cm_ctl: Starting to enable the maintenance mode.
cm_ctl: Close maintenance mode on cm instances.
cm_ctl: Close maintenance mode on cm instances successfully.
cm_ctl: Generate and distribute the maintenance white-list file.
cm_ctl: Generate and distribute the maintenance white-list file successfully.
cm_ctl: Set maintenance mode on related cm instances.
cm_ctl: Set maintenance mode on related cm instances successfully.
cm_ctl: Reload configuration on related cm instances.
cm_ctl: Reload configuration on related cm instances successfully.
cm_ctl: Query the maintenance mode from the primary cm server.
cm_ctl: Enable the maintenance mode successfully.

The following nodes enter the maintenance mode:
node_2

 

3.6 Recalling gs_replace

[omm@DN01 cm_ctl]$ gs_replace -t config -h DN02
Checking all the cm_agent instances.
There are [0] cm_agents need to be repaired in cluster.
Fixing all the CMAgents instances.
Checking and restoring the secondary standby instance.
The secondary standby instance does not need to be restored.
Configuring
Waiting for promote peer instances.
.
Successfully upgraded standby instances.
Configuring replacement instances.
Successfully configured replacement instances.
Deleting abnormal CN from pgxc_node on the normal CN.
No abnormal CN needs to be deleted.
Unlocking cluster.
Successfully unlocked cluster.
Locking cluster.
Successfully locked cluster.
Incremental building CN from the Normal CN.
Successfully incremental built CN from the Normal CN.
Creating fixed CN on the normal CN.
Successfully created fixed CN on the normal CN.
Starting the fixed cns.
Successfully started the fixed cns.
Creating fixed CN on the fixed CN.
Successfully created fixed CN on the fixed CN.
Unlocking cluster.
Successfully unlocked cluster.
Creating unfixed CN on the fixed and normal CN.
No CN needs to be created.
Configuration succeeded.

 

3.7 gs_replace startup CN

[omm@DN01 cm_ctl]$ gs_replace -t start -h DN02
Starting.
======================================================================
.
Successfully started instance process. Waiting to become Normal.
======================================================================

======================================================================
Start succeeded.

 

3.8 Cluster balanced operation

[omm@DN01 cm_ctl]$ gs_om -t switch --reset
Operating: Switch reset.
cm_ctl: cmserver is rebalancing the cluster automatically.
.......
cm_ctl: switchover successfully.
Operation succeeded: Switch reset.

 

3.9 Cluster status

Cluster remediation

[omm@DN01 cm_ctl]$ gs_om -t status --detail
[  CMServer State   ]

node    node_ip         instance                                    state
---------------------------------------------------------------------------
1  DN01 10.254.21.75    1    /opt/huawei/Bigdata/mppdb/cm/cm_server Primary
3  DN03 10.254.21.77    2    /opt/huawei/Bigdata/mppdb/cm/cm_server Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes

[ Coordinator State ]

node    node_ip         instance                                   state
--------------------------------------------------------------------------
1  DN01 10.254.21.75    5001 /srv/BigData/mppdb/data1/coordinator Normal
2  DN02 10.254.21.76    5002 /srv/BigData/mppdb/data1/coordinator Normal
3  DN03 10.254.21.77    5003 /srv/BigData/mppdb/data1/coordinator Normal

[ Central Coordinator State ]

node    node_ip         instance                                  state
-------------------------------------------------------------------------
3  DN03 10.254.21.77    5003 /srv/BigData/mppdb/data1/coordinator Normal

[     GTM State     ]

node    node_ip         instance                           state                    sync_state
---------------------------------------------------------------
3  DN03 10.254.21.77    1001 /opt/huawei/Bigdata/mppdb/gtm P Primary Connection ok  Sync
1  DN01 10.254.21.75    1002 /opt/huawei/Bigdata/mppdb/gtm S Standby Connection ok  Sync

[  Datanode State   ]

node    node_ip         instance                                  state            | node    node_ip         instance                                  state            | node    node_ip         instance                                  state
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  DN01 10.254.21.75    6001 /srv/BigData/mppdb/data1/master1     P Primary Normal | 2  DN02 10.254.21.76    6002 /srv/BigData/mppdb/data1/slave1      S Standby Normal | 3  DN03 10.254.21.77    3002 /srv/BigData/mppdb/data1/dummyslave1 R Secondary Normal
1  DN01 10.254.21.75    6003 /srv/BigData/mppdb/data2/master2     P Primary Normal | 3  DN03 10.254.21.77    6004 /srv/BigData/mppdb/data1/slave2      S Standby Normal | 2  DN02 10.254.21.76    3003 /srv/BigData/mppdb/data1/dummyslave2 R Secondary Normal
2  DN02 10.254.21.76    6005 /srv/BigData/mppdb/data1/master1     P Primary Normal | 3  DN03 10.254.21.77    6006 /srv/BigData/mppdb/data2/slave1      S Standby Normal | 1  DN01 10.254.21.75    3004 /srv/BigData/mppdb/data1/dummyslave1 R Secondary Normal
2  DN02 10.254.21.76    6007 /srv/BigData/mppdb/data2/master2     P Primary Normal | 1  DN01 10.254.21.75    6008 /srv/BigData/mppdb/data1/slave2      S Standby Normal | 3  DN03 10.254.21.77    3005 /srv/BigData/mppdb/data2/dummyslave2 R Secondary Normal
3  DN03 10.254.21.77    6009 /srv/BigData/mppdb/data1/master1     P Primary Normal | 1  DN01 10.254.21.75    6010 /srv/BigData/mppdb/data2/slave1      S Standby Normal | 2  DN02 10.254.21.76    3006 /srv/BigData/mppdb/data2/dummyslave1 R Secondary Normal
3  DN03 10.254.21.77    6011 /srv/BigData/mppdb/data2/master2     P Primary Normal | 2  DN02 10.254.21.76    6012 /srv/BigData/mppdb/data2/slave2      S Standby Normal | 1  DN01 10.254.21.75    3007 /srv/BigData/mppdb/data2/dummyslave2 R Secondary Normal

 

 

3.10 Normal state database environment variables

[root@DN01 ~]# tail -5f /etc/profile
fi
#TMOUT=600
export TMOUT=0
#LD_LIBRARY_PATH=/usr/local/lib/
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH
[omm@DN01 ~]$ cat .bash_profile
# Source /root/.bashrc if user has one
[ -f ~/.bashrc ] && . ~/.bashrc
source /home/omm/.profile

LD_LIBRARY_PATH=/usr/local/lib/
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH
[omm@DN01 ~]$ cat /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
#LD_LIBRARY_PATH=/usr/local/lib
export MPPDB_ENV_SEPARATE_PATH=/opt/huawei/Bigdata/mppdb/.mppdbgs_profile
export LDAPCONF=/opt/huawei/Bigdata/mppdb/
export GPHOME=/opt/huawei/Bigdata/mppdb/wisequery
export PATH=$PATH:$GPHOME/script/gspylib/pssh/bin:$GPHOME/script
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GPHOME/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GPHOME/lib/libsimsearch
export PYTHONPATH=$GPHOME/lib
export GAUSS_WARNING_TYPE=1
export GAUSSHOME=/opt/huawei/Bigdata/mppdb/core
export PATH=$GAUSSHOME/bin:$PATH
export S3_CLIENT_CRT_FILE=$GAUSSHOME/lib/
export GAUSS_VERSION=8.2.1
export PGHOST=/opt/huawei/Bigdata/mppdb/mppdb_tmp
export GS_CLUSTER_NAME=FI-MPPDB
export GAUSSLOG=/var/log/Bigdata/mpp/omm
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GAUSSHOME/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$GAUSSHOME/lib/libsimsearch
export ETCD_UNSUPPORTED_ARCH=386
if [ -f '/opt/huawei/Bigdata/mppdb/core/utilslib/env_ec' ] && [ `id -u` -ne 0 ]; then source '/opt/huawei/Bigdata/mppdb/core/utilslib/env_ec'; fi
export GAUSS_ENV=2
export LD_LIBRARY_PATH=/lib64:$LD_LIBRARY_PATH