Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Add logic to assert that the output of the diff command must do not contain lines starting with "Only in /tmp/home" #2748

Merged
merged 16 commits into from
Jun 18, 2024
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,65 @@
if node['cluster']['node_type'] == 'HeadNode'
# Restore the shared storage home data if it doesn't already exist
# This is necessary to preserve any data in these directories that was
# generated during the build of ParallelCluster AMIs after converting to
# shared storage and backed up to a temporary location previously
# generated during the node bootstrap after converting to shared storage.
# Before removing the backup, ensure the data in the new home is the same
# as the original to avoid any data loss or inconsistency. This is done
# by using rsync to copy the data and diff to check for differences.
# The diff command excludes files that were originally in /home to focus on
# the newly synchronized files. This approach is necessary because the /home
# directory may contain pre-existing files such as slurm-*.out generated by
# running SLURM jobs and the automatically created lost+found directory.
gmarciani marked this conversation as resolved.
Show resolved Hide resolved
# Remove the backup after the copy is done and the data integrity is verified.
bash "Restore /home" do
user 'root'
group 'root'
code <<-EOH
# Generate a list of existing files and dirs in /home before the sync
find /home -mindepth 1 > /tmp/home_existing_files.txt
gmarciani marked this conversation as resolved.
Show resolved Hide resolved

# Initialize an empty set for exclude options and directories to exclude
declare -A exclude_dirs

touch /tmp/exclude_options.txt

# Process each file and directory in the list to determine which paths should be excluded from the diff check
while IFS= read -r file; do
# Remove the /home/ prefix
relative_path=${file#/home/}
gmarciani marked this conversation as resolved.
Show resolved Hide resolved
current_path="/tmp/home"

# Split the relative path by /
IFS='/' read -ra parts <<< "$relative_path"
hehe7318 marked this conversation as resolved.
Show resolved Hide resolved

for part in "${parts[@]}"; do
current_path="$current_path/$part"
if [ ! -e "$current_path" ]; then
# If the path does not exist in /tmp/home, add the last part of path to the exclude list
if [ -z "${exclude_dirs[$part]}" ]; then
exclude_dirs[$part]=1
echo $part >> /tmp/exclude_options.txt
fi
break
else
if [ -f "$current_path" ]; then
# If the path is a file, add it to the exclude list
echo $part >> /tmp/exclude_options.txt
break
fi
# If the path is a directory, continue checking subdirectories
fi
done
done < /tmp/home_existing_files.txt

# Sync data from /tmp/home to /home
rsync -a --ignore-existing /tmp/home/ /home
diff_output=$(diff -r /tmp/home/ /home)

# Perform the diff check, excluding the original files
diff_output=$(diff -r --exclude-from=/tmp/exclude_options.txt /tmp/home /home)
if [ $? -eq 0 ]; then
rm -rf /tmp/home/
rm -rf /tmp/home_existing_files.txt
rm -rf /tmp/exclude_options.txt
else
echo "Data integrity check failed comparing /home and /tmp/home: $diff_output"
exit 1
Expand Down
Loading