Skip to content

Commit

Permalink
feat: move Java Owlbot into this repository for postprocessing (#2282)
Browse files Browse the repository at this point in the history
* use local synthtool and owlbot

* remove unused files

* remove more unused files

* remove cache files in owlbot

* use java 11 for it

* remove kokoro files

* use glob in owlbot entrypoint

* remove unused files

* do not do post-process IT on mac

* concise entrypoint logic

* cleanup i

* cleanup ii

* cleanup iii

* cleanup iv

* remove templates

* remove protos folder

* remove synthtool

* connect image to owlbot entrypoint

* simplify synthtool docker run command

* install synthtool locally

* install synthtool only once

* use virtualenvs to run python scripts

* install pyenv in action

* remove jar from history

* download google-java-format

* fix pyenv init

* attempt to fix pyenv installation in gh action

* fix manual pyenv installation

* install pyenv in profile

* install pyenv in bashrc as well

* use bash shell explicitly in gh action

* install pyenv in same step as IT

* do not restart shell

* set pyenv path manually

* install pyenv in its own step

* propagate environment variables to other steps

* fix global env var setup

* remove wrong env settings

* explain usage of pyenv in README

* simplify pyenv setup

* add comment to owlbot entrypoint

* rename destination_path to preprocessed_libraries_path

* infer scripts_root in postprocess_library.sh

* use temporary folder for preprocess step

* use owlbot files from workspace

* get rid of output_folder argument

* use common temp dir to clone synthtool into

* lock synthtool to a specific commitish

* fix file transfer

* fix owl-bot-staging unpacking

* remove unnecessary workspace variable

* rename workspace to postprocessing_target

* remove owlbot sha logic

* remove repository_root variable

* cleanup

* correct pyenv comment

* clean temp sources folder on each run

* safety checks for get_proto_path_from_preprocessed_sources

* fix integration test

* disable compute and asset/v1p2beta1 temporarily

they have changes in googleapis that have not been reflected yet in
google-cloud-java

* fix unit tests

* correct comment

* do not install docker for macos

* fix owlbot files check

* fix license headers

* remove unnecessary owlbot_sha

* add explanation on why are there no macos + postprocess ITs

* use `fmt:format` instead of google formatter

* clean templates

* remove more unnecessary elements

* add README entry explaining owlbot maintenance

* remove unnecessary java format version
  • Loading branch information
diegomarquezp authored Dec 18, 2023
1 parent c9cf66b commit f8969d2
Show file tree
Hide file tree
Showing 30 changed files with 1,828 additions and 203 deletions.
29 changes: 19 additions & 10 deletions .github/workflows/verify_library_generation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
integration_tests:
strategy:
matrix:
java: [ 8 ]
java: [ 11 ]
os: [ ubuntu-22.04, macos-12 ]
post_processing: [ 'true', 'false' ]
runs-on: ${{ matrix.os }}
Expand All @@ -26,8 +26,23 @@ jobs:
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: install pyenv
shell: bash
run: |
set -ex
curl https://pyenv.run | bash
# setup environment
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
echo "PYENV_ROOT=${PYENV_ROOT}" >> $GITHUB_ENV
echo "PATH=${PATH}" >> $GITHUB_ENV
# init pyenv
eval "$(pyenv init --path)"
eval "$(pyenv init -)"
set +ex
- name: install docker (ubuntu)
if: matrix.os == 'ubuntu-22.04'
shell: bash
run: |
set -x
# install docker
Expand All @@ -36,17 +51,11 @@ jobs:
# launch docker
sudo systemctl start docker
- name: install docker (macos)
if: matrix.os == 'macos-12'
run: |
brew update --preinstall
brew install docker docker-compose qemu
brew upgrade qemu
colima start
docker run --user $(id -u):$(id -g) --rm hello-world
- name: Run integration tests
# we don't run ITs with postprocessing on macos because one of its dependencies "synthtool" is designed to run on linux only
if: matrix.os == 'ubuntu-22.04' || matrix.post_processing == 'false'
shell: bash
run: |
set -x
git config --global user.email "[email protected]"
git config --global user.name "Github Workflow"
library_generation/test/generate_library_integration_test.sh \
Expand Down
31 changes: 30 additions & 1 deletion library_generation/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Generate GAPIC Client Library without post-processing
# Generate GAPIC Client Library with and without post-processing

The script, `generate_library.sh`, allows you to generate a GAPIC client library from proto files.

Expand Down Expand Up @@ -28,6 +28,12 @@ This repository will be the source of truth for pre-existing
pom.xml files, owlbot.py and .OwlBot.yaml files. See the option belows for
custom postprocessed generations (e.g. custom `versions.txt` file).

Post-processing makes use of python scripts. The script will automatically use
`pyenv` to use the specified version in
`library_generation/configuration/python-version`. Pyenv is then a requirement
in the environment.


## Parameters to run `generate_library.sh`

You need to run the script with the following parameters.
Expand Down Expand Up @@ -225,3 +231,26 @@ library_generation/generate_library.sh \
--versions_file "path/to/versions.txt" \
--include_samples true
```

# Owlbot Java Postprocessor

We have transferred the
[implementation](https://github.com/googleapis/synthtool/tree/59fe44fde9866a26e7ee4e4450fd79f67f8cf599/docker/owlbot/java)
of Java Owlbot Postprocessor into `sdk-platform-java/library_generation`. The
implementation in synthtool is still valid and used by other services, so we
have two versions during a transition period.

## Reflecting changes in synthtool/docker/owlbot/java into this repository
The transfer was not a verbatim copy, it rather had modifications:
* `format-source.sh` was replaced by a call to `mvn fmt:format`
* `entrypoint.sh` was modified to have input arguments and slightly modified
the way the helper scripts are called
* Other helper scripts were modified to have input arguments.
* `fix-poms.py` modified the way the monorepo is detected

All these modifications imply that whenever we want to reflect a change from the
original owlbot in synthtool we may be better off modifying the affected source
files one by one. The mapping is from
[`synthtool/docker/owlbot/java`](https://github.com/googleapis/synthtool/tree/59fe44fde9866a26e7ee4e4450fd79f67f8cf599/docker/owlbot/java)
to
[`sdk-platform-java/library_generation/owlbot`](https://github.com/googleapis/sdk-platform-java/tree/move-java-owlbot/library_generation/owlbot)
1 change: 1 addition & 0 deletions library_generation/configuration/python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.11.2
1 change: 1 addition & 0 deletions library_generation/configuration/synthtool-commitish
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
59fe44fde9866a26e7ee4e4450fd79f67f8cf599
70 changes: 35 additions & 35 deletions library_generation/generate_library.sh
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,13 @@ if [ -z "${os_architecture}" ]; then
os_architecture=$(detect_os_architecture)
fi

temp_destination_path="${output_folder}/temp_preprocessed"
mkdir -p "${output_folder}/${destination_path}"
if [ -d "${temp_destination_path}" ]; then
# we don't want the preprocessed sources of a previous run
rm -rd "${temp_destination_path}"
fi
mkdir -p "${temp_destination_path}"
##################### Section 0 #####################
# prepare tooling
#####################################################
Expand Down Expand Up @@ -185,30 +191,30 @@ download_tools "${gapic_generator_version}" "${protobuf_version}" "${grpc_versio
if [[ ! "${transport}" == "rest" ]]; then
# do not need to generate grpc-* if the transport is `rest`.
"${protoc_path}"/protoc "--plugin=protoc-gen-rpc-plugin=protoc-gen-grpc-java-${grpc_version}-${os_architecture}.exe" \
"--rpc-plugin_out=:${destination_path}/java_grpc.jar" \
"--rpc-plugin_out=:${temp_destination_path}/java_grpc.jar" \
${proto_files} # Do not quote because this variable should not be treated as one long string.
# unzip java_grpc.jar to grpc-*/src/main/java
unzip_src_files "grpc"
unzip_src_files "grpc" "${temp_destination_path}"
# remove empty files in grpc-*/src/main/java
remove_empty_files "grpc"
remove_empty_files "grpc" "${temp_destination_path}"
# remove grpc version in *ServiceGrpc.java file so the content is identical with bazel build.
remove_grpc_version
remove_grpc_version "${temp_destination_path}"
fi
###################### Section 2 #####################
## generate gapic-*/, part of proto-*/, samples/
######################################################
if [[ "${proto_only}" == "false" ]]; then
"$protoc_path"/protoc --experimental_allow_proto3_optional \
"--plugin=protoc-gen-java_gapic=${script_dir}/gapic-generator-java-wrapper" \
"--java_gapic_out=metadata:${destination_path}/java_gapic_srcjar_raw.srcjar.zip" \
"--java_gapic_out=metadata:${temp_destination_path}/java_gapic_srcjar_raw.srcjar.zip" \
"--java_gapic_opt=$(get_gapic_opts "${transport}" "${rest_numeric_enums}" "${gapic_yaml}" "${service_config}" "${service_yaml}")" \
${proto_files} ${gapic_additional_protos}

unzip -o -q "${destination_path}/java_gapic_srcjar_raw.srcjar.zip" -d "${destination_path}"
unzip -o -q "${temp_destination_path}/java_gapic_srcjar_raw.srcjar.zip" -d "${temp_destination_path}"
# Sync'\''d to the output file name in Writer.java.
unzip -o -q "${destination_path}/temp-codegen.srcjar" -d "${destination_path}/java_gapic_srcjar"
unzip -o -q "${temp_destination_path}/temp-codegen.srcjar" -d "${temp_destination_path}/java_gapic_srcjar"
# Resource name source files.
proto_dir=${destination_path}/java_gapic_srcjar/proto/src/main/java
proto_dir=${temp_destination_path}/java_gapic_srcjar/proto/src/main/java
if [ ! -d "${proto_dir}" ]; then
# Some APIs don't have resource name helpers, like BigQuery v2.
# Create an empty file so we can finish building. Gating the resource name rule definition
Expand All @@ -218,14 +224,14 @@ if [[ "${proto_only}" == "false" ]]; then
touch "${proto_dir}"/PlaceholderFile.java
fi
# move java_gapic_srcjar/src/main to gapic-*/src.
mv_src_files "gapic" "main"
mv_src_files "gapic" "main" "${temp_destination_path}"
# remove empty files in gapic-*/src/main/java
remove_empty_files "gapic"
remove_empty_files "gapic" "${temp_destination_path}"
# move java_gapic_srcjar/src/test to gapic-*/src
mv_src_files "gapic" "test"
mv_src_files "gapic" "test" "${temp_destination_path}"
if [ "${include_samples}" == "true" ]; then
# move java_gapic_srcjar/samples/snippets to samples/snippets
mv_src_files "samples" "main"
mv_src_files "samples" "main" "${temp_destination_path}"
fi
fi
##################### Section 3 #####################
Expand All @@ -247,16 +253,16 @@ case "${proto_path}" in
proto_files="${proto_files//${removed_proto}/}"
;;
esac
"$protoc_path"/protoc "--java_out=${destination_path}/java_proto.jar" ${proto_files}
"$protoc_path"/protoc "--java_out=${temp_destination_path}/java_proto.jar" ${proto_files}
if [[ "${proto_only}" == "false" ]]; then
# move java_gapic_srcjar/proto/src/main/java (generated resource name helper class)
# to proto-*/src/main
mv_src_files "proto" "main"
mv_src_files "proto" "main" "${temp_destination_path}"
fi
# unzip java_proto.jar to proto-*/src/main/java
unzip_src_files "proto"
unzip_src_files "proto" "${temp_destination_path}"
# remove empty files in proto-*/src/main/java
remove_empty_files "proto"
remove_empty_files "proto" "${temp_destination_path}"
case "${proto_path}" in
"google/cloud/aiplatform/v1beta1"*)
prefix="google/cloud/aiplatform/v1beta1/schema"
Expand All @@ -282,14 +288,14 @@ for proto_src in ${proto_files}; do
if [[ "${proto_src}" == "google/cloud/common/operation_metadata.proto" ]]; then
continue
fi
mkdir -p "${destination_path}/proto-${folder_name}/src/main/proto"
rsync -R "${proto_src}" "${destination_path}/proto-${folder_name}/src/main/proto"
mkdir -p "${temp_destination_path}/proto-${folder_name}/src/main/proto"
rsync -R "${proto_src}" "${temp_destination_path}/proto-${folder_name}/src/main/proto"
done
popd # output_folder
##################### Section 4 #####################
# rm tar files
#####################################################
pushd "${output_folder}/${destination_path}"
pushd "${temp_destination_path}"
rm -rf java_gapic_srcjar java_gapic_srcjar_raw.srcjar.zip java_grpc.jar java_proto.jar temp-codegen.srcjar
popd # destination path
##################### Section 5 #####################
Expand All @@ -298,6 +304,8 @@ popd # destination path
if [ "${enable_postprocessing}" != "true" ];
then
echo "post processing is disabled"
cp -r ${temp_destination_path}/* "${output_folder}/${destination_path}"
rm -rdf "${temp_destination_path}"
exit 0
fi
if [ -z "${versions_file}" ];then
Expand All @@ -311,21 +319,13 @@ fi

mkdir -p "${workspace}"

# if destination_path is not empty, it will be used as a starting workspace for
# postprocessing
if [[ $(find "${output_folder}/${destination_path}" -mindepth 1 -maxdepth 1 -type d,f | wc -l) -gt 0 ]];then
workspace="${output_folder}/${destination_path}"
fi

bash -x "${script_dir}/postprocess_library.sh" "${workspace}" \
"${script_dir}" \
"${destination_path}" \
"${proto_path}" \
"${versions_file}" \
"${output_folder}"
"${temp_destination_path}" \
"${versions_file}"

# for post-procesed libraries, remove pre-processed folders
pushd "${output_folder}/${destination_path}"
rm -rdf "proto-${folder_name}"
rm -rdf "grpc-${folder_name}"
rm -rdf "gapic-${folder_name}"
if [ "${include_samples}" == "false" ]; then
rm -rdf "samples"
fi
popd # output_folder
# move contents of the post-processed library into destination_path
cp -r ${workspace}/* "${output_folder}/${destination_path}"
113 changes: 113 additions & 0 deletions library_generation/owlbot/bin/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#!/bin/bash
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This is the entrypoint script for java owlbot. This is not intended to be
# called directly but rather be called from postproces_library.sh
# For reference, the positional arguments are
# 1: scripts_root: location of postprocess_library.sh
# 2: versions_file: points to a versions.txt containing versions to be applied
# both to README and pom.xml files

# The scripts assumes the CWD is the folder where postprocessing is going to be
# applied

set -ex
scripts_root=$1
versions_file=$2

# Runs template and etc in current working directory
function processModule() {
# templates as well as retrieving files from owl-bot-staging
echo "Generating templates and retrieving files from owl-bot-staging directory..."
if [ -f "owlbot.py" ]
then
# defaults to run owlbot.py
python3 owlbot.py
fi
echo "...done"

# write or restore pom.xml files
echo "Generating missing pom.xml..."
python3 "${scripts_root}/owlbot/src/fix-poms.py" "${versions_file}" "true"
echo "...done"

# write or restore clirr-ignored-differences.xml
echo "Generating clirr-ignored-differences.xml..."
${scripts_root}/owlbot/bin/write_clirr_ignore.sh "${scripts_root}"
echo "...done"

# fix license headers
echo "Fixing missing license headers..."
python3 "${scripts_root}/owlbot/src/fix-license-headers.py"
echo "...done"

# TODO: re-enable this once we resolve thrashing
# restore license headers years
# echo "Restoring copyright years..."
# /owlbot/bin/restore_license_headers.sh
# echo "...done"

# ensure formatting on all .java files in the repository
echo "Reformatting source..."
mvn fmt:format
echo "...done"
}

if [ "$(ls */.OwlBot.yaml|wc -l)" -gt 1 ];then
# Monorepo (googleapis/google-cloud-java) has multiple OwlBot.yaml config
# files in the modules.
echo "Processing monorepo"
if [ -d owl-bot-staging ]; then
# The content of owl-bot-staging is controlled by Owlbot.yaml files in
# each module in the monorepo
echo "Extracting contents from owl-bot-staging"
for module in owl-bot-staging/* ; do
if [ ! -d "$module" ]; then
continue
fi
# This relocation allows us continue to use owlbot.py without modification
# after monorepo migration.
mv "owl-bot-staging/$module" "$module/owl-bot-staging"
pushd "$module"
processModule
popd
done
rm -r owl-bot-staging
else
echo "In monorepo but no owl-bot-staging." \
"Formatting changes in the last commit"
# Find the files that were touched by the last commit.
last_commit=$(git log -1 --format=%H)
# [A]dded, [C]reated, [M]odified, and [R]enamed
changed_files=$(git show --name-only --no-renames --diff-filter=ACMR \
"${last_commit}")
changed_modules=$(echo "$changed_files" |grep -E '.java$' |cut -d '/' -f 1 \
|sort -u)
for module in ${changed_modules}; do
if [ ! -f "$module/.OwlBot.yaml" ]; then
# Changes irrelevant to Owlbot-generated module (such as .github) do not
# need formatting
continue
fi
pushd "$module"
processModule
popd
done
fi
else
# Split repository
echo "Processing a split repo"
processModule
fi
Loading

0 comments on commit f8969d2

Please sign in to comment.