Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement delete expired index job #2702

Merged
merged 8 commits into from
Oct 26, 2024

Conversation

highpon
Copy link
Contributor

@highpon highpon commented Oct 14, 2024

Description

Delete expired index from Agent using k8s Job.
The user sets the index_id in config and deletes the index.

This feature is currently implementing.

Related Issue

close: #2647

Versions

  • Vald Version: v1.7.13
  • Go Version: v1.23.2
  • Rust Version: v1.81.0
  • Docker Version: v27.3.1
  • Kubernetes Version: v1.31.1
  • Helm Version: v3.16.2
  • NGT Version: v2.2.4
  • Faiss Version: v1.9.0

Checklist

Special notes for your reviewer

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new index deletion job and associated configurations, including a Kubernetes CronJob and ConfigMap for managing deletion operations.
    • Added new GitHub Actions workflows for building and scanning Docker images related to index deletion.
  • Improvements

    • Enhanced spell-checking configuration by ignoring specific paths.
    • Improved Docker image build process with new targets and configurations.
    • Added comprehensive configuration details for the index deletion service, including logging, server settings, and observability options.
  • Bug Fixes

    • Minor formatting corrections in workflow files for consistency.

@kpango
Copy link
Collaborator

kpango commented Oct 16, 2024

Thank you for your contribution to Vald, and I am very happy that you tried to create the DeleteExpired Index Job.

After reading the code, you understood about Docker and Workflow auto-generation, and you wrote the code in Vald's typical way.
Meanwhile, as mentioned in the original Issue, we think it would be better to use Remove or RemoveByTimestamp to remove items.
So it would be good if you can change the implementation to use Remove or RemoveByTimestamp in Vald.proto without adding the DeleteIndex RPC in Agent.proto.

I apologize if the information may not be enough.

Vald.proto: Vald Gateways (LB, Filter, Mirror) and Agents (Faiss, NGT, Rust(future work)) implement this.
Agent.proto: implemented by Agents (Faiss, NGT, Rust(future work)).

This means the basic CRUD processing has all been standardized in Vald.proto, so you can use Vald.Proto's Remove or RemoveByTimestamp.

@kpango kpango self-assigned this Oct 16, 2024
@highpon
Copy link
Contributor Author

highpon commented Oct 17, 2024

@kpango
Thank you very much!
I was mistaken, I will fix it to be implemented using the RPC already provided in Vald.proto.

@highpon highpon force-pushed the feat/implement-delete-expired-index branch from 019bf55 to 4093d71 Compare October 20, 2024 06:56
@highpon highpon changed the title [WIP] feat: Implement delete expired index job feat: Implement delete expired index job Oct 20, 2024
@highpon highpon marked this pull request as ready for review October 20, 2024 06:59
Copy link
Contributor

coderabbitai bot commented Oct 20, 2024

📝 Walkthrough

Walkthrough

The pull request introduces several changes primarily focused on implementing a new index deletion feature within a Kubernetes environment. Key modifications include the addition of new workflow files for Docker image management, a Kubernetes CronJob for periodic index deletion, and various configuration files to support the new functionality. Additionally, updates to existing files enhance the build process and spell-checking configurations. Overall, the changes aim to automate the deletion of expired indexes without user intervention.

Changes

File Change Summary
.cspell.json Added "**/cmd/index/job/deletion/index-deletion" to ignorePaths.
.gitfiles New workflow file: .github/workflows/dockers-index-deletion-image.yaml.
.github/actions/detect-docker-image-tags/action.yaml Added mapping for vdaas/vald-index-deletion to manager.index.delete.image.tag.
.github/workflows/dockers-image-scan.yaml Added new job index-deletion for Docker image scanning.
.github/workflows/dockers-index-deletion-image.yaml New workflow for building Docker image for index deletion.
.github/workflows/dockers-index-deletion.yaml New workflow for automating Docker image building for index deletion.
.github/workflows/dockers-release-branch-image.yaml Added job index-deletion, dependent on dump-contexts-to-log.
Makefile Added variable INDEX_DELETION_IMAGE = $(NAME)-index-deletion.
Makefile.d/build.mk Added target for cmd/index/job/deletion/index-deletion and corresponding artifact zip target.
Makefile.d/docker.mk Added .PHONY targets for docker/name/index-deletion and docker/build/index-deletion.
cmd/index/job/deletion/main.go New file implementing the main entry point for index deletion job.
cmd/index/job/deletion/sample.yaml New configuration file for job deletion service.
dockers/index/job/deletion/Dockerfile New Dockerfile for building the index deletion image.
hack/cspell/main.go Added "**/cmd/index/job/creation/index-deletion" to mandatoryIgnorePaths.
k8s/index/job/deletion/configmap.yaml New Kubernetes ConfigMap for index deletion configuration.
k8s/index/job/deletion/cronjob.yaml New Kubernetes CronJob for periodic index deletion.
pkg/index/job/deletion/config/config.go New configuration management for index deletion.
pkg/index/job/deletion/service/deleter.go New service implementation for managing deletion operations.
pkg/index/job/deletion/service/options.go New functional options for configuring deletion service.
pkg/index/job/deletion/usecase/deletion.go New struct and methods for managing deletion job lifecycle.
internal/config/index_deleter.go New struct for managing index deletion configurations.

Assessment against linked issues

Objective Addressed Explanation
Implement index deletion via Kubernetes Job (2647)
Automate deletion of expired indexes without user intervention (2647)

Possibly related PRs

Suggested labels

priority/low, area/index/job/save, type/ci, area/internal

Suggested reviewers

  • kpango

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@highpon highpon force-pushed the feat/implement-delete-expired-index branch from 4093d71 to f80bdc5 Compare October 20, 2024 07:00
@highpon
Copy link
Contributor Author

highpon commented Oct 20, 2024

@kpango
Fixed implementation using Vald.proto's Remove rpc!
Ready for review. Please review when you have time!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🧹 Outside diff range and nitpick comments (24)
pkg/index/job/deletion/service/options.go (2)

33-40: LGTM with suggestion: Consider consistent error handling

The WithTargetAddrs function is well-implemented, efficiently handling the addition of target addresses. It correctly follows the functional options pattern and handles the case of no addresses being provided.

However, for consistency with other option functions, consider adding error handling for invalid inputs (e.g., empty strings). This would align with the error handling approach used in other options.


43-48: Add input validation for robustness

While the WithTargetIndexID function follows the functional options pattern, it lacks input validation. Consider adding a check for empty strings or invalid index ID formats. This would improve robustness and maintain consistency with the error handling approach used in other option functions.

Example implementation:

func WithTargetIndexID(indexID string) Option {
	return func(idx *index) error {
		if indexID == "" {
			return errors.NewErrInvalidOption("targetIndexID", indexID)
		}
		idx.targetIndexID = indexID
		return nil
	}
}
internal/config/index_deletor.go (3)

17-47: LGTM! Consider adding validation for the Concurrency field.

The IndexDeletor struct is well-defined with clear field names, appropriate types, and helpful comments. The use of JSON and YAML tags is consistent with Go best practices for configuration structs.

Consider adding validation for the Concurrency field to ensure it's always a positive integer. This could be done in the Bind() method or in a separate Validate() method.


49-61: LGTM! Consider adding error handling and validation.

The Bind() method effectively processes the struct fields using GetActualValue and GetActualValues. The method allows for method chaining by returning the receiver, which is a good practice.

Consider the following improvements:

  1. Add error handling to the Bind() method. If GetActualValue or GetActualValues can return errors, these should be propagated.
  2. Implement validation for numeric fields like AgentPort, Concurrency, and CreationPoolSize.
  3. Consider adding a Validate() method to perform more complex validations, such as checking if TargetAddrs is empty when Discoverer is nil.

Example implementation:

func (ic *IndexDeletor) Bind() (*IndexDeletor, error) {
    var err error
    ic.IndexID, err = GetActualValue(ic.IndexID)
    if err != nil {
        return nil, fmt.Errorf("failed to bind IndexID: %w", err)
    }
    // ... (repeat for other fields)

    if ic.Discoverer != nil {
        if err := ic.Discoverer.Bind(); err != nil {
            return nil, fmt.Errorf("failed to bind Discoverer: %w", err)
        }
    }

    return ic, ic.Validate()
}

func (ic *IndexDeletor) Validate() error {
    if ic.Concurrency <= 0 {
        return fmt.Errorf("Concurrency must be a positive integer")
    }
    if ic.CreationPoolSize == 0 {
        return fmt.Errorf("CreationPoolSize must be greater than zero")
    }
    if ic.Discoverer == nil && len(ic.TargetAddrs) == 0 {
        return fmt.Errorf("TargetAddrs must not be empty when Discoverer is nil")
    }
    return nil
}

1-61: Overall, the implementation looks good with room for minor improvements.

The IndexDeletor struct and its Bind() method provide a solid foundation for managing index deletion configurations. The implementation aligns well with the PR objectives of implementing a feature for deleting expired indexes.

To further improve the robustness of this implementation:

  1. Consider adding more comprehensive error handling and validation as suggested in the previous comments.
  2. Ensure that this configuration is properly integrated with the Kubernetes Job mentioned in the PR objectives.
  3. Document how this configuration interacts with the existing Remove or RemoveByTimestamp methods from Vald.proto, as suggested in the PR comments.
  4. Consider adding unit tests for the Bind() method and any additional validation logic you implement.
.github/workflows/dockers-index-deletion-image.yaml (2)

19-74: LGTM: Well-defined workflow triggers with a suggestion.

The workflow name and trigger events are well-defined, covering all relevant scenarios for building the index-deletion Docker image. The file path filters for pull requests are comprehensive, ensuring the workflow runs only when necessary.

Consider adding a brief comment explaining the purpose of the pull_request_target event, as it's less common and has different security implications compared to the regular pull_request event.


1-80: Overall: Excellent implementation of the index deletion workflow.

This workflow file is well-structured and aligns perfectly with the PR objectives of implementing the delete expired index job. Key points:

  1. It ensures the Docker image for the new index deletion feature is built under appropriate conditions.
  2. The use of reusable workflows promotes consistency and maintainability.
  3. Specific trigger conditions and file path filters demonstrate thoughtful CI/CD practices.

The implementation shows a good understanding of GitHub Actions and integrates well with the existing CI/CD pipeline. This will greatly facilitate the automation of deleting old indexes as outlined in the PR objectives and linked issue #2647.

As the project grows, consider creating a document or README that explains the overall CI/CD architecture, including how this workflow fits into the bigger picture. This will help new contributors understand the build and deployment processes more easily.

k8s/index/job/deletion/cronjob.yaml (3)

29-32: Consider adjusting the CronJob schedule frequency

The current schedule "* * * * *" runs the job every minute, which might be unnecessarily frequent and could lead to resource contention, especially if the job takes longer than a minute to complete.

Consider adjusting the schedule based on your actual needs. For example, if hourly runs are sufficient, you could use "0 * * * *".

Also, note that:

  1. The concurrencyPolicy: Forbid ensures that new jobs won't start if the previous one is still running.
  2. The startingDeadlineSeconds: 43200 (12 hours) allows the job to start if the CronJob controller recovers within this period after being down.

53-77: Approve init containers with suggestions for improvement

The use of init containers to ensure the readiness of dependent services (vald-agent and vald-discoverer) is a good practice. It helps prevent the main container from starting before the required services are available.

Suggestions for improvement:

  1. Consider adding a timeout to the wait loops to prevent infinite waiting if a service never becomes ready.
  2. Add error messages to stderr for better debugging.
  3. Use more descriptive variable names for better readability.

Here's an example of how you could improve the wait-for-agent script:

TIMEOUT=300  # 5 minutes timeout
AGENT_URL="http://vald-agent.default.svc.cluster.local:3001/readiness"

start_time=$(date +%s)
while true; do
  response_code=$(wget --server-response --spider --quiet "$AGENT_URL" 2>&1 | awk 'NR==1{print $2}')
  if [ "$response_code" = "200" ]; then
    echo "Agent is ready" >&2
    exit 0
  fi
  
  current_time=$(date +%s)
  elapsed_time=$((current_time - start_time))
  if [ $elapsed_time -ge $TIMEOUT ]; then
    echo "Timeout waiting for agent to be ready" >&2
    exit 1
  fi
  
  echo "Waiting for agent to be ready... (elapsed time: ${elapsed_time}s)" >&2
  sleep 2
done

Apply similar improvements to the wait-for-discoverer script.


79-140: Approve container configuration with a suggestion for image versioning

The main container configuration is well-structured with several good practices:

  1. Proper naming and image specification.
  2. Configuration mounted as a volume from a ConfigMap.
  3. Well-configured liveness, readiness, and startup probes with appropriate timing and thresholds.
  4. Necessary ports exposed for various functions (liveness, readiness, gRPC, pprof).
  5. Environment variables correctly set using the Kubernetes downward API.

However, there's one important suggestion:

Consider using a specific version tag for the image in production environments.

The current image specification uses the "nightly" tag:

image: "vdaas/vald-index-deletion:nightly"

While this is fine for development or staging environments, for production, it's recommended to use a specific version tag to ensure consistency and reproducibility. For example:

image: "vdaas/vald-index-deletion:v1.7.14"

This practice helps in maintaining a stable and predictable deployment, making it easier to track which version is running and to roll back if necessary.

cmd/index/job/deletion/sample.yaml (1)

16-21: Consider updating version and reviewing time zone and log level settings.

  1. The version is set to v0.0.0, which appears to be a placeholder. Consider updating it to reflect the actual version of the software.
  2. The time zone is set to JST. Ensure this aligns with your project's requirements, especially if you have a globally distributed system.
  3. The log level is set to info. For production environments, you might want to consider using warn or error to reduce log volume.
hack/cspell/main.go (1)

93-93: LGTM! Consider sorting the mandatoryIgnorePaths slice for consistency.

The addition of "**/cmd/index/job/creation/index-deletion" to the mandatoryIgnorePaths slice is appropriate and aligns with the PR's objective of implementing a delete expired index job. This change will prevent unnecessary spell-checking errors in the newly added feature.

For improved readability and easier maintenance, consider sorting the mandatoryIgnorePaths slice alphabetically. This can be done by adding a sort.Strings(mandatoryIgnorePaths) call after all the paths have been added to the slice.

Makefile.d/docker.mk (1)

348-353: LGTM: Correct implementation of index-deletion build target

The new docker/build/index-deletion target is correctly implemented, following the established pattern for other image build targets in the Makefile. It properly specifies the Dockerfile path and uses the $(INDEX_DELETION_IMAGE) variable.

For consistency with other similar targets, consider adding a comment describing the purpose of this target:

 .PHONY: docker/build/index-deletion
-## build index-deletion image
+## Build index-deletion image for deleting expired indexes
 docker/build/index-deletion:

This minor change would align the comment style with other build targets in the file.

k8s/index/job/deletion/configmap.yaml (4)

30-35: Consider updating the version and reviewing logging level

The version is set to v0.0.0, which might need to be updated to reflect the actual version of the service. Additionally, the logging level is set to debug. While this is fine for development, consider setting it to a less verbose level (e.g., info) for production environments to reduce log volume.


36-190: Server configurations look good, consider adjusting health check timeouts

The server configurations for gRPC, health checks, and metrics are well-structured and comprehensive. However, for the health check servers (liveness and readiness), consider setting explicit timeouts for handler, read, and write operations to ensure proper behavior in case of issues.

You might want to add the following timeouts to both liveness and readiness server configurations:

http:
  handler_timeout: 5s
  read_timeout: 5s
  write_timeout: 5s

191-201: Consider reducing the full shutdown duration

The startup and shutdown strategies look good, ensuring proper order of operations. However, the full_shutdown_duration is set to 600s (10 minutes), which seems quite long. Consider reducing this duration to a more reasonable value (e.g., 2-5 minutes) unless there's a specific reason for such a long shutdown period.


208-238: Consider enabling observability for better monitoring

The observability configuration is currently disabled (enabled: false). For production environments, it's recommended to enable observability to facilitate monitoring, debugging, and performance analysis. When enabled, the configuration for OTLP, metrics, and tracing looks comprehensive.

Consider enabling observability:

observability:
  enabled: true

Also, verify that the OTLP collector endpoint is correctly set and that the attribute values (namespace, pod_name, node_name) are properly injected at runtime.

pkg/index/job/deletion/config/config.go (1)

21-22: Unnecessary Type Alias for GlobalConfig.

The type alias type GlobalConfig = config.GlobalConfig may be redundant since it's not adding any new functionality or abstraction.

Consider removing the type alias to simplify the code:

-// GlobalConfig is a type alias of config.GlobalConfig representing application base configurations.
-type GlobalConfig = config.GlobalConfig

This reduces confusion and maintains clarity by referring directly to config.GlobalConfig.

dockers/index/job/deletion/Dockerfile (1)

43-43: Environment variable USER is redundant.

The USER environment variable is set to root, but this is the default in the builder stage. Unless it's required explicitly, it can be removed.

Apply this diff to remove the redundant environment variable:

 ENV TZ=Etc/UTC
-ENV USER=root
 ENV PATH=${GOPATH}/bin:${GOROOT}/bin:/usr/local/bin:${PATH}
pkg/index/job/deletion/service/deleter.go (5)

48-52: Ensure the correct error is logged for option failures

When an option fails to apply, you are logging err instead of the wrapped error oerr, which may contain more context. Consider logging oerr to provide detailed error information.

Apply this diff to log the wrapped error:

if errors.As(oerr, &e) {
-    log.Error(err)
+    log.Error(oerr)
    return nil, oerr
}

79-128: Consider adding context to returned errors in Start method

In the Start method, when idx.doDeleteIndex returns an error, consider wrapping it with additional context to provide more clarity about where the error originated.

Example modification:

if err != nil {
    // Existing error handling
    ...
+   return fmt.Errorf("failed to start deletion process: %w", err)
}
return nil

35-41: Export the index struct or rename it for clarity

The struct index is unexported but implements the exported Deleter interface. To align with Go naming conventions and improve clarity, consider renaming the struct to deleter or exporting it as Index.

Option 1: Rename to deleter

-type index struct {
+type deleter struct {
    ...
}

Option 2: Export as Index

-type index struct {
+type Index struct {
    ...
}

74-77: Provide more descriptive documentation for StartClient

The comment for StartClient is minimal. Adding a more detailed description of what the method does can improve code readability and maintainability.

Update the comment to describe the method's purpose:

-// StartClient starts the gRPC client.
+// StartClient initializes and starts the gRPC client, returning a channel to receive any startup errors.

61-72: Simplify the delDuplicateAddrs function using a set

Consider simplifying the delDuplicateAddrs function by using a map as a set to more idiomatically remove duplicates from the slice.

Example:

func delDuplicateAddrs(targetAddrs []string) []string {
-    addrs := make([]string, 0, len(targetAddrs))
+    exist := make(map[string]struct{}, len(targetAddrs))
+    for _, addr := range targetAddrs {
+        exist[addr] = struct{}{}
+    }
+    addrs := make([]string, 0, len(exist))
+    for addr := range exist {
+        addrs = append(addrs, addr)
+    }
     return addrs
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 0a8b332 and f80bdc5.

⛔ Files ignored due to path filters (2)
  • hack/actions/gen/main.go is excluded by !**/gen/**
  • hack/docker/gen/main.go is excluded by !**/gen/**
📒 Files selected for processing (21)
  • .cspell.json (1 hunks)
  • .gitfiles (1 hunks)
  • .github/actions/detect-docker-image-tags/action.yaml (1 hunks)
  • .github/workflows/dockers-image-scan.yaml (1 hunks)
  • .github/workflows/dockers-index-deletion-image.yaml (1 hunks)
  • .github/workflows/dockers-index-deletion.yaml (1 hunks)
  • .github/workflows/dockers-release-branch-image.yaml (1 hunks)
  • Makefile (1 hunks)
  • Makefile.d/build.mk (4 hunks)
  • Makefile.d/docker.mk (3 hunks)
  • cmd/index/job/deletion/main.go (1 hunks)
  • cmd/index/job/deletion/sample.yaml (1 hunks)
  • dockers/index/job/deletion/Dockerfile (1 hunks)
  • hack/cspell/main.go (1 hunks)
  • internal/config/index_deletor.go (1 hunks)
  • k8s/index/job/deletion/configmap.yaml (1 hunks)
  • k8s/index/job/deletion/cronjob.yaml (1 hunks)
  • pkg/index/job/deletion/config/config.go (1 hunks)
  • pkg/index/job/deletion/service/deleter.go (1 hunks)
  • pkg/index/job/deletion/service/options.go (1 hunks)
  • pkg/index/job/deletion/usecase/deletion.go (1 hunks)
🧰 Additional context used
🔇 Additional comments (31)
pkg/index/job/deletion/service/options.go (3)

11-19: LGTM: Well-implemented functional option for discoverer client

The WithDiscoverer function is well-implemented. It follows the functional options pattern, includes proper nil checking, and uses custom error types. This approach enhances the configurability and robustness of the index structure.


22-30: LGTM: Proper implementation of indexing concurrency option

The WithIndexingConcurrency function is correctly implemented. It includes appropriate input validation, uses consistent error handling, and follows the functional options pattern. This ensures that only valid concurrency values are set for the index.


1-9: LGTM: Well-structured file with appropriate imports

The file is well-organized with appropriate package naming and imports. The Option type is correctly defined, setting up the functional options pattern used throughout the file. This structure promotes clean and maintainable code.

.github/workflows/dockers-index-deletion-image.yaml (2)

1-17: LGTM: Proper license and file generation information.

The file includes the correct Apache 2.0 license header and a note indicating that it's auto-generated. This is good practice for maintaining consistency and clarity in the project.


75-80: LGTM: Efficient use of reusable workflow.

The job definition is concise and makes good use of a reusable workflow, which promotes consistency and reduces duplication in the CI/CD pipeline.

To ensure the referenced workflow file exists and is correctly configured, please run the following command:

.github/workflows/dockers-index-deletion.yaml (4)

1-18: LGTM: File header and licensing information.

The file header contains appropriate licensing information (Apache License 2.0) and a note indicating that this file is auto-generated. This is good practice for maintaining consistency and compliance.


75-80: LGTM: Job definition is concise and utilizes reusable workflow.

The job definition is well-structured:

  1. It uses a reusable workflow (_docker-image.yaml), which promotes code reuse and maintainability.
  2. It passes "index-deletion" as the target, allowing for specific configuration.
  3. It inherits secrets from the calling workflow, which is a secure way to pass sensitive information.

This approach is efficient and follows good practices for GitHub Actions workflows.


1-80: Overall, the workflow file is well-structured and follows good practices.

The file .github/workflows/dockers-index-deletion.yaml is a well-designed GitHub Actions workflow for building a Docker image for the index-deletion functionality. It includes appropriate licensing information, clear naming, comprehensive trigger events, and efficient job definition using a reusable workflow.

The only point of consideration is the use of the pull_request_target event, which should be handled with care to avoid potential security risks. Ensure that proper security measures are in place when working with this event type.


19-74: LGTM: Workflow name and trigger events are well-defined.

The workflow name clearly describes its purpose, and the trigger events are comprehensive, covering both direct pushes and pull requests. The path filters are detailed and appear to include all relevant files for the index-deletion functionality.

However, please note the use of the pull_request_target event. While it's useful for accessing secrets in pull requests from forks, it can pose security risks if not handled carefully. Ensure that you're aware of these risks and have implemented appropriate safeguards.

To verify the security measures for pull_request_target, you can run the following script:

✅ Verification successful

To ensure the pull_request_target event is securely configured, please run the following script to check for any usage of secrets or environment variables within the workflow steps:


LGTM: Security measures for pull_request_target appear properly configured.

The workflow uses secrets: inherit, and no additional usage of secrets or environment variables was detected within the workflow steps. This indicates that secrets are not being exposed or misused beyond inheriting them securely.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential security risks in pull_request_target usage

# Test: Look for any write operations or secret usage in the workflow
rg -i 'secrets:|env:' .github/workflows/dockers-index-deletion.yaml

Length of output: 90


Script:

#!/bin/bash
# Description: Check for usage of secrets or environment variables in workflow steps, excluding 'secrets: inherit'

# Search for 'secrets:' or 'env:' in the workflow file, excluding lines with 'secrets: inherit'
rg -i 'secrets:|env:' .github/workflows/dockers-index-deletion.yaml | grep -v 'secrets: inherit'

Length of output: 98

.github/workflows/dockers-image-scan.yaml (1)

73-76: LGTM! New job added correctly.

The addition of the index-deletion job is well-implemented and consistent with the existing structure of the workflow. It correctly uses the _docker-image-scan.yaml workflow and targets the new index-deletion component.

To ensure consistency, let's verify that the new job follows the same pattern as existing ones:

✅ Verification successful

Verification successful: Job definitions are consistent.

All job definitions in .github/workflows/dockers-image-scan.yaml adhere to the established structure, ensuring consistency across the workflow.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistency of job definitions in the workflow file

# Test: Check if all jobs follow the same structure
# Expect: All job definitions to have the same structure (3 lines, uses and with sections)
rg -A 3 '^\s{2}\w+(-\w+)*:$' .github/workflows/dockers-image-scan.yaml

Length of output: 2226

.github/actions/detect-docker-image-tags/action.yaml (1)

51-51: LGTM: New Docker image mapping added correctly

The addition of the mapping for vdaas/vald-index-deletion to manager.index.delete.image.tag is correct and consistent with the existing structure. This change aligns well with the PR objective of implementing a feature for deleting expired indexes using a Kubernetes Job.

.github/workflows/dockers-release-branch-image.yaml (1)

91-96: LGTM: New index-deletion job added correctly

The new index-deletion job has been added with the correct structure and configuration. It follows the same pattern as other jobs in the workflow:

  • Depends on the dump-contexts-to-log job
  • Uses the _docker-image.yaml workflow file
  • Specifies the appropriate target
  • Inherits secrets from the workflow context

The job is also placed in a logical order within the file, near other index-related jobs. This addition is consistent with the overall structure of the workflow and aligns with the PR objectives of implementing a feature for deleting expired indexes.

k8s/index/job/deletion/cronjob.yaml (2)

35-51: Well-configured job template with proper labeling and profiling

The job template is well-structured with several beneficial configurations:

  1. The ttlSecondsAfterFinished: 86400 ensures that completed jobs are automatically cleaned up after 24 hours, which is good for resource management.
  2. The labels are consistent with the CronJob metadata, following Kubernetes best practices for resource identification and management.
  3. The Pyroscope annotations enable CPU and memory profiling, which will be valuable for performance analysis and optimization of the index deletion process.

These configurations contribute to better manageability and observability of the index deletion jobs.


141-146: Approve pod spec configuration

The pod spec configuration is well-defined and follows Kubernetes best practices:

  1. The restartPolicy: OnFailure is appropriate for a job. It ensures that the container will be restarted if it fails, allowing for automatic recovery from transient errors.

  2. The use of a ConfigMap for the vald-index-deletion-config volume is an excellent practice. It allows for externalized configuration that can be updated independently of the pod spec. This separation of configuration from the application code enhances maintainability and flexibility.

These configurations contribute to the robustness and manageability of the index deletion job.

Makefile.d/build.mk (4)

27-27: LGTM: New index deletion job added to build target

The addition of cmd/index/job/deletion/index-deletion to the binary/build target is consistent with the PR objective and follows the existing pattern for other components.


83-85: LGTM: Build command for index deletion job is correct

The build command for the index deletion job is correctly implemented:

  • It follows the same pattern as other similar jobs.
  • CGO_ENABLED is set to 0, which is appropriate for a static build.
  • The go-build function is called with the correct parameters.

130-130: LGTM: Zip target for index deletion job artifact correctly implemented

The changes related to zipping the index deletion job artifact are correctly implemented:

  • The artifact is properly added to the binary/build/zip target.
  • The zip command follows the same pattern as other artifacts.
  • The use of $(GOOS) and $(GOARCH) ensures correct naming for the target platform.

Also applies to: 190-193


27-27: Overall assessment: Changes are well-implemented and consistent

The additions for the index deletion job in this Makefile are well-implemented and consistent with the existing patterns. They correctly integrate the new job into the build and packaging process, aligning with the PR objective of implementing an index deletion feature.

Also applies to: 83-85, 130-130, 190-193

cmd/index/job/deletion/sample.yaml (1)

71-140: Review deletor configuration and environment-specific settings.

  1. The index_id is set to "sample" (line 72), which appears to be a placeholder. Ensure this is updated with the correct index ID for your use case.
  2. The agent and discoverer configurations (lines 73-140) are comprehensive. However, some settings might need adjustment based on your specific environment:
    • Review the agent_dns and agent_namespace (lines 75-76) to ensure they match your Kubernetes cluster configuration.
    • The DNS cache settings (lines 127-128) might need tuning based on your environment's DNS stability and change frequency.
  3. The concurrency setting is set to 1 (line 78). Depending on your system's capacity and requirements, you might want to increase this for better performance.

To ensure the configuration is correct for your environment, you can run the following script to check the existence of the specified services:

#!/bin/bash
# Verify the existence of specified services
echo "Checking service DNS resolution:"
services=("vald-agent-ngt.default.svc.cluster.local" "vald-discoverer.default.svc.cluster.local")
for service in "${services[@]}"; do
  if nslookup "$service" >/dev/null 2>&1; then
    echo "$service resolves correctly"
  else
    echo "$service does not resolve"
  fi
done
Makefile.d/docker.mk (2)

37-37: LGTM: Correct integration of index-deletion build target

The addition of docker/build/index-deletion to the docker/build target is correct and consistent with the existing structure. This ensures that the new index-deletion image will be built along with other images in the project.


344-346: LGTM: Correct implementation of index-deletion name target

The new docker/name/index-deletion target is correctly implemented, following the established pattern for other image name targets in the Makefile. It properly uses the $(ORG) and $(INDEX_DELETION_IMAGE) variables to echo the full image name.

k8s/index/job/deletion/configmap.yaml (1)

1-415: Overall, a well-structured configuration with room for production-ready improvements

The ConfigMap provides a comprehensive configuration for the Vald index deletion service. However, several areas need attention before deploying to production:

  1. Update the version number from v0.0.0.
  2. Enable and properly configure TLS for all components.
  3. Enable observability for better monitoring and debugging.
  4. Review and adjust timeouts, especially for health checks and shutdown duration.
  5. Ensure the deletor configuration, particularly the index_id and concurrency, is set correctly.
  6. Populate empty address arrays and review security settings in client configurations.

Please address these points to enhance the security, reliability, and production-readiness of the service.

Makefile (1)

43-43: LGTM: New variable for index deletion image added.

The addition of INDEX_DELETION_IMAGE = $(NAME)-index-deletion is consistent with the naming convention and structure of other image-related variables in the Makefile. This new variable aligns well with the PR objective of implementing a delete expired index job.

.cspell.json (1)

44-44: Appropriate addition to spell-check ignore paths.

The new path "**/cmd/index/job/deletion/index-deletion" has been added to the ignorePaths array. This is consistent with the pull request's objective of implementing a delete expired index job. By ignoring this path during spell-checking, it prevents potential false positives for technical terms or identifiers specific to the new feature.

.gitfiles (1)

99-99: New workflow file added for index deletion image.

The addition of dockers-index-deletion-image.yaml aligns with the PR objective of implementing a feature for deleting expired indexes. This workflow file is likely responsible for building and managing the Docker image for the index deletion process.

To ensure this new workflow file is properly integrated and doesn't introduce any issues, please run the following verification script:

✅ Verification successful

New workflow file for index deletion image verified successfully.

The dockers-index-deletion-image.yaml has been properly added and referenced only within relevant workflow configurations. No unintended references were found, ensuring smooth integration within the CI/CD pipeline.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the new workflow file and its references
echo "Checking the content of the new workflow file:"
cat .github/workflows/dockers-index-deletion-image.yaml

echo "Checking for references to the new workflow in other files:"
grep -R "dockers-index-deletion-image.yaml" .

Length of output: 3129

cmd/index/job/deletion/main.go (4)

1-13: Proper inclusion of Apache 2.0 License Header

The file includes the standard Apache 2.0 License header, ensuring compliance with open-source licensing requirements.


14-26: Imports are well-organized and necessary

All imported packages are appropriately grouped and utilized within the code. Standard library imports are separated from third-party and internal packages, enhancing readability.


28-32: Version constants and job name are correctly defined

The constants maxVersion, minVersion, and name are properly set to manage version constraints and identify the job.


34-59: Main function initializes the job runner correctly

The main function uses safety.RecoverFunc to handle panics gracefully. It sets up the runner with the correct context, name, version constraints, configuration loader, and daemon initializer. Error handling is appropriately managed, and any initialization errors are logged using log.Fatal.

dockers/index/job/deletion/Dockerfile (2)

68-68: Run ldconfig may not be necessary.

Running ldconfig is typically not needed in container images unless shared libraries have been added to custom locations. Verify if this step is required.

Please confirm whether ldconfig is necessary in the builder stage. If it's not required, consider removing it to streamline the build process.


86-86: Confirm non-root user permissions.

The final stage sets the user to nonroot:nonroot. Ensure that the application does not require elevated permissions and that all necessary files and directories are accessible to the non-root user.

If the application needs access to specific resources, confirm that permissions are correctly set.

cmd/index/job/deletion/sample.yaml Outdated Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Outdated Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Show resolved Hide resolved
k8s/index/job/deletion/configmap.yaml Show resolved Hide resolved
pkg/index/job/deletion/usecase/deletion.go Outdated Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Outdated Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Outdated Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Show resolved Hide resolved
@highpon highpon force-pushed the feat/implement-delete-expired-index branch 2 times, most recently from d4fbcf9 to 3bdb7cf Compare October 20, 2024 12:11
@kpango kpango requested review from kpango and vankichi October 24, 2024 04:38
vankichi
vankichi previously approved these changes Oct 24, 2024
Copy link
Contributor

@vankichi vankichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :) Thanks for your great job 🎉

Copy link
Collaborator

@kpango kpango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good, but I commented that there were a few package references that were different from other components.

import (
"context"
"reflect"
"strings"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use internal/strings instead of official string package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review!
fix: c7172a1


import (
"context"
"log"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use internal/log instead of official logger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review!
fix: c7172a1

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (2)
internal/config/index_deleter.go (2)

17-47: Add field validation and improve documentation.

The struct fields would benefit from additional validation and clearer documentation:

  1. AgentPort should be validated for the valid port range (1-65535)
  2. Concurrency and CreationPoolSize should have documented default values
  3. Required fields should be clearly marked in the documentation

Consider adding validation in the Bind() method:

 func (ic *IndexDeleter) Bind() *IndexDeleter {
+    if ic.AgentPort <= 0 || ic.AgentPort > 65535 {
+        ic.AgentPort = 8080 // or another suitable default
+    }
+    if ic.Concurrency <= 0 {
+        ic.Concurrency = 1 // or another suitable default
+    }
+    if ic.CreationPoolSize == 0 {
+        ic.CreationPoolSize = 1 // or another suitable default
+    }

17-47: Consider adding expiry time configuration.

Given that this PR implements deletion of expired indexes, consider adding fields to support expiry time configuration:

 type IndexDeleter struct {
     // ... existing fields ...
+
+    // ExpiryDuration represents the duration after which indexes are considered expired
+    ExpiryDuration string `json:"expiry_duration" yaml:"expiry_duration"`
+
+    // DeleteBatchSize represents the maximum number of indexes to delete in one batch
+    DeleteBatchSize int `json:"delete_batch_size" yaml:"delete_batch_size"`
 }

This would provide more control over the deletion process and help prevent overloading the system when deleting multiple expired indexes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f80bdc5 and ab8671f.

⛔ Files ignored due to path filters (1)
  • hack/docker/gen/main.go is excluded by !**/gen/**
📒 Files selected for processing (9)
  • Makefile (1 hunks)
  • Makefile.d/build.mk (4 hunks)
  • Makefile.d/docker.mk (3 hunks)
  • cmd/index/job/deletion/sample.yaml (1 hunks)
  • internal/config/index_deleter.go (1 hunks)
  • k8s/index/job/deletion/configmap.yaml (1 hunks)
  • pkg/index/job/deletion/config/config.go (1 hunks)
  • pkg/index/job/deletion/service/deleter.go (1 hunks)
  • pkg/index/job/deletion/usecase/deletion.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • Makefile
  • Makefile.d/build.mk
  • k8s/index/job/deletion/configmap.yaml
  • pkg/index/job/deletion/config/config.go
  • pkg/index/job/deletion/usecase/deletion.go
🔇 Additional comments (9)
cmd/index/job/deletion/sample.yaml (2)

210-214: ⚠️ Potential issue

Update observability attribute placeholders.

The attribute configuration uses placeholder values that need to be replaced:

    attribute:
      namespace: "_MY_POD_NAMESPACE_"
      pod_name: "_MY_POD_NAME_"
      node_name: "_MY_NODE_NAME_"

These values should be replaced with actual Kubernetes downward API references in your deployment configuration.


72-78: Validate agent configuration and DNS settings.

  • The index_id: "sample" is a placeholder that needs to be replaced with your actual index ID
  • Verify that the agent DNS vald-agent-ngt.default.svc.cluster.local is correctly configured in your cluster
  • Consider setting node_name if you need to target specific nodes
✅ Verification successful

Let me try a different approach to verify the agent configuration and DNS settings.

This script will help us:

  1. Find all agent-related configurations to compare with the sample
  2. Look for service definitions that might define the DNS
  3. Check deployment configurations for agent components

Let me know the results, and I'll provide a comprehensive verification of the configuration.


Based on the search results, I can now provide a comprehensive verification of the agent configuration. The results show that:

  1. The agent DNS pattern vald-agent-ngt.default.svc.cluster.local is consistent with Kubernetes DNS naming conventions
  2. The configuration matches other similar configurations in the codebase (e.g., in k8s/gateway/gateway/lb/configmap.yaml)
  3. The agent service definitions exist in the codebase (k8s/agent/svc.yaml)
  4. The port 8081 is consistently used across agent configurations

Let me generate the final response:

Agent configuration is valid but requires environment-specific adjustments

  • The index_id: "sample" needs to be replaced with your actual index ID as noted
  • The agent configuration (agent_port: 8081, agent_name: "vald-agent-ngt", agent_dns: vald-agent-ngt.default.svc.cluster.local) follows the standard Vald deployment pattern
  • The agent_namespace: "default" is valid but should be adjusted if you're deploying to a different namespace
  • The node_name can remain empty unless you need node-specific targeting
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify DNS configuration and service existence
echo "Checking for agent service definition:"
fd -t f "vald-agent-ngt" -e yaml -e yml

Length of output: 180


Script:

#!/bin/bash
# Search for agent-related configurations and service definitions
echo "Searching for agent configurations:"
rg -A 5 "agent_dns|agent_name|agent_port" --type yaml

echo -e "\nSearching for service definitions:"
rg -A 5 "kind:\s+Service" --type yaml

echo -e "\nSearching for agent deployments:"
rg -A 5 "kind:\s+Deployment.*agent" --type yaml

Length of output: 70641

Makefile.d/docker.mk (2)

39-39: LGTM! Build target dependencies properly updated.

The index-deletion image has been correctly added to both the sequential and parallel build targets while maintaining alphabetical order.

Also applies to: 67-67


346-356: LGTM! Index deletion targets follow established patterns.

The new targets for building the index-deletion image are well-structured and consistent with other index-related image targets in the Makefile:

  • Proper .PHONY declarations
  • Clear documentation comments
  • Consistent naming conventions
  • Correct Dockerfile path pattern
  • Standard build parameters
pkg/index/job/deletion/service/deleter.go (5)

45-59: Well-structured New function with appropriate option handling

The New function correctly initializes the Deleter with default and provided options, handling errors appropriately and ensuring that critical errors halt the initialization process.


61-72: Efficient removal of duplicate addresses in delDuplicateAddrs

The delDuplicateAddrs function effectively removes duplicates from the list of target addresses, ensuring that each address is unique before proceeding.


79-128: Robust error handling and tracing in the Start method

The Start method effectively initiates the deletion process with comprehensive error handling and integration with tracing for observability.


146-148: ⚠️ Potential issue

Handle connection errors without halting the process

Currently, if a connection to any target address fails, the function returns an error immediately, stopping the deletion process for other targets. Consider modifying the error handling to log the error and continue processing the remaining addresses.

Apply this diff to handle connection errors gracefully:

for _, addr := range idx.targetAddrs {
    log.Infof("connect to target agent (%s)", addr)
    if _, err := idx.client.GetClient().Connect(ctx, addr); err != nil {
-       return err
+       log.Errorf("failed to connect to target agent (%s): %v", addr, err)
+       continue
    }
}

This change allows the process to attempt to connect to all specified addresses, even if some connections fail.

Likely invalid or redundant comment.


172-177: ⚠️ Potential issue

Use the correct status wrapper for context.DeadlineExceeded errors

When handling context.DeadlineExceeded, you should use status.WrapWithDeadlineExceeded instead of status.WrapWithCanceled to return the appropriate gRPC status code.

Apply this diff to fix the error handling:

case errors.Is(err, context.DeadlineExceeded):
-   err = status.WrapWithCanceled(
+   err = status.WrapWithDeadlineExceeded(
        vald.RemoveRPCName+" API deadline exceeded", err,
    )
    attrs = trace.StatusCodeDeadlineExceeded(err.Error())

This adjustment ensures that clients receive the correct status code, accurately reflecting the nature of the timeout error.

Likely invalid or redundant comment.

internal/config/index_deleter.go Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (4)
internal/config/index_deleter.go (2)

36-37: Clarify the purpose of Concurrency field.

The comment "represents indexing concurrency" is unclear in the context of an index deletion job. Consider updating it to explicitly state its role in the deletion process.

-	// Concurrency represents indexing concurrency.
+	// Concurrency represents the number of concurrent deletion operations.

17-47: Consider adding expiration and scheduling configuration.

Based on the PR objectives to support both one-time and periodic deletion of expired indexes, consider adding fields for:

  1. Expiration criteria (e.g., age threshold, last access time)
  2. Schedule configuration for periodic deletion

Example additions:

type IndexDeleter struct {
    // ... existing fields ...

    // ExpirationCriteria defines when an index is considered expired
    ExpirationCriteria struct {
        // AgeThreshold represents the minimum age (in days) for an index to be considered expired
        AgeThreshold int `json:"age_threshold" yaml:"age_threshold"`
        // LastAccessThreshold represents the minimum time (in days) since last access
        LastAccessThreshold int `json:"last_access_threshold" yaml:"last_access_threshold"`
    } `json:"expiration_criteria" yaml:"expiration_criteria"`

    // Schedule configuration for periodic deletion (optional, used by CronJob)
    Schedule string `json:"schedule,omitempty" yaml:"schedule,omitempty"`
}
cmd/index/job/deletion/sample.yaml (1)

16-17: Review version and timezone settings.

  • The version v0.0.0 appears to be a placeholder. Consider setting this to match your actual release version.
  • The hardcoded timezone JST might not be suitable for all deployments. Consider making this configurable or using UTC as a default.
pkg/index/job/deletion/service/deleter.go (1)

154-154: Rename emu to errsMu for clarity

The variable emu is used as a mutex to protect access to the errs variable. Renaming it to errsMu or errsMutex will make its purpose clearer.

Apply this diff to rename the mutex variable:

-var emu sync.Mutex
+var errsMu sync.Mutex
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f80bdc5 and ab8671f.

⛔ Files ignored due to path filters (1)
  • hack/docker/gen/main.go is excluded by !**/gen/**
📒 Files selected for processing (9)
  • Makefile (1 hunks)
  • Makefile.d/build.mk (4 hunks)
  • Makefile.d/docker.mk (3 hunks)
  • cmd/index/job/deletion/sample.yaml (1 hunks)
  • internal/config/index_deleter.go (1 hunks)
  • k8s/index/job/deletion/configmap.yaml (1 hunks)
  • pkg/index/job/deletion/config/config.go (1 hunks)
  • pkg/index/job/deletion/service/deleter.go (1 hunks)
  • pkg/index/job/deletion/usecase/deletion.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • Makefile
  • Makefile.d/build.mk
  • k8s/index/job/deletion/configmap.yaml
  • pkg/index/job/deletion/config/config.go
  • pkg/index/job/deletion/usecase/deletion.go
🔇 Additional comments (9)
internal/config/index_deleter.go (1)

1-14: LGTM: License header and package declaration are correct.

cmd/index/job/deletion/sample.yaml (3)

28-44: Review gRPC server configuration defaults.

Several gRPC settings are left at their zero/empty values. Consider setting appropriate values for:

  • connection_timeout
  • keepalive settings
  • Buffer and message sizes
    These settings are crucial for production deployments to handle connection management and resource utilization effectively.

61-61: Empty metrics_servers configuration.

The metrics_servers section is empty. Consider configuring this section if you plan to expose metrics for monitoring the deletion job's performance.


72-78: Review deleter configuration for production readiness.

  • The index_id: "sample" needs to be replaced with your actual index ID.
  • Consider if the default namespace is appropriate for your deployment.
  • The empty node_name might need to be set depending on your scheduling requirements.
Makefile.d/docker.mk (3)

39-39: LGTM: Correctly added index-deletion target

The addition of docker/build/index-deletion to the build targets follows the established pattern and maintains alphabetical ordering.


67-67: LGTM: Correctly added index-deletion to xpanes build

The addition of docker/build/index-deletion to the xpanes build list follows the established pattern and maintains alphabetical ordering.


346-356: LGTM with verification needed

The implementation of the index deletion targets follows the established patterns correctly. However, please verify:

  1. The existence of the Dockerfile at dockers/index/job/deletion/Dockerfile
  2. The definition of INDEX_DELETION_IMAGE variable in the Makefile
✅ Verification successful

Since both verification checks have passed successfully:

  1. The Dockerfile exists at the specified path dockers/index/job/deletion/Dockerfile
  2. The INDEX_DELETION_IMAGE variable is properly defined in the Makefile as $(NAME)-index-deletion

I can now generate the final response:

All required dependencies verified successfully

The implementation of the index deletion targets is correct and all dependencies are in place:

  • ✓ Dockerfile exists at the specified path
  • INDEX_DELETION_IMAGE variable is properly defined
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of required files and variables

# Test 1: Check if the Dockerfile exists
echo "Checking for Dockerfile..."
if [ -f "dockers/index/job/deletion/Dockerfile" ]; then
    echo "✓ Dockerfile exists"
else
    echo "✗ Dockerfile not found at dockers/index/job/deletion/Dockerfile"
fi

# Test 2: Check if INDEX_DELETION_IMAGE is defined in Makefile
echo "Checking for INDEX_DELETION_IMAGE variable..."
rg "INDEX_DELETION_IMAGE.*=.*" -g "Makefile*"

Length of output: 377

pkg/index/job/deletion/service/deleter.go (2)

79-128: Start method implementation is correct

The Start method properly initiates the deletion process, handles error wrapping, and incorporates tracing effectively.


88-94: Verify the correctness of targetIndexID usage

The Remove method is called with idx.targetIndexID. Ensure that targetIndexID is correctly set and that all components that rely on this value are appropriately handled.

Run the following script to check the usage of targetIndexID:

✅ Verification successful

Let me analyze the current findings and generate additional verification scripts to ensure complete understanding of the targetIndexID initialization and usage flow.

This will help us understand:

  1. How the field is defined in the struct
  2. If there are any other places where it might be modified
  3. The complete initialization context in options.go

Let me generate one final verification to check the initialization flow and usage in the constructor:

This will help us understand how the index struct is created and if there are any validation checks for the targetIndexID field before usage.


Based on the comprehensive analysis of the code, I can now generate the final response:

targetIndexID initialization and usage is properly handled

The verification shows that:

  1. targetIndexID is properly defined as a string field in the index struct
  2. It's initialized through the WithTargetIndexID option function, which is part of the standard options pattern
  3. The initialization happens before usage, as the New function applies all options (including WithTargetIndexID) before returning the instance
  4. The usage in the Remove call is consistent with its initialization
  5. Error handling is in place with appropriate logging when deletion operations fail
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `targetIndexID` is set and used correctly across the codebase.

# Test: Search for assignments to `idx.targetIndexID`. Expect: Proper initialization before use.
rg -A 3 'idx\.targetIndexID\s*='

# Test: Search for usage of `idx.targetIndexID` in function calls. Expect: Consistent usage.
rg -A 2 'idx\.targetIndexID'

Length of output: 1511


Script:

#!/bin/bash
# Let's verify the initialization flow and struct definition

# Check the struct definition and constructor
ast-grep --pattern 'type $_ struct {
  $$$
  targetIndexID $_
  $$$
}'

# Check if there are any other methods that might modify targetIndexID
ast-grep --pattern 'func ($_ *$_) $_($$_) {
  $$$
  targetIndexID = $_
  $$$
}'

# Check the initialization flow in options.go
rg -B 5 -A 2 'targetIndexID = indexID'

Length of output: 1167


Script:

#!/bin/bash
# Check the New function and initialization flow
ast-grep --pattern 'func New($$$) ($_, error) {
  $$$
}'

# Check if there are any validation checks for targetIndexID
rg -B 2 -A 5 'func New' pkg/index/job/deletion/service/deleter.go

Length of output: 78055

internal/config/index_deleter.go Outdated Show resolved Hide resolved
internal/config/index_deleter.go Outdated Show resolved Hide resolved
internal/config/index_deleter.go Show resolved Hide resolved
cmd/index/job/deletion/sample.yaml Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Show resolved Hide resolved
pkg/index/job/deletion/service/deleter.go Show resolved Hide resolved
Copy link

codecov bot commented Oct 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 371 lines in your changes missing coverage. Please review.

Project coverage is 23.92%. Comparing base (d5df286) to head (3ee69c0).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
pkg/index/job/deletion/service/deleter.go 0.00% 146 Missing ⚠️
pkg/index/job/deletion/usecase/deletion.go 0.00% 131 Missing ⚠️
pkg/index/job/deletion/config/config.go 0.00% 27 Missing ⚠️
pkg/index/job/deletion/service/options.go 0.00% 25 Missing ⚠️
cmd/index/job/deletion/main.go 0.00% 22 Missing ⚠️
internal/config/index_deleter.go 0.00% 12 Missing ⚠️
hack/actions/gen/main.go 0.00% 4 Missing ⚠️
hack/docker/gen/main.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2702      +/-   ##
==========================================
- Coverage   24.11%   23.92%   -0.19%     
==========================================
  Files         539      545       +6     
  Lines       53953    54324     +371     
==========================================
- Hits        13009    12998      -11     
- Misses      40170    40551     +381     
- Partials      774      775       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai[bot]
coderabbitai bot previously approved these changes Oct 26, 2024
@highpon highpon dismissed stale reviews from coderabbitai[bot] and vankichi via 06cdd2b October 26, 2024 09:55
@highpon highpon force-pushed the feat/implement-delete-expired-index branch from ab8671f to 06cdd2b Compare October 26, 2024 09:55
@highpon highpon force-pushed the feat/implement-delete-expired-index branch from 06cdd2b to 3ee69c0 Compare October 26, 2024 09:56
@highpon highpon requested a review from kpango October 26, 2024 09:57
@highpon
Copy link
Contributor Author

highpon commented Oct 26, 2024

@kpango
I fixed all your points.
When you have time, please check review!

Copy link
Collaborator

@kpango kpango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank you for your Great Contribution.
We needs 2 approvals before merging, our team member will review this PR in few days.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
pkg/index/job/deletion/service/options.go (2)

8-9: Consider adding documentation for the Option type.

While the functional options pattern is well implemented, adding documentation about the unexported index type would improve API clarity. Consider adding a comment explaining what fields can be configured through these options.

+// Option is a function that configures an index instance.
+// The index type supports configuration of:
+// - discoverer client
+// - indexing concurrency
+// - target addresses
+// - target index ID
 type Option func(_ *index) error

10-30: Consider using a named constant for minimum concurrency.

The implementation looks good with proper error handling. Consider defining a constant for the minimum concurrency value to improve maintainability and documentation.

+const (
+    // MinIndexingConcurrency defines the minimum allowed value for indexing concurrency
+    MinIndexingConcurrency = 1
+)

 func WithIndexingConcurrency(num int) Option {
     return func(idx *index) error {
-        if num <= 0 {
+        if num < MinIndexingConcurrency {
             return errors.NewErrInvalidOption("indexingConcurrency", num)
         }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ab8671f and 3ee69c0.

⛔ Files ignored due to path filters (2)
  • hack/actions/gen/main.go is excluded by !**/gen/**
  • hack/docker/gen/main.go is excluded by !**/gen/**
📒 Files selected for processing (21)
  • .cspell.json (1 hunks)
  • .gitfiles (1 hunks)
  • .github/actions/detect-docker-image-tags/action.yaml (1 hunks)
  • .github/workflows/dockers-image-scan.yaml (1 hunks)
  • .github/workflows/dockers-index-deletion-image.yaml (1 hunks)
  • .github/workflows/dockers-index-deletion.yaml (1 hunks)
  • .github/workflows/dockers-release-branch-image.yaml (1 hunks)
  • Makefile (1 hunks)
  • Makefile.d/build.mk (4 hunks)
  • Makefile.d/docker.mk (3 hunks)
  • cmd/index/job/deletion/main.go (1 hunks)
  • cmd/index/job/deletion/sample.yaml (1 hunks)
  • dockers/index/job/deletion/Dockerfile (1 hunks)
  • hack/cspell/main.go (1 hunks)
  • internal/config/index_deleter.go (1 hunks)
  • k8s/index/job/deletion/configmap.yaml (1 hunks)
  • k8s/index/job/deletion/cronjob.yaml (1 hunks)
  • pkg/index/job/deletion/config/config.go (1 hunks)
  • pkg/index/job/deletion/service/deleter.go (1 hunks)
  • pkg/index/job/deletion/service/options.go (1 hunks)
  • pkg/index/job/deletion/usecase/deletion.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (20)
  • .cspell.json
  • .gitfiles
  • .github/actions/detect-docker-image-tags/action.yaml
  • .github/workflows/dockers-image-scan.yaml
  • .github/workflows/dockers-index-deletion-image.yaml
  • .github/workflows/dockers-index-deletion.yaml
  • .github/workflows/dockers-release-branch-image.yaml
  • Makefile
  • Makefile.d/build.mk
  • Makefile.d/docker.mk
  • cmd/index/job/deletion/main.go
  • cmd/index/job/deletion/sample.yaml
  • dockers/index/job/deletion/Dockerfile
  • hack/cspell/main.go
  • internal/config/index_deleter.go
  • k8s/index/job/deletion/configmap.yaml
  • k8s/index/job/deletion/cronjob.yaml
  • pkg/index/job/deletion/config/config.go
  • pkg/index/job/deletion/service/deleter.go
  • pkg/index/job/deletion/usecase/deletion.go
🔇 Additional comments (2)
pkg/index/job/deletion/service/options.go (2)

1-7: LGTM! Clean package structure with focused imports.

The package structure follows Go conventions and imports only the necessary dependencies.


1-48: Verify integration with Remove/RemoveByTimestamp RPCs.

As per PR comments, the implementation should use existing Remove or RemoveByTimestamp RPCs from Vald.proto. Let's verify the integration of these options with the RPC calls.

✅ Verification successful

Options are correctly integrated with Remove RPC

The implementation correctly uses the Remove RPC from Vald.proto, and the options are properly integrated:

  • WithDiscoverer provides the client for RPC communication
  • WithIndexingConcurrency controls parallel deletion operations
  • WithTargetAddrs specifies target agents for deletion
  • WithTargetIndexID sets the ID to be deleted via Remove RPC

The deleter implementation shows proper error handling, tracing, and concurrent execution across multiple agents. While there are no tests, the core functionality aligns with the PR objectives.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Remove/RemoveByTimestamp RPC usage in the codebase
echo "Searching for Remove RPC usage..."
rg -A 5 "Remove\(" pkg/index/job/deletion/

echo "Searching for RemoveByTimestamp RPC usage..."
rg -A 5 "RemoveByTimestamp\(" pkg/index/job/deletion/

Length of output: 662


Script:

#!/bin/bash
# Let's check the full context of the deleter implementation
echo "Checking deleter implementation..."
cat pkg/index/job/deletion/service/deleter.go

# Also check if there are any tests to verify the integration
echo -e "\nChecking deleter tests..."
cat pkg/index/job/deletion/service/deleter_test.go 2>/dev/null || echo "No test file found"

Length of output: 6790

pkg/index/job/deletion/service/options.go Show resolved Hide resolved
@kmrmt
Copy link
Contributor

kmrmt commented Oct 26, 2024

@all-contributors please add @highpon for code

Copy link
Contributor

@kmrmt

I've put up a pull request to add @highpon! 🎉

@kmrmt kmrmt merged commit 8fa3b7f into vdaas:main Oct 26, 2024
251 of 259 checks passed
@highpon highpon deleted the feat/implement-delete-expired-index branch October 26, 2024 15:41
vdaas-ci pushed a commit that referenced this pull request Oct 26, 2024
* feat: Implement delete expired index job

* fix: Replace agent Remove RPC to Vald Remove RPC

* fix: service name

* fix: typo

* fix: log method

* fix: variable name

* fix: use internal package

* fix: Change struct field name
kpango pushed a commit that referenced this pull request Oct 27, 2024
* feat: Implement delete expired index job

* fix: Replace agent Remove RPC to Vald Remove RPC

* fix: service name

* fix: typo

* fix: log method

* fix: variable name

* fix: use internal package

* fix: Change struct field name

Co-authored-by: s-shiraki <[email protected]>
takuyaymd pushed a commit to takuyaymd/vald that referenced this pull request Dec 2, 2024
* feat: Implement delete expired index job

* fix: Replace agent Remove RPC to Vald Remove RPC

* fix: service name

* fix: typo

* fix: log method

* fix: variable name

* fix: use internal package

* fix: Change struct field name
takuyaymd pushed a commit to takuyaymd/vald that referenced this pull request Dec 2, 2024
* feat: Implement delete expired index job

* fix: Replace agent Remove RPC to Vald Remove RPC

* fix: service name

* fix: typo

* fix: log method

* fix: variable name

* fix: use internal package

* fix: Change struct field name
This was referenced Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Impl] Delete expired index from Agent using k8s Job
4 participants