Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data quality assistant #2657

Merged
merged 39 commits into from
Oct 30, 2024
Merged

Data quality assistant #2657

merged 39 commits into from
Oct 30, 2024

Conversation

gaspergrom
Copy link
Contributor

@gaspergrom gaspergrom commented Oct 24, 2024

Changes proposed ✍️

What

copilot:summary

copilot:poem

Why

How

copilot:walkthrough

Checklist ✅

  • Label appropriately with Feature, Improvement, or Bug.
  • Add screenshots to the PR description for relevant FE changes
  • New backend functionality has been unit-tested.
  • API documentation has been updated (if necessary) (see docs on API documentation).
  • Quality standards are met.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced asynchronous endpoints for retrieving data quality issues related to members and organizations.
    • Added a Data Quality Assistant page with tabbed navigation for managing member and organization data quality.
    • New Vue components for handling data quality issues, merge suggestions, and dropdowns for project groups and issue types.
  • Permissions

    • Added new permissions for data quality read and edit operations.
  • Styling

    • New badge styles for "Danger" and "Warning" types.
  • Enhancements

    • Improved routing and error handling for data quality features.
    • Expanded API service for managing data quality issues.
    • Enhanced dropdown components for selecting project groups and data issue types.
    • Added detailed data context in member merge suggestions.
    • Integrated new constants for various data quality issue types, improving configuration and management.

@gaspergrom gaspergrom self-assigned this Oct 24, 2024
Copy link

coderabbitai bot commented Oct 24, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The changes introduce a comprehensive data quality management feature across both backend and frontend components. New asynchronous endpoints are added for retrieving member and organization data quality issues, along with a routing structure to integrate these endpoints into the existing Express application. The frontend is enhanced with multiple Vue components for displaying and managing data quality issues, including dropdowns for selecting project groups and issue types. Additionally, new permissions and badge styles are defined, along with a set of utility functions for handling data quality checks.

Changes

File Change Summary
backend/src/api/dataQuality/dataQualityMember.ts Added asynchronous GET endpoint for member data quality issues, validating permissions and handling segment IDs.
backend/src/api/dataQuality/dataQualityOrganization.ts Added asynchronous GET endpoint for organization data quality issues, validating permissions and handling segment IDs.
backend/src/api/dataQuality/index.ts Introduced routing for data quality endpoints in Express, wrapping route handlers with error middleware.
backend/src/api/index.ts Integrated data quality routes into the existing API structure.
backend/src/database/repositories/dataQualityRepository.ts Created DataQualityRepository class with methods for querying member data based on various criteria.
backend/src/services/dataQualityService.ts Introduced DataQualityService class with methods for finding member and organization data quality issues.
backend/src/types/data-quality/data-quality-filters.ts Defined IDataQualityType enum and IDataQualityParams interface for data quality checks.
frontend/config/styles/components/badge.scss Added CSS custom properties for new "Danger" and "Warning" badge styles.
frontend/src/config/permissions/admin.ts Added new permissions for data quality read and edit actions.
frontend/src/config/permissions/projectAdmin.ts Added new permissions for data quality read and edit actions.
frontend/src/config/permissions/readonly.ts Added permissions for data quality read and edit, with editing set to false.
frontend/src/modules/data-quality/components/data-quality-member.vue Created component for managing data quality member views with dropdowns for project groups and issue types.
frontend/src/modules/data-quality/components/data-quality-organization.vue Created component for managing data quality organization views with dropdowns for project groups and issue types.
frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue Introduced component to display individual member data quality issues.
frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue Created component to manage and display lists of member data quality issues.
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue Added component for displaying member merge suggestions.
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue Created component for managing member merge suggestions with loading and pagination features.
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue Added component for displaying organization merge suggestions.
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue Created component for managing organization merge suggestions with loading and pagination features.
frontend/src/modules/data-quality/config/data-issue-types.ts Defined configurations for data issue types and associated constants.
frontend/src/modules/data-quality/config/types/no-work-experience.ts Added constant for representing the "No Work Experience" data issue type.
frontend/src/modules/data-quality/data-quality.module.ts Introduced module for data quality routing configurations.
frontend/src/modules/data-quality/data-quality.routes.ts Defined routing configurations for the Data Quality Assistant feature.
frontend/src/modules/data-quality/pages/data-quality.page.vue Created main page component for the Data Quality Assistant with tabbed navigation.
frontend/src/modules/data-quality/services/data-quality.api.service.ts Introduced DataQualityApiService class for handling API requests related to data quality issues.
frontend/src/modules/data-quality/types/DataIssueType.ts Updated enumeration for various data issue types.
frontend/src/modules/index.ts Integrated the new data quality module into the main application modules.
frontend/src/modules/layout/components/menu/menu.vue Added a new menu link for the Data Quality Assistant.
frontend/src/shared/modules/monitoring/types/event.ts Added new event key for tracking Data Quality Assistant interactions.
frontend/src/shared/modules/permissions/types/Permissions.ts Introduced new permissions for data quality management.
frontend/src/ui-kit/badge/badge.scss Updated badge styles to include new types for "Danger" and "Warning."
frontend/src/ui-kit/badge/types/BadgeType.ts Expanded badge type definitions to include "danger" and "warning."
services/libs/data-access-layer/src/data-quality/index.ts Exported all entities from members and organizations modules for consolidated access.
services/libs/data-access-layer/src/data-quality/members.ts Implemented various functions for retrieving member data based on specific criteria.
services/libs/data-access-layer/src/data-quality/organizations.ts Introduced functions for managing member organizations in the data access layer.
frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts Defined constant for the "Conflicting Work Experience" data issue type.
frontend/src/modules/data-quality/config/types/too-many-emails.ts Defined constant for the "Too Many Emails" data issue type.
frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts Defined constant for the "Too Many Identities Per Platform" data issue type.
frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue Created dropdown component for selecting project groups.
frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue Created dropdown component for selecting data quality issue types.
frontend/src/ui-kit/tabs/Tab.vue Added optional property to preserve query parameters when selecting tabs.
frontend/src/modules/data-quality/config/types/too-many-identities.ts Defined constant for the "Too Many Identities" data issue type.
frontend/src/modules/data-quality/config/types/work-experience-missing-info.ts Defined constant for the "Work Experience Missing Info" data issue type.
frontend/src/modules/data-quality/config/types/work-experience-missing-period.ts Defined constant for the "Work Experience Missing Period" data issue type.
frontend/src/modules/member/components/suggestions/member-merge-similarity.vue Updated component to enhance visual representation and flexibility.
frontend/src/modules/member/components/suggestions/member-merge-suggestion-dropdown.vue Created dropdown component for handling member merge suggestions.
frontend/src/modules/member/pages/member-merge-suggestions-page.vue Replaced dropdown with new component for managing member merge suggestions.
scripts/cli Added new commands for service management in development environments.

Possibly related PRs

  • Org and member view updates (LFX-1664) #2660: The changes in this PR involve modifications to the backend for both organizations and members views, specifically returning members and organizations without activities, which aligns with the new asynchronous endpoint introduced in the main PR for retrieving member data quality issues.

Poem

In the fields where data hops,
New quality checks, no more flops!
With badges bright in red and gold,
We manage issues, brave and bold.
So gather 'round, let's take a leap,
For data's health, our promise we keep! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 59

🧹 Outside diff range and nitpick comments (28)
backend/src/types/data-quality/data-quality-filters.ts (1)

1-11: Consider expanding data quality types for future use cases.

As this is part of a larger data quality management feature, consider:

  1. Adding an extensible way to define severity levels for different issue types
  2. Including metadata fields that could help in issue resolution (e.g., suggestedAction, impact)
  3. Adding a namespace or grouping mechanism for when more issue types are added

Example structure:

export interface DataQualityMetadata {
  severity: 'low' | 'medium' | 'high';
  suggestedAction: string;
  impact: string;
}

export const DATA_QUALITY_METADATA: Record<DataQualityType, DataQualityMetadata> = {
  // ... metadata for each type
};
frontend/src/modules/data-quality/config/types/more-than-10-identities.ts (1)

7-7: Enhance the warning message with more context.

The current description could be more informative about why having many identities might be problematic and what actions users should take.

- description: (member: any) => `This profile has ${member.identityCount} identities, please review them if they are all necessary.`,
+ description: (member: any) => `This profile has ${member.identityCount} identities. Having too many identities may indicate duplicate or obsolete entries. Please review and consider merging or removing unnecessary ones to maintain data quality.`,
frontend/src/modules/data-quality/config/types/more-than-1-identity-per-platform.ts (1)

4-13: Consider adding JSDoc documentation.

Adding documentation would help other developers understand the purpose and usage of this configuration.

+/**
+ * Configuration for detecting and displaying issues related to multiple identities
+ * on a single platform.
+ *
+ * @type {DataIssueTypeConfig}
+ */
 const moreThan1IdentityPerPlatform: DataIssueTypeConfig = {
frontend/src/modules/data-quality/config/data-issue-types.ts (1)

8-13: Add JSDoc documentation to the interface.

While the interface is well-structured, adding JSDoc documentation would improve code maintainability by clearly describing the purpose of each property.

+/**
+ * Configuration for a data quality issue type.
+ */
 export interface DataIssueTypeConfig {
+  /** Display label for the issue type */
   label: string;
+  /** Type of badge to display (affects styling) */
   badgeType: string;
+  /** Function to generate badge text based on the entity */
   badgeText: (entity: Contributor | Organization) => string;
+  /** Function to generate detailed description based on the entity */
   description: (entity: Contributor | Organization) => string;
 }
frontend/src/modules/data-quality/data-quality.routes.ts (1)

8-35: Consider these architectural improvements for better routing structure.

  1. The parent route has an empty name which could potentially cause naming conflicts. Consider giving it a descriptive name like 'data-quality-layout'.

  2. The URL structure could be more RESTful. Consider using:

    • /data-quality instead of /data-quality-assistant
    • This leaves room for sub-resources like /data-quality/members or /data-quality/organizations
  3. Consider adding a redirect route for unauthorized access.

Here's a suggested improvement:

 export default [
   {
-    name: '',
+    name: 'data-quality-layout',
     path: '',
     component: Layout,
     meta: {
       auth: true,
       title: 'Data Quality Assistant',
       segments: {
         requireSelectedProjectGroup: true,
       },
     },
     children: [
       {
         name: 'data-quality-assistant',
-        path: '/data-quality-assistant',
+        path: '/data-quality',
         component: DataQualityPage,
         meta: {
           auth: true,
           eventKey: PageEventKey.DATA_QUALITY_ASSISTANT,
         },
         beforeEnter: [
           PermissionGuard(LfPermission.dataQualityRead),
         ],
       },
+      {
+        path: '/data-quality-assistant',
+        redirect: '/data-quality'
+      }
     ],
   },
 ];
frontend/src/modules/data-quality/components/data-quality-organization.vue (2)

14-14: Consider moving decorative elements to CSS.

The border line should be handled through CSS rather than an empty div.

-    <div class="border-b border-gray-200 w-full mb-1" />
+    <style scoped>
+    .flex {
+      border-bottom: 1px solid theme('colors.gray.200');
+      margin-bottom: 0.25rem;
+    }
+    </style>

30-34: Consider consolidating script tags.

The component name can be defined in the setup script, eliminating the need for a separate script block.

<script lang="ts" setup>
import LfButton from '@/ui-kit/button/Button.vue';
import { ref } from 'vue';
import LfDataQualityOrganizationMergeSuggestions
  from '@/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue';

const tab = ref('merge-suggestions');
+ defineOptions({
+   name: 'LfDataQualityOrganization',
+ });
</script>

- <script lang="ts">
- export default {
-   name: 'LfDataQualityOrganization',
- };
- </script>
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (1)

1-26: Add accessibility attributes to improve user experience.

The template should include proper ARIA attributes for better screen reader support.

Apply these changes:

-<article class="border-b border-gray-100 py-5 flex items-center justify-between">
+<article 
+  class="border-b border-gray-100 py-5 flex items-center justify-between"
+  role="listitem"
+  aria-label="Member merge suggestion">
   <div class="flex items-center gap-2">
backend/src/api/dataQuality/dataQualityMember.ts (3)

1-5: Consider using consistent import paths.

The file mixes relative imports (../../security/permissions) with absolute imports (@/services/dataQualityService). Consider standardizing to absolute imports for better maintainability and consistency.

-import Permissions from '../../security/permissions'
-import PermissionChecker from '../../services/user/permissionChecker'
-import isFeatureEnabled from '../../feature-flags/isFeatureEnabled'
+import Permissions from '@/security/permissions'
+import PermissionChecker from '@/services/user/permissionChecker'
+import isFeatureEnabled from '@/feature-flags/isFeatureEnabled'

7-19: Enhance API documentation completeness.

The API documentation is missing:

  1. Query parameter documentation for segments
  2. Import or definition of the DataQualityResponse type
  3. Description of the 400 error response for missing segment ID

Add the following to the documentation:

 * @pathParam {string} tenantId - Your workspace/tenant ID
+* @queryParam {string[]} segments - Optional segment IDs to filter data quality issues
 * @response 200 - Ok
 * @responseContent {DataQualityResponse} 200.application/json
+* @response 400 - Bad Request
 * @response 401 - Unauthorized

20-21: Add type definitions for request and response parameters.

Consider adding TypeScript type definitions for better type safety and documentation.

-export default async (req, res) => {
+import { Request, Response } from 'express'
+
+export default async (req: Request, res: Response) => {
frontend/src/modules/data-quality/pages/data-quality.page.vue (2)

4-12: Enhance header accessibility and user context.

Consider adding:

  1. ARIA attributes for better screen reader support
  2. A brief description explaining the purpose of the Data Quality Assistant
 <!-- Header -->
 <div class="mb-6">
   <div class="flex justify-between items-center">
     <div>
-      <app-lf-page-header text-class="text-tiny text-secondary-400 mb-1" />
-      <h4 class="font-semibold">
+      <app-lf-page-header text-class="text-tiny text-secondary-400 mb-1" role="banner" />
+      <h4 class="font-semibold" aria-label="Data Quality Assistant">
         Data Quality Assistant
       </h4>
+      <p class="text-sm text-gray-600 mt-1">
+        Identify and resolve data quality issues across people and organizations.
+      </p>
     </div>

15-22: Improve tab navigation accessibility.

The tab implementation should include proper ARIA attributes and keyboard navigation support.

-          <lf-tabs v-model="tab">
+          <lf-tabs 
+            v-model="tab" 
+            role="tablist" 
+            aria-label="Data quality sections"
+          >
             <lf-tab v-model="tab" name="member">
               People
             </lf-tab>
             <lf-tab v-model="tab" name="organization">
               Organizations
             </lf-tab>
           </lf-tabs>
backend/src/services/dataQualityService.ts (3)

1-1: Remove unnecessary ESLint disable comment.

The /* eslint-disable no-continue */ comment appears to be unnecessary as there are no continue statements in this file.

-/* eslint-disable no-continue */

46-49: Document placeholder method and consider architectural improvements.

The findOrganizationIssues method appears to be a placeholder. Consider either:

  1. Adding proper documentation explaining its status and future implementation plans
  2. Making the method abstract if other classes will implement it
  3. Implementing the actual functionality as suggested by the method name

Add documentation and proper typing:

-  // eslint-disable-next-line class-methods-use-this
-  async findOrganizationIssues() {
-    return Promise.resolve([])
-  }
+  /**
+   * TODO: Implement organization data quality checks
+   * This method will be implemented to find quality issues in organization data
+   * @returns {Promise<Array<IOrganizationQualityIssue>>} Currently returns empty array
+   */
+  async findOrganizationIssues(): Promise<IOrganizationQualityIssue[]> {
+    return [];
+  }

7-50: Consider adding monitoring and pagination metadata.

To improve the service's operational characteristics, consider:

  1. Adding metrics to track data quality issues over time
  2. Including pagination metadata in the response
  3. Implementing rate limiting for resource-intensive checks

Example implementation:

interface IDataQualityResponse<T> {
  data: T[];
  pagination: {
    total: number;
    limit: number;
    offset: number;
    hasMore: boolean;
  };
  metrics?: {
    scanDuration: number;
    issuesFound: number;
  };
}

This would provide valuable insights into data quality trends and improve the API's usability.

frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1)

35-37: Consider creating a dedicated UI components barrel file

Instead of importing UI components from different locations, consider creating an index.ts barrel file in the ui-kit directory to centralize and simplify imports.

This would allow you to simplify the imports to:

import { LfAvatar, LfIcon, LfBadge } from '@/ui-kit';
backend/src/database/repositories/dataQualityRepository.ts (1)

31-40: Consider making the identity threshold configurable and add documentation.

Two improvements needed:

  1. The hardcoded threshold of 10 should be configurable, preferably through environment variables or configuration
  2. Missing JSDoc documentation for the method

Add documentation similar to the first method:

/**
 * Finds members with too many identities.
 *
 * @param {IRepositoryOptions} options - The repository options for executing the query.
 * @param {string} tenantId - The ID of the tenant.
 * @param {number} limit - The maximum number of results to return.
 * @param {number} offset - The offset from which to begin returning results.
 * @param {string} segmentId - The ID of the segment.
 * @return {Promise<Members[]>} - A promise that resolves with an array of members having too many identities.
 */
frontend/src/shared/modules/permissions/types/Permissions.ts (1)

107-110: Consider creating a dedicated "Data Quality" section for these permissions.

While the permission naming follows the established pattern, these permissions would be better organized under a new dedicated "Data Quality" section rather than being placed under "Custom views". This would improve code organization and make it clearer for future maintenance.

Apply this diff to reorganize the sections:

  // Custom views
  customViewsCreate = 'customViewsCreate',
  customViewsTenantManage = 'customViewsTenantManage',

-  // Custom views
+  // Data Quality
  dataQualityRead = 'dataQualityRead',
  dataQualityEdit = 'dataQualityEdit',
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (2)

2-2: Remove unnecessary wrapper div.

The outer div wrapper is not providing any styling or structural benefit and can be removed to reduce DOM nesting.

-<div>
  <div v-if="loading" class="flex justify-center py-20">
    ...
  </div>
-</div>

19-21: Enhance button accessibility.

Add an aria-label to the load more button for better screen reader support.

-<lf-button type="primary-ghost" size="small" :loading="loading" @click="loadMore()">
+<lf-button 
+  type="primary-ghost" 
+  size="small" 
+  :loading="loading" 
+  aria-label="Load more merge suggestions"
+  @click="loadMore()"
+>
frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (2)

30-40: Improve empty state message clarity.

The current empty state message is generic and doesn't provide context about the selected issue type or potential next steps.

       <p class="text-sm text-center text-gray-600 leading-5">
-        We couldn't find any issues with members
+        No {{ dataIssueTypes[selectedType].label.toLowerCase() }} issues found for members. This could mean all members have complete and accurate data for this category.
       </p>

25-27: Make load more button more descriptive.

The current label doesn't indicate what type of content will be loaded.

-          <i class="ri-arrow-down-line" />Load more
+          <i class="ri-arrow-down-line" />Load more member issues
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (4)

26-26: Replace non-standard text size class.

The class text-10xl appears to be a custom size that may not be defined in your design system. Consider using standard Tailwind text sizes or defining this in your design system if it's a recurring pattern.

-class="ri-shuffle-line text-gray-200 text-10xl h-40 flex items-center mb-8"
+class="ri-shuffle-line text-gray-200 text-9xl h-40 flex items-center mb-8"

13-15: Add accessibility attributes to the button.

The button lacks proper accessibility attributes. Consider adding an aria-label for better screen reader support.

-<lf-button type="secondary" size="small" @click="isModalOpen = true; detailsOffset = si">
+<lf-button 
+  type="secondary" 
+  size="small" 
+  aria-label="View merge suggestion details"
+  @click="isModalOpen = true; detailsOffset = si">

40-40: Use kebab-case for prop names in template.

Vue style guide recommends using kebab-case for prop names in templates.

-orderBy: ['activityCount_DESC'],
+order-by: ['activityCount_DESC'],

57-57: Extract pagination limit as a constant.

The pagination limit is hardcoded. Consider extracting it as a named constant for better maintainability.

+const SUGGESTIONS_PAGE_SIZE = 20;
-const limit = ref(20);
+const limit = ref(SUGGESTIONS_PAGE_SIZE);
backend/src/api/index.ts (1)

242-242: Document the new data quality endpoints in API documentation.

Since this introduces new endpoints for data quality checks, ensure they are properly documented in the API documentation.

Consider adding Swagger/OpenAPI documentation for the new endpoints:

  • /tenant/{tenantId}/data-quality/member
  • /tenant/{tenantId}/data-quality/organization

This will help maintain consistency with the existing API documentation setup in setupSwaggerUI.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between e0de6d5 and 2d06e22.

📒 Files selected for processing (37)
  • backend/src/api/dataQuality/dataQualityMember.ts (1 hunks)
  • backend/src/api/dataQuality/dataQualityOrganization.ts (1 hunks)
  • backend/src/api/dataQuality/index.ts (1 hunks)
  • backend/src/api/index.ts (1 hunks)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
  • backend/src/services/dataQualityService.ts (1 hunks)
  • backend/src/types/data-quality/data-quality-filters.ts (1 hunks)
  • frontend/config/styles/components/badge.scss (1 hunks)
  • frontend/src/config/permissions/admin.ts (1 hunks)
  • frontend/src/config/permissions/projectAdmin.ts (1 hunks)
  • frontend/src/config/permissions/readonly.ts (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-member.vue (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/config/data-issue-types.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/more-than-1-identity-per-platform.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/more-than-10-identities.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/no-work-experience.ts (1 hunks)
  • frontend/src/modules/data-quality/data-quality.module.ts (1 hunks)
  • frontend/src/modules/data-quality/data-quality.routes.ts (1 hunks)
  • frontend/src/modules/data-quality/pages/data-quality.page.vue (1 hunks)
  • frontend/src/modules/data-quality/services/data-quality.api.service.ts (1 hunks)
  • frontend/src/modules/data-quality/types/DataIssueType.ts (1 hunks)
  • frontend/src/modules/index.ts (2 hunks)
  • frontend/src/modules/layout/components/menu/menu.vue (1 hunks)
  • frontend/src/shared/modules/monitoring/types/event.ts (1 hunks)
  • frontend/src/shared/modules/permissions/types/Permissions.ts (1 hunks)
  • frontend/src/ui-kit/badge/badge.scss (1 hunks)
  • frontend/src/ui-kit/badge/types/BadgeType.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/index.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/organizations.ts (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • frontend/src/modules/data-quality/data-quality.module.ts
  • frontend/src/modules/data-quality/types/DataIssueType.ts
  • frontend/src/modules/layout/components/menu/menu.vue
  • services/libs/data-access-layer/src/data-quality/index.ts
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 8-52: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

frontend/src/modules/data-quality/services/data-quality.api.service.ts

[error] 6-38: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🪛 GitHub Check: lint-format-services
services/libs/data-access-layer/src/data-quality/members.ts

[failure] 2-2:
Insert

services/libs/data-access-layer/src/data-quality/organizations.ts

[failure] 2-2:
Insert

🔇 Additional comments (41)
frontend/src/ui-kit/badge/types/BadgeType.ts (1)

5-6: LGTM! The new badge types are well integrated.

The addition of 'danger' and 'warning' badge types follows best practices and maintains type safety through the as const assertion.

Let's verify the CSS integration:

✅ Verification successful

CSS styles for new badge types are properly integrated

The verification confirms that the new badge types 'danger' and 'warning' have their corresponding CSS styles properly defined in frontend/config/styles/components/badge.scss:

  • Both types have complete style definitions including background, border, and text colors
  • The styles use semantic color variables from _colors.scss:
    • Danger badge uses red color palette (--lf-color-red-*)
    • Warning badge uses yellow color palette (--lf-color-yellow-*)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that corresponding CSS styles exist for the new badge types

# Check for danger and warning styles in badge-related CSS files
rg -l "(?:--danger|--warning)" "frontend/config/styles/components/badge.scss" "frontend/ui-kit/badge/badge.scss"

# Verify color variables used by these badges
rg "(?:--lf-color-(?:red|yellow)-(?:100|200|800))" "frontend/config/styles"

Length of output: 1334

backend/src/api/dataQuality/index.ts (1)

3-9: Verify tenant isolation and permission checks.

The routes are properly structured with tenant isolation through the :tenantId parameter. However, ensure that the imported handlers implement proper permission checks to prevent unauthorized access to tenant data.

Additionally, consider adding TypeScript type annotations for the Express app parameter:

-export default (app) => {
+export default (app: Express.Application) => {
✅ Verification successful

Based on the shell script results, I can now generate the final response as we have sufficient information about the permission checks and tenant validation in both handler files.

Permission checks and tenant isolation are properly implemented

Both handler files implement proper security measures:

  • Permission checks are enforced using PermissionChecker:
    • dataQualityMember.ts validates memberRead permission
    • dataQualityOrganization.ts validates organizationRead permission
  • Tenant isolation is maintained by using the tenantId from request parameters in both handlers
  • API documentation properly describes security requirements with @security Bearer and includes proper error responses (401, 404, 429)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify permission checks in the handler files

# Test: Search for PermissionChecker usage in handler files
echo "Checking for permission validation in handlers:"
rg -A 5 "PermissionChecker" "./backend/src/api/dataQuality/dataQualityMember.ts" "./backend/src/api/dataQuality/dataQualityOrganization.ts"

# Test: Search for tenant validation in handlers
echo "Checking for tenant validation in handlers:"
rg -A 5 "tenantId" "./backend/src/api/dataQuality/dataQualityMember.ts" "./backend/src/api/dataQuality/dataQualityOrganization.ts"

Length of output: 5782

backend/src/types/data-quality/data-quality-filters.ts (1)

7-11: 🛠️ Refactor suggestion

Add documentation and type constraints.

The interface would benefit from JSDoc documentation and more specific types for pagination parameters.

+/**
+ * Parameters for querying data quality issues.
+ */
export interface IDataQualityParams {
-  type: IDataQualityType
-  limit?: number
-  offset?: number
+  /** The type of data quality issue to query */
+  type: DataQualityType
+  /** Maximum number of results to return (1-100) */
+  limit?: number & { __brand: 'LimitConstraint' }
+  /** Number of results to skip for pagination */
+  offset?: number & { __brand: 'OffsetConstraint' }
}

+/**
+ * Type guard to ensure limit is within valid range
+ */
+export function isValidLimit(limit: number): limit is IDataQualityParams['limit'] {
+  return limit >= 1 && limit <= 100;
+}
frontend/src/modules/data-quality/config/types/no-work-experience.ts (3)

1-1: LGTM! Import statement is well-structured.

The import statement correctly uses path aliasing and properly imports the type definition.


10-10: LGTM! Export statement follows best practices.

The default export is appropriate for a single configuration object.


3-8: Verify consistency with other data quality configurations.

Let's ensure this configuration follows the same patterns as other data quality types.

✅ Verification successful

Based on the shell script results, I can now provide the final response as the verification is complete. The results show all data quality configurations in the codebase.

Configuration structure is consistent, badge types are appropriate

The verification confirms that:

  • All configurations follow the same structure with required properties: label, badgeType, badgeText, and description
  • Badge types are consistently used:
    • danger for critical issues (no work experience)
    • warning for less critical issues (multiple identities)

The no-work-experience configuration aligns well with the established patterns, using appropriate severity level and maintaining structural consistency with other configurations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check consistency across data quality configurations

# Test 1: Check for consistent badge types
echo "Checking badge types across configurations..."
rg "badgeType: '[^']*'" frontend/src/modules/data-quality/config/types/

# Test 2: Check for consistent property structure
echo "Checking property structure..."
ast-grep --pattern 'const $_: DataIssueTypeConfig = {
  label: $_,
  badgeType: $_,
  badgeText: $_,
  description: $_,
}'

Length of output: 3399

frontend/src/modules/data-quality/config/types/more-than-10-identities.ts (1)

3-8: Verify the threshold of 10 identities.

The choice of 10 as the threshold for triggering this warning seems arbitrary. Could you provide context on why this specific number was chosen? This would help validate if the threshold aligns with business requirements and user behavior patterns.

frontend/src/modules/data-quality/config/types/more-than-1-identity-per-platform.ts (1)

1-2: Verify the integration configuration usage.

Let's ensure that CrowdIntegrations.getPlatformsLabel is being used consistently across the codebase.

✅ Verification successful

Based on the shell script results, I can now provide a final response:

getPlatformsLabel usage is consistent across the codebase

The method CrowdIntegrations.getPlatformsLabel() is used consistently throughout the frontend codebase for platform label generation in various components and modules. It's primarily used for:

  • Displaying platform sources in organization and contributor components
  • Generating labels for data quality issues
  • Showing platform information in attribute helpers

The implementation in frontend/src/integrations/integrations-config.js shows it properly filters out system-specific platforms, ensuring consistent platform label generation across all usage points.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check usage of getPlatformsLabel method across the codebase
# Expected: Consistent usage pattern for platform label generation

# Search for all usages of getPlatformsLabel
rg -A 2 "getPlatformsLabel" 

# Check if there are any other similar platform label generation methods
ast-grep --pattern 'CrowdIntegrations.$_($$$)'

Length of output: 10687

frontend/src/modules/index.ts (1)

14-14: LGTM! Clean integration of the data quality module.

The changes follow the established module integration pattern and are properly positioned within the existing structure.

Also applies to: 31-31

frontend/src/modules/data-quality/config/data-issue-types.ts (2)

1-7: LGTM! Well-organized imports.

The imports are logically structured and each import has a clear purpose in the implementation.


15-19: Verify complete enum coverage in configuration mapping.

Let's ensure all DataIssueType enum values are mapped in the configuration.

✅ Verification successful

Based on the results from the shell scripts, I can now generate the final response since I have all the necessary information to verify the enum coverage.

All DataIssueType enum values are properly mapped in configuration ✓

The verification shows perfect alignment between the enum definition and its usage in the configuration:

Enum values defined in DataIssueType:

  • MORE_THAN_10_IDENTITIES
  • MORE_THAN_1_IDENTITY_PER_PLATFORM
  • NO_WORK_EXPERIENCE

All these values are correctly mapped in the dataIssueTypes configuration object with their corresponding configurations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if all DataIssueType enum values are mapped in dataIssueTypes

# Search for enum definition and its usage
echo "DataIssueType enum values:"
ast-grep --pattern 'export enum DataIssueType {
  $$$
}'

echo -e "\nEnum values used in configuration:"
rg '\[DataIssueType\.' frontend/src/modules/data-quality/config/data-issue-types.ts

Length of output: 996

frontend/src/modules/data-quality/data-quality.routes.ts (3)

1-4: LGTM! Well-organized imports.

The imports follow consistent patterns using absolute paths and import only the necessary components and types.


6-6: LGTM! Proper lazy loading implementation.

The component is correctly configured for lazy loading, which will help with initial bundle size and load performance.


30-30: Verify permission configuration across files.

The route uses LfPermission.dataQualityRead. Let's ensure this permission is consistently configured.

✅ Verification successful

Let me verify the actual permission configuration in these files to ensure consistency.


Permission configuration is properly implemented and consistent

The verification shows that LfPermission.dataQualityRead is:

  • Correctly defined in the permissions enum (Permissions.ts)
  • Consistently configured across all role-based permission files:
    • readonly.ts: read-only access (dataQualityRead: true, dataQualityEdit: false)
    • admin.ts and projectAdmin.ts: full access (both read and edit: true)
  • Properly used in the route guard
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify dataQualityRead permission configuration

# Test: Check if the permission is properly defined in all required files
echo "Checking permission configuration in permission files..."
rg -l "dataQualityRead" "frontend/src/config/permissions/"

Length of output: 312


Script:

#!/bin/bash
# Check the actual permission configuration in each file
echo "=== Checking readonly.ts ==="
rg "dataQualityRead" frontend/src/config/permissions/readonly.ts -C 2

echo -e "\n=== Checking admin.ts ==="
rg "dataQualityRead" frontend/src/config/permissions/admin.ts -C 2

echo -e "\n=== Checking projectAdmin.ts ==="
rg "dataQualityRead" frontend/src/config/permissions/projectAdmin.ts -C 2

# Also check the permission enum definition
echo -e "\n=== Checking LfPermission enum definition ==="
rg "dataQualityRead" -A 1 -B 1

Length of output: 2733

frontend/src/modules/data-quality/components/data-quality-organization.vue (1)

3-13: 🛠️ Refactor suggestion

Consider simplifying the UI if only one tab is needed.

The current implementation shows a tab interface with only one option "Merge suggestions". This might be confusing for users as tabs typically imply multiple options.

Let's check if there are plans for additional tabs:

#!/bin/bash
# Search for any TODO comments or references to additional tabs
rg -i "todo.*tab" frontend/src/modules/data-quality/

If no additional tabs are planned, consider simplifying the UI by removing the tab structure:

-    <div class="flex">
-      <div class="border border-gray-200 rounded-lg p-0.5 flex gap-1 mb-6">
-        <lf-button
-          type="secondary-ghost"
-          :class="tab === 'merge-suggestions' ? '!font-semibold !bg-gray-200' : '!font-normal'"
-          @click="tab = 'merge-suggestions'"
-        >
-          Merge suggestions
-        </lf-button>
-      </div>
-    </div>
+    <h2 class="text-lg font-semibold mb-6">Merge Suggestions</h2>
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (2)

38-42: LGTM!

The component definition follows Vue 3 naming conventions correctly.


1-42: Verify component registration and usage.

Let's verify that this component is properly imported and used in parent components.

✅ Verification successful

Let me try a different approach to verify the component usage by searching for both the component name and its usage patterns without relying on file type filtering.

The previous script failed because the vue file type wasn't recognized. This new script searches across all files and specifically looks in the data-quality module directory to understand the component hierarchy. This will help us verify the component's integration within the codebase.


Let me check the actual usage context in the parent component to ensure proper integration.

This will help us verify how the component is being used in its parent component and confirm proper integration.


Component is properly integrated and used

The component is correctly integrated within the codebase:

  • It's properly imported and used in its parent component data-quality-member-merge-suggestions.vue
  • The parent component correctly passes the required suggestion prop
  • The component's slot action is being utilized appropriately
  • The component is part of a well-structured data quality module hierarchy
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check component registration and usage

# Check for component imports
echo "Checking component imports..."
rg -l "import.*LfDataQualityMemberMergeSuggestionsItem" --type vue

# Check for component usage in templates
echo "Checking component usage..."
rg -l "<lf-data-quality-member-merge-suggestions-item" --type vue

Length of output: 326


Script:

#!/bin/bash
# Search for component imports and usage without file type filtering
echo "Checking component imports and registration..."
rg -l "LfDataQualityMemberMergeSuggestionsItem"

echo -e "\nChecking component usage in templates..."
rg -l "<lf-data-quality-member-merge-suggestions-item"

echo -e "\nChecking for potential parent components in data-quality directory..."
fd . "frontend/src/modules/data-quality" -e .vue

Length of output: 1582


Script:

#!/bin/bash
# Check the implementation details of the parent component
echo "Checking parent component implementation..."
cat "frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue"

Length of output: 3244

frontend/config/styles/components/badge.scss (1)

38-46: Well-structured implementation of new badge styles!

The new danger and warning badge styles are well-implemented with:

  • Consistent naming conventions
  • Appropriate use of semantic color variables
  • Clear documentation
  • Proper visual hierarchy using different color intensities for background, border, and text

The implementation aligns well with the data quality management feature by providing clear visual indicators for different states.

backend/src/api/dataQuality/dataQualityMember.ts (1)

35-41: Verify DataQualityService implementation.

Let's verify the service implementation and error handling patterns.

✅ Verification successful

Let me generate another verification script to check the DataQualityService implementation since it wasn't found in the first search.

The first search didn't find the DataQualityService class implementation, so I'm trying a broader search pattern to locate it and understand its implementation. I'm also checking where this service is imported in the file under review.


Based on the verification results, I can now generate the final response since we have found both the service implementation and its usage pattern:

Implementation verified successfully

The code correctly uses the DataQualityService:

  • Service is properly imported from '@/services/dataQualityService'
  • The service implementation in backend/src/services/dataQualityService.ts shows the findMemberIssues method with matching parameters
  • Response handling follows the consistent pattern used throughout the codebase with responseHandler.success
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check DataQualityService implementation
echo "Checking DataQualityService implementation..."
ast-grep --pattern 'class DataQualityService {
  $$$
  findMemberIssues($$$) {
    $$$
  }
  $$$
}'

# Check response handler usage pattern
echo "Checking response handler usage pattern..."
rg -A 2 'responseHandler\.(success|error)' --type ts

Length of output: 44819


Script:

#!/bin/bash
# Search for DataQualityService implementation with a broader pattern
echo "Searching for DataQualityService class..."
rg -A 10 "class DataQualityService" --type ts

# Search for findMemberIssues method specifically
echo "Searching for findMemberIssues method..."
rg -A 5 "findMemberIssues" --type ts

# Check the imports in the file under review
echo "Checking imports in the file under review..."
rg -A 10 "^import.*DataQualityService" backend/src/api/dataQuality/dataQualityMember.ts

Length of output: 3373

frontend/src/modules/data-quality/pages/data-quality.page.vue (1)

41-42: Verify child component implementations.

Let's ensure that the required child components are properly implemented with necessary props and events.

✅ Verification successful

Let me try a different approach to verify these components using fd to first confirm their existence and then examine their content.


Child components are properly implemented and integrated

The verification shows that both required components (data-quality-member.vue and data-quality-organization.vue) exist and are properly implemented with:

  • Correct component structure and naming
  • Proper child components integration
  • Well-defined props and events
  • Complete functionality for data quality features:
    • Member component handles merge suggestions, identities, and work history
    • Organization component handles merge suggestions
  • Proper integration with UI components and services
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of child components

# Check for the existence and implementation of data quality components
echo "Checking data-quality-member component..."
rg -A 10 "export default.*data-quality-member" frontend/src/modules/data-quality/components/

echo "Checking data-quality-organization component..."
rg -A 10 "export default.*data-quality-organization" frontend/src/modules/data-quality/components/

Length of output: 393


Script:

#!/bin/bash
# First, let's verify the existence of these files
echo "Checking for component files..."
fd -e vue "data-quality-member|data-quality-organization" frontend/src/modules/data-quality/components/

# If files exist, let's examine their content
echo -e "\nChecking component implementations..."
for file in $(fd -e vue "data-quality-member|data-quality-organization" frontend/src/modules/data-quality/components/); do
    echo -e "\n=== Content of $file ==="
    cat "$file"
done

Length of output: 21648

frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (2)

44-48: LGTM! Component definition follows Vue.js naming conventions.

The component name is properly defined and follows the recommended naming pattern for Vue components.


9-23: 🛠️ Refactor suggestion

Add error handling for missing organization properties

The component assumes organization.logo and organization.displayName will always be available. Consider adding null checks or default values.

 <lf-avatar
-  :src="organization.logo"
-  :name="organization.displayName"
+  :src="organization.logo ?? ''"
+  :name="organization.displayName ?? 'Unknown Organization'"
   :size="32"
 >
frontend/src/ui-kit/badge/badge.scss (3)

32-36: LGTM! Danger badge type follows established patterns.

The implementation correctly follows the existing badge type pattern and uses appropriate CSS custom properties.


38-42: LGTM! Warning badge type follows established patterns.

The implementation correctly follows the existing badge type pattern and uses appropriate CSS custom properties.


32-42: Verify custom properties are defined.

Let's ensure the new CSS custom properties for danger and warning badges are properly defined in the styles configuration.

✅ Verification successful

Custom properties are properly defined and configured

The verification shows that all the required CSS custom properties for both danger and warning badges are correctly defined:

  • The custom properties are properly defined in frontend/config/styles/components/badge.scss:
    • Danger badge properties using red color variables
    • Warning badge properties using yellow color variables
  • These properties are correctly referenced in frontend/src/ui-kit/badge/badge.scss
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the new CSS custom properties are defined in the styles configuration

# Test: Search for the new custom properties definitions
echo "Checking for danger badge custom properties..."
rg --type css "lf-badge-danger-(background|border|text)"

echo "Checking for warning badge custom properties..."
rg --type css "lf-badge-warning-(background|border|text)"

Length of output: 1487

backend/src/database/repositories/dataQualityRepository.ts (2)

10-29: LGTM! Well-documented and properly implemented.

The method is well-documented with JSDoc comments and follows a clean implementation pattern.


42-51: Verify the per-platform identity threshold and add documentation.

The threshold of 1 identity per platform seems very restrictive. Please verify if this is the intended business logic, as it would flag any member with more than one identity on a single platform.

Additionally:

  1. The threshold should be configurable
  2. Missing JSDoc documentation

Let's verify the current implementation's impact:

frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue (2)

34-45: LGTM!

The imports are well-organized and follow Vue 3 composition API best practices.


57-61: LGTM!

The component definition follows Vue naming conventions.

frontend/src/config/permissions/admin.ts (2)

78-79: LGTM! The new data quality permissions are properly configured.

The addition of read and edit permissions for data quality features follows the established pattern and provides appropriate access levels for admin users.

Let's verify that these permissions are properly defined in the enum:

#!/bin/bash
# Description: Verify that the new permissions are defined in the LfPermission enum
# and consistently applied across different role configurations

# Check the enum definition
rg "enum LfPermission" -A 100

# Check permission usage across role configurations
rg "dataQualityRead|dataQualityEdit" --type ts

78-79: Verify permission consistency across roles.

Ensure these permissions are appropriately configured for other roles (e.g., readonly users might need read-only access).

✅ Verification successful

Let me try a different approach to verify the permission configurations.

This should help us find all files that use the permission system and specifically how dataQuality permissions are configured across different roles or contexts.


Permission configuration is correctly implemented across roles

The verification shows that data quality permissions are properly configured across different role types:

  • Admin role: Has both read and edit permissions (dataQualityRead: true, dataQualityEdit: true)
  • Project Admin role: Has both read and edit permissions (dataQualityRead: true, dataQualityEdit: true)
  • Readonly role: Has read-only access (dataQualityRead: true, dataQualityEdit: false)

The permissions are also properly used in the codebase:

  • Defined in the permissions types (Permissions.ts)
  • Used in route guards (data-quality.routes.ts)
  • Referenced in menu visibility controls (menu.vue)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check permission configurations across different role files

# Find all permission configuration files
fd "permissions.*\.ts$" frontend/src/config/permissions

# Compare permission settings across roles
for file in $(fd "permissions.*\.ts$" frontend/src/config/permissions); do
  echo "=== $file ==="
  rg "dataQuality(Read|Edit)" "$file" || echo "No data quality permissions found"
done

Length of output: 116


Script:

#!/bin/bash
# First, let's find all TypeScript files in the frontend directory that might contain permission configurations
echo "=== Finding permission-related files ==="
rg -l "LfPermission" frontend/src

# Then, let's examine the content of these files for dataQuality permissions
echo -e "\n=== Checking dataQuality permissions usage ==="
rg "dataQuality(Read|Edit)" frontend/src -A 1 -B 1

Length of output: 7725

frontend/src/config/permissions/projectAdmin.ts (1)

78-79: LGTM! Verify permission consistency across roles.

The new data quality permissions are correctly added and align with the project admin's role responsibilities.

Let's verify the consistency of these permissions across other role configurations:

✅ Verification successful

Permissions are consistently configured across all roles

The verification confirms that data quality permissions are properly implemented across all role configurations with appropriate access levels:

  • Admin: Both read and edit enabled (true/true)
  • Project Admin: Both read and edit enabled (true/true)
  • Readonly: Read enabled, edit disabled (true/false)

The permission settings align with the expected role capabilities and follow a logical access control pattern.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify data quality permissions across all permission configuration files
# Expected: dataQualityRead and dataQualityEdit permissions should be present in admin.ts and readonly.ts

echo "Checking permission configurations..."
rg -A 1 "dataQualityRead|dataQualityEdit" "frontend/src/config/permissions/"

Length of output: 816

frontend/src/config/permissions/readonly.ts (1)

78-79: LGTM! Permission configuration aligns with readonly role expectations.

The new data quality permissions are correctly configured:

  • dataQualityRead: true allows readonly users to view data quality information
  • dataQualityEdit: false prevents modification of data quality settings

This maintains consistency with the readonly role's intended access level.

frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (2)

82-96: LGTM! Well-structured component implementation.

The helper functions, lifecycle hooks, and component definition are well-implemented and follow Vue best practices.


26-27: Review the icon size.

The text-10xl class results in an extremely large icon size. Consider using a more moderate size class like text-6xl or text-7xl for better visual balance.

frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (2)

106-110: LGTM!

The component definition follows Vue.js naming conventions and is properly exported.


1-110: Verify component integration and dependencies.

Let's verify the component's integration with the data quality feature and the existence of its dependencies.

✅ Verification successful

Let me gather more information with a refined search strategy.

This refined search should help us verify the component integration and dependencies more thoroughly.


Based on the comprehensive results, I can now generate the final response.

Component integration and dependencies verified successfully

The verification results show that:

  • All required UI components (Button, Radio, Spinner) exist and are properly implemented in the ui-kit directory
  • The component is properly integrated within the data quality module, used by data-quality-member.vue
  • The child component data-quality-member-issues-item.vue exists
  • The DataQualityApiService is implemented in the expected location
  • DataIssueType and data issue types configuration are properly defined with multiple type implementations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify component integration and dependencies

# Test 1: Verify the parent component that uses this component
echo "Checking parent component usage..."
rg -l "LfDataQualityMemberIssues" --type vue

# Test 2: Verify the existence of required components
echo -e "\nChecking required components..."
components=(
    "ui-kit/spinner/Spinner.vue"
    "ui-kit/button/Button.vue"
    "ui-kit/radio/Radio.vue"
    "modules/data-quality/components/member/data-quality-member-issues-item.vue"
)
for component in "${components[@]}"; do
    fd -e vue "$component"
done

# Test 3: Verify the data issue types configuration
echo -e "\nChecking data issue types configuration..."
rg -l "dataIssueTypes" --type ts

Length of output: 948


Script:

#!/bin/bash
# Description: Verify component integration and dependencies with refined search

# Test 1: Search for vue files that might use this component with a different pattern
echo "Checking parent component usage (refined)..."
fd -e vue . | xargs rg -l "data-quality-member-issues"

# Test 2: Verify the existence of required components with a broader search
echo -e "\nChecking UI components..."
fd -e vue -e ts "Spinner|Button|Radio" frontend/src/ui-kit

# Test 3: Check data quality related components
echo -e "\nChecking data quality components..."
fd -e vue "data-quality.*member.*issues" frontend/src/modules

# Test 4: Verify DataQualityApiService implementation
echo -e "\nChecking API service implementation..."
rg -l "class DataQualityApiService" --type ts

# Test 5: Check DataIssueType definition
echo -e "\nChecking DataIssueType definition..."
rg -l "DataIssueType" --type ts

Length of output: 1917

frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1)

95-99: LGTM!

The component export follows Vue naming conventions correctly.

frontend/src/shared/modules/monitoring/types/event.ts (1)

38-38: LGTM! Verify usage in related components.

The addition of DATA_QUALITY_ASSISTANT to PageEventKey follows the established naming patterns and integrates well with the new data quality management feature.

Let's verify its usage in the related components:

✅ Verification successful

Let me try different search patterns and file types to verify the usage.


The DATA_QUALITY_ASSISTANT enum value is properly integrated and used

The verification shows that the new enum value is correctly used in the codebase:

  • It's properly defined in frontend/src/shared/modules/monitoring/types/event.ts
  • It's used in frontend/src/modules/data-quality/data-quality.routes.ts for route configuration and event tracking
  • The string value is consistently used across these files
  • The implementation aligns with the data quality feature's structure, as evidenced by the numerous related files in both frontend and backend
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of DATA_QUALITY_ASSISTANT enum value
# Expected: Should find references in data quality related components

# Search for direct usage of the enum value
rg "PageEventKey.DATA_QUALITY_ASSISTANT" -t ts -t vue

# Search for the string value in case it's used directly
rg "'Data Quality Assistant'" -t ts -t vue

Length of output: 160


Script:

#!/bin/bash
# Search for direct usage of the enum value without file type restrictions
rg "PageEventKey.DATA_QUALITY_ASSISTANT"

# Search for string literal without file type restrictions
rg "'Data Quality Assistant'"

# Search for the enum value in TypeScript files
rg "DATA_QUALITY_ASSISTANT" -t typescript

# List all files that might be related to data quality assistant
fd -e ts -e tsx -e vue "quality" 

Length of output: 1738

backend/src/api/index.ts (1)

242-242: LGTM! Verify error handling consistency.

The data quality route is correctly placed before tenant and segment middleware. Let's verify the error handling implementation matches the existing pattern.

✅ Verification successful

Let me generate a new script to check the error handling pattern with a corrected regex:


Let me try one more time with a properly escaped regex pattern:


Error handling is properly implemented and consistent with the codebase patterns

The verification shows that the data quality routes are well-implemented with proper error handling:

  • Uses safeWrap middleware for global error handling
  • Implements permission checks via PermissionChecker
  • Uses standardized response handling through req.responseHandler
  • Includes proper input validation with clear error messages
  • Has comprehensive API documentation with response codes
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify error handling consistency in the data quality routes

# Test 1: Check if the route uses safeWrap middleware like other routes
echo "Checking for safeWrap usage in data quality routes..."
ast-grep --pattern 'safeWrap($_)' backend/src/api/dataQuality/

# Test 2: Check if the route implements similar error handling as other routes
echo "Checking error handling pattern..."
rg -A 5 "catch\s*\([^)]*\)\s*{" backend/src/api/dataQuality/

Length of output: 668


Script:

#!/bin/bash
# Description: Verify error handling consistency in the data quality routes

# Check error handling implementation in data quality files
echo "Checking error handling in data quality member route..."
rg "catch.*{" -A 5 backend/src/api/dataQuality/dataQualityMember.ts

echo "Checking error handling in data quality organization route..."
rg "catch.*{" -A 5 backend/src/api/dataQuality/dataQualityOrganization.ts

# Get a sample of error handling from other routes for comparison
echo "Checking error handling pattern in other routes..."
rg "catch.*{" -A 5 "backend/src/api/!(dataQuality)/**/*.ts" -l | head -n 2 | xargs rg "catch.*{" -A 5

Length of output: 934


Script:

#!/bin/bash
# Description: Verify error handling consistency in the data quality routes

# Check error handling implementation in data quality files
echo "Checking error handling in data quality member route..."
rg "catch\s*\{" -A 5 backend/src/api/dataQuality/dataQualityMember.ts

echo "Checking error handling in data quality organization route..."
rg "catch\s*\{" -A 5 backend/src/api/dataQuality/dataQualityOrganization.ts

# Let's also verify the middleware usage and route structure
echo "Checking route structure and middleware..."
cat backend/src/api/dataQuality/index.ts

# Get implementation of the route handlers
echo "Checking route handler implementations..."
cat backend/src/api/dataQuality/dataQualityMember.ts
cat backend/src/api/dataQuality/dataQualityOrganization.ts

Length of output: 3851

services/libs/data-access-layer/src/data-quality/organizations.ts (1)

1-143: Overall, the code looks good with minor issues to address

The code is well-structured and aligns with project conventions. Addressing the minor issues will enhance data integrity and prevent potential bugs.

🧰 Tools
🪛 GitHub Check: lint-format-services

[failure] 2-2:
Insert

Comment on lines +1 to +2
import { safeWrap } from '../../middlewares/errorMiddleware'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider importing route handlers directly.

Instead of using require in the route definitions, consider importing the handlers at the top of the file for better code organization and to leverage TypeScript's static type checking.

import { safeWrap } from '../../middlewares/errorMiddleware'
+import memberHandler from './dataQualityMember'
+import organizationHandler from './dataQualityOrganization'

Committable suggestion was skipped due to low confidence.

Comment on lines 1 to 5
export enum IDataQualityType {
MORE_THAN_10_IDENTITIES = 'more-than-10-identities',
MORE_THAN_1_IDENTITY_PER_PLATFORM = 'more-than-1-identity-per-platform',
NO_WORK_EXPERIENCE = 'no-work-experience',
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider revising enum naming and documentation.

  1. The 'I' prefix is typically used for interfaces in TypeScript, not enums. Consider renaming to DataQualityType.
  2. Consider using SCREAMING_SNAKE_CASE for enum values to follow TypeScript conventions.
  3. Add JSDoc comments to document each type's purpose.
+/**
+ * Represents different types of data quality issues that can be detected.
+ */
-export enum IDataQualityType {
+export enum DataQualityType {
+  /** Members with more than 10 identities across all platforms */
-  MORE_THAN_10_IDENTITIES = 'more-than-10-identities',
+  MORE_THAN_10_IDENTITIES = 'MORE_THAN_10_IDENTITIES',
+  /** Members with multiple identities on the same platform */
-  MORE_THAN_1_IDENTITY_PER_PLATFORM = 'more-than-1-identity-per-platform',
+  MORE_THAN_1_IDENTITY_PER_PLATFORM = 'MORE_THAN_1_IDENTITY_PER_PLATFORM',
+  /** Members without any work experience data */
-  NO_WORK_EXPERIENCE = 'no-work-experience',
+  NO_WORK_EXPERIENCE = 'NO_WORK_EXPERIENCE',
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export enum IDataQualityType {
MORE_THAN_10_IDENTITIES = 'more-than-10-identities',
MORE_THAN_1_IDENTITY_PER_PLATFORM = 'more-than-1-identity-per-platform',
NO_WORK_EXPERIENCE = 'no-work-experience',
}
/**
* Represents different types of data quality issues that can be detected.
*/
export enum DataQualityType {
/** Members with more than 10 identities across all platforms */
MORE_THAN_10_IDENTITIES = 'MORE_THAN_10_IDENTITIES',
/** Members with multiple identities on the same platform */
MORE_THAN_1_IDENTITY_PER_PLATFORM = 'MORE_THAN_1_IDENTITY_PER_PLATFORM',
/** Members without any work experience data */
NO_WORK_EXPERIENCE = 'NO_WORK_EXPERIENCE',
}

Comment on lines 3 to 8
const noWorkExperience: DataIssueTypeConfig = {
label: 'No work experience',
badgeType: 'danger',
badgeText: () => 'Missing work experiences',
description: () => 'This profile has no work experience, please review if this is correct.',
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider enhancing the description with more actionable guidance.

While the current description is clear, it could be more helpful by providing specific actions users can take to resolve the issue.

Consider updating the description to be more actionable:

-  description: () => 'This profile has no work experience, please review if this is correct.',
+  description: () => 'This profile has no work experience. If this is incorrect, add relevant work history or mark the profile as a student/new graduate if applicable.',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const noWorkExperience: DataIssueTypeConfig = {
label: 'No work experience',
badgeType: 'danger',
badgeText: () => 'Missing work experiences',
description: () => 'This profile has no work experience, please review if this is correct.',
};
const noWorkExperience: DataIssueTypeConfig = {
label: 'No work experience',
badgeType: 'danger',
badgeText: () => 'Missing work experiences',
description: () => 'This profile has no work experience. If this is incorrect, add relevant work history or mark the profile as a student/new graduate if applicable.',
};

label: 'More than 10 identities',
badgeType: 'warning',
badgeText: () => 'More than 10 identities',
description: (member: any) => `This profile has ${member.identityCount} identities, please review them if they are all necessary.`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace any type with a proper interface.

Using any type reduces type safety. Consider creating a proper interface for the member object that explicitly defines the expected properties.

- description: (member: any) => `This profile has ${member.identityCount} identities, please review them if they are all necessary.`,
+ interface Member {
+   identityCount: number;
+ }
+ description: (member: Member) => `This profile has ${member.identityCount} identities, please review them if they are all necessary.`,

Committable suggestion was skipped due to low confidence.

const moreThan1IdentityPerPlatform: DataIssueTypeConfig = {
label: 'More than 1 identity per platform',
badgeType: 'warning',
badgeText: () => 'More than 1 identity per platform',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider making the badge text dynamic.

The badge text could be more informative by including the number of identities or affected platforms.

-  badgeText: () => 'More than 1 identity per platform',
+  badgeText: (member: { platforms: string }) => {
+    const platforms = member.platforms?.split(',') ?? [];
+    return `${platforms.length} identities across platforms`;
+  },

Committable suggestion was skipped due to low confidence.

Comment on lines 87 to 112
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
WHERE m."tenantId" = '${tenantId}' AND mi.type = 'username'
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > ${treshold}
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
STRING_AGG(p.platform, ',') AS platforms,
msa."activityCount"
FROM platform_identities p
JOIN "members" m ON p."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
GROUP BY p."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities.

Directly inserting variables into SQL queries can be dangerous. Use placeholders and parameter arrays to secure the query.

Apply this diff:

 return qx.select(
     `
         WITH platform_identities AS (
             SELECT
                 mi."memberId",
                 mi.platform,
                 COUNT(*) AS "identityCount"
             FROM "memberIdentities" mi
                      JOIN "members" m ON mi."memberId" = m.id
-            WHERE m."tenantId" = '${tenantId}' AND mi.type = 'username'
+            WHERE m."tenantId" = $1 AND mi.type = 'username'
             GROUP BY mi."memberId", mi.platform
-            HAVING COUNT(*) > ${treshold}
+            HAVING COUNT(*) > $2
         )
         SELECT
             p."memberId",
             m."displayName",
             m."attributes",
             m.id,
             STRING_AGG(p.platform, ',') AS platforms,
             msa."activityCount"
         FROM platform_identities p
                  JOIN "members" m ON p."memberId" = m.id
-                 LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                 LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $3
         GROUP BY p."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
         ORDER BY msa."activityCount" DESC
-        LIMIT ${limit} OFFSET ${offset};
+        LIMIT $4 OFFSET $5;
     `,
-    {
-      treshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      tenantId,
+      threshold,
+      segmentId,
+      limit,
+      offset,
+    ],
   )

Ensure the parameter list aligns with the placeholders in your SQL query.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
WHERE m."tenantId" = '${tenantId}' AND mi.type = 'username'
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > ${treshold}
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
STRING_AGG(p.platform, ',') AS platforms,
msa."activityCount"
FROM platform_identities p
JOIN "members" m ON p."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
GROUP BY p."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
WHERE m."tenantId" = $1 AND mi.type = 'username'
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > $2
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
STRING_AGG(p.platform, ',') AS platforms,
msa."activityCount"
FROM platform_identities p
JOIN "members" m ON p."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $3
GROUP BY p."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $4 OFFSET $5;
`,
[
tenantId,
threshold,
segmentId,
limit,
offset,
],

Comment on lines +72 to +101
export async function updateMemberOrganization(
qx: QueryExecutor,
memberId: string,
id: string,
data: Partial<IMemberOrganization>,
): Promise<void> {
return qx.result(
`
UPDATE "memberOrganizations"
SET
"organizationId" = $(organizationId),
"dateStart" = $(dateStart),
"dateEnd" = $(dateEnd),
title = $(title),
source = $(source),
"updatedAt" = $(updatedAt)
WHERE "memberId" = $(memberId) AND "id" = $(id);
`,
{
memberId,
id,
organizationId: data.organizationId,
dateStart: data.dateStart,
dateEnd: data.dateEnd,
title: data.title,
source: data.source,
updatedAt: new Date().toISOString(),
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Prevent unintended overwrites by updating only defined fields in updateMemberOrganization

In the updateMemberOrganization function, updating all fields without checking for undefined values may overwrite existing data with NULL. Consider modifying the SQL query to update only the fields provided in data.

Here's how you might adjust the code:

export async function updateMemberOrganization(
  qx: QueryExecutor,
  memberId: string,
  id: string,
  data: Partial<IMemberOrganization>,
): Promise<void> {
  const fields = []
  const params: any = { memberId, id, updatedAt: new Date().toISOString() }

  if (data.organizationId !== undefined) {
    fields.push('"organizationId" = $(organizationId)')
    params.organizationId = data.organizationId
  }

  if (data.dateStart !== undefined) {
    fields.push('"dateStart" = $(dateStart)')
    params.dateStart = data.dateStart
  }

  if (data.dateEnd !== undefined) {
    fields.push('"dateEnd" = $(dateEnd)')
    params.dateEnd = data.dateEnd
  }

  if (data.title !== undefined) {
    fields.push('title = $(title)')
    params.title = data.title
  }

  if (data.source !== undefined) {
    fields.push('source = $(source)')
    params.source = data.source
  }

  fields.push('"updatedAt" = $(updatedAt)')

  const sql = `
    UPDATE "memberOrganizations"
    SET
      ${fields.join(', ')}
    WHERE "memberId" = $(memberId) AND "id" = $(id);
  `

  return qx.result(sql, params)
}

Comment on lines +50 to +70
export async function createMemberOrganization(
qx: QueryExecutor,
memberId: string,
data: Partial<IMemberOrganization>,
): Promise<void> {
return qx.result(
`
INSERT INTO "memberOrganizations"("memberId", "organizationId", "dateStart", "dateEnd", "title", "source", "createdAt", "updatedAt")
VALUES($(memberId), $(organizationId), $(dateStart), $(dateEnd), $(title), $(source), $(date), $(date))
`,
{
memberId,
organizationId: data.organizationId,
dateStart: data.dateStart,
dateEnd: data.dateEnd,
title: data.title,
source: data.source,
date: new Date().toISOString(),
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure required fields are validated in createMemberOrganization

When creating a new member organization, if required fields are missing in data, the database may insert NULL values, which could lead to data integrity issues. Please validate that all required fields are provided before insertion.

Implement input validation to check for required fields like organizationId, title, and others before executing the SQL query.

Comment on lines +129 to +134
DELETE FROM "memberOrganizations"
WHERE "memberId" = $(memberId)
AND "organizationId" = $(organizationId)
AND (("dateStart" = $(dateStart)) OR ("dateStart" IS NULL AND $(dateStart) IS NULL))
AND (("dateEnd" = $(dateEnd)) OR ("dateEnd" IS NULL AND $(dateEnd) IS NULL))
AND "deletedAt" IS NOT NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Double-check NULL handling in cleanSoftDeletedMemberOrganization conditions

The DELETE statement in cleanSoftDeletedMemberOrganization includes conditions that compare dateStart and dateEnd, handling NULL values explicitly. Ensure that this logic accurately identifies and deletes only the intended records, as incorrect handling may lead to unintended data loss.

Review and test the conditional logic to confirm it behaves as expected when dateStart or dateEnd are NULL.

@@ -0,0 +1,143 @@
import { IMemberOrganization } from '@crowd/types'
import { QueryExecutor } from '../queryExecutor'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Insert a newline after import statements for better readability

Please add a newline between the import statements to comply with the project's linting and formatting guidelines.

Apply this diff to fix the formatting issue:

 import { IMemberOrganization } from '@crowd/types'

+
 import { QueryExecutor } from '../queryExecutor'

Committable suggestion was skipped due to low confidence.

🧰 Tools
🪛 GitHub Check: lint-format-services

[failure] 2-2:
Insert

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
backend/src/database/repositories/dataQualityRepository.ts (1)

52-52: Extract magic numbers into named constants.

The numbers 15 and 1 are used as thresholds but their meaning isn't immediately clear. Consider extracting these into named constants at the module level.

+ const MAX_TOTAL_IDENTITIES = 15;
+ const MAX_IDENTITIES_PER_PLATFORM = 1;

  static async findMembersWithTooManyIdentities(...) {
-   return fetchMembersWithTooManyIdentities(qx, 15, tenantId, limit, offset, segmentId)
+   return fetchMembersWithTooManyIdentities(qx, MAX_TOTAL_IDENTITIES, tenantId, limit, offset, segmentId)
  }

  static async findMembersWithTooManyIdentitiesPerPlatform(...) {
-   return fetchMembersWithTooManyIdentitiesPerPlatform(qx, 1, tenantId, limit, offset, segmentId)
+   return fetchMembersWithTooManyIdentitiesPerPlatform(qx, MAX_IDENTITIES_PER_PLATFORM, tenantId, limit, offset, segmentId)
  }

Also applies to: 74-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2d06e22 and 9f37f66.

📒 Files selected for processing (4)
  • backend/src/api/dataQuality/dataQualityMember.ts (1 hunks)
  • backend/src/api/dataQuality/dataQualityOrganization.ts (1 hunks)
  • backend/src/api/index.ts (1 hunks)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/src/api/dataQuality/dataQualityMember.ts
  • backend/src/api/dataQuality/dataQualityOrganization.ts
  • backend/src/api/index.ts
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 10-76: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🔇 Additional comments (1)
backend/src/database/repositories/dataQualityRepository.ts (1)

1-5: Verify data access layer integration.

The repository relies on imported functions from @crowd/data-access-layer. Let's verify these functions exist and have matching signatures.

✅ Verification successful

Let me try a different approach to verify these functions in the data access layer.

This will help us:

  1. Find where the data-quality module is referenced
  2. Locate the actual function declarations with some context

Let me get more context about the function signatures to ensure they match completely.


Data access layer integration is verified and exists

The verification confirms that all three functions are properly defined in the data access layer at services/libs/data-access-layer/src/data-quality/members.ts with appropriate signatures:

  • fetchMembersWithoutWorkExperience(qx: QueryExecutor, tenantId: string, limit: number, offset: number)
  • fetchMembersWithTooManyIdentities(qx: QueryExecutor, treshold = 15, tenantId: string, limit: number)
  • fetchMembersWithTooManyIdentitiesPerPlatform(qx: QueryExecutor, treshold = 1, tenantId: string, limit: number)

The imports in dataQualityRepository.ts correctly reference these existing functions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the imported DAL functions exist and have matching signatures

# Test: Search for the function declarations in the DAL
ast-grep --pattern 'export const fetch$_(qx: any, $$$): Promise<$_>' \
  | grep -E 'fetchMembersWithoutWorkExperience|fetchMembersWithTooManyIdentities|fetchMembersWithTooManyIdentitiesPerPlatform'

Length of output: 195


Script:

#!/bin/bash
# Search for the function declarations using ripgrep for better context
rg -l "@crowd/data-access-layer/src/data-quality" --type ts

# Then search for these specific functions in the codebase
rg "export.*(fetchMembersWithTooManyIdentities|fetchMembersWithTooManyIdentitiesPerPlatform|fetchMembersWithoutWorkExperience)" --type ts -A 2

Length of output: 1107


Script:

#!/bin/bash
# Get the complete function declarations with their return types
rg "export async function (fetchMembersWithTooManyIdentities|fetchMembersWithTooManyIdentitiesPerPlatform|fetchMembersWithoutWorkExperience)" --type ts -A 4 services/libs/data-access-layer/src/data-quality/members.ts

Length of output: 631

Comment on lines 11 to 76
class DataQualityRepository {
/**
* Finds members with no work experience.
*
* @param {IRepositoryOptions} options - The repository options for executing the query.
* @param {string} tenantId - The ID of the tenant.
* @param {number} limit - The maximum number of results to return.
* @param {number} offset - The offset from which to begin returning results.
* @param {string} segmentId - The ID of the segment.
* @return {Promise<Members[]>} - A promise that resolves with an array of members having no work experience.
*/
static async findMembersWithNoWorkExperience(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithoutWorkExperience(qx, tenantId, limit, offset, segmentId)
}

/**
* Finds and returns members with too many identities.
* Executes a query to fetch members who exceed a certain number of identities.
*
* @param {IRepositoryOptions} options - Repository options for querying the database.
* @param {string} tenantId - Identifier of the tenant whose members are being queried.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The number of members to skip before starting to collect the result set.
* @param {string} segmentId - Identifier of the segment to filter members.
* @return {Promise<Array>} A promise that resolves to an array of members with too many identities.
*/
static async findMembersWithTooManyIdentities(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyIdentities(qx, 15, tenantId, limit, offset, segmentId)
}

/**
* Finds members with too many identities per platform.
*
* @param {IRepositoryOptions} options - The repository options for database connection and other configurations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to return.
* @param {number} offset - The number of members to skip before starting to collect the result set.
* @param {string} segmentId - The ID of the segment to filter members by.
*
* @return {Promise<Array>} A promise that resolves to an array of members with too many identities per platform.
*/
static async findMembersWithTooManyIdentitiesPerPlatform(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyIdentitiesPerPlatform(qx, 1, tenantId, limit, offset, segmentId)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Convert class to a TypeScript module with standalone functions.

The class currently only contains static methods, which suggests it's being used as a namespace. Converting this into a module with exported functions would be more idiomatic in TypeScript and improve maintainability.

Consider refactoring to:

- class DataQualityRepository {
+ const getQueryExecutor = (options: IRepositoryOptions) => 
+   SequelizeRepository.getQueryExecutor(options);

+ export interface DataQualityQueryParams {
+   options: IRepositoryOptions;
+   tenantId: string;
+   limit: number;
+   offset: number;
+   segmentId: string;
+ }

+ export const findMembersWithNoWorkExperience = async ({
+   options,
+   tenantId,
+   limit,
+   offset,
+   segmentId,
+ }: DataQualityQueryParams) => {
+   const qx = getQueryExecutor(options);
+   return fetchMembersWithoutWorkExperience(qx, tenantId, limit, offset, segmentId);
+ };

// Similar refactoring for other methods

Committable suggestion was skipped due to low confidence.

🧰 Tools
🪛 Biome

[error] 10-76: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🧹 Outside diff range and nitpick comments (6)
frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (1)

3-8: Enhance clarity and actionability of messages

The configuration structure is good, but the messages could be more informative and actionable.

Consider this improved version:

 const conflictingWorkExperience: DataIssueTypeConfig = {
   label: 'Conflicting work experience',
   badgeType: 'danger',
-  badgeText: () => 'Conflicting work experiences',
-  description: () => 'This profile has Conflicting work experiences, please review them.',
+  badgeText: () => 'Overlapping work experiences',
+  description: () => 'This profile contains work experiences with overlapping date ranges at different organizations. Please review and resolve the timeline conflicts.',
 };
frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts (1)

1-15: Consider adding JSDoc documentation.

The configuration object would benefit from documentation explaining its purpose and usage.

Add JSDoc comments:

+/**
+ * Configuration for detecting and displaying warnings about multiple verified identities
+ * on the same platform for a member.
+ *
+ * @type {DataIssueTypeConfig}
+ */
 const tooManyIdentitiesPerPlatform: DataIssueTypeConfig = {
backend/src/services/dataQualityService.ts (2)

1-1: Remove unnecessary ESLint disable comment.

The /* eslint-disable no-continue */ comment can be removed as there are no continue statements in this file.


74-78: Enhance TODO tracking and method implementation.

Consider the following improvements:

  1. Add tracking information to the TODO comment (e.g., issue number)
  2. Make the method static instead of disabling the ESLint rule

Apply this diff:

-  // TODO: Implement this method when there are checks available
-  // eslint-disable-next-line class-methods-use-this
-  async findOrganizationIssues() {
+  // TODO(#issue): Implement organization data quality checks
+  static async findOrganizationIssues(): Promise<[]> {
     return Promise.resolve([])
   }
frontend/src/modules/data-quality/components/data-quality-member.vue (1)

22-47: Remove commented out code.

Remove the commented-out code block as it:

  1. Clutters the file and reduces readability
  2. Can be retrieved from version control if needed
  3. May become outdated and misleading over time
backend/src/database/repositories/dataQualityRepository.ts (1)

34-43: Ensure consistency in parameter documentation

The parameter descriptions in the JSDoc comments vary across methods. For consistency and clarity, consider standardizing the descriptions for parameters like tenantId, limit, offset, and segmentId.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 9f37f66 and 4f1717c.

📒 Files selected for processing (12)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
  • backend/src/services/dataQualityService.ts (1 hunks)
  • backend/src/types/data-quality/data-quality-filters.ts (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-member.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1 hunks)
  • frontend/src/modules/data-quality/config/data-issue-types.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/incomplete-work-experience.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-emails.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts (1 hunks)
  • frontend/src/modules/data-quality/types/DataIssueType.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue
  • frontend/src/modules/data-quality/config/data-issue-types.ts
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 10-98: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🔇 Additional comments (11)
frontend/src/modules/data-quality/types/DataIssueType.ts (1)

1-8: LGTM! Clean and well-structured enum implementation.

The enum follows TypeScript best practices with consistent naming conventions and clear categorization of data quality issues.

frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (2)

1-10: Verify consistency with other data quality configurations

Let's ensure this configuration follows the same patterns as other data quality issue types.

#!/bin/bash
# Find and compare with other data quality type configurations
fd -e ts -e tsx --full-path "frontend/src/modules/data-quality/config/types" --exec cat {} \;

# Verify the integration in the data quality system
rg -t typescript "conflictingWorkExperience" "frontend/src/modules/data-quality/"

1-1: Verify the imported type exists

The import statement looks correct, but let's verify the existence and structure of the imported type.

✅ Verification successful

Let me generate additional scripts to verify the existence and structure of the DataIssueTypeConfig type definition, as the previous search didn't specifically locate it.

The previous results showed that multiple files are consistently importing and using the type with similar structure (all having label, badgeType, badgeText, and description properties), but we need to verify the actual type definition to ensure it exists and matches this usage pattern.


Based on the results, I found the DataIssueTypeConfig interface definition in the data-issue-types.ts file. The type exists and is properly defined with the expected structure:

export interface DataIssueTypeConfig {
  label: string;
  badgeType: string;
  badgeText: (entity: Contributor | Organization) => string;
  description: (entity: Contributor | Organization) => string;
}

Import statement is valid and type exists

The import statement is correctly referencing an existing type that matches the usage pattern seen across multiple data quality configuration files. The type is well-defined with all the required properties (label, badgeType, badgeText, and description) that are consistently used by other data quality configurations in the same directory.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the DataIssueTypeConfig type definition exists
rg -t typescript "export.*DataIssueTypeConfig" "frontend/src/modules/data-quality/config/"

# Check the type structure to ensure it matches our usage
ast-grep --pattern 'type DataIssueTypeConfig = {
  $$$
}'

Length of output: 3038


Script:

#!/bin/bash
# Search for the DataIssueTypeConfig type definition in the data-issue-types file
fd "data-issue-types.ts" frontend/src/modules/data-quality/config/ --exec cat {}

# As backup, search for any type definition of DataIssueTypeConfig in the codebase
ast-grep --pattern 'type DataIssueTypeConfig = $_'

Length of output: 49832

backend/src/types/data-quality/data-quality-filters.ts (1)

1-8: Verify implementation of data quality checks

Let's ensure that all enum values have corresponding implementations in the service layer.

✅ Verification successful

Let me gather more information about the actual implementations since we only found the type definitions. Let's search for the service layer implementations.


Let me generate the final response based on the gathered information.

Data quality types are properly implemented across the codebase

The verification shows that all enum values from IDataQualityType are properly referenced and implemented:

  • All data quality types are consistently used in both frontend (frontend/src/modules/data-quality/) and backend (backend/src/) codebases
  • The implementation is handled through DataQualityService and DataQualityApiService classes
  • Identity-related validations are implemented in the member identity service
  • Email validations are present in the member service
  • The data quality checks are integrated with the tenant-specific endpoints
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that each data quality type has a corresponding implementation

# Search for implementations of each data quality type
echo "Searching for TOO_MANY_IDENTITIES implementations..."
rg -l "too-many-identities" --type ts

echo "Searching for TOO_MANY_IDENTITIES_PER_PLATFORM implementations..."
rg -l "too-many-identities-per-platform" --type ts

echo "Searching for TOO_MANY_EMAILS implementations..."
rg -l "too-many-emails" --type ts

echo "Searching for NO_WORK_EXPERIENCE implementations..."
rg -l "no-work-experience" --type ts

echo "Searching for INCOMPLETE_WORK_EXPERIENCE implementations..."
rg -l "incomplete-work-experience" --type ts

echo "Searching for CONFLICTING_WORK_EXPERIENCE implementations..."
rg -l "conflicting-work-experience" --type ts

# Search for the service implementation pattern
echo "Searching for potential service implementations..."
ast-grep --pattern 'class $_ implements $_ {
  $$$
  findMemberIssues($_, $_) {
    $$$
  }
  $$$
}'

Length of output: 2177


Script:

#!/bin/bash
# Search for data quality service implementations
echo "Searching for data quality service files..."
fd -e ts -e js quality service

# Search for specific implementation patterns
echo "Searching for data quality check implementations..."
rg -A 5 "class.*DataQuality.*Service" --type ts

# Look for actual validation logic
echo "Searching for validation methods..."
rg -A 10 "(validate|check|verify)(Identity|WorkExperience|Email)" --type ts

# Search for test files to verify implementation coverage
echo "Searching for related test files..."
fd -e spec.ts -e test.ts quality

Length of output: 13151

frontend/src/modules/data-quality/config/types/incomplete-work-experience.ts (2)

12-12: LGTM!

The export statement follows the module pattern correctly.


1-12: Verify integration with data quality configuration.

Let's ensure this new type is properly integrated with other data quality configurations.

✅ Verification successful

Let me gather more information about how this type is integrated in the main configuration file.

This will help us verify:

  1. How the incomplete-work-experience type is imported and used in the main configuration
  2. If the structure of this type matches other similar configurations in terms of required properties

Let me try one more verification with a modified regex pattern to check the structure of these configurations:

This should help us verify:

  1. If the new type is properly registered in the main configuration
  2. If it follows the same structure as other type configurations (with proper properties)

Integration verified and properly implemented

The incomplete work experience type is correctly integrated:

  • It's properly imported and registered in dataIssueTypes configuration
  • It follows the same structure as other type configurations with required properties (label, badgeType, badgeText, description)
  • The implementation matches the pattern used by similar types like tooManyEmails and tooManyIdentities that also use member data in their description
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration of incomplete work experience type with data quality configurations

# Test 1: Check if the type is imported in the main data-issue-types configuration
echo "Checking data-issue-types configuration..."
rg -l "import.*incomplete-work-experience" "frontend/src/modules/data-quality/config/"

# Test 2: Look for similar patterns in other data quality type configurations
echo "Checking consistency with other type configurations..."
rg -l "DataIssueTypeConfig" "frontend/src/modules/data-quality/config/types/"

Length of output: 880


Script:

#!/bin/bash
# Check the actual integration in data-issue-types.ts
echo "Checking how incomplete-work-experience is used in data-issue-types.ts..."
rg "incomplete-work-experience" "frontend/src/modules/data-quality/config/data-issue-types.ts" -A 2 -B 2

# Check the structure of other similar type configurations
echo -e "\nChecking structure of other type configurations for consistency..."
for file in $(fd ".*\.ts$" "frontend/src/modules/data-quality/config/types/"); do
  echo -e "\n=== $file ==="
  rg "DataIssueTypeConfig = {" "$file" -A 3
done

Length of output: 3803


Script:

#!/bin/bash
# Check how the type is used in the main configuration
echo "Checking data-issue-types.ts for type registration..."
rg "export const dataIssueTypes" -A 10 "frontend/src/modules/data-quality/config/data-issue-types.ts"

# Check structure of other configurations with fixed regex
echo -e "\nChecking structure of type configurations..."
for file in $(fd ".*\.ts$" "frontend/src/modules/data-quality/config/types/"); do
  echo -e "\n=== $file ==="
  rg "const.*: DataIssueTypeConfig = \{" "$file" -A 4
done

Length of output: 4635

frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts (2)

1-3: LGTM! Clean and focused imports.

The imports are well-organized and specifically import only what's needed.


15-15: LGTM! Clean export statement.

The default export is appropriate for this configuration object.

backend/src/services/dataQualityService.ts (3)

9-15: LGTM! Clean class setup.

The class structure and constructor implementation follow TypeScript best practices.


25-72: Previous review suggestions are still applicable.

The implementation would benefit from the improvements suggested in the previous review regarding error handling, maintainability, and code structure.


9-9: Verify service integration.

Let's verify the proper integration of this service in the application.

Comment on lines 1 to 8
export enum DataIssueType {
TOO_MANY_IDENTITIES = 'too-many-identities',
TOO_MANY_IDENTITIES_PER_PLATFORM = 'too-many-identities-per-platform',
TOO_MANY_EMAILS = 'too-many-emails',
NO_WORK_EXPERIENCE = 'no-work-experience',
INCOMPLETE_WORK_EXPERIENCE = 'incomplete-work-experience',
CONFLICTING_WORK_EXPERIENCE = 'conflicting-work-experience',
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider enhancing documentation and type safety.

To improve maintainability and type safety, consider:

  1. Adding JSDoc comments to document each issue type
  2. Including a type guard function for runtime validation

Here's how you could enhance the implementation:

+/**
+ * Represents different types of data quality issues that can be detected in the system.
+ */
 export enum DataIssueType {
+  /** Multiple identities found for a single member across all platforms */
   TOO_MANY_IDENTITIES = 'too-many-identities',
+  /** Multiple identities found for a single member on the same platform */
   TOO_MANY_IDENTITIES_PER_PLATFORM = 'too-many-identities-per-platform',
+  /** Multiple email addresses associated with a single member */
   TOO_MANY_EMAILS = 'too-many-emails',
+  /** Member has no work experience information */
   NO_WORK_EXPERIENCE = 'no-work-experience',
+  /** Member has work experience entries with missing required information */
   INCOMPLETE_WORK_EXPERIENCE = 'incomplete-work-experience',
+  /** Member has work experience entries with conflicting information */
   CONFLICTING_WORK_EXPERIENCE = 'conflicting-work-experience',
 }

+/**
+ * Type guard to check if a value is a valid DataIssueType
+ * @param value - The value to check
+ * @returns True if the value is a valid DataIssueType
+ */
+export function isDataIssueType(value: string): value is DataIssueType {
+  return Object.values(DataIssueType).includes(value as DataIssueType);
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export enum DataIssueType {
TOO_MANY_IDENTITIES = 'too-many-identities',
TOO_MANY_IDENTITIES_PER_PLATFORM = 'too-many-identities-per-platform',
TOO_MANY_EMAILS = 'too-many-emails',
NO_WORK_EXPERIENCE = 'no-work-experience',
INCOMPLETE_WORK_EXPERIENCE = 'incomplete-work-experience',
CONFLICTING_WORK_EXPERIENCE = 'conflicting-work-experience',
}
/**
* Represents different types of data quality issues that can be detected in the system.
*/
export enum DataIssueType {
/** Multiple identities found for a single member across all platforms */
TOO_MANY_IDENTITIES = 'too-many-identities',
/** Multiple identities found for a single member on the same platform */
TOO_MANY_IDENTITIES_PER_PLATFORM = 'too-many-identities-per-platform',
/** Multiple email addresses associated with a single member */
TOO_MANY_EMAILS = 'too-many-emails',
/** Member has no work experience information */
NO_WORK_EXPERIENCE = 'no-work-experience',
/** Member has work experience entries with missing required information */
INCOMPLETE_WORK_EXPERIENCE = 'incomplete-work-experience',
/** Member has work experience entries with conflicting information */
CONFLICTING_WORK_EXPERIENCE = 'conflicting-work-experience',
}
/**
* Type guard to check if a value is a valid DataIssueType
* @param value - The value to check
* @returns True if the value is a valid DataIssueType
*/
export function isDataIssueType(value: string): value is DataIssueType {
return Object.values(DataIssueType).includes(value as DataIssueType);
}

label: 'More than 3 verified emails',
badgeType: 'warning',
badgeText: () => 'More than 3 verified emails',
description: (member: any) => `This profile has ${member.identityCount} verified emails, please review them if they are all necessary.`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve type safety for the member parameter.

The any type bypasses TypeScript's type checking. Consider creating a proper interface for the member object to ensure type safety.

- description: (member: any) => `This profile has ${member.identityCount} verified emails, please review them if they are all necessary.`,
+ interface Member {
+   identityCount: number;
+ }
+ description: (member: Member) => `This profile has ${member.identityCount} verified emails, please review them if they are all necessary.`,

Committable suggestion was skipped due to low confidence.

Comment on lines 3 to 8
const tooManyEmails: DataIssueTypeConfig = {
label: 'More than 3 verified emails',
badgeType: 'warning',
badgeText: () => 'More than 3 verified emails',
description: (member: any) => `This profile has ${member.identityCount} verified emails, please review them if they are all necessary.`,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Based on the results from both verification scripts, I can now provide a final response:

The review comment is partially correct - both concerns are valid and verified

  1. The code incorrectly uses identityCount to represent email count. From the SQL queries, we can see that email-specific counts should use type = 'email' filter, but the current implementation uses a generic identityCount which counts all identities, not just emails.
  2. The hardcoded threshold of "3" appears in multiple places in the message strings, making it harder to maintain.

Specific issues found:

  • The identityCount property is misleading as it represents all identities, not just verified emails (verified by SQL queries in data-quality/members.ts)
  • The threshold value "3" is duplicated in three places within the component

Note: The suggested refactoring in the original review comment is appropriate, but should also include fixing the identityCount to specifically count email identities.

🔗 Analysis chain

Verify email count logic and consider extracting the threshold.

  1. The label mentions "verified emails" but uses identityCount. Verify if this property actually represents the count of verified emails.
  2. Consider extracting the threshold (3) as a named constant for better maintainability.

Consider refactoring to:

+ const EMAIL_THRESHOLD = 3;
+
 const tooManyEmails: DataIssueTypeConfig = {
-  label: 'More than 3 verified emails',
+  label: `More than ${EMAIL_THRESHOLD} verified emails`,
   badgeType: 'warning',
-  badgeText: () => 'More than 3 verified emails',
+  badgeText: () => `More than ${EMAIL_THRESHOLD} verified emails`,
   description: (member: Member) => 
-    `This profile has ${member.identityCount} verified emails, please review them if they are all necessary.`,
+    `This profile has ${member.identityCount} verified emails (threshold: ${EMAIL_THRESHOLD}), please review them if they are all necessary.`,
 };

Let me analyze the results and generate one more verification script to get complete clarity.

The previous results show that identityCount is used in SQL queries from the memberIdentities table, but we need to verify if it's specifically counting email identities in the context of the "too many emails" check. The new script will help us understand if there's a specific filter for email type identities in the relevant queries.

Let me know the results of this script and I'll provide the final response.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify if identityCount represents verified emails count
# Search for references to identityCount to understand its meaning
rg -A 5 "identityCount" 

# Search for any email verification related code
ast-grep --pattern 'email' 

Length of output: 17345


Script:

#!/bin/bash
# Let's check the SQL queries and schema related to memberIdentities to understand the relationship with emails
rg -A 10 "memberIdentities.*email" 
rg -A 5 "type = 'email'" 

Length of output: 2077

Comment on lines 1 to 2
import { DataIssueTypeConfig } from '@/modules/data-quality/config/data-issue-types';
import pluralize from 'pluralize';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding proper type definition for the member parameter.

Instead of using any type, consider creating an interface that defines the expected shape of the member object with the required organizationsCount property.

interface Member {
  organizationsCount: number;
  // ... other properties
}

Comment on lines 4 to 10
const incompleteWorkExperience: DataIssueTypeConfig = {
label: 'Incomplete work experience',
badgeType: 'warning',
badgeText: () => 'Incomplete work experiences',
description: (member: any) => `This profile has ${member.organizationsCount}
incomplete work ${pluralize('experience', +member.organizationsCount)}, please review ${+member.organizationsCount > 1 ? 'them' : 'it'}.`,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve string formatting and type safety.

The configuration object could benefit from better string formatting and explicit type handling.

Apply these improvements:

 const incompleteWorkExperience: DataIssueTypeConfig = {
   label: 'Incomplete work experience',
   badgeType: 'warning',
   badgeText: () => 'Incomplete work experiences',
-  description: (member: any) => `This profile has ${member.organizationsCount} 
-  incomplete work ${pluralize('experience', +member.organizationsCount)}, please review ${+member.organizationsCount > 1 ? 'them' : 'it'}.`,
+  description: (member: Member) => {
+    const count = Number(member.organizationsCount);
+    return `This profile has ${count} incomplete work ${pluralize('experience', count)}, please review ${count > 1 ? 'them' : 'it'}.`;
+  },
 };

Changes made:

  1. Removed unnecessary line breaks and spaces in the template literal
  2. Added explicit number conversion using Number() instead of the + operator
  3. Improved readability by breaking down the logic

Committable suggestion was skipped due to low confidence.

Comment on lines 4 to 20
<lf-dropdown class="mb-6">
<template #trigger>
<lf-button type="secondary">
<lf-icon name="apps-2-line" />
<span v-if="tab === 'merge-suggestions'">Merge suggestions</span>
<span v-else>{{ dataIssueTypes[tab]?.label || 'Select issue type' }}</span>
</lf-button>
</template>

<lf-dropdown-item @click="tab = 'merge-suggestions'">
Merge suggestions
</lf-dropdown-item>

<lf-dropdown-item v-for="(config, type) in dataIssueTypes" :key="type" @click="tab = type">
{{ config.label }}
</lf-dropdown-item>
</lf-dropdown>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance dropdown accessibility and i18n support.

The dropdown implementation needs improvements for accessibility and internationalization:

Apply these enhancements:

-      <lf-dropdown class="mb-6">
+      <lf-dropdown 
+        class="mb-6"
+        :aria-label="$t('data-quality.issue-type-selector')"
+      >
         <template #trigger>
           <lf-button type="secondary">
             <lf-icon name="apps-2-line" />
-            <span v-if="tab === 'merge-suggestions'">Merge suggestions</span>
-            <span v-else>{{ dataIssueTypes[tab]?.label || 'Select issue type' }}</span>
+            <span v-if="tab === 'merge-suggestions'">{{ $t('data-quality.merge-suggestions') }}</span>
+            <span v-else>{{ $t(dataIssueTypes[tab]?.labelKey) || $t('data-quality.select-issue-type') }}</span>
           </lf-button>
         </template>

-        <lf-dropdown-item @click="tab = 'merge-suggestions'">
-          Merge suggestions
+        <lf-dropdown-item 
+          @click="tab = 'merge-suggestions'"
+          :aria-selected="tab === 'merge-suggestions'"
+        >
+          {{ $t('data-quality.merge-suggestions') }}
         </lf-dropdown-item>

-        <lf-dropdown-item v-for="(config, type) in dataIssueTypes" :key="type" @click="tab = type">
-          {{ config.label }}
+        <lf-dropdown-item 
+          v-for="(config, type) in dataIssueTypes" 
+          :key="type" 
+          @click="tab = type"
+          :aria-selected="tab === type"
+        >
+          {{ $t(config.labelKey) }}
         </lf-dropdown-item>

Committable suggestion was skipped due to low confidence.

Comment on lines 44 to 53
static async findMembersWithTooManyIdentities(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyIdentities(qx, 15, tenantId, limit, offset, segmentId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider externalizing magic numbers into constants

The hardcoded threshold value 15 in fetchMembersWithTooManyIdentities can be replaced with a named constant to improve readability and maintainability.

Apply this diff to declare a constant for the threshold:

+const MAX_IDENTITIES_THRESHOLD = 15;

...

 static async findMembersWithTooManyIdentities(
     options: IRepositoryOptions,
     tenantId: string,
     limit: number,
     offset: number,
     segmentId: string,
   ) {
     const qx = SequelizeRepository.getQueryExecutor(options)
-    return fetchMembersWithTooManyIdentities(qx, 15, tenantId, limit, offset, segmentId)
+    return fetchMembersWithTooManyIdentities(qx, MAX_IDENTITIES_THRESHOLD, tenantId, limit, offset, segmentId)
 }

Committable suggestion was skipped due to low confidence.

Comment on lines 193 to 205
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Include all non-aggregated columns in the GROUP BY clause

The SQL query selects m."displayName" and m."attributes", but these columns are not included in the GROUP BY clause. SQL requires that all selected columns that are not aggregated must be included in the GROUP BY clause. Failing to do so can result in SQL errors or unexpected behavior.

Apply this diff to include the missing columns in the GROUP BY clause:

           GROUP BY 
-              m.id, msa."activityCount"
+              m.id, m."displayName", m."attributes", msa."activityCount"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
ORDER BY msa."activityCount" DESC

Comment on lines 192 to 207
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities

Variables are directly embedded into the SQL query string, which can introduce SQL injection risks. It's recommended to use parameter placeholders and pass the parameters separately to ensure the query is safe.

Apply this diff to modify the query:

 return qx.select(
     `
           SELECT
               m.id,
               m."displayName",
               m."attributes",
               msa."activityCount",
               COUNT(mo.id) AS "organizationsCount"
           FROM "members" m
                    JOIN "memberOrganizations" mo ON m.id = mo."memberId"
-                   LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                   LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
           WHERE m."tenantId" = $2
             AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
           GROUP BY m.id, msa."activityCount"
           ORDER BY msa."activityCount" DESC
           LIMIT $3 OFFSET $4;
       `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      segmentId,
+      tenantId,
+      limit,
+      offset,
+    ],
   )

Ensure that the parameters array aligns with the placeholders in your SQL query.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,
[
segmentId,
tenantId,
limit,
offset,
],

Comment on lines 155 to 173
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(*) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(*) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities

Variables are directly embedded into the SQL query, which can lead to SQL injection attacks. It's safer to use parameter placeholders and pass the parameters separately to ensure proper sanitization.

Apply this diff to modify the query:

 return qx.select(
     `
             SELECT
                 mi."memberId",
                 m."displayName",
                 m."attributes",
                 m.id,
                 COUNT(*) AS "identityCount",
                 msa."activityCount"
             FROM "memberIdentities" mi
                      JOIN "members" m ON mi."memberId" = m.id
-                     LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                     LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
             WHERE m."tenantId" = $2
               AND mi.verified = true
               AND mi.type = 'email'
             GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
             HAVING COUNT(*) > $3
             ORDER BY msa."activityCount" DESC
             LIMIT $4 OFFSET $5;
         `,
-    {
-      threshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      segmentId,
+      tenantId,
+      threshold,
+      limit,
+      offset,
+    ],
   )

Ensure that your QueryExecutor supports parameterized queries with positional placeholders and adjust accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(*) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(*) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(*) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND mi.verified = true
AND mi.type = 'email'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(*) > $3
ORDER BY msa."activityCount" DESC
LIMIT $4 OFFSET $5;
`,
[
segmentId,
tenantId,
threshold,
limit,
offset,
],

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4f1717c and f140e72.

📒 Files selected for processing (1)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🔇 Additional comments (2)
services/libs/data-access-layer/src/data-quality/members.ts (2)

146-182: ⚠️ Potential issue

Ensure consistent use of query parameters and prevent SQL injection

Variables are directly interpolated into SQL query strings, which can lead to SQL injection vulnerabilities. Use parameterized queries with placeholders to enhance security and maintainability.

Apply this diff to modify the query:

 return qx.select(
   `
       SELECT
           mi."memberId",
           m."displayName",
           m."attributes",
           m.id,
           COUNT(DISTINCT mi.value) AS "identityCount",
           msa."activityCount"
       FROM "memberIdentities" mi
                JOIN "members" m ON mi."memberId" = m.id
-               LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+               LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
       WHERE m."tenantId" = $2
         AND mi.verified = true
         AND mi.type = 'email'
       GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
       HAVING COUNT(DISTINCT mi.value) > $3
       ORDER BY msa."activityCount" DESC
       LIMIT $4 OFFSET $5;
       `,
-    {
-      threshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      segmentId,
+      tenantId,
+      threshold,
+      limit,
+      offset,
+    ],
   )

Additionally, ensure that the parameters array aligns with the placeholders in your SQL query.

To verify that all queries are adjusted appropriately, run the following script:


184-215: ⚠️ Potential issue

Include all selected non-aggregated columns in the GROUP BY clause

In fetchMembersWithIncompleteWorkExperience, the SQL query selects m."displayName" and m."attributes", but these columns are not included in the GROUP BY clause. SQL standards require that all selected non-aggregated columns be included in the GROUP BY clause to avoid ambiguous results.

Apply this diff to include the missing columns:

   GROUP BY 
-      m.id, msa."activityCount"
+      m.id, m."displayName", m."attributes", msa."activityCount"

Likely invalid or redundant comment.

Comment on lines 56 to 81
treshold = 15,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(*) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(*) > ${treshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
treshold,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the parameter name 'treshold' to 'threshold' in function definitions and usage

The parameter treshold is misspelled in both the function definition and usage within fetchMembersWithTooManyIdentities and fetchMembersWithTooManyIdentitiesPerPlatform. This should be corrected to threshold for consistency and clarity.

Apply this diff to correct the spelling:

 // In fetchMembersWithTooManyIdentities
 export async function fetchMembersWithTooManyIdentities(
   qx: QueryExecutor,
-  treshold = 15,
+  threshold = 15,
   tenantId: string,
   limit: number,
   offset: number,
   segmentId: string,
 ): Promise<IMember[]> {
   return qx.select(
     `
       ... SQL query ...
-      HAVING COUNT(*) > ${treshold}
+      HAVING COUNT(*) > ${threshold}
       ... SQL query ...
     `,
     {
-      treshold,
+      threshold,
       tenantId,
       limit,
       offset,
       segmentId,
     },
   )
 }

 // In fetchMembersWithTooManyIdentitiesPerPlatform
 export async function fetchMembersWithTooManyIdentitiesPerPlatform(
   qx: QueryExecutor,
-  treshold = 1,
+  threshold = 1,
   tenantId: string,
   limit: number,
   offset: number,
   segmentId: string,
 ): Promise<IMember[]> {
   return qx.select(
     `
       ... SQL query ...
-      HAVING COUNT(*) > ${treshold}
+      HAVING COUNT(*) > ${threshold}
       ... SQL query ...
     `,
     {
-      treshold,
+      threshold,
       tenantId,
       limit,
       offset,
       segmentId,
     },
   )
 }

Also applies to: 103-137

Comment on lines 15 to 215
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
WHERE m."tenantId" = '${tenantId}' AND mi.type = 'username' AND mi.verified = true
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > ${treshold}
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
STRING_AGG(p.platform, ',') AS platforms,
msa."activityCount"
FROM platform_identities p
JOIN "members" m ON p."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
GROUP BY p."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
treshold,
tenantId,
limit,
offset,
segmentId,
},
)
}

export async function fetchMembersWithTooManyEmails(
qx: QueryExecutor,
threshold = 3,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT mi.value) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT mi.value) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
}

export async function fetchMembersWithIncompleteWorkExperience(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider refactoring to reduce code duplication

Multiple functions in this file share similar patterns in constructing SQL queries and handling parameters. Consider abstracting common logic into helper functions or utilities to improve maintainability and reduce redundancy.

For example, you could create a utility function to build parameterized queries with placeholders, which can then be reused across different data retrieval functions.

* Fetches members with a number of verified identities per platform exceeding a specified threshold for a given tenant.
*
* @param {QueryExecutor} qx - The query executor to run the database queries.
* @param {number} [treshold=1] - The minimum number of verified identities per platform to filter members by. Defaults to 1.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the misspelling of 'treshold' to 'threshold' in JSDoc comments

In the JSDoc comments for fetchMembersWithTooManyIdentitiesPerPlatform, the parameter treshold is misspelled. It should be threshold to ensure clarity and prevent misunderstandings.

Apply this diff to correct the spelling:

  * @param {QueryExecutor} qx - The query executor to run the database queries.
- * @param {number} [treshold=1] - The minimum number of verified identities per platform to filter members by. Defaults to 1.
+ * @param {number} [threshold=1] - The minimum number of verified identities per platform to filter members by. Defaults to 1.
  * @param {string} tenantId - The ID of the tenant to filter members.
  * @param {number} limit - The maximum number of records to return.
  * @param {number} offset - The number of records to skip.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* @param {number} [treshold=1] - The minimum number of verified identities per platform to filter members by. Defaults to 1.
* @param {number} [threshold=1] - The minimum number of verified identities per platform to filter members by. Defaults to 1.

Comment on lines 45 to 47
*
* @param {QueryExecutor} qx - The query executor to perform database operations.
* @param {number} [treshold=15] - The threshold for the number of identities a member must exceed to be included in the results.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the misspelling of 'treshold' to 'threshold' in JSDoc comments

In the JSDoc comments for fetchMembersWithTooManyIdentities, the parameter treshold is misspelled. It should be threshold to adhere to standard spelling and avoid confusion.

Apply this diff to correct the spelling:

 /**
  * Fetches members with a number of identities that exceed a specified threshold.
  *
  * @param {QueryExecutor} qx - The query executor to perform database operations.
- * @param {number} [treshold=15] - The threshold for the number of identities a member must exceed to be included in the results.
+ * @param {number} [threshold=15] - The threshold for the number of identities a member must exceed to be included in the results.
  * @param {string} tenantId - The ID of the tenant whose members are being queried.
  * @param {number} limit - The maximum number of members to return.
  * @param {number} offset - The number of members to skip before starting to collect the result set.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
*
* @param {QueryExecutor} qx - The query executor to perform database operations.
* @param {number} [treshold=15] - The threshold for the number of identities a member must exceed to be included in the results.
*
* @param {QueryExecutor} qx - The query executor to perform database operations.
* @param {number} [threshold=15] - The threshold for the number of identities a member must exceed to be included in the results.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🧹 Outside diff range and nitpick comments (5)
frontend/src/modules/data-quality/components/data-quality-member.vue (1)

31-31: Add component documentation.

Add JSDoc documentation to describe the component's purpose, props, and usage.

/**
 * Data Quality Member Component
 * 
 * Displays and manages data quality issues for members, including merge suggestions
 * and various issue types like identity conflicts and work experience problems.
 * 
 * @component
 * @example
 * <lf-data-quality-member />
 * 
 * @example
 * <lf-data-quality-member project-group="group-id" />
 */
frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (1)

6-7: Move inline styles to class.

For better maintainability and consistency, move the inline style to a CSS class.

-    <div class="truncate" style="max-width: 40ch;">
+    <div class="truncate max-w-40">

Add to your stylesheet:

.max-w-40 {
  max-width: 40ch;
}
frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1)

38-46: Consider organizing imports by category.

For better maintainability, consider grouping imports into categories:

  1. Vue core
  2. UI components
  3. Store/utils
-import LfButton from '@/ui-kit/button/Button.vue';
-import { computed, ref } from 'vue';
-import LfIcon from '@/ui-kit/icon/Icon.vue';
-import LfDropdown from '@/ui-kit/dropdown/Dropdown.vue';
-import LfDropdownItem from '@/ui-kit/dropdown/DropdownItem.vue';
-import { storeToRefs } from 'pinia';
-import { useLfSegmentsStore } from '@/modules/lf/segments/store';
-import { getSegmentName } from '@/utils/segments';
-import LfInput from '@/ui-kit/input/Input.vue';
+// Vue core
+import { computed, ref } from 'vue';
+import { storeToRefs } from 'pinia';
+
+// UI components
+import LfButton from '@/ui-kit/button/Button.vue';
+import LfIcon from '@/ui-kit/icon/Icon.vue';
+import LfDropdown from '@/ui-kit/dropdown/Dropdown.vue';
+import LfDropdownItem from '@/ui-kit/dropdown/DropdownItem.vue';
+import LfInput from '@/ui-kit/input/Input.vue';
+
+// Store & utils
+import { useLfSegmentsStore } from '@/modules/lf/segments/store';
+import { getSegmentName } from '@/utils/segments';
frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1)

22-24: Enhance accessibility for external link button.

The external link button should indicate that it opens in a new tab for better accessibility.

-            <lf-button type="secondary" size="small">
-              <lf-icon name="external-link-line" />Review profile
+            <lf-button type="secondary" size="small" aria-label="Review profile (opens in new tab)">
+              <lf-icon name="external-link-line" />Review profile
+              <span class="sr-only">(opens in new tab)</span>
backend/src/database/repositories/dataQualityRepository.ts (1)

35-45: Enhance JSDoc documentation with threshold information

The JSDoc comment should include information about the identity threshold to make the behavior more transparent.

Update the documentation to include:

/**
 * Finds and returns members with too many identities.
 * Executes a query to fetch members who exceed 15 identities.
 *
 * @param {IRepositoryOptions} options - Repository options for querying the database.
 * @param {string} tenantId - Identifier of the tenant whose members are being queried.
 * @param {number} limit - The maximum number of members to retrieve.
 * @param {number} offset - The number of members to skip before starting to collect the result set.
 * @param {string} segmentId - Identifier of the segment to filter members.
 * @return {Promise<Array>} A promise that resolves to an array of members with more than 15 identities.
 */
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f140e72 and c22a76d.

📒 Files selected for processing (12)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
  • backend/src/services/dataQualityService.ts (1 hunks)
  • backend/src/types/data-quality/data-quality-filters.ts (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-member.vue (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • backend/src/services/dataQualityService.ts
  • backend/src/types/data-quality/data-quality-filters.ts
  • frontend/src/modules/data-quality/components/data-quality-organization.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue
  • frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 12-120: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🔇 Additional comments (15)
frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (2)

32-40: LGTM!

The imports are well-organized and follow Vue best practices.


48-51: LGTM!

The computed property correctly implements two-way binding following Vue's v-model convention.

frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (2)

66-70: LGTM!

The component definition follows Vue naming conventions and best practices.


15-15: Review the use of important modifiers in classes.

The use of ! modifiers (!border-0 !rounded-none) suggests potential CSS specificity issues. Consider reviewing the base styles of lf-input to avoid using important modifiers.

✅ Verification successful

The use of important modifiers is appropriate in this specific case

The important modifiers are used intentionally here to override the default input styles when the lf-input component is used within a dropdown context. This is a common pattern for search inputs within dropdowns where you want to remove borders and rounded corners. The usage is isolated to this specific component and isn't indicative of a broader specificity issue, as evidenced by:

  1. This is the only instance of these particular important modifiers in the frontend codebase
  2. The base LfInput component has a clean implementation without competing styles
  3. The modification is scoped to a specific UI pattern (search within dropdown)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for other important modifiers in the codebase
rg --type vue '![a-zA-Z0-9-]+' frontend/src/ui-kit/input/

Length of output: 87


Script:

#!/bin/bash
# Check for other important modifiers in the codebase, adjusting the file type
rg '![a-zA-Z0-9-]+' frontend/src/ui-kit/input/

# Also check the component itself to understand the context better
cat frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue

# Look for lf-input component definition
fd -e vue lf-input

Length of output: 2274


Script:

#!/bin/bash
# Let's find and examine the LfInput component definition
rg -g '*.vue' -l "name.*Input" frontend/src/ui-kit/
cat frontend/src/ui-kit/input/Input.vue

# Also check for similar usage patterns in other dropdown search implementations
rg -g '*.vue' '!border-0.*!rounded-none' frontend/src/

Length of output: 1518

services/libs/data-access-layer/src/data-quality/members.ts (11)

1-4: Imports are necessary and correctly referenced

The import statements for IMember and QueryExecutor are appropriate and correctly used.


5-14: JSDoc comments for fetchMembersWithoutWorkExperience are clear and informative

The documentation provides a clear explanation of the function's purpose, parameters, and return value, enhancing maintainability.


15-41: Function fetchMembersWithoutWorkExperience correctly retrieves members without work experience

The SQL query accurately selects members who have no associated work experience records in "memberOrganizations". The use of LEFT JOIN and the condition mo."memberId" IS NULL effectively identifies such members.


43-53: JSDoc comments for fetchMembersWithTooManyIdentities are comprehensive

The comments thoroughly describe the function's parameters and return value, aiding in understanding and future maintenance.


54-88: Function fetchMembersWithTooManyIdentities effectively identifies members exceeding identity threshold

The SQL query aggregates identities per member and correctly filters those exceeding the specified threshold. The logic is sound and aligns with the intended functionality.


90-100: JSDoc comments for fetchMembersWithTooManyIdentitiesPerPlatform are detailed and helpful

The documentation clearly explains the purpose and parameters of the function, which is beneficial for developers.


101-144: Function fetchMembersWithTooManyIdentitiesPerPlatform accurately aggregates identities per platform

The use of a Common Table Expression (CTE) with platform_identities efficiently calculates identity counts per platform. The subsequent query correctly selects members exceeding the threshold.


146-156: JSDoc comments for fetchMembersWithTooManyEmails are precise

The comments provide a clear description of the function's intent and its parameters, facilitating easier understanding.


157-193: Function fetchMembersWithTooManyEmails correctly fetches members with excessive email identities

The SQL query appropriately counts distinct verified email addresses and filters members exceeding the specified threshold, ensuring accurate results.


195-205: JSDoc comments for fetchMembersWithIncompleteWorkExperience are informative

The documentation adequately explains the function's purpose and parameters, aiding in code readability.


206-237: Function fetchMembersWithIncompleteWorkExperience effectively retrieves members with incomplete work data

The function correctly identifies members with missing titles or start dates in their work experience records. The SQL query is well-structured to fulfill the requirement.

Comment on lines +17 to +27
<div>
<lf-data-quality-member-merge-suggestions
v-if="tab === 'merge-suggestions'"
:project-group="projectGroup"
/>
<lf-data-quality-member-issues
v-else
:type="tab"
:project-group="projectGroup"
/>
</div>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add loading and error states for better UX.

The component should handle loading and error states when switching between views.

-    <div>
+    <div v-if="isLoading">
+      <lf-loading />
+    </div>
+    <div v-else-if="error">
+      <lf-error-state :error="error" @retry="loadData" />
+    </div>
+    <div v-else>
       <lf-data-quality-member-merge-suggestions
         v-if="tab === 'merge-suggestions'"
         :project-group="projectGroup"
       />
       <lf-data-quality-member-issues
         v-else
         :type="tab"
         :project-group="projectGroup"
       />
     </div>

Add to script setup:

const isLoading = ref(false);
const error = ref<Error | null>(null);

async function loadData() {
  try {
    isLoading.value = true;
    // Add data fetching logic here
  } catch (e) {
    error.value = e instanceof Error ? e : new Error('Unknown error');
  } finally {
    isLoading.value = false;
  }
}

Comment on lines +43 to +44
const tab = ref('merge-suggestions');
const projectGroup = ref(route.query.projectGroup as string);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve type safety and state management.

The current implementation lacks proper typing and reactive state management.

-const tab = ref('merge-suggestions');
-const projectGroup = ref(route.query.projectGroup as string);
+type TabValue = 'merge-suggestions' | DataIssueType;
+
+const tab = ref<TabValue>('merge-suggestions');
+const projectGroup = ref<string | undefined>(
+  typeof route.query.projectGroup === 'string' ? route.query.projectGroup : undefined
+);
+
+watch(tab, () => {
+  loadData();
+});
+
+watch(projectGroup, () => {
+  if (projectGroup.value) {
+    loadData();
+  }
+});
+
+onMounted(() => {
+  loadData();
+});

Committable suggestion was skipped due to low confidence.

Comment on lines 8 to 9
<span v-if="selectedType === 'merge-suggestions'">Merge suggestions</span>
<span v-else>{{ dataIssueTypes[selectedType]?.label || 'Select issue type' }}</span>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor hardcoded 'merge-suggestions' type.

The special handling of 'merge-suggestions' creates code duplication and reduces maintainability. Consider adding it to the dataIssueTypes configuration.

-    <span v-if="selectedType === 'merge-suggestions'">Merge suggestions</span>
-    <span v-else>{{ dataIssueTypes[selectedType]?.label || 'Select issue type' }}</span>
+    <span>{{ dataIssueTypes[selectedType]?.label || 'Select issue type' }}</span>

-    <lf-dropdown-item
-      :selected="selectedType === 'merge-suggestions'"
-      @click="selectedType = 'merge-suggestions'"
-    >
-      Merge suggestions
-    </lf-dropdown-item>

Also applies to: 14-19

Comment on lines 2 to 29
<lf-dropdown width="19rem">
<template #trigger>
<lf-button type="secondary">
<lf-icon name="stack-line" />
<div class="truncate" style="max-width: 40ch;">
<span>Issue type: </span>
<span v-if="selectedType === 'merge-suggestions'">Merge suggestions</span>
<span v-else>{{ dataIssueTypes[selectedType]?.label || 'Select issue type' }}</span>
</div>
</lf-button>
</template>

<lf-dropdown-item
:selected="selectedType === 'merge-suggestions'"
@click="selectedType = 'merge-suggestions'"
>
Merge suggestions
</lf-dropdown-item>

<lf-dropdown-item
v-for="type in props.types"
:key="type"
:selected="selectedType === type"
@click="selectedType = type"
>
{{ dataIssueTypes[type]?.label }}
</lf-dropdown-item>
</lf-dropdown>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add accessibility attributes to the dropdown.

The dropdown lacks proper accessibility attributes which could make it difficult for screen reader users.

-  <lf-dropdown width="19rem">
+  <lf-dropdown
+    width="19rem"
+    aria-label="Select issue type"
+    role="listbox"
+  >
     <template #trigger>
-      <lf-button type="secondary">
+      <lf-button
+        type="secondary"
+        aria-haspopup="listbox"
+        :aria-expanded="isOpen"
+      >

Committable suggestion was skipped due to low confidence.

Comment on lines +55 to +59
<script lang="ts">
export default {
name: 'LfDataQualityTypeDropdown',
};
</script>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify component definition.

With Composition API, the component name can be defined in the setup script, eliminating the need for a separate script block.

-<script lang="ts">
-export default {
-  name: 'LfDataQualityTypeDropdown',
-};
-</script>
+<script lang="ts" setup>
+defineOptions({
+  name: 'LfDataQualityTypeDropdown'
+});
+
 // ... rest of the setup script
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script lang="ts">
export default {
name: 'LfDataQualityTypeDropdown',
};
</script>
<script lang="ts" setup>
defineOptions({
name: 'LfDataQualityTypeDropdown'
});
// ... rest of the setup script

Comment on lines 1 to 11
import {
fetchMembersWithIncompleteWorkExperience,
fetchMembersWithTooManyEmails,
fetchMembersWithTooManyIdentities,
fetchMembersWithTooManyIdentitiesPerPlatform,
fetchMembersWithoutWorkExperience,
} from '@crowd/data-access-layer/src/data-quality'

import SequelizeRepository from '@/database/repositories/sequelizeRepository'

import { IRepositoryOptions } from './IRepositoryOptions'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding explicit return types for imported functions

The imported data quality functions from @crowd/data-access-layer would benefit from explicit TypeScript return type declarations to improve type safety and developer experience.

Consider creating an interface for the return types:

interface MemberDataQualityIssue {
  id: string;
  // add other relevant fields
}

// Then update the imports to include return types
import {
  fetchMembersWithoutWorkExperience: (
    qx: any,
    tenantId: string,
    limit: number,
    offset: number,
    segmentId: string
  ) => Promise<MemberDataQualityIssue[]>,
  // similar type annotations for other functions
} from '@crowd/data-access-layer/src/data-quality'

Comment on lines +24 to +33
static async findMembersWithNoWorkExperience(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithoutWorkExperience(qx, tenantId, limit, offset, segmentId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation for method parameters

While the implementation is correct, it would benefit from parameter validation to ensure data integrity and provide clear error messages.

Consider adding validation:

static async findMembersWithNoWorkExperience(
  options: IRepositoryOptions,
  tenantId: string,
  limit: number,
  offset: number,
  segmentId: string,
) {
  if (!tenantId) throw new Error('tenantId is required')
  if (limit < 0) throw new Error('limit must be non-negative')
  if (offset < 0) throw new Error('offset must be non-negative')
  
  const qx = SequelizeRepository.getQueryExecutor(options)
  return fetchMembersWithoutWorkExperience(qx, tenantId, limit, offset, segmentId)
}

Comment on lines +68 to +77
static async findMembersWithTooManyIdentitiesPerPlatform(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyIdentitiesPerPlatform(qx, 1, tenantId, limit, offset, segmentId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for database operations

The method should handle potential database errors gracefully and provide meaningful error messages.

Consider adding try-catch block:

static async findMembersWithTooManyIdentitiesPerPlatform(
  options: IRepositoryOptions,
  tenantId: string,
  limit: number,
  offset: number,
  segmentId: string,
) {
  try {
    const qx = SequelizeRepository.getQueryExecutor(options)
    return await fetchMembersWithTooManyIdentitiesPerPlatform(qx, 1, tenantId, limit, offset, segmentId)
  } catch (error) {
    throw new Error(`Failed to fetch members with too many identities per platform: ${error.message}`)
  }
}

Comment on lines 89 to 98
static async findMembersWithTooManyEmails(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyEmails(qx, 3, tenantId, limit, offset, segmentId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add pagination interface for better type safety

Consider creating a dedicated interface for pagination parameters to ensure consistency across repository methods.

Create and use a pagination interface:

interface PaginationParams {
  limit: number;
  offset: number;
}

static async findMembersWithTooManyEmails(
  options: IRepositoryOptions,
  tenantId: string,
  pagination: PaginationParams,
  segmentId: string,
) {
  const { limit, offset } = pagination
  const qx = SequelizeRepository.getQueryExecutor(options)
  return fetchMembersWithTooManyEmails(qx, 3, tenantId, limit, offset, segmentId)
}

Comment on lines 110 to 119
static async findMembersWithIncompleteWorkExperience(
options: IRepositoryOptions,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithIncompleteWorkExperience(qx, tenantId, limit, offset, segmentId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type safety for query executor

The query executor type should be explicitly defined to ensure type safety when passing it to the DAL function.

Add type definition:

import { QueryTypes } from 'sequelize'

interface QueryExecutor {
  query<T = any>(
    sql: string,
    options?: { type: QueryTypes; replacements?: any }
  ): Promise<T>;
}

static async findMembersWithIncompleteWorkExperience(
  options: IRepositoryOptions,
  tenantId: string,
  limit: number,
  offset: number,
  segmentId: string,
) {
  const qx = SequelizeRepository.getQueryExecutor(options) as QueryExecutor
  return fetchMembersWithIncompleteWorkExperience(qx, tenantId, limit, offset, segmentId)
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (3)
frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (2)

15-15: Simplify class modifiers for better maintainability.

The class syntax !border-0 !rounded-none uses multiple important modifiers. Consider using a custom class or component prop to handle these styles.

-class=" !border-0 !rounded-none shadow-none"
+class="no-border shadow-none"

58-58: Consider adding runtime validation for emitted value.

While the type is correctly defined, consider adding runtime validation to ensure the emitted value matches a valid project group ID.

-const emit = defineEmits<{(e: 'update:modelValue', value: string): void}>();
+const emit = defineEmits({
+  'update:modelValue': (value: string) => {
+    return projectGroups.value.list.some(group => group.id === value);
+  }
+});
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1)

57-59: Add prop validation.

Consider adding validation for the projectGroup prop to ensure it's a non-empty string.

 const props = defineProps<{
-  projectGroup: string,
+  projectGroup: string & { _brand?: 'NonEmptyString' },
 }>();
+
+const validateProjectGroup = (value: string) => {
+  if (!value?.trim()) {
+    throw new Error('projectGroup cannot be empty');
+  }
+};
+
+validateProjectGroup(props.projectGroup);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between c22a76d and b1a3de0.

📒 Files selected for processing (6)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/pages/data-quality.page.vue (1 hunks)
  • frontend/src/ui-kit/tabs/Tab.vue (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue
  • frontend/src/modules/data-quality/pages/data-quality.page.vue
🔇 Additional comments (5)
frontend/src/ui-kit/tabs/Tab.vue (2)

14-14: LGTM! Well-typed prop addition.

The new optional preserveQuery prop is properly typed and follows Vue naming conventions.


Line range hint 14-26: Verify impact on existing tab implementations.

Since this is a UI kit component, the routing behavior changes could affect existing implementations.

Let's verify existing usages:

✅ Verification successful

No impact on existing tab implementations

Based on the search results, the changes to the Tab component's routing behavior are safe because:

  1. The new preserveQuery prop is optional and defaults to false, maintaining backward compatibility
  2. Existing tab implementations that use route hash (found in lf-admin-panel-page.vue, activity-list-page.vue) work independently of this change
  3. The parent Tabs.vue component handles hash-based routing consistently with the existing pattern
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find all Vue files using LfTab component
echo "Searching for LfTab usage patterns..."
rg -l "LfTab" --type vue

# Find specific tab implementations with route handling
echo "\nSearching for route handling in tab components..."
rg -A 5 "route\.(hash|query)" --type vue

Length of output: 321


Script:

#!/bin/bash
# Find all Vue files using Tab component (without file type filter)
echo "Searching for Tab usage patterns..."
rg -l "Tab"

# Find specific tab implementations with route handling
echo "\nSearching for route handling in tabs..."
rg -A 5 "route\.(hash|query)"

# Search for imports of the Tab component
echo "\nSearching for Tab component imports..."
rg "import.*Tab.*from"

Length of output: 30147

frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1)

39-50: LGTM!

Imports are well-organized and follow consistent naming conventions.

frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (2)

43-56: LGTM! Clean imports and component setup.

The imports are well-organized and all dependencies are properly declared.


107-111: LGTM! Clear component registration.

The component is properly registered with a descriptive name.

Comment on lines +23 to +26
router?.push(props.preserveQuery ? {
...route,
hash: `#${props.name}`,
} : {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding type safety and route property filtering.

While the implementation works, it could benefit from additional safeguards:

Consider applying these improvements:

 const selectTab = () => {
-  router?.push(props.preserveQuery ? {
-    ...route,
-    hash: `#${props.name}`,
-  } : {
-    hash: `#${props.name}`,
-  });
+  const baseRoute = props.preserveQuery
+    ? {
+        path: route.path,
+        query: route.query,
+        params: route.params,
+      }
+    : {};
+  
+  router?.push({
+    ...baseRoute,
+    hash: `#${props.name}`,
+  });
 };

This refactor:

  1. Explicitly specifies which route properties to preserve
  2. Prevents potential issues with non-serializable route properties
  3. Improves type safety
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
router?.push(props.preserveQuery ? {
...route,
hash: `#${props.name}`,
} : {
const selectTab = () => {
const baseRoute = props.preserveQuery
? {
path: route.path,
query: route.query,
params: route.params,
}
: {};
router?.push({
...baseRoute,
hash: `#${props.name}`,
});
};

Comment on lines +84 to +88
<script lang="ts">
export default {
name: 'LfDataQualityProjectDropdown',
};
</script>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Modernize component definition.

In Vue 3, you can define the component name directly in the setup script using defineOptions.

-<script lang="ts">
-export default {
-  name: 'LfDataQualityProjectDropdown',
-};
-</script>
+<script lang="ts" setup>
+defineOptions({
+  name: 'LfDataQualityProjectDropdown',
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script lang="ts">
export default {
name: 'LfDataQualityProjectDropdown',
};
</script>
<script lang="ts" setup>
defineOptions({
name: 'LfDataQualityProjectDropdown',
});
</script>

Comment on lines 71 to 81
const changeOption = (group: string) => {
projectGroup.value = group;
search.value = '';
router.push({
...route,
query: {
...route.query,
projectGroup: group,
},
});
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for route navigation.

The route navigation could fail silently. Consider handling potential navigation errors and providing user feedback.

 const changeOption = async (group: string) => {
   projectGroup.value = group;
   search.value = '';
-  router.push({
-    ...route,
-    query: {
-      ...route.query,
-      projectGroup: group,
-    },
-  });
+  try {
+    await router.push({
+      ...route,
+      query: {
+        ...route.query,
+        projectGroup: group,
+      },
+    });
+  } catch (error) {
+    console.error('Navigation failed:', error);
+    // TODO: Add user feedback mechanism
+  }
 };

Committable suggestion was skipped due to low confidence.

Comment on lines 74 to 78
const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
const segments = [
...getSegmentsFromProjectGroup(projectGroup),
props.projectGroup,
];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add null checks for projectGroup lookup.

The segments array construction might fail if the projectGroup lookup returns undefined.

-  const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
+  const foundProjectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
+  if (!foundProjectGroup) {
+    console.error(`Project group ${props.projectGroup} not found`);
+    loading.value = false;
+    return;
+  }
   const segments = [
-    ...getSegmentsFromProjectGroup(projectGroup),
+    ...getSegmentsFromProjectGroup(foundProjectGroup),
     props.projectGroup,
   ];

Committable suggestion was skipped due to low confidence.

:suggestion="suggestion"
>
<template #action>
<lf-button type="secondary" size="small" @click="isModalOpen = true; detailsOffset = si">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor the button click handler.

Multiple state updates are combined in the template. Consider extracting this into a method for better maintainability and reusability.

+const handleViewDetails = (index: number) => {
+  isModalOpen.value = true;
+  detailsOffset.value = index;
+};

-<lf-button type="secondary" size="small" @click="isModalOpen = true; detailsOffset = si">
+<lf-button type="secondary" size="small" @click="handleViewDetails(si)">

Committable suggestion was skipped due to low confidence.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b1a3de0 and 800d3aa.

📒 Files selected for processing (1)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🔇 Additional comments (5)
services/libs/data-access-layer/src/data-quality/members.ts (5)

15-41: Approved: fetchMembersWithoutWorkExperience is well-implemented

The function fetchMembersWithoutWorkExperience correctly retrieves members without any associated work experience. The SQL query is accurately constructed, and the parameters are appropriately used.


54-88: Approved: fetchMembersWithTooManyIdentities functions as intended

The fetchMembersWithTooManyIdentities function effectively identifies members exceeding the specified identity threshold. The SQL query is well-structured, and the grouping and ordering are correctly applied.


101-144: Approved: fetchMembersWithTooManyIdentitiesPerPlatform is accurately implemented

The function correctly fetches members with too many verified identities per platform. The use of a Common Table Expression (CTE) enhances the query's readability and efficiency.


157-193: Approved: fetchMembersWithTooManyEmails retrieves members correctly

The function effectively retrieves members who have more verified email addresses than the specified threshold. The SQL query appropriately uses COUNT(DISTINCT mi.value) to ensure unique email counts.


206-237: Approved: fetchMembersWithIncompleteWorkExperience is properly implemented

The function successfully identifies members with incomplete work experience records. The SQL query correctly filters based on missing title or dateStart fields, ensuring only relevant members are fetched.

Comment on lines 15 to 237
tenantId,
limit,
offset,
segmentId,
},
)
}

/**
* Fetches members who have more than a specified number of verified email addresses.
*
* @param {QueryExecutor} qx - The query executor to run database queries.
* @param {number} [threshold=3] - The threshold number of email addresses a member must exceed to be included.
* @param {string} tenantId - The ID of the tenant to which the members belong.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The number of members to skip before starting to collect the result set.
* @param {string} segmentId - The ID of the segment to which the members belong.
* @return {Promise<IMember[]>} - A promise that resolves to an array of members who have more than the specified number of verified email addresses.
*/
export async function fetchMembersWithTooManyEmails(
qx: QueryExecutor,
threshold = 3,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT mi.value) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT mi.value) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
}

/**
* Fetch members with incomplete work experience within a specified tenant and segment.
*
* @param {QueryExecutor} qx - The query executor instance used to perform database operations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The starting point in the list of members to retrieve.
* @param {string} segmentId - The ID of the segment to filter members by.
*
* @return {Promise<IMember[]>} A promise that resolves to an array of members with incomplete work experience.
*/
export async function fetchMembersWithIncompleteWorkExperience(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId"
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title" = '' OR mo."dateStart" IS NULL)
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add unit tests for the new data quality functions

To ensure the robustness and correctness of these new functions, consider adding unit tests that cover various scenarios and edge cases. This will help prevent future regressions and maintain code quality.

Would you like assistance in creating unit tests for these functions?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (5)
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1)

25-28: Enhance badge accessibility

The merge suggestion badge should have proper ARIA attributes for better screen reader support.

-      <lf-badge type="primary" size="small">
+      <lf-badge 
+        type="primary" 
+        size="small"
+        role="status"
+        aria-label="Merge suggestion confidence level"
+      >
frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (2)

6-6: Move inline styles to component's style block.

Inline styles reduce maintainability and reusability. Consider moving the max-width style to the component's style block.

-<div class="truncate" style="max-width: 40ch;">
+<div class="truncate max-w-40">

+<style>
+.max-w-40 {
+  max-width: 40ch;
+}
+</style>

46-46: Improve emit type definition.

Consider using a union type for the value parameter to match the possible values.

-const emit = defineEmits<{(e: 'update:modelValue', value: string): void}>();
+const emit = defineEmits<{(e: 'update:modelValue', value: DataIssueType | 'merge-suggestions'): void}>();
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (2)

24-34: Enhance empty state accessibility.

The empty state uses icons and text to communicate the absence of merge suggestions. Consider improving accessibility by:

  1. Adding aria-label to the icon div
  2. Using semantic HTML for the message
-      <div
+      <div
+        aria-label="No merge suggestions icon"
         class="ri-shuffle-line text-gray-200 text-10xl h-40 flex items-center mb-8"
       />
-      <h5 class="text-center text-lg font-semibold mb-4">
+      <h2 class="text-center text-lg font-semibold mb-4">
         No merge suggestions
-      </h5>
+      </h2>

57-59: Add prop validation.

Consider adding validation for the projectGroup prop to ensure it's not empty.

 const props = defineProps<{
-  projectGroup: string,
+  projectGroup: string,
 }>();
+
+if (props.projectGroup.trim() === '') {
+  console.warn('projectGroup prop should not be empty');
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 47f33c5 and dfaf4b1.

📒 Files selected for processing (8)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue
  • frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue
🔇 Additional comments (6)
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1)

1-48: Verify component integration in the data quality feature

Let's ensure this component is properly integrated within the data quality module.

✅ Verification successful

Component is properly integrated within the data quality feature

The verification shows that the component is correctly integrated:

  • It's used by its parent component data-quality-organization-merge-suggestions.vue which properly passes the required suggestion prop
  • The parent handles loading states, empty states, and pagination
  • The component is part of a well-structured data quality module with related components for both organization and member merge suggestions
  • The suggestion prop matches the backend types, particularly IOrganizationMergeSuggestion which contains the expected similarity and organizations fields used in the template
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check component integration and usage patterns

# Test 1: Find parent components using this component
echo "Finding parent components using LfDataQualityOrganizationMergeSuggestionsItem:"
rg -l "LfDataQualityOrganizationMergeSuggestionsItem" --type vue

# Test 2: Check for consistent prop usage
echo "\nChecking prop usage patterns:"
rg "v-bind:suggestion|:suggestion" --type vue -A 2

# Test 3: Verify route configuration
echo "\nVerifying route configuration for data quality feature:"
rg "data-quality" --type ts --glob "**/router/**"

Length of output: 588


Script:

#!/bin/bash
# Let's try different search approaches without relying on file type detection

# Test 1: Find parent components using this component
echo "Finding parent components using LfDataQualityOrganizationMergeSuggestionsItem:"
fd . -e vue -x rg -l "LfDataQualityOrganizationMergeSuggestionsItem" {}

# Test 2: Check for consistent prop usage
echo "\nChecking prop usage patterns:"
fd . -e vue -x rg "v-bind:suggestion|:suggestion" -A 2 {}

# Test 3: Verify route configuration
echo "\nVerifying route configuration for data quality feature:"
fd . -e ts -p "**/router/**" -x rg "data-quality" {}

# Test 4: Check for any data quality related components
echo "\nChecking for data quality related components:"
fd . -e vue -x rg -l "data-quality" {}

Length of output: 1889


Script:

#!/bin/bash
# Let's check the parent component's implementation and routing setup

# Check the parent component's implementation
echo "Checking parent component implementation:"
cat "./frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue"

# Check router setup in a different way
echo "\nChecking router configuration:"
fd router -e ts -x cat {}

# Check how the suggestion prop is structured
echo "\nChecking suggestion prop type definitions:"
rg -A 5 "interface.*Suggestion" 

Length of output: 13696

frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (3)

32-39: LGTM!

The imports are well-organized and all imported components are properly utilized.


48-51: LGTM!

The computed property correctly implements the v-model pattern for two-way binding.


1-59: Verify component registration and usage.

Let's verify that this component is properly registered and used in parent components.

✅ Verification successful

Component is properly integrated and used

The component is correctly imported and used in two parent components:

  • data-quality-member.vue
  • data-quality-organization.vue

Both parent components follow the correct import path from the shared components directory, indicating proper component organization and usage within the data quality module.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check component registration and usage

# Test 1: Find parent components using this dropdown
echo "Looking for parent components using LfDataQualityTypeDropdown..."
rg -l "LfDataQualityTypeDropdown" --type vue

# Test 2: Check if component is registered in any index files
echo "Checking component registration..."
rg -l "LfDataQualityTypeDropdown" --glob "*/index.ts"

Length of output: 493


Script:

#!/bin/bash
# Description: Check component registration and usage with corrected search

# Test 1: Find parent components using this dropdown (using file extension)
echo "Looking for parent components using LfDataQualityTypeDropdown..."
rg -l "LfDataQualityTypeDropdown" --glob "*.vue"

# Test 2: Check if component is registered in any index files
echo "Checking component registration..."
rg -l "LfDataQualityTypeDropdown" frontend/src/modules/data-quality

# Test 3: Check actual usage pattern
echo "Checking usage pattern..."
rg "LfDataQualityTypeDropdown" --glob "*.vue" -C 2

Length of output: 3029

frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1)

43-56: LGTM! Well-structured imports and setup.

The component setup follows Vue 3 best practices with TypeScript integration.

services/libs/data-access-layer/src/data-quality/members.ts (1)

64-78: Use parameterized queries to prevent SQL injection vulnerabilities

Variables are directly embedded into the SQL query strings, which presents a security risk. It's essential to use parameterized queries with placeholders and pass the parameters separately to prevent SQL injection attacks.

This issue has been previously raised in past review comments. Please refer to the earlier feedback and ensure that parameterized queries are implemented throughout the code.

Also applies to: 111-143, 175-193, 224-237

@@ -0,0 +1,48 @@
<template>
<article class="border-b border-gray-100 py-5 flex items-center justify-between">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use design system tokens for consistent styling

Replace hardcoded color classes with design system tokens for better maintainability:

-  <article class="border-b border-gray-100 py-5 flex items-center justify-between">
+  <article class="border-b border-primary-100 py-5 flex items-center justify-between">
...
-          <lf-icon-old name="subtract-line" :size="24" class="text-gray-400" />
+          <lf-icon-old name="subtract-line" :size="24" class="text-neutral-400" />
...
-              <div class="w-full h-full bg-gray-50 flex items-center justify-center">
-                <lf-icon-old name="community-line" :size="16" class="text-gray-400" />
+              <div class="w-full h-full bg-neutral-50 flex items-center justify-center">
+                <lf-icon-old name="community-line" :size="16" class="text-neutral-400" />

Also applies to: 6-6, 15-17

Comment on lines +44 to +48
<script lang="ts">
export default {
name: 'LfDataQualityOrganizationMergeSuggestionsItem',
};
</script>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify component definition using script setup

The separate script block for the component name is unnecessary when using script setup.

-<script lang="ts">
-export default {
-  name: 'LfDataQualityOrganizationMergeSuggestionsItem',
-};
-</script>
+<script lang="ts" setup>
+defineOptions({
+  name: 'LfDataQualityOrganizationMergeSuggestionsItem',
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script lang="ts">
export default {
name: 'LfDataQualityOrganizationMergeSuggestionsItem',
};
</script>
<script lang="ts" setup>
defineOptions({
name: 'LfDataQualityOrganizationMergeSuggestionsItem',
});
</script>

Comment on lines +39 to +41
const props = defineProps<{
suggestion: any,
}>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add prop validation for similarity value

The similarity value is used for percentage calculation and should be validated.

+import { z } from 'zod';
+
+const suggestionSchema = z.object({
+  organizations: z.array(z.object({
+    id: z.string(),
+    logo: z.string().optional(),
+    displayName: z.string()
+  })),
+  similarity: z.number().min(0).max(1)
+});
+
 const props = defineProps<{
-  suggestion: any,
+  suggestion: z.infer<typeof suggestionSchema>
 }>();
+
+// Validate props at runtime
+watch(() => props.suggestion, (newValue) => {
+  const result = suggestionSchema.safeParse(newValue);
+  if (!result.success) {
+    console.error('Invalid suggestion prop:', result.error);
+  }
+}, { immediate: true });

Committable suggestion was skipped due to low confidence.

Comment on lines +107 to +111
<script lang="ts">
export default {
name: 'LfDataQualityOrganizationMergeSuggestions',
};
</script>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Modernize component registration.

In Vue 3 with script setup, the component name can be defined using a single script tag.

-<script lang="ts">
-export default {
-  name: 'LfDataQualityOrganizationMergeSuggestions',
-};
-</script>
+<script lang="ts" setup>
+defineOptions({
+  name: 'LfDataQualityOrganizationMergeSuggestions',
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script lang="ts">
export default {
name: 'LfDataQualityOrganizationMergeSuggestions',
};
</script>
<script lang="ts" setup>
defineOptions({
name: 'LfDataQualityOrganizationMergeSuggestions',
});
</script>

FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure consistency in filtering out deleted members

In the queries of fetchMembersWithTooManyIdentities (line 74), fetchMembersWithTooManyEmails (line 186), fetchMembersWithTooManyIdentitiesPerPlatform (line 141), and fetchMembersWithIncompleteWorkExperience (line 233), the condition AND m."deletedAt" IS NULL is missing from the WHERE clause. To prevent including deleted members in the results, consider adding this condition, as you have done in fetchMembersWithoutWorkExperience.

Apply this diff to add the condition:

 WHERE m."tenantId" = '${tenantId}'
+  AND m."deletedAt" IS NULL

Also applies to: 186-186, 141-141, 233-233

SELECT m.id, m."displayName", m.attributes, msa."activityCount"
FROM members m
LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId"
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using LEFT JOIN for 'memberSegmentsAgg' to include members with no activities

Using INNER JOIN with "memberSegmentsAgg" msa excludes members who have no activities in the specified segment. If the goal is to include all members regardless of their activity count, consider changing the INNER JOIN to a LEFT JOIN.

Apply this diff:

- INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+ LEFT JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'

This modification will include members without activities, with msa."activityCount" being NULL.

Also applies to: 73-73, 185-185, 232-232

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

🧹 Outside diff range and nitpick comments (17)
frontend/src/modules/data-quality/config/types/no-work-experience.ts (1)

1-10: Overall structure looks good with room for improvement.

The configuration follows the correct type structure and uses appropriate badge styling for a critical data issue. Consider adding additional properties in the future such as:

  • Severity level for prioritization
  • Suggested resolution steps
  • Links to relevant documentation
frontend/src/modules/data-quality/config/types/too-many-identities.ts (2)

6-6: Consider simplifying the redundant badgeText function.

The badgeText function returns the same static string as the label property. This creates unnecessary duplication.

You could either:

  1. Reference the label:
-  badgeText: () => 'More than 15 identities',
+  badgeText: () => label,
  1. Or if the function wrapper isn't needed:
-  badgeText: () => 'More than 15 identities',
+  badgeText: 'More than 15 identities',

10-10: Consider using named export for better maintainability.

Using named exports makes it easier to import specific entities and helps with code maintenance.

-export default tooManyIdentities;
+export { tooManyIdentities };
frontend/src/modules/data-quality/config/types/work-experience-missing-period.ts (1)

3-8: Add JSDoc documentation for better maintainability.

While the configuration is well-structured, adding JSDoc documentation would improve maintainability by explaining:

  • The purpose of this configuration
  • Why the description returns null
  • The impact of the 'danger' badge type
+/**
+ * Configuration for identifying work experiences that lack period information.
+ * This is considered a critical data quality issue as it affects the accuracy
+ * of a member's professional timeline.
+ */
 const workExperienceMissingPeriod: DataIssueTypeConfig = {
frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (1)

7-7: Consider adding a meaningful description instead of returning null.

The description function currently returns null, but it would be more helpful to provide actual guidance or explanation about the overlapping work experiences issue and how to resolve it.

Example implementation:

-  description: () => null,
+  description: () => 'Multiple work experiences have overlapping date ranges. This might indicate incorrect data entry or need for date validation.',
frontend/src/modules/data-quality/config/types/work-experience-missing-info.ts (1)

3-8: Consider providing a meaningful description instead of null.

The description property returns null, which might miss an opportunity to provide helpful context to users about what information is missing and how to resolve it.

Consider adding a descriptive message, for example:

-  description: () => null,
+  description: () => 'Work experiences should include essential information such as company name, role, and dates. Please review and complete the missing details.',
frontend/src/modules/member/components/suggestions/member-merge-similarity.vue (1)

23-23: Consider documenting the similarity thresholds.

The icon updates look good and follow a consistent naming pattern. Consider adding a comment explaining the similarity thresholds and their corresponding visual representations for better maintainability.

Add documentation above the style computed property:

+/**
+ * Computes the visual style based on similarity thresholds:
+ * - ≥ 0.9: Strong similarity (green, strong signal)
+ * - ≥ 0.7: Good similarity (primary color, good signal)
+ * - < 0.7: Fair similarity (yellow, fair signal)
+ */
const style = computed<{

Also applies to: 28-28, 34-34, 39-39

frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1)

15-15: Avoid using force override in class names.

The ! prefix in the class name forces style overrides, which can lead to maintenance issues. Consider creating a proper CSS class or modifying the base component styles.

-class=" !border-0 !rounded-none shadow-none"
+class="input-borderless shadow-none"
frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (3)

43-50: Remove commented code.

The commented-out dropdown item implementation should be removed as it's no longer needed and adds unnecessary noise to the codebase.


6-6: Move inline style to a CSS class.

Instead of using inline styles, define a CSS class for better maintainability and consistency.

-        <div class="truncate" style="max-width: 40ch;">
+        <div class="truncate max-w-[40ch]">

62-65: Add default value for config prop.

The config prop should have a default empty array to prevent undefined errors when the prop is not provided.

-const props = defineProps<{
+const props = withDefaults(defineProps<{
   modelValue: string;
   config: DataIssueTypeMenu[]
-}>();
+}>(), {
+  config: () => []
+});
frontend/src/modules/member/components/suggestions/member-merge-suggestion-dropdown.vue (1)

4-11: Add accessibility attributes to the button.

The icon-only button should include an aria-label to improve accessibility for screen readers.

 <lf-button
   size="small"
   type="secondary-ghost"
   :icon-only="true"
+  aria-label="Member merge suggestion options"
 >
frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (3)

28-30: Consider using responsive sizing for the empty state icon.

The hardcoded text-10xl class might not scale well across different screen sizes.

-        class="ri-shuffle-line text-gray-200 text-10xl h-40 flex items-center mb-8"
+        class="ri-shuffle-line text-gray-200 text-7xl md:text-8xl lg:text-10xl h-40 flex items-center mb-8"

45-45: Separate state update and method call for better readability.

Combining multiple operations in a single line reduces readability.

-    @reload="offset = 0; loadMergeSuggestions()"
+    @reload="reload"

14-16: Maintain consistency in icon usage.

The component mixes ri-* icons with LfIconOld component. Consider using LfIconOld consistently throughout the template.

-            <lf-button type="secondary" size="small" @click="isModalOpen = true; detailsOffset = si">
-              <lf-icon-old name="eye-line" />View suggestion
-            </lf-button>
+            <lf-button type="secondary" size="small" @click="showDetails(si)">
+              <lf-icon-old name="eye-line" class="mr-1" />View suggestion
+            </lf-button>

And add the method:

const showDetails = (index: number) => {
  detailsOffset.value = index;
  isModalOpen.value = true;
};
services/libs/data-access-layer/src/data-quality/members.ts (2)

210-219: Correct the JSDoc comment to match the function's purpose

The JSDoc comment for fetchMembersWithMissingInfoOnWorkExperience incorrectly mentions "missing period information" instead of "missing information" on work experience, which could cause confusion.


299-335: Add JSDoc comment for fetchMembersWithConflictingWorkExperiences

The function fetchMembersWithConflictingWorkExperiences lacks a JSDoc comment describing its purpose, parameters, and return value. Including a JSDoc comment would improve code readability and maintainability.

Would you like assistance in generating the JSDoc comment for this function?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ff932dd and 14b3e83.

📒 Files selected for processing (25)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
  • backend/src/services/dataQualityService.ts (1 hunks)
  • backend/src/types/data-quality/data-quality-filters.ts (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-member.vue (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (1 hunks)
  • frontend/src/modules/data-quality/config/data-issue-types.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/no-work-experience.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-emails.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/too-many-identities.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/work-experience-missing-info.ts (1 hunks)
  • frontend/src/modules/data-quality/config/types/work-experience-missing-period.ts (1 hunks)
  • frontend/src/modules/data-quality/types/DataIssueType.ts (1 hunks)
  • frontend/src/modules/layout/components/menu/menu.vue (2 hunks)
  • frontend/src/modules/member/components/suggestions/member-merge-similarity.vue (1 hunks)
  • frontend/src/modules/member/components/suggestions/member-merge-suggestion-dropdown.vue (1 hunks)
  • frontend/src/modules/member/pages/member-merge-suggestions-page.vue (2 hunks)
  • scripts/cli (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • backend/src/services/dataQualityService.ts
  • frontend/src/modules/data-quality/components/data-quality-member.vue
  • frontend/src/modules/data-quality/components/data-quality-organization.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue
  • frontend/src/modules/data-quality/config/types/too-many-emails.ts
  • frontend/src/modules/data-quality/config/types/too-many-identities-per-platform.ts
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 13-164: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🔇 Additional comments (17)
frontend/src/modules/data-quality/config/types/no-work-experience.ts (1)

3-8: ⚠️ Potential issue

Implement an actionable description instead of null.

Setting the description to null removes valuable guidance that could help users resolve data quality issues. Users need clear instructions on how to address missing work experience data.

Apply this diff to add an actionable description:

 const noWorkExperience: DataIssueTypeConfig = {
   label: 'Missing work experiences',
   badgeType: 'danger',
   badgeText: () => 'Missing work experiences',
-  description: () => null,
+  description: () => 'This profile has no work experience. If this is incorrect, add relevant work history or mark the profile as a student/new graduate if applicable.',
 };
frontend/src/modules/data-quality/config/types/conflicting-work-experience.ts (1)

1-1: LGTM! Well-structured configuration.

The configuration follows the TypeScript interface correctly and maintains a consistent pattern with other data quality issue types. The export is properly defined as the default export.

Also applies to: 3-8, 10-10

frontend/src/modules/data-quality/config/types/work-experience-missing-info.ts (1)

1-10: Verify integration with other data quality configurations.

The configuration structure looks good and follows a consistent pattern. Let's verify it matches other similar configurations in the codebase.

✅ Verification successful

Configuration structure matches other data quality configurations

The implementation follows the same consistent pattern as other data quality configurations in the codebase. All configurations:

  • Export a constant of type DataIssueTypeConfig
  • Include the required fields: label, badgeType, badgeText, and description
  • Use similar structure for badge text and null descriptions
  • Use appropriate badge types ('warning' or 'danger') based on severity
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for consistency across data quality configurations
# Expected: Similar structure in other config files

# Look for other DataIssueTypeConfig implementations
ast-grep --pattern 'const $_: DataIssueTypeConfig = {
  $$$
}'

# Check for any other files in the same directory
fd . "$(dirname "frontend/src/modules/data-quality/config/types/work-experience-missing-info.ts")"

Length of output: 5442

frontend/src/modules/data-quality/types/DataIssueType.ts (1)

1-9: LGTM! Consider internationalization for issue messages.

The enum implementation is clean and well-organized, with logical grouping of identity and work experience related issues. The naming is clear and follows consistent conventions.

The previous review comment about enhancing documentation with JSDoc and adding a type guard is still valid and would improve maintainability.

Consider using translation keys instead of hardcoded English strings to support internationalization:

 export enum DataIssueType {
-  TOO_MANY_IDENTITIES = 'too-many-identities',
+  TOO_MANY_IDENTITIES = 'data_quality.issues.too_many_identities',
-  TOO_MANY_IDENTITIES_PER_PLATFORM = 'too-many-identities-per-platform',
+  TOO_MANY_IDENTITIES_PER_PLATFORM = 'data_quality.issues.too_many_identities_per_platform',
   // ... similar changes for other enum values
 }

This would allow for easier localization of the messages displayed to users in different languages.

frontend/src/modules/member/components/suggestions/member-merge-similarity.vue (2)

12-12: LGTM! Props and imports are well-structured.

The optional percentageOnly prop is properly typed and provides good flexibility for different usage contexts.

Also applies to: 15-16


3-5: LGTM! Verify icon component migration.

The changes look good and improve component flexibility. The text styling update aligns with the design system.

Let's verify the icon component migration across the codebase:

✅ Verification successful

Icon component migration is part of a broader UI update

The codebase shows a clear pattern of migration from LfSvg to LfIcon components, with both old and new icon components coexisting:

  • LfSvg is defined in frontend/src/shared/svg/svg.vue
  • LfIcon is defined in frontend/src/ui-kit/icon/Icon.vue
  • Many components are already using the new LfIcon component
  • Some components still use LfIconOld (previously named LfSvg)

The change in member-merge-similarity.vue aligns with this migration pattern and is correct.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining instances of LfSvg that might need migration

# Test: Search for LfSvg usage
rg "LfSvg" --type vue

Length of output: 49


Script:

#!/bin/bash
# Let's try different approaches to find Vue files and LfSvg usage

# First, find all Vue files
echo "=== Vue files ==="
fd "\.vue$"

# Then search for LfSvg in all files (not just Vue)
echo -e "\n=== LfSvg occurrences ==="
rg "LfSvg"

# Also search for LfIcon to understand the migration pattern
echo -e "\n=== LfIcon occurrences ==="
rg "LfIcon"

Length of output: 57272

frontend/src/modules/data-quality/config/data-issue-types.ts (2)

1-11: LGTM! Well-organized imports.

The imports are logically grouped and all are being utilized within the file.


24-32: Verify completeness of data issue type mapping.

The mapping looks well-structured. Let's verify that all enum values are properly mapped.

✅ Verification successful

All DataIssueType enum values are properly mapped

Comparing the enum values from DataIssueType.ts with the mappings in data-issue-types.ts, all enum values are correctly mapped to their corresponding configurations in the same order:

  1. TOO_MANY_IDENTITIES
  2. TOO_MANY_IDENTITIES_PER_PLATFORM
  3. TOO_MANY_EMAILS
  4. NO_WORK_EXPERIENCE
  5. WORK_EXPERIENCE_MISSING_INFO
  6. WORK_EXPERIENCE_MISSING_PERIOD
  7. CONFLICTING_WORK_EXPERIENCE
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all DataIssueType enum values are mapped in dataIssueTypes

# Get all enum values
echo "Checking for unmapped enum values..."
ast-grep --pattern 'export enum DataIssueType {
  $$$
}'

# Get all mapped values
echo -e "\nChecking current mappings..."
rg '\[DataIssueType\.' -A 1

Length of output: 2171

frontend/src/modules/data-quality/components/shared/data-quality-project-dropdown.vue (1)

37-44: 🛠️ Refactor suggestion

Add spinner component import for loading states.

To support the loading state implementation, import the spinner component.

 import LfButton from '@/ui-kit/button/Button.vue';
+import LfSpinner from '@/ui-kit/spinner/Spinner.vue';
 import { computed, ref } from 'vue';

Likely invalid or redundant comment.

frontend/src/modules/data-quality/components/shared/data-quality-type-dropdown.vue (2)

54-61: LGTM!

The imports are well-organized and all are being used in the component.


69-72: LGTM!

The computed property implementation correctly handles two-way binding for v-model.

frontend/src/modules/member/components/suggestions/member-merge-suggestion-dropdown.vue (1)

50-56: LGTM! Event tracking implementation is solid.

The event tracking includes relevant properties and is properly placed before operations.

Also applies to: 87-93

frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1)

125-129: LGTM!

The component name follows the established naming convention.

frontend/src/modules/member/pages/member-merge-suggestions-page.vue (2)

153-154: LGTM: Clean import statement.

The import statement follows the project's conventions and is properly aligned.


101-101: Verify the new dropdown component integration.

The dropdown component integration looks good, following Vue.js best practices by properly passing the suggestion prop and handling the reload event. However, let's verify the component's functionality.

✅ Verification successful

Let me generate another verification script to check the implementation of the new member merge suggestion dropdown component.


The dropdown component integration is properly implemented

The new lf-member-merge-suggestion-dropdown component is correctly implemented with:

  • Proper props and emits definitions
  • Required merge and ignore functionality
  • Consistent UI with other dropdowns in the codebase
  • Proper event tracking and error handling

The refactoring successfully moved the merge/ignore logic into a dedicated component while maintaining the same functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the new dropdown component implementation and its dependencies

# Test 1: Check if the component exists and exports required props/emits
ast-grep --pattern 'export default {
  $$$
  props: {
    suggestion: {
      $$$
    }
  },
  $$$
  emits: [$$$"reload"$$$],
  $$$
}'

# Test 2: Check for any remaining old dropdown usage that might have been missed
rg "lf-dropdown" -g "*.vue"

Length of output: 23858


Script:

#!/bin/bash
# Check the implementation of the new member merge suggestion dropdown component
cat frontend/src/modules/member/components/suggestions/member-merge-suggestion-dropdown.vue

# Check if there are any references to the old merge and ignore methods in the codebase
rg "merge\(suggestion\)" -g "*.vue"
rg "ignore\(suggestion\)" -g "*.vue"

Length of output: 3481

scripts/cli (1)

865-878: LGTM! The new commands enhance frontend development workflow.

The addition of service-restart-fe-dev and clean-start-fe-dev commands provides developers with targeted options for managing frontend development services. The commands appropriately ignore backend services that aren't typically needed during frontend development.

backend/src/database/repositories/dataQualityRepository.ts (1)

14-164: Duplicate Comment: Refactor the class into a module with standalone functions

As previously noted, the DataQualityRepository class contains only static methods. Refactoring it into a module with exported functions would be more idiomatic in TypeScript and improve maintainability.

🧰 Tools
🪛 Biome

[error] 13-164: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

label: 'More than 15 identities',
badgeType: 'danger',
badgeText: () => 'More than 15 identities',
description: () => null,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add a meaningful description to help users understand and resolve the issue.

The current implementation returns null for the description, which misses an opportunity to:

  • Explain why having too many identities is problematic
  • Guide users on how to resolve the issue
  • Provide context about the threshold

Consider adding a descriptive message:

-  description: () => null,
+  description: () => 'Having more than 15 identities for a single member may indicate duplicate or incorrect data. Consider reviewing and merging duplicate identities or validating the data source.',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: () => null,
description: () => 'Having more than 15 identities for a single member may indicate duplicate or incorrect data. Consider reviewing and merging duplicate identities or validating the data source.',

Comment on lines 3 to 8
const tooManyIdentities: DataIssueTypeConfig = {
label: 'More than 15 identities',
badgeType: 'danger',
badgeText: () => 'More than 15 identities',
description: () => null,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Extract the identity threshold to a constant.

The threshold of 15 identities is hardcoded in multiple strings. This makes maintenance difficult if the threshold needs to change.

Consider refactoring to use a constant:

+const MAX_IDENTITIES_THRESHOLD = 15;
+
 const tooManyIdentities: DataIssueTypeConfig = {
-  label: 'More than 15 identities',
+  label: `More than ${MAX_IDENTITIES_THRESHOLD} identities`,
   badgeType: 'danger',
-  badgeText: () => 'More than 15 identities',
+  badgeText: () => `More than ${MAX_IDENTITIES_THRESHOLD} identities`,
   description: () => 'Having more than 15 identities for a single member may indicate duplicate or incorrect data. Consider reviewing and merging duplicate identities or validating the data source.',
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const tooManyIdentities: DataIssueTypeConfig = {
label: 'More than 15 identities',
badgeType: 'danger',
badgeText: () => 'More than 15 identities',
description: () => null,
};
const MAX_IDENTITIES_THRESHOLD = 15;
const tooManyIdentities: DataIssueTypeConfig = {
label: `More than ${MAX_IDENTITIES_THRESHOLD} identities`,
badgeType: 'danger',
badgeText: () => `More than ${MAX_IDENTITIES_THRESHOLD} identities`,
description: () => 'Having more than 15 identities for a single member may indicate duplicate or incorrect data. Consider reviewing and merging duplicate identities or validating the data source.',
};

Comment on lines +4 to +6
label: 'Work experience(s) without period',
badgeType: 'danger',
badgeText: () => 'Work experience(s) without period',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider extracting the duplicate string to a constant.

The label text is duplicated in both label and badgeText. Consider extracting it to a constant to follow DRY principles.

+const MISSING_PERIOD_TEXT = 'Work experience(s) without period';
+
 const workExperienceMissingPeriod: DataIssueTypeConfig = {
-  label: 'Work experience(s) without period',
+  label: MISSING_PERIOD_TEXT,
   badgeType: 'danger',
-  badgeText: () => 'Work experience(s) without period',
+  badgeText: () => MISSING_PERIOD_TEXT,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
label: 'Work experience(s) without period',
badgeType: 'danger',
badgeText: () => 'Work experience(s) without period',
const MISSING_PERIOD_TEXT = 'Work experience(s) without period';
label: MISSING_PERIOD_TEXT,
badgeType: 'danger',
badgeText: () => MISSING_PERIOD_TEXT,

Comment on lines +4 to +6
label: 'Work experiences with overlapping periods',
badgeType: 'warning',
badgeText: () => 'Work experiences with overlapping periods',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider deduplicating the badge text.

The badgeText is identical to the label. Consider either:

  1. Using the label value directly
  2. Creating a more concise badge text
   label: 'Work experiences with overlapping periods',
   badgeType: 'warning',
-  badgeText: () => 'Work experiences with overlapping periods',
+  badgeText: () => 'Overlapping periods',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
label: 'Work experiences with overlapping periods',
badgeType: 'warning',
badgeText: () => 'Work experiences with overlapping periods',
label: 'Work experiences with overlapping periods',
badgeType: 'warning',
badgeText: () => 'Overlapping periods',

import { DataIssueTypeConfig } from '@/modules/data-quality/config/data-issue-types';

const workExperienceMissingInfo: DataIssueTypeConfig = {
label: 'Work experience(s) with missing information',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid string duplication between label and badgeText.

The same string is duplicated in both label and badgeText. Consider extracting it to a constant to maintain DRY principles and ease future updates.

+const MISSING_INFO_TEXT = 'Work experience(s) with missing information';
+
 const workExperienceMissingInfo: DataIssueTypeConfig = {
-  label: 'Work experience(s) with missing information',
+  label: MISSING_INFO_TEXT,
   badgeType: 'warning',
-  badgeText: () => 'Work experience(s) with missing information',
+  badgeText: () => MISSING_INFO_TEXT,

Also applies to: 6-6

Comment on lines +48 to +50
<div class="px-3 py-2 opacity-0">
<div class="border-b border-gray-200" />
</div>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary invisible separator.

The separator div is set to be completely invisible (opacity-0). If this element is meant to provide visual separation, it should be visible. If it's not needed, consider removing it entirely.

-    <div class="px-3 py-2 opacity-0">
-      <div class="border-b border-gray-200" />
-    </div>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<div class="px-3 py-2 opacity-0">
<div class="border-b border-gray-200" />
</div>

Comment on lines 51 to 59
<lf-menu-link
v-if="hasPermission(LfPermission.dataQualityRead)"
class=" !opacity-0"
title="Data Quality Assistant"
icon="message-exclamation"
:is-collapsed="isCollapsed"
:disabled="!selectedProjectGroup"
:to="{ path: '/data-quality-assistant' }"
/>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix invisible Data Quality Assistant menu item.

The menu link for Data Quality Assistant is set to be completely invisible (opacity-0), which prevents users from accessing this new feature. This appears to be unintentional as it defeats the purpose of adding this navigation item.

     <lf-menu-link
       v-if="hasPermission(LfPermission.dataQualityRead)"
-      class=" !opacity-0"
       title="Data Quality Assistant"
       icon="message-exclamation"
       :is-collapsed="isCollapsed"
       :disabled="!selectedProjectGroup"
       :to="{ path: '/data-quality-assistant' }"
     />
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<lf-menu-link
v-if="hasPermission(LfPermission.dataQualityRead)"
class=" !opacity-0"
title="Data Quality Assistant"
icon="message-exclamation"
:is-collapsed="isCollapsed"
:disabled="!selectedProjectGroup"
:to="{ path: '/data-quality-assistant' }"
/>
<lf-menu-link
v-if="hasPermission(LfPermission.dataQualityRead)"
title="Data Quality Assistant"
icon="message-exclamation"
:is-collapsed="isCollapsed"
:disabled="!selectedProjectGroup"
:to="{ path: '/data-quality-assistant' }"
/>

@@ -862,6 +862,20 @@ while test $# -gt 0; do
service_start
exit
;;
service-restart-fe-dev)
IGNORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider consolidating the ignored services list.

Both service-restart-fe-dev and clean-start-fe-dev commands use the same list of ignored services. Consider defining this list as a constant at the top of the script to avoid duplication and make maintenance easier.

Add this at the top of the script with other constants:

FRONTEND_DEV_IGNORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")

Then use it in both commands:

-        IGNORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
+        IGNORED_SERVICES=("${FRONTEND_DEV_IGNORED_SERVICES[@]}")

Also applies to: 873-873

exit
;;
clean-start-fe-dev)
INGORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in variable name.

There's a typo in the variable name INGORED_SERVICES. It should be IGNORED_SERVICES to match the variable name used in other parts of the script.

-        INGORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
+        IGNORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
INGORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")
IGNORED_SERVICES=("frontend" "python-worker" "job-generator" "discord-ws" "webhook-api" "profiles-worker" "organizations-enrichment-worker" "merge-suggestions-worker" "members-enrichment-worker" "exports-worker" "emails-worker" "entity-merging-worker" "automations-worker")

Comment on lines +255 to +264
* Fetches members with missing information on work experience.
*
* @param {QueryExecutor} qx - The query executor instance used to perform database operations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The starting point in the list of members to retrieve.
* @param {string} segmentId - The ID of the segment to filter members by.
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update the JSDoc comment to accurately describe the function

The JSDoc comment for fetchMembersWithMissingPeriodOnWorkExperience is identical to the previous function's comment and does not reflect its specific purpose of fetching members with missing period information in their work experience.

Apply this diff to correct the JSDoc comment:

 /**
- * Fetches members with missing information on work experience.
+ * Fetches members with missing period information on work experience.
  *
  * @param {QueryExecutor} qx - The query executor instance used to perform database operations.
  * @param {string} tenantId - The ID of the tenant to filter members by.
  * @param {number} limit - The maximum number of members to retrieve.
  * @param {number} offset - The starting point in the list of members to retrieve.
  * @param {string} segmentId - The ID of the segment to filter members by.
- * @return {Promise<IMember[]>} A promise that resolves to an array of members
- * with missing period information on their work experience.
+ * @return {Promise<IMember[]>} A promise that resolves to an array of members
+ * with missing period information on their work experience.
  */
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* Fetches members with missing information on work experience.
*
* @param {QueryExecutor} qx - The query executor instance used to perform database operations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The starting point in the list of members to retrieve.
* @param {string} segmentId - The ID of the segment to filter members by.
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
*/
/**
* Fetches members with missing period information on work experience.
*
* @param {QueryExecutor} qx - The query executor instance used to perform database operations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The starting point in the list of members to retrieve.
* @param {string} segmentId - The ID of the segment to filter members by.
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
*/

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
backend/src/database/repositories/dataQualityRepository.ts (1)

26-163: Consider architectural improvements for scalability

The current implementation could benefit from the following architectural improvements:

  1. Batch processing capabilities for handling large datasets efficiently
  2. Caching strategy for frequently accessed data quality issues
  3. Rate limiting to prevent API abuse

Consider implementing:

  1. Batch processing:
interface BatchOptions {
  batchSize: number;
  processingFn: (batch: any[]) => Promise<void>;
}

const processBatchedResults = async (
  fetchFn: Function,
  options: IRepositoryOptions,
  batchOptions: BatchOptions,
  ...args: any[]
) => {
  let offset = 0;
  while (true) {
    const batch = await fetchFn(options, ...args, batchOptions.batchSize, offset);
    if (batch.length === 0) break;
    await batchOptions.processingFn(batch);
    offset += batchOptions.batchSize;
  }
};
  1. Caching layer:
import { caching } from '@crowd/cache';

const cache = caching({
  ttl: 300, // 5 minutes
  store: 'redis'
});

const getCachedResults = async (key: string, fetchFn: Function, ...args: any[]) => {
  const cached = await cache.get(key);
  if (cached) return cached;
  
  const results = await fetchFn(...args);
  await cache.set(key, results);
  return results;
};

Would you like me to help implement these improvements?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 14b3e83 and b355721.

📒 Files selected for processing (2)
  • backend/src/database/repositories/dataQualityRepository.ts (1 hunks)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🧰 Additional context used
🪛 Biome
backend/src/database/repositories/dataQualityRepository.ts

[error] 14-164: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

🔇 Additional comments (1)
services/libs/data-access-layer/src/data-quality/members.ts (1)

1-337: Well-structured implementation of data quality functions

The newly added data quality functions are clearly defined with comprehensive JSDoc comments. The SQL queries are constructed to retrieve the intended datasets effectively, and the code aligns with the project's standards.

Comment on lines +1 to +9
import {
fetchMembersWithConflictingWorkExperiences,
fetchMembersWithMissingInfoOnWorkExperience,
fetchMembersWithMissingPeriodOnWorkExperience,
fetchMembersWithTooManyEmails,
fetchMembersWithTooManyIdentities,
fetchMembersWithTooManyIdentitiesPerPlatform,
fetchMembersWithoutWorkExperience,
} from '@crowd/data-access-layer/src/data-quality'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type definitions for data quality issues

Import and define explicit types for the data quality issues to improve type safety and developer experience.

Add type definitions at the top of the file:

import type { 
  MemberWithNoWorkExperience,
  MemberWithTooManyIdentities,
  MemberWithTooManyEmails,
  MemberWithMissingWorkExperience,
  MemberWithConflictingWorkExperience
} from '@crowd/data-access-layer/src/data-quality/types'

segmentId: string,
) {
const qx = SequelizeRepository.getQueryExecutor(options)
return fetchMembersWithTooManyIdentities(qx, 15, tenantId, limit, offset, segmentId)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Extract magic numbers into named constants

The hardcoded threshold values should be extracted into named constants at the module level for better maintainability and documentation.

Add these constants at the top of the file:

const THRESHOLDS = {
  MAX_IDENTITIES: 15,
  MAX_IDENTITIES_PER_PLATFORM: 1,
  MAX_EMAILS: 3
} as const

// Then update the method calls:
// Line 56: fetchMembersWithTooManyIdentities(qx, THRESHOLDS.MAX_IDENTITIES, ...)
// Line 78: fetchMembersWithTooManyIdentitiesPerPlatform(qx, THRESHOLDS.MAX_IDENTITIES_PER_PLATFORM, ...)
// Line 99: fetchMembersWithTooManyEmails(qx, THRESHOLDS.MAX_EMAILS, ...)

Also applies to: 78-78, 99-99

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b355721 and d7ff1f4.

📒 Files selected for processing (3)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue

Comment on lines 74 to 75
const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
console.log(projectGroup);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove console.log statement.

Remove debugging console.log statement from production code.

 const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
-console.log(projectGroup);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);
console.log(projectGroup);
const projectGroup = projectGroups.value.list.find((g) => g.id === props.projectGroup);

Comment on lines +57 to +59
const props = defineProps<{
projectGroup: string,
}>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add props validation for projectGroup.

Consider adding validation to ensure the projectGroup prop is not empty.

 const props = defineProps<{
   projectGroup: string,
-}>();
+}>();
+
+const validateProps = () => {
+  if (!props.projectGroup?.trim()) {
+    throw new Error('projectGroup prop is required and cannot be empty');
+  }
+};
+
+onMounted(() => {
+  validateProps();
+  loadMergeSuggestions();
+});

Committable suggestion was skipped due to low confidence.

Comment on lines 43 to 56
<script lang="ts" setup>
import { onMounted, ref } from 'vue';
import LfSpinner from '@/ui-kit/spinner/Spinner.vue';
import LfButton from '@/ui-kit/button/Button.vue';
import LfIconOld from '@/ui-kit/icon/IconOld.vue';
import { OrganizationService } from '@/modules/organization/organization-service';
import LfDataQualityOrganizationMergeSuggestionsItem
from '@/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue';
import AppOrganizationMergeSuggestionsDialog
from '@/modules/organization/components/organization-merge-suggestions-dialog.vue';
import { getSegmentsFromProjectGroup } from '@/utils/segments';
import { storeToRefs } from 'pinia';
import { useLfSegmentsStore } from '@/modules/lf/segments/store';

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Import types from OrganizationService for better type safety.

Consider importing the response type from the OrganizationService to improve type safety when handling the API response.

 import { OrganizationService } from '@/modules/organization/organization-service';
+import type { MergeSuggestionsResponse } from '@/modules/organization/organization-service';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script lang="ts" setup>
import { onMounted, ref } from 'vue';
import LfSpinner from '@/ui-kit/spinner/Spinner.vue';
import LfButton from '@/ui-kit/button/Button.vue';
import LfIconOld from '@/ui-kit/icon/IconOld.vue';
import { OrganizationService } from '@/modules/organization/organization-service';
import LfDataQualityOrganizationMergeSuggestionsItem
from '@/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue';
import AppOrganizationMergeSuggestionsDialog
from '@/modules/organization/components/organization-merge-suggestions-dialog.vue';
import { getSegmentsFromProjectGroup } from '@/utils/segments';
import { storeToRefs } from 'pinia';
import { useLfSegmentsStore } from '@/modules/lf/segments/store';
<script lang="ts" setup>
import { onMounted, ref } from 'vue';
import LfSpinner from '@/ui-kit/spinner/Spinner.vue';
import LfButton from '@/ui-kit/button/Button.vue';
import LfIconOld from '@/ui-kit/icon/IconOld.vue';
import { OrganizationService } from '@/modules/organization/organization-service';
import type { MergeSuggestionsResponse } from '@/modules/organization/organization-service';
import LfDataQualityOrganizationMergeSuggestionsItem
from '@/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue';
import AppOrganizationMergeSuggestionsDialog
from '@/modules/organization/components/organization-merge-suggestions-dialog.vue';
import { getSegmentsFromProjectGroup } from '@/utils/segments';
import { storeToRefs } from 'pinia';
import { useLfSegmentsStore } from '@/modules/lf/segments/store';

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (1)
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1)

24-34: Enhance accessibility for the empty state message.

The empty state message should be more accessible to screen readers.

-      <div
-        class="ri-shuffle-line text-gray-200 text-10xl h-40 flex items-center mb-8"
+      <div
+        class="ri-shuffle-line text-gray-200 text-10xl h-40 flex items-center mb-8"
+        role="img"
+        aria-label="No duplicates found"
       />
       <h5 class="text-center text-lg font-semibold mb-4">
         No merge suggestions
       </h5>
-      <p class="text-sm text-center text-gray-600 leading-5">
+      <p class="text-sm text-center text-gray-600 leading-5" aria-live="polite">
         We couldn't find any duplicated organizations
       </p>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between d7ff1f4 and 02e5e9a.

📒 Files selected for processing (4)
  • backend/src/database/repositories/memberRepository.ts (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions-item.vue
🔇 Additional comments (2)
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions-item.vue (2)

42-44: ⚠️ Potential issue

Add TypeScript types for props

Replace any type with proper TypeScript interface.

 const props = defineProps<{
-  suggestion: any,
+  suggestion: MergeSuggestion,
 }>();

Likely invalid or redundant comment.


32-32: Verify action slot usage

The component provides an action slot but its usage and required content should be verified.

✅ Verification successful

Action slot is properly implemented and used

The action slot is correctly used in the parent component LfDataQualityOrganizationMergeSuggestions.vue. It provides a "View merge suggestion" button that triggers a modal dialog when clicked. The implementation is complete and follows the expected pattern for slot usage.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usage of this component and its action slot
echo "Searching for component usage..."
rg -l "LfDataQualityOrganizationMergeSuggestionsItem" 

echo "Checking action slot implementation..."
rg -A 5 "<LfDataQualityOrganizationMergeSuggestionsItem" 

Length of output: 491


Script:

#!/bin/bash
# Search for the usage of the component with more context
echo "Checking component usage with context..."
rg -B 2 -A 10 "<LfDataQualityOrganizationMergeSuggestionsItem" frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue

# Check the template structure of the parent component
echo "Checking parent component template..."
cat frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue

Length of output: 4236

Comment on lines 25 to 29
<lf-badge type="primary" size="small">
<lf-badge type="primary" size="small" class="!font-semibold">
Merge suggestion
</lf-badge>
<app-member-merge-similarity :similarity="+props.suggestion.similarity" percentage-only />
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify redundant badge structure

The nested badge structure appears redundant and could be simplified.

-      <lf-badge type="primary" size="small">
-        <lf-badge type="primary" size="small" class="!font-semibold">
-          Merge suggestion
-        </lf-badge>
+      <lf-badge type="primary" size="small" class="font-semibold">
+        Merge suggestion
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<lf-badge type="primary" size="small">
<lf-badge type="primary" size="small" class="!font-semibold">
Merge suggestion
</lf-badge>
<app-member-merge-similarity :similarity="+props.suggestion.similarity" percentage-only />
<lf-badge type="primary" size="small" class="font-semibold">
Merge suggestion
<app-member-merge-similarity :similarity="+props.suggestion.similarity" percentage-only />

Comment on lines 2 to 33
<article class="border-b border-gray-100 py-5 flex items-center justify-between">
<div class="flex items-center gap-4">
<template v-for="(organization, mi) of props.suggestion.organizations" :key="organization.id">
<div v-if="mi > 0">
<lf-icon-old name="arrow-left-line" :size="16" class="text-gray-400" />
</div>
<div class="flex items-center gap-3">
<lf-avatar
:src="organization.logo"
:name="organization.displayName"
:size="32"
>
<template #placeholder>
<div class="w-full h-full bg-gray-50 flex items-center justify-center">
<lf-icon-old name="community-line" :size="16" class="text-gray-400" />
</div>
</template>
</lf-avatar>
<p class="text-medium font-semibold">
{{ organization.displayName }}
</p>
</div>
</template>
<lf-badge type="primary" size="small">
<lf-badge type="primary" size="small" class="!font-semibold">
Merge suggestion
</lf-badge>
<app-member-merge-similarity :similarity="+props.suggestion.similarity" percentage-only />
</lf-badge>
</div>
<slot name="action" />
</article>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add accessibility attributes

The component lacks proper accessibility attributes for better screen reader support.

Add these attributes:

-  <article class="border-b border-gray-100 py-5 flex items-center justify-between">
+  <article 
+    class="border-b border-gray-100 py-5 flex items-center justify-between"
+    role="listitem"
+    aria-label="Organization merge suggestion">
...
-          <p class="text-medium font-semibold">
+          <p class="text-medium font-semibold" aria-label="Organization name">

Committable suggestion was skipped due to low confidence.

Comment on lines +36 to +41
<script lang="ts" setup>
import LfAvatar from '@/ui-kit/avatar/Avatar.vue';
import LfIconOld from '@/ui-kit/icon/IconOld.vue';
import LfBadge from '@/ui-kit/badge/Badge.vue';
import AppMemberMergeSimilarity from '@/modules/member/components/suggestions/member-merge-similarity.vue';

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type imports for component props

Consider adding type definitions for better type safety and documentation.

 <script lang="ts" setup>
+import type { Organization } from '@/types/organization';
+
+interface MergeSuggestion {
+  organizations: Organization[];
+  similarity: number;
+}
+
 import LfAvatar from '@/ui-kit/avatar/Avatar.vue';

Committable suggestion was skipped due to low confidence.

Comment on lines 73 to 87
const segments: string[] = selectedProjectGroup.value?.id === props.projectGroup
? [
selectedProjectGroup.value?.id,
...selectedProjectGroup.value.projects.map((p) => [
...p.subprojects.map((sp) => sp.id),
]).flat(),
]
: [
props.projectGroup,
...selectedProjectGroup.value.projects
.filter((p) => p.id === props.projectGroup)
.map((p) => [
...p.subprojects.map((sp) => sp.id),
]).flat(),
];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify segments array construction logic.

The current implementation is complex and could be simplified for better maintainability.

-  const segments: string[] = selectedProjectGroup.value?.id === props.projectGroup
-    ? [
-      selectedProjectGroup.value?.id,
-      ...selectedProjectGroup.value.projects.map((p) => [
-        ...p.subprojects.map((sp) => sp.id),
-      ]).flat(),
-    ]
-    : [
-      props.projectGroup,
-      ...selectedProjectGroup.value.projects
-        .filter((p) => p.id === props.projectGroup)
-        .map((p) => [
-          ...p.subprojects.map((sp) => sp.id),
-        ]).flat(),
-    ];
+  const getSubprojectIds = (projects: any[]) => 
+    projects.map(p => p.subprojects.map(sp => sp.id)).flat();
+
+  const segments: string[] = [];
+  
+  if (selectedProjectGroup.value?.id === props.projectGroup) {
+    segments.push(selectedProjectGroup.value.id);
+    segments.push(...getSubprojectIds(selectedProjectGroup.value.projects));
+  } else {
+    segments.push(props.projectGroup);
+    const filteredProjects = selectedProjectGroup.value.projects
+      .filter(p => p.id === props.projectGroup);
+    segments.push(...getSubprojectIds(filteredProjects));
+  }

Committable suggestion was skipped due to low confidence.

Comment on lines +537 to +543
activityCount: i.primaryActivityCount,
avatarUrl: i.primaryAvatarUrl,
},
{
id: i.toMergeId,
displayName: i.toMergeDisplayName,
activityCount: i.toActivityCount,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

'primaryActivityCount' and 'toActivityCount' might be undefined

In the mapping of mems to result, the properties i.primaryActivityCount and i.toActivityCount are being assigned to activityCount. However, these properties are not included in the SQL query that populates mems. This will result in activityCount being undefined in the output.

To fix this, modify the SQL query to include primaryActivityCount and toActivityCount by selecting the appropriate activityCount fields from the member_segments_mv table:

 SELECT
     DISTINCT
     mtm."memberId" AS id,
     mtm."toMergeId",
     mtm.similarity,
     mtm."activityEstimate",
     m."displayName" as "primaryDisplayName",
+    ms."activityCount" as "primaryActivityCount",
     m.attributes->'avatarUrl'->>'default' as "primaryAvatarUrl",
     m2."displayName" as "toMergeDisplayName",
+    ms2."activityCount" as "toActivityCount",
     m2.attributes->'avatarUrl'->>'default' as "toMergeAvatarUrl"
 FROM "memberToMerge" mtm
 JOIN member_segments_mv ms ON ms."memberId" = mtm."memberId"
 JOIN member_segments_mv ms2 ON ms2."memberId" = mtm."toMergeId"
 JOIN members m on m.id = mtm."memberId"
 JOIN members m2 on m2.id = mtm."toMergeId"

This ensures that primaryActivityCount and toActivityCount are available in i when constructing the result.

Committable suggestion was skipped due to low confidence.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (1)
frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1)

13-15: Enhance button accessibility.

The button lacks proper ARIA attributes for better screen reader support.

-          <lf-button type="secondary" size="small" @click="isModalOpen = true; detailsOffset = si">
-            <lf-icon-old name="eye-line" />View merge suggestion
+          <lf-button 
+            type="secondary" 
+            size="small" 
+            @click="isModalOpen = true; detailsOffset = si"
+            aria-label="View merge suggestion details"
+          >
+            <lf-icon-old name="eye-line" aria-hidden="true" />View merge suggestion
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3719ed2 and 5a1ffeb.

📒 Files selected for processing (5)
  • frontend/src/modules/data-quality/components/data-quality-member.vue (1 hunks)
  • frontend/src/modules/data-quality/components/data-quality-organization.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue (1 hunks)
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue (1 hunks)
  • frontend/src/modules/data-quality/components/organization/data-quality-organization-merge-suggestions.vue (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • frontend/src/modules/data-quality/components/data-quality-member.vue
  • frontend/src/modules/data-quality/components/data-quality-organization.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-issues-item.vue
  • frontend/src/modules/data-quality/components/member/data-quality-member-merge-suggestions.vue

Comment on lines +112 to +115
const loadMore = () => {
offset.value = mergeSuggestions.value.length;
loadMergeSuggestions();
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle edge cases in pagination.

The loadMore function should handle edge cases where the offset might exceed the total count.

 const loadMore = () => {
+  if (mergeSuggestions.value.length >= total.value) {
+    console.warn('All suggestions have been loaded');
+    return;
+  }
   offset.value = mergeSuggestions.value.length;
   loadMergeSuggestions();
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const loadMore = () => {
offset.value = mergeSuggestions.value.length;
loadMergeSuggestions();
};
const loadMore = () => {
if (mergeSuggestions.value.length >= total.value) {
console.warn('All suggestions have been loaded');
return;
}
offset.value = mergeSuggestions.value.length;
loadMergeSuggestions();
};

Comment on lines +65 to +66
const limit = ref(20);
const offset = ref(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add validation for pagination parameters.

The limit and offset refs lack validation to ensure they remain within reasonable bounds.

-const limit = ref(20);
-const offset = ref(0);
+const limit = ref<number>(20);
+const offset = ref<number>(0);
+
+watch([limit, offset], ([newLimit, newOffset]) => {
+  if (newLimit <= 0) limit.value = 20;
+  if (newOffset < 0) offset.value = 0;
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const limit = ref(20);
const offset = ref(0);
const limit = ref<number>(20);
const offset = ref<number>(0);
watch([limit, offset], ([newLimit, newOffset]) => {
if (newLimit <= 0) limit.value = 20;
if (newOffset < 0) offset.value = 0;
});

Comment on lines +75 to +89
const segments = computed(() => (selectedProjectGroup.value?.id === props.projectGroup
? [
selectedProjectGroup.value?.id,
...selectedProjectGroup.value.projects.map((p) => [
...p.subprojects.map((sp) => sp.id),
]).flat(),
]
: [
props.projectGroup,
...selectedProjectGroup.value.projects
.filter((p) => p.id === props.projectGroup)
.map((p) => [
...p.subprojects.map((sp) => sp.id),
]).flat(),
]));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add type safety for project group data structures.

The computed property lacks proper typing for projects and subprojects, which could lead to runtime errors.

+interface Project {
+  id: string;
+  subprojects: Array<{
+    id: string;
+  }>;
+}
+
+interface ProjectGroup {
+  id: string;
+  projects: Project[];
+}
+
-const segments = computed(() => (selectedProjectGroup.value?.id === props.projectGroup
+const segments = computed<string[]>(() => {
+  const currentGroup = selectedProjectGroup.value as ProjectGroup | undefined;
+  if (!currentGroup) return [props.projectGroup];
+
+  return (currentGroup.id === props.projectGroup

Committable suggestion was skipped due to low confidence.

Comment on lines +91 to +110
const loadMergeSuggestions = () => {
loading.value = true;
OrganizationService.fetchMergeSuggestions(limit.value, offset.value, {
detail: false,
segments: segments.value,
orderBy: ['similarity_DESC', 'activityCount_DESC'],
})
.then((res) => {
total.value = +res.count;
const rows = res.rows.filter((s: any) => s.similarity > 0);
if (+res.offset > 0) {
mergeSuggestions.value = [...mergeSuggestions.value, ...rows];
} else {
mergeSuggestions.value = rows;
}
})
.finally(() => {
loading.value = false;
});
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add retry logic for failed requests.

The loadMergeSuggestions function should handle network failures gracefully with retry logic.

+const MAX_RETRIES = 3;
+const RETRY_DELAY = 1000;
+
+const delay = (ms: number) => new Promise(resolve => setTimeout(resolve, ms));
+
+const loadMergeSuggestions = async (retryCount = 0) => {
   loading.value = true;
-  OrganizationService.fetchMergeSuggestions(limit.value, offset.value, {
-    detail: false,
-    segments: segments.value,
-    orderBy: ['similarity_DESC', 'activityCount_DESC'],
-  })
-    .then((res) => {
+  try {
+    const res = await OrganizationService.fetchMergeSuggestions(
+      limit.value,
+      offset.value,
+      {
+        detail: false,
+        segments: segments.value,
+        orderBy: ['similarity_DESC', 'activityCount_DESC'],
+      }
+    );
       total.value = +res.count;
       const rows = res.rows.filter((s: any) => s.similarity > 0);
       if (+res.offset > 0) {
         mergeSuggestions.value = [...mergeSuggestions.value, ...rows];
       } else {
         mergeSuggestions.value = rows;
       }
-    })
-    .finally(() => {
+  } catch (error) {
+    if (retryCount < MAX_RETRIES) {
+      await delay(RETRY_DELAY);
+      return loadMergeSuggestions(retryCount + 1);
+    }
+    console.error('Failed to load merge suggestions after retries:', error);
+  } finally {
       loading.value = false;
-    });
+  }
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const loadMergeSuggestions = () => {
loading.value = true;
OrganizationService.fetchMergeSuggestions(limit.value, offset.value, {
detail: false,
segments: segments.value,
orderBy: ['similarity_DESC', 'activityCount_DESC'],
})
.then((res) => {
total.value = +res.count;
const rows = res.rows.filter((s: any) => s.similarity > 0);
if (+res.offset > 0) {
mergeSuggestions.value = [...mergeSuggestions.value, ...rows];
} else {
mergeSuggestions.value = rows;
}
})
.finally(() => {
loading.value = false;
});
};
const MAX_RETRIES = 3;
const RETRY_DELAY = 1000;
const delay = (ms: number) => new Promise(resolve => setTimeout(resolve, ms));
const loadMergeSuggestions = async (retryCount = 0) => {
loading.value = true;
try {
const res = await OrganizationService.fetchMergeSuggestions(
limit.value,
offset.value,
{
detail: false,
segments: segments.value,
orderBy: ['similarity_DESC', 'activityCount_DESC'],
}
);
total.value = +res.count;
const rows = res.rows.filter((s: any) => s.similarity > 0);
if (+res.offset > 0) {
mergeSuggestions.value = [...mergeSuggestions.value, ...rows];
} else {
mergeSuggestions.value = rows;
}
} catch (error) {
if (retryCount < MAX_RETRIES) {
await delay(RETRY_DELAY);
return loadMergeSuggestions(retryCount + 1);
}
console.error('Failed to load merge suggestions after retries:', error);
} finally {
loading.value = false;
}
};

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 5a1ffeb and ba00a7e.

📒 Files selected for processing (1)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)

Comment on lines +24 to +34
SELECT m.id, m."displayName", m.attributes, msa."activityCount"
FROM members m
LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE mo."memberId" IS NULL
AND m."tenantId" = '${tenantId}'
AND m."deletedAt" IS NULL
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities.

Variables are directly embedded into the SQL query string, which can lead to SQL injection attacks. Use placeholders and pass the parameters separately to ensure proper sanitization.

Apply this diff to modify the query:

 return qx.select(
   `
           SELECT m.id, m."displayName", m.attributes, msa."activityCount"
           FROM members m
                    LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
-                   INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                   INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
           WHERE mo."memberId" IS NULL
-            AND m."tenantId" = '${tenantId}'
+            AND m."tenantId" = $2
             AND m."deletedAt" IS NULL
             AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
           ORDER BY msa."activityCount" DESC
-          LIMIT ${limit} OFFSET ${offset};
+          LIMIT $3 OFFSET $4;
       `,
-    {
-      segmentId,
-      tenantId,
-      limit,
-      offset,
-    },
+    [
+      segmentId,
+      tenantId,
+      limit,
+      offset,
+    ],
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SELECT m.id, m."displayName", m.attributes, msa."activityCount"
FROM members m
LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE mo."memberId" IS NULL
AND m."tenantId" = '${tenantId}'
AND m."deletedAt" IS NULL
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
SELECT m.id, m."displayName", m.attributes, msa."activityCount"
FROM members m
LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE mo."memberId" IS NULL
AND m."tenantId" = $2
AND m."deletedAt" IS NULL
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,
[
segmentId,
tenantId,
limit,
offset,
],

Comment on lines +228 to +244
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities.

Variables are directly embedded into the SQL query string, which can lead to SQL injection attacks. Use placeholders and pass the parameters separately to ensure proper sanitization.

Apply this diff to modify the query:

 return qx.select(
   `
         SELECT
             m.id,
             m."displayName",
             m."attributes",
             msa."activityCount",
             COUNT(mo.id) AS "organizationsCount"
         FROM "members" m
                  JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
-                 INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                 INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
         WHERE m."tenantId" = $2
           AND (mo."title" IS NULL OR mo."title" = '')
           AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
         GROUP BY m.id, msa."activityCount"
         ORDER BY msa."activityCount" DESC
-        LIMIT ${limit} OFFSET ${offset};
+        LIMIT $3 OFFSET $4;
     `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      segmentId,
+      tenantId,
+      limit,
+      offset,
+    ],
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,

Comment on lines 308 to 328
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo1."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use parameterized queries to prevent SQL injection vulnerabilities.

Variables are directly embedded into the SQL query string, which can lead to SQL injection attacks. Use placeholders and pass the parameters separately to ensure proper sanitization.

Apply this diff to modify the query:

 return qx.select(
   `
       SELECT
           m.id,
           m."displayName",
           m."attributes",
           msa."activityCount",
           COUNT(mo1.id) AS "organizationsCount"
       FROM "members" m
               JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
-              INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+              INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
               JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo1."deletedAt" IS NULL
               AND mo1.id != mo2.id
               AND (
                  (mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
                      AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
                  (mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
                      AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
              )
       WHERE m."tenantId" = $2
       GROUP BY m.id, msa."activityCount"
       ORDER BY msa."activityCount" DESC
-      LIMIT ${limit} OFFSET ${offset};
+      LIMIT $3 OFFSET $4;
     `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [
+      segmentId,
+      tenantId,
+      limit,
+      offset,
+    ],
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo1."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo1."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = $2
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (1)
services/libs/data-access-layer/src/data-quality/members.ts (1)

299-305: Add JSDoc documentation

The function is missing JSDoc documentation to describe its purpose and parameters.

Add this documentation:

+/**
+ * Fetches members who have overlapping work experience dates.
+ *
+ * @param {QueryExecutor} qx - The query executor instance used to perform database operations.
+ * @param {string} tenantId - The ID of the tenant to filter members by.
+ * @param {number} limit - The maximum number of members to retrieve.
+ * @param {number} offset - The starting point in the list of members to retrieve.
+ * @param {string} segmentId - The ID of the segment to filter members by.
+ * @return {Promise<IMember[]>} A promise that resolves to an array of members
+ * who have work experiences with overlapping date ranges.
+ */
 export async function fetchMembersWithConflictingWorkExperiences(
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ba00a7e and 6ad0f9e.

📒 Files selected for processing (1)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🔇 Additional comments (2)
services/libs/data-access-layer/src/data-quality/members.ts (2)

254-264: ⚠️ Potential issue

Update JSDoc comment to match function purpose

The JSDoc comment is duplicated from the previous function and doesn't accurately describe this function's purpose.

Apply this diff:

 /**
- * Fetches members with missing information on work experience.
+ * Fetches members with missing start dates in their work experience records.
  *
  * @param {QueryExecutor} qx - The query executor instance used to perform database operations.
  * @param {string} tenantId - The ID of the tenant to filter members by.
  * @param {number} limit - The maximum number of members to retrieve.
  * @param {number} offset - The starting point in the list of members to retrieve.
  * @param {string} segmentId - The ID of the segment to filter members by.
  * @return {Promise<IMember[]>} A promise that resolves to an array of members
- * with missing period information on their work experience.
+ * who have work experience records with missing start dates.
  */

Likely invalid or redundant comment.


22-41: ⚠️ Potential issue

Convert to parameterized query to prevent SQL injection

The query is vulnerable to SQL injection attacks due to direct string interpolation. Use parameterized queries instead.

Apply this diff:

 return qx.select(
   `
            SELECT m.id, m."displayName", m.attributes, msa."activityCount"
            FROM members m
                     LEFT JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
-                    INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                    INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
            WHERE mo."memberId" IS NULL
-              AND m."tenantId" = '${tenantId}'
+              AND m."tenantId" = $2
              AND m."deletedAt" IS NULL
              AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
            ORDER BY msa."activityCount" DESC
-            LIMIT ${limit} OFFSET ${offset};
+            LIMIT $3 OFFSET $4;
        `,
-    {
-      segmentId,
-      tenantId,
-      limit,
-      offset,
-    },
+    [segmentId, tenantId, limit, offset],
   )

Likely invalid or redundant comment.

Comment on lines +272 to +296
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."dateStart" IS NULL)
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix GROUP BY clause and prevent SQL injection

Two issues need to be addressed:

  1. Missing columns in GROUP BY clause
  2. SQL injection vulnerability

Apply this diff:

 return qx.select(
   `
          SELECT
              m.id,
              m."displayName",
              m."attributes",
              msa."activityCount",
              COUNT(mo.id) AS "organizationsCount"
          FROM "members" m
                   JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
-                  INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                  INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
-          WHERE m."tenantId" = '${tenantId}'
+          WHERE m."tenantId" = $2
            AND (mo."dateStart" IS NULL)
            AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
-          GROUP BY m.id, msa."activityCount"
+          GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
          ORDER BY msa."activityCount" DESC
-          LIMIT ${limit} OFFSET ${offset};
+          LIMIT $3 OFFSET $4;
      `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [segmentId, tenantId, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."dateStart" IS NULL)
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND (mo."dateStart" IS NULL)
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,
[segmentId, tenantId, limit, offset],
)

Comment on lines +178 to +206
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT mi.value) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT mi.value) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add deletedAt filter and prevent SQL injection

The function needs:

  1. Filter for deleted members
  2. Protection against SQL injection

Apply this diff:

 return qx.select(
   `
        SELECT
            mi."memberId",
            m."displayName",
            m."attributes",
            m.id,
            COUNT(DISTINCT mi.value) AS "identityCount",
            msa."activityCount"
        FROM "memberIdentities" mi
                 JOIN "members" m ON mi."memberId" = m.id
-                INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
-        WHERE m."tenantId" = '${tenantId}'
+        WHERE m."tenantId" = $2
+          AND m."deletedAt" IS NULL
          AND mi.verified = true
          AND mi.type = 'email'
          AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
        GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
-        HAVING COUNT(DISTINCT mi.value) > ${threshold}
+        HAVING COUNT(DISTINCT mi.value) > $3
        ORDER BY msa."activityCount" DESC
-        LIMIT ${limit} OFFSET ${offset};
+        LIMIT $4 OFFSET $5;
        `,
-    {
-      threshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [segmentId, tenantId, threshold, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT mi.value) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND mi.verified = true
AND mi.type = 'email'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT mi.value) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT mi.value) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND m."deletedAt" IS NULL
AND mi.verified = true
AND mi.type = 'email'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT mi.value) > $3
ORDER BY msa."activityCount" DESC
LIMIT $4 OFFSET $5;
`,
[segmentId, tenantId, threshold, limit, offset],
)

Comment on lines +63 to +89
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add deletedAt filter and convert to parameterized query

Two issues need to be addressed:

  1. Missing filter for deleted members
  2. SQL injection vulnerability

Apply this diff:

 return qx.select(
   `
            SELECT
                mi."memberId",
                m."displayName",
                m."attributes",
                m.id,
                COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) AS "identityCount",
                msa."activityCount"
            FROM "memberIdentities" mi
                     JOIN "members" m ON mi."memberId" = m.id
-                    INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                    INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
-            WHERE m."tenantId" = '${tenantId}'
+            WHERE m."tenantId" = $2
+              AND m."deletedAt" IS NULL
              AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
            GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
-            HAVING COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) > ${threshold}
+            HAVING COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) > $3
            ORDER BY msa."activityCount" DESC
-            LIMIT ${limit} OFFSET ${offset};
+            LIMIT $4 OFFSET $5;
        `,
-    {
-      threshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [segmentId, tenantId, threshold, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) > ${threshold}
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
return qx.select(
`
SELECT
mi."memberId",
m."displayName",
m."attributes",
m.id,
COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) AS "identityCount",
msa."activityCount"
FROM "memberIdentities" mi
JOIN "members" m ON mi."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND m."deletedAt" IS NULL
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY mi."memberId", m."displayName", m."attributes", m.id, msa."activityCount"
HAVING COUNT(DISTINCT CASE WHEN mi."type" = 'email' THEN mi."value"::text ELSE mi.id::text END) > $3
ORDER BY msa."activityCount" DESC
LIMIT $4 OFFSET $5;
`,
[segmentId, tenantId, threshold, limit, offset],
)

Comment on lines +227 to +251
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix GROUP BY clause and prevent SQL injection

Two issues need to be addressed:

  1. Missing columns in GROUP BY clause
  2. SQL injection vulnerability

Apply this diff:

 return qx.select(
   `
          SELECT
              m.id,
              m."displayName",
              m."attributes",
              msa."activityCount",
              COUNT(mo.id) AS "organizationsCount"
          FROM "members" m
                   JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
-                  INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
+                  INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
-          WHERE m."tenantId" = '${tenantId}'
+          WHERE m."tenantId" = $2
            AND (mo."title" IS NULL OR mo."title"='')
            AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
-          GROUP BY m.id, msa."activityCount"
+          GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
          ORDER BY msa."activityCount" DESC
-          LIMIT ${limit} OFFSET ${offset};
+          LIMIT $3 OFFSET $4;
      `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [segmentId, tenantId, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,
[segmentId, tenantId, limit, offset],
)

Comment on lines 306 to 336
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
LEFT JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Prevent SQL injection and improve date overlap check

The query needs protection against SQL injection. Additionally, consider extracting the date overlap logic into a CTE for better readability.

Apply this diff:

 return qx.select(
   `
+       WITH overlapping_experiences AS (
+           SELECT DISTINCT mo1."memberId"
+           FROM "memberOrganizations" mo1
+           JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId"
+               AND mo1.id != mo2.id
+               AND mo1."deletedAt" IS NULL
+               AND mo2."deletedAt" IS NULL
+               AND (
+                   (mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
+                       AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
+               )
+       )
        SELECT
            m.id,
            m."displayName",
            m."attributes",
            msa."activityCount",
            COUNT(DISTINCT mo1.id) AS "organizationsCount"
        FROM "members" m
-               JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
-               INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
-               LEFT JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
-               AND mo1.id != mo2.id
-               AND (
-                   (mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
-                       AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
-               )
+               JOIN overlapping_experiences oe ON m.id = oe."memberId"
+               JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
+               INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
-        WHERE m."tenantId" = '${tenantId}'
+        WHERE m."tenantId" = $2
+          AND m."deletedAt" IS NULL
+          AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
        GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
        ORDER BY msa."activityCount" DESC
-        LIMIT ${limit} OFFSET ${offset};
+        LIMIT $3 OFFSET $4;
      `,
-    {
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [segmentId, tenantId, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
LEFT JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
return qx.select(
`
WITH overlapping_experiences AS (
SELECT DISTINCT mo1."memberId"
FROM "memberOrganizations" mo1
JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId"
AND mo1.id != mo2.id
AND mo1."deletedAt" IS NULL
AND mo2."deletedAt" IS NULL
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
)
)
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN overlapping_experiences oe ON m.id = oe."memberId"
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = $1
WHERE m."tenantId" = $2
AND m."deletedAt" IS NULL
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, m."displayName", m."attributes", msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT $3 OFFSET $4;
`,
[segmentId, tenantId, limit, offset],
)

Comment on lines +111 to +156
return qx.select(
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
WHERE mi."tenantId" = '${tenantId}'
AND mi.type = 'username'
AND mi.verified = true
AND mi.platform IN ('linkedin', 'github', 'twitter', 'lfx', 'slack', 'cvent', 'tnc', 'groupsio')
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > ${threshold}
),
aggregated_platforms AS (
SELECT
p."memberId",
STRING_AGG(p.platform, ',') AS platforms
FROM platform_identities p
GROUP BY p."memberId"
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
platforms,
msa."activityCount"
FROM aggregated_platforms p
JOIN "members" m ON p."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId"
WHERE m."tenantId" = '${tenantId}'
AND msa."segmentId" = '${segmentId}'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Make platform list configurable and prevent SQL injection

Two improvements needed:

  1. The platform list is hard-coded and should be configurable
  2. The query is vulnerable to SQL injection

Apply this diff:

+const SUPPORTED_PLATFORMS = ['linkedin', 'github', 'twitter', 'lfx', 'slack', 'cvent', 'tnc', 'groupsio'];

 return qx.select(
   `
            WITH platform_identities AS (
                SELECT
                    mi."memberId",
                    mi.platform,
                    COUNT(*) AS "identityCount"
                FROM "memberIdentities" mi
-               WHERE mi."tenantId" = '${tenantId}'
+               WHERE mi."tenantId" = $1
                  AND mi.type = 'username'
                  AND mi.verified = true
-                 AND mi.platform IN ('linkedin', 'github', 'twitter', 'lfx', 'slack', 'cvent', 'tnc', 'groupsio')
+                 AND mi.platform = ANY($2::text[])
                GROUP BY mi."memberId", mi.platform
-               HAVING COUNT(*) > ${threshold}
+               HAVING COUNT(*) > $3
            ),
                 aggregated_platforms AS (
                     SELECT
                         p."memberId",
                         STRING_AGG(p.platform, ',') AS platforms
                     FROM platform_identities p
                     GROUP BY p."memberId"
                 )
            SELECT
                p."memberId",
                m."displayName",
                m."attributes",
                m.id,
                platforms,
                msa."activityCount"
            FROM aggregated_platforms p
                     JOIN "members" m ON p."memberId" = m.id
                     INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId"
-            WHERE m."tenantId" = '${tenantId}'
-              AND msa."segmentId" = '${segmentId}'
+            WHERE m."tenantId" = $1
+              AND msa."segmentId" = $4
              AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
            ORDER BY msa."activityCount" DESC
-            LIMIT ${limit} OFFSET ${offset};
+            LIMIT $5 OFFSET $6;
        `,
-    {
-      threshold,
-      tenantId,
-      limit,
-      offset,
-      segmentId,
-    },
+    [tenantId, SUPPORTED_PLATFORMS, threshold, segmentId, limit, offset],
   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return qx.select(
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
WHERE mi."tenantId" = '${tenantId}'
AND mi.type = 'username'
AND mi.verified = true
AND mi.platform IN ('linkedin', 'github', 'twitter', 'lfx', 'slack', 'cvent', 'tnc', 'groupsio')
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > ${threshold}
),
aggregated_platforms AS (
SELECT
p."memberId",
STRING_AGG(p.platform, ',') AS platforms
FROM platform_identities p
GROUP BY p."memberId"
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
platforms,
msa."activityCount"
FROM aggregated_platforms p
JOIN "members" m ON p."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId"
WHERE m."tenantId" = '${tenantId}'
AND msa."segmentId" = '${segmentId}'
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
threshold,
tenantId,
limit,
offset,
segmentId,
},
)
const SUPPORTED_PLATFORMS = ['linkedin', 'github', 'twitter', 'lfx', 'slack', 'cvent', 'tnc', 'groupsio'];
return qx.select(
`
WITH platform_identities AS (
SELECT
mi."memberId",
mi.platform,
COUNT(*) AS "identityCount"
FROM "memberIdentities" mi
WHERE mi."tenantId" = $1
AND mi.type = 'username'
AND mi.verified = true
AND mi.platform = ANY($2::text[])
GROUP BY mi."memberId", mi.platform
HAVING COUNT(*) > $3
),
aggregated_platforms AS (
SELECT
p."memberId",
STRING_AGG(p.platform, ',') AS platforms
FROM platform_identities p
GROUP BY p."memberId"
)
SELECT
p."memberId",
m."displayName",
m."attributes",
m.id,
platforms,
msa."activityCount"
FROM aggregated_platforms p
JOIN "members" m ON p."memberId" = m.id
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId"
WHERE m."tenantId" = $1
AND msa."segmentId" = $4
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
ORDER BY msa."activityCount" DESC
LIMIT $5 OFFSET $6;
`,
[tenantId, SUPPORTED_PLATFORMS, threshold, segmentId, limit, offset],
)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 6ad0f9e and 3699606.

📒 Files selected for processing (1)
  • services/libs/data-access-layer/src/data-quality/members.ts (1 hunks)
🔇 Additional comments (1)
services/libs/data-access-layer/src/data-quality/members.ts (1)

255-264: ⚠️ Potential issue

Ensure the JSDoc comments accurately describe the function

The JSDoc comment for fetchMembersWithMissingPeriodOnWorkExperience is generic and similar to the previous function. To improve clarity, please update the description to specifically indicate that this function retrieves members with missing period information (e.g., missing start dates) in their work experience.

Apply this diff to clarify the function description:

 /**
- * Fetches members with missing information on work experience.
+ * Fetches members with missing period information in their work experience.
  *
  * @param {QueryExecutor} qx - The query executor instance used to perform database operations.
  * @param {string} tenantId - The ID of the tenant to filter members by.
  * @param {number} limit - The maximum number of members to retrieve.
  * @param {number} offset - The starting point in the list of members to retrieve.
  * @param {string} segmentId - The ID of the segment to filter members by.
- * @return {Promise<IMember[]>} A promise that resolves to an array of members
- * with missing period information on their work experience.
+ * @return {Promise<IMember[]>} A promise that resolves to an array of members with missing period information on their work experience.
  */

Likely invalid or redundant comment.

Comment on lines 1 to 337
WHERE m."tenantId" = '${tenantId}'
AND (mo."title" IS NULL OR mo."title"='')
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}

/**
* Fetches members with missing information on work experience.
*
* @param {QueryExecutor} qx - The query executor instance used to perform database operations.
* @param {string} tenantId - The ID of the tenant to filter members by.
* @param {number} limit - The maximum number of members to retrieve.
* @param {number} offset - The starting point in the list of members to retrieve.
* @param {string} segmentId - The ID of the segment to filter members by.
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
*/
export async function fetchMembersWithMissingPeriodOnWorkExperience(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(mo.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo ON m.id = mo."memberId" AND mo."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
AND (mo."dateStart" IS NULL)
AND COALESCE((m.attributes -> 'isBot' ->> 'default')::BOOLEAN, FALSE) = FALSE
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}

export async function fetchMembersWithConflictingWorkExperiences(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
INNER JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add unit tests for the new data quality functions

The new data quality functions are crucial for ensuring data integrity, but there are no associated unit tests. To maintain high code quality and prevent future regressions, consider adding unit tests that cover various scenarios and edge cases for these functions.

Would you like assistance in creating unit tests for these functions? I can help generate test cases to ensure they behave as expected.

Comment on lines +217 to +218
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update the JSDoc @return description to match the function's purpose

The @return description mentions "members with missing period information on their work experience," but this function fetchMembersWithMissingInfoOnWorkExperience is intended to fetch members with missing information such as titles. Please update the description to accurately reflect the function's intent.

Apply this diff to correct the @return description:

 * @return {Promise<IMember[]>} A promise that resolves to an array of members
- * with missing period information on their work experience.
+ * with missing information (e.g., missing titles) on their work experience.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing period information on their work experience.
* @return {Promise<IMember[]>} A promise that resolves to an array of members
* with missing information (e.g., missing titles) on their work experience.

Comment on lines 299 to 337
export async function fetchMembersWithConflictingWorkExperiences(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
INNER JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor the SQL query for readability and maintainability

The SQL query in fetchMembersWithConflictingWorkExperiences is complex due to nested conditions for date overlaps. To enhance readability and maintainability, consider refactoring the query using Common Table Expressions (CTEs) to separate logic into named subqueries.

Apply this refactored query:

 return qx.select(
   `
+      WITH overlapping_experiences AS (
+          SELECT DISTINCT mo1."memberId"
+          FROM "memberOrganizations" mo1
+          JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId"
+              AND mo1.id != mo2.id
+              AND mo1."deletedAt" IS NULL
+              AND mo2."deletedAt" IS NULL
+              AND (
+                  (mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
+                      AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
+              )
+      )
       SELECT
           m.id,
           m."displayName",
           m."attributes",
           msa."activityCount",
           COUNT(DISTINCT mo1.id) AS "organizationsCount"
       FROM "members" m
-            JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
-            INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
-            INNER JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
-                AND mo1.id != mo2.id
-                AND (
-                    (mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
-                        AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
-                )
+            JOIN overlapping_experiences oe ON m.id = oe."memberId"
+            JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
+            INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
       WHERE m."tenantId" = '${tenantId}'
       GROUP BY m.id, msa."activityCount"
       ORDER BY msa."activityCount" DESC
       LIMIT ${limit} OFFSET ${offset};
     `,
     {
       tenantId,
       limit,
       offset,
       segmentId,
     },
   )

This refactoring isolates the overlapping logic into a CTE, making the main query cleaner and easier to understand.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function fetchMembersWithConflictingWorkExperiences(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
INNER JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId" AND mo2."deletedAt" IS NULL
AND mo1.id != mo2.id
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart") OR
(mo2."dateStart" < COALESCE(mo1."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo2."dateEnd", 'infinity'::timestamp) > mo1."dateStart")
)
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}
export async function fetchMembersWithConflictingWorkExperiences(
qx: QueryExecutor,
tenantId: string,
limit: number,
offset: number,
segmentId: string,
): Promise<IMember[]> {
return qx.select(
`
WITH overlapping_experiences AS (
SELECT DISTINCT mo1."memberId"
FROM "memberOrganizations" mo1
JOIN "memberOrganizations" mo2 ON mo1."memberId" = mo2."memberId"
AND mo1.id != mo2.id
AND mo1."deletedAt" IS NULL
AND mo2."deletedAt" IS NULL
AND (
(mo1."dateStart" < COALESCE(mo2."dateEnd", 'infinity'::timestamp)
AND COALESCE(mo1."dateEnd", 'infinity'::timestamp) > mo2."dateStart")
)
)
SELECT
m.id,
m."displayName",
m."attributes",
msa."activityCount",
COUNT(DISTINCT mo1.id) AS "organizationsCount"
FROM "members" m
JOIN overlapping_experiences oe ON m.id = oe."memberId"
JOIN "memberOrganizations" mo1 ON m.id = mo1."memberId" AND mo1."deletedAt" IS NULL
INNER JOIN "memberSegmentsAgg" msa ON m.id = msa."memberId" AND msa."segmentId" = '${segmentId}'
WHERE m."tenantId" = '${tenantId}'
GROUP BY m.id, msa."activityCount"
ORDER BY msa."activityCount" DESC
LIMIT ${limit} OFFSET ${offset};
`,
{
tenantId,
limit,
offset,
segmentId,
},
)
}

@gaspergrom gaspergrom merged commit 38c9678 into main Oct 30, 2024
7 of 8 checks passed
@gaspergrom gaspergrom deleted the improvement/data-quality-assistant branch October 30, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant