Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use auto-generated IDs on indexing #803

Merged
merged 4 commits into from
Oct 23, 2024
Merged

Conversation

srishti-saraswat
Copy link
Contributor

@srishti-saraswat srishti-saraswat commented Oct 22, 2024

Problem

Original PR - #679
PR to expose this on cloud - https://github.com/confluentinc/kafka-connect-elasticsearch-private/pull/44

Solution

Does this solution apply anywhere else?
  • yes
  • no
If yes, where?

Test Strategy

Tested on docker playround.
Case1: when use.autogenerated.ids is false
connector config -

{
     "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
     "tasks.max": "1",
     "topics": "test-elasticsearch-sink-testing1",
     "key.ignore": "true",
     "connection.url": "**************",
     "connection.username": "*******",
     "connection.password": "******",
     "type.name": "kafka-connect",
     "data.stream.namespace": "ssaraswat${topic}",
     "use.autogenerated.ids": "false"
}

result -

17:34:38 ℹ️ Check that the data is available in Elasticsearch Cloud
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2625  100  2625    0     0   2843      0 --:--:-- --:--:-- --:--:--  2840
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+0",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value1"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+6",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value7"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+7",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value8"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+8",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value9"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+9",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value10"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+1",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value2"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+2",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value3"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+3",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value4"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+4",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value5"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing1",
        "_type" : "_doc",
        "_id" : "test-elasticsearch-sink-testing1+0+5",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value6"
        }
      }
    ]
  }
}
          "f1" : "value1"
          "f1" : "value10"
          "f1" : "value10"
17:34:39 ℹ️ ####################################################
17:34:39 ℹ️ ✅ RESULT: SUCCESS for elasticsearch-cloud-sink.sh (took: 2min 14sec - )
17:34:39 ℹ️ ####################################################

17:34:43 ℹ️ 🧩 Displaying status for 🌎onprem connector elasticsearch-cloud-sink
Name                           Status       Tasks                                                        Stack Trace                                       
-------------------------------------------------------------------------------------------------------------
elasticsearch-cloud-sink       ✅ RUNNING  0:🟢 RUNNING[connect]        -                                                 
-------------------------------------------------------------------------------------------------------------

Case 2: when use.autogenerated.ids is true
connector config -

{
     "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
     "tasks.max": "1",
     "topics": "test-elasticsearch-sink-testing2",
     "key.ignore": "true",
     "connection.url": "**************",
     "connection.username": "*******",
     "connection.password": "******",
     "type.name": "kafka-connect",
     "data.stream.namespace": "ssaraswat${topic}",
     "use.autogenerated.ids": "true"
}

result -

17:38:21 ℹ️ Check that the data is available in Elasticsearch Cloud
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2465  100  2465    0     0   2285      0  0:00:01  0:00:01 --:--:--  2286
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "joEgtJIBFHOU-SZarWdj",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value1"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "j4EgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value2"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "kIEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value3"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "kYEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value4"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "koEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value5"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "k4EgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value6"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "lIEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value7"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "lYEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value8"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "loEgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value9"
        }
      },
      {
        "_index" : "test-elasticsearch-sink-testing2",
        "_type" : "_doc",
        "_id" : "l4EgtJIBFHOU-SZar2fH",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value10"
        }
      }
    ]
  }
}
          "f1" : "value1"
          "f1" : "value10"
          "f1" : "value10"
17:38:22 ℹ️ ####################################################
17:38:22 ℹ️ ✅ RESULT: SUCCESS for elasticsearch-cloud-sink.sh (took: 2min 19sec - )
17:38:22 ℹ️ ####################################################

17:38:28 ℹ️ 🧩 Displaying status for 🌎onprem connector elasticsearch-cloud-sink
Name                           Status       Tasks                                                        Stack Trace                                       
-------------------------------------------------------------------------------------------------------------
elasticsearch-cloud-sink       ✅ RUNNING  0:🟢 RUNNING[connect]        -                                                 
-------------------------------------------------------------------------------------------------------------

Case 3: when use.autogenerated.ids is true but key.ignore is false
connector config -

{
     "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
     "tasks.max": "1",
     "topics": "test-elasticsearch-sink-testing6",
     "key.ignore": "false",
     "connection.url": "**************",
     "connection.username": "*******",
     "connection.password": "******",
     "type.name": "kafka-connect",
     "data.stream.namespace": "ssaraswat${topic}",
     "use.autogenerated.ids": "true"
}

result -

      {
        "_index" : "test-elasticsearch-sink-testing6",
        "_type" : "_doc",
        "_id" : "U4Htt5IBFHOU-SZax6Qn",
        "_score" : 1.0,
        "_source" : {
          "f1" : "value10"
        }
      }
    ]
  }
}
          "f1" : "value1"
          "f1" : "value10"
          "f1" : "value10"
11:21:14 ℹ️ ####################################################
11:21:14 ℹ️ ✅ RESULT: SUCCESS for elasticsearch-cloud-sink.sh (took: 1min 45sec - )
11:21:14 ℹ️ ####################################################

11:21:18 ℹ️ 🧩 Displaying status for 🌎onprem connector elasticsearch-cloud-sink
Name                           Status       Tasks                                                        Stack Trace                                       
-------------------------------------------------------------------------------------------------------------
elasticsearch-cloud-sink       ✅ RUNNING  0:🟢 RUNNING[connect]        -                                                 
-------------------------------------------------------------------------------------------------------------

Testing done:
  • Unit tests
  • Integration tests
  • System tests
  • Manual tests

Release Plan

@confluent-cla-assistant
Copy link

🎉 All Contributor License Agreements have been signed. Ready to merge.
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

@sonarqube-confluent

This comment has been minimized.

@sp-gupta sp-gupta marked this pull request as ready for review October 22, 2024 12:44
@sp-gupta sp-gupta requested a review from a team as a code owner October 22, 2024 12:44
@sonarqube-confluent
Copy link

Passed

Analysis Details

1 Issue

  • Bug 0 Bugs
  • Vulnerability 0 Vulnerabilities
  • Code Smell 1 Code Smell

Coverage and Duplications

  • Coverage 81.00% Coverage (58.50% Estimated after merge)
  • Duplications No duplication information (1.20% Estimated after merge)

Project ID: kafka-connect-elasticsearch

View in SonarQube

@srishti-saraswat srishti-saraswat merged commit e4860b8 into 14.0.x Oct 23, 2024
2 checks passed
@srishti-saraswat srishti-saraswat deleted the CC-29638 branch October 23, 2024 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants