Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticSearch connector with date in index and custom document type #173

Merged
merged 4 commits into from
May 4, 2017
Merged

ElasticSearch connector with date in index and custom document type #173

merged 4 commits into from
May 4, 2017

Conversation

robvadai
Copy link
Contributor

@robvadai robvadai commented May 3, 2017

The ElasticSearch connector is updated with:

  • support for auto-generated time/date in index names; this is to create indexes per day instead of one ever-increasing index, as per the official guidelines on Retiring Data
  • support for custom document type for the messages to be saved; document types allow custom indexing policies

@@ -29,4 +29,16 @@ object ElasticSinkConfigConstants {
val URL_PREFIX_DEFAULT = "elasticsearch"
val EXPORT_ROUTE_QUERY = "connect.elastic.sink.kcql"
val EXPORT_ROUTE_QUERY_DOC = "KCQL expression describing field selection and routes."

val INDEX_NAME_SUFFIX = "connect.elastic.index.suffix"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robvadai
Shouldn't we add support for this in KCQL? The reason i am asking is for scenarios where you route messages from topic1 to index1 and topic2 to index2 and maybe you want one of the index to have suffix. Furthermore you might use different document type for messages coming from two different topics.
Thoughts?

Copy link
Contributor

@stheppi stheppi May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KCQL example: INSERT INTO $THE_INDEX SELECT * FROM $THE_TOPIC [WITHDOCTYPE($docType)] [WITHINDEXSUFFIX($suffix)] [AUTOCREATE]

Autocreate - already exists in the KCQL grammar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes this could be part of KCQL, good idea, obviously updating KCQL was out of scope for me, we needed a way to support date string in ES index names
  2. We had a requirement to make sure indexes are never auto created because we create indexes with custom policies and document types. The Connector was automatically creating missing indexes which is what we wanted to avoid. Hence the conditional index creation was added here: https://github.com/ConnectedHomes/stream-reactor/blob/9bc01bd61034037cacb14f5154bbfa01ae8e4b47/kafka-connect-elastic/src/main/scala/com/datamountaineer/streamreactor/connect/elastic/ElasticJsonWriter.scala#L42


val DOCUMENT_TYPE = "connect.elastic.document.type"
val DOCUMENT_TYPE_DOC = "Sets the ElasticSearch document type. See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-type-field.html for more info."
val DOCUMENT_TYPE_DEFAULT = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably use null and avoid toOption over an empty string

@stheppi stheppi merged commit 484d385 into lensesio:master May 4, 2017
@stheppi
Copy link
Contributor

stheppi commented May 4, 2017

@robvadai thank you for the improvement

@joscas
Copy link

joscas commented May 4, 2017

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants