Add getting started content #6834

kolchfa-aws · 2024-04-02T18:33:23Z

Closes #6533

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <[email protected]>

_getting-started/communicate.md

Signed-off-by: kolchfa-aws <[email protected]>

_getting-started/intro.md

Signed-off-by: kolchfa-aws <[email protected]>

_getting-started/intro.md

Signed-off-by: kolchfa-aws <[email protected]>

_getting-started/intro.md

Signed-off-by: kolchfa-aws <[email protected]>

_getting-started/search-data.md

Signed-off-by: kolchfa-aws <[email protected]>

vagimeli

Well done! Clear, simple instructions that cover the key concepts and information users need. It clarified my understanding of OpenSearch terms and use cases too :)

_getting-started/communicate.md

vagimeli · 2024-04-03T18:10:33Z

_getting-started/communicate.md

+
+You interact with OpenSearch clusters using the REST API, which offers a lot of flexibility. Through the REST API, you can change most OpenSearch settings, modify indexes, check the health of the cluster, get statistics---almost everything. You can use clients like [cURL](https://curl.se/) or any programming language that can send HTTP requests. 
+
+You can send HTTP requests in your terminal or in the Dev Tools console in OpenSearch Dashboards.


Should "Dev Tools console" hyperlink to the documentation? {{site.url}}{{site.baseurl}}/dashboards/dev-tools/index-dev/

_getting-started/communicate.md

vagimeli · 2024-04-03T18:14:11Z

_getting-started/communicate.md

+
+For more information about `pretty` and other useful query parameters, see [Common REST parameters]({{site.url}}{{site.baseurl}}/opensearch/common-parameters/).
+
+For requests that contain a body, specify the `Content-Type` header and provide the request payload in the `-d` (data) oprion:


Suggested change

For requests that contain a body, specify the `Content-Type` header and provide the request payload in the `-d` (data) oprion:

For requests that contain a body, specify the `Content-Type` header and provide the request payload in the `-d` (data) option:

_getting-started/communicate.md

_getting-started/intro.md

_getting-started/quickstart.md

_getting-started/search-data.md

_getting-started/communicate.md

Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

msfroh · 2024-04-03T19:28:02Z

_getting-started/search-data.md

+}
+```
+
+Both `John Doe` and `Jane Doe` matched the word `doe`, but `John Doe` is scored higher because it also matched `john`.


It may be worth mentioning that the match query type uses OR as an operator by default, so the query is functionally doe OR john.

msfroh · 2024-04-03T19:31:00Z

_getting-started/search-data.md

+
+## Search methods
+
+Along with the traditional BM25 search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).


It looks like the latest commit removed the description of BM25. (I think this is the only mention of BM25 in the page now.)

Maybe "Along with the traditional full-text search described..." ?

msfroh · 2024-04-03T19:57:02Z

_getting-started/intro.md

+Any index changes, such as document indexing or deletion, are written to disk during a Lucene commit. However, Lucene commits are expensive operations, so they cannot be performed after every change to the index. Instead, each shard records every indexing operation in a transaction log called _translog_. When a document is indexed, it is added to the memory buffer and recorded in the translog. After a process or host restart, any data in the in-memory buffer is lost. Recording the document in the translog ensures durability because the translog is written to disk.
+
+Frequent refresh operations write the documents in the memory buffer to a segment and then clear the memory buffer. Periodically, a [flush](#flush) performs a Lucene commit, which includes writing the segments to disk using `fsync`, purging the old translog, and starting a new translog. Thus, a translog contains all operations that have not yet been flushed.


I feel like this might be getting too detailed while muddying the key thing that users might need to know ("When is my data durable? When is my data searchable?").

I think we could say something like:

An indexing or bulk call responds when the documents have been written to the translog and the translog is flushed to disk, so the updates are durable. The updates will be visible from search requests until after a refresh operation (see below).

I almost feel like it would help to document these as steps in the lifecycle of an update, like:

An update is received by a primary shard and gets written to the shard's transaction log, which is flushed to disk (followed by an fsync) before the update is acknowledged. This guarantees durability.

The update is also passed to the Lucene index writer, which adds it to an in-memory buffer.

On refresh, the Lucene index writer flushes the in-memory buffers to disk (with each buffer becoming a new Lucene segment), and a new index reader is opened over the resulting segment files. The updates are now visible for search.

On a flush operation, the shard fsyncs the Lucene segments. Since the segment files are a durable representation of the updates, the translog is no longer needed to provide durability, do the updates can be purged from the translog.

If the OpenSearch process is terminated between the end of step 1 (when the update has been acknowledged) and the end of step 4 (when the updated Lucene segments have been flushed to disk), the updates will be replayed from the translog when the process restarts.

@smacrakis -- we talked briefly about this content. Is the above clearer or still too in-the-weeds?

msfroh · 2024-04-03T20:07:05Z

_getting-started/communicate.md

+}
+```
+
+You cannot change the mappings once the index is created. 


There are some mapping changes that are allowed. For example, new fields can be added. I believe you can change the search analyzer associated with a field.

Maybe "You cannot change the type of a field once it is created" ?

Changed to your suggestion. Also added "Changing a field type requires deleting the index and recreating it with the new mappings."

natebower

@kolchfa-aws Great job putting this all together 😄. Please see my comments and changes and let me know if you have any questions. Thanks!

_getting-started/communicate.md

natebower · 2024-04-04T13:01:47Z

_getting-started/search-data.md

+```
+{% include copy-curl.html %}
+
+This request returns no hits because the `keyword` fields must be matched exactly. 


Suggested change

This request returns no hits because the `keyword` fields must be matched exactly.

Then the request returns no hits because the `keyword` fields must exactly match.

natebower · 2024-04-04T13:02:55Z

_getting-started/search-data.md

+
+This request returns no hits because the `keyword` fields must be matched exactly. 
+
+However, you can search for the exact text `John Doe`:


Same comment. Would this be better structured as "However, if you search for the exact text John Doe:

[Example]

Then OpenSearch returns..."?

natebower · 2024-04-04T13:04:59Z

_getting-started/search-data.md

+
+### Filters
+
+You can add a filter clause to your query for fields with exact values using a Boolean query. 


The syntax here is slightly confusing. Do we mean "Using a Boolean query, you can add a filter clause to your query for fields with exact values"?

natebower · 2024-04-04T13:06:21Z

_getting-started/search-data.md

+```
+{% include copy-curl.html %}
+
+Range filters support specifying a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:


Either "Range filters specify a range of values", "Range filters allow you to specify a range of values", or "With range filters, you can support a range of values".

natebower · 2024-04-04T13:09:48Z

_getting-started/search-data.md

+
+## Search methods
+
+Along with the traditional full-text search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).


Suggested change

Along with the traditional full-text search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).

Along with the traditional full-text search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all OpenSearch-supported search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).

_getting-started/communicate.md

kolchfa-aws · 2024-04-04T14:03:08Z

_getting-started/intro.md

+
+- In a database of students, a document might represent one student.
+- When you search for information, OpenSearch returns documents related to your search.
+- If you're familiar with traditional databases, a document represents a row.


Suggested change

- If you're familiar with traditional databases, a document represents a row.

- A document represents a row in a traditional database.

kolchfa-aws · 2024-04-04T14:06:24Z

_getting-started/intro.md

+
+You can think of an index in several ways:
+
+- If you have a collection of encyclopedia articles, an index represents the whole collection.


Suggested change

- If you have a collection of encyclopedia articles, an index represents the whole collection.

- In a database of students, an index represents all students in the database.

kolchfa-aws · 2024-04-04T14:07:12Z

_getting-started/intro.md

+
+- If you have a collection of encyclopedia articles, an index represents the whole collection.
+- When you search for information, you query data contained in an index.
+- If you're familiar with traditional databases, a document represents a database table.


Suggested change

- If you're familiar with traditional databases, a document represents a database table.

- An index represents a database table in a traditional database.

kolchfa-aws · 2024-04-04T14:07:39Z

_getting-started/intro.md

+- When you search for information, you query data contained in an index.
+- If you're familiar with traditional databases, a document represents a database table.
+
+For example, in a school database, an index might contain all students in the school.


Suggested change

For example, in a school database, an index might contain all students in the school.

For example, in a school database, an index might contain information about all students in the school.

kolchfa-aws · 2024-04-04T14:08:25Z

_getting-started/intro.md

+
+## Clusters and nodes
+
+OpenSearch is designed to be a distributed search engine. OpenSearch can run on one or more _nodes_---servers that store your data and process search requests. An OpenSearch *cluster* is a collection of nodes. 


Suggested change

OpenSearch is designed to be a distributed search engine. OpenSearch can run on one or more _nodes_---servers that store your data and process search requests. An OpenSearch *cluster* is a collection of nodes.

OpenSearch is designed to be a distributed search engine, meaning that it can run on one or more _nodes_---servers that store your data and process search requests. An OpenSearch *cluster* is a collection of nodes.

kolchfa-aws · 2024-04-04T14:11:00Z

_getting-started/intro.md

+
+You can run OpenSearch locally on a laptop---its system requirements are minimal---but you can also scale a single cluster to hundreds of powerful machines in a data center.
+
+In a single-node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state. 


Suggested change

In a single-node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state.

In a single-node cluster, such as one deployed on a laptop, one machine has to perform every task: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might perform well when indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state.

kolchfa-aws · 2024-04-04T14:16:36Z

_getting-started/search-data.md

+
+### Full-text search
+
+You can run a full-text search on fields mapped as `text`. By default, text fields are analyzed by the `default` analyzer. The analyzer splits text into terms and makes it lowercase. For more information about OpenSearch analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/).


Suggested change

You can run a full-text search on fields mapped as `text`. By default, text fields are analyzed by the `default` analyzer. The analyzer splits text into terms and makes it lowercase. For more information about OpenSearch analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/).

You can run a full-text search on fields mapped as `text`. By default, text fields are analyzed by the `default` analyzer. The analyzer splits text into terms and changes it to lowercase. For more information about OpenSearch analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/).

kolchfa-aws · 2024-04-04T14:20:29Z

_getting-started/search-data.md

+
+### Keyword search
+
+The `name` field contains the `name.keyword` subfield, which was added by OpenSearch automatically. You can try to search the `name.keyword` field in a manner similar to the previous request:


Suggested change

The `name` field contains the `name.keyword` subfield, which was added by OpenSearch automatically. You can try to search the `name.keyword` field in a manner similar to the previous request:

The `name` field contains the `name.keyword` subfield, which is added by OpenSearch automatically. If you search the `name.keyword` field in a manner similar to the previous request:

kolchfa-aws · 2024-04-04T14:21:19Z

_getting-started/search-data.md

+
+This request returns no hits because the `keyword` fields must be matched exactly. 
+
+However, you can search for the exact text `John Doe`:


Suggested change

However, you can search for the exact text `John Doe`:

However, if you search for the exact text `John Doe`:

kolchfa-aws · 2024-04-04T14:22:59Z

_getting-started/search-data.md

+
+### Filters
+
+You can add a filter clause to your query for fields with exact values using a Boolean query. 


Suggested change

You can add a filter clause to your query for fields with exact values using a Boolean query.

Using a Boolean query, you can add a filter clause to your query for fields with exact values

kolchfa-aws · 2024-04-04T14:26:51Z

_getting-started/search-data.md

+```
+{% include copy-curl.html %}
+
+Range filters support specifying a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:


Suggested change

Range filters support specifying a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:

With range filters, you can specify a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

natebower

@kolchfa-aws Just a few minor comments/changes.

natebower · 2024-04-04T14:40:44Z

_getting-started/intro.md

+
+1. An update is received by a primary shard and is written to the shard's transaction log ([translog](#translog)). The translog is flushed to disk (followed by an fsync) before the update is acknowledged. This guarantees durability.
+1. The update is also passed to the Lucene index writer, which adds it to an in-memory buffer.
+1. On a [refresh operation](#refresh), the Lucene index writer flushes the in-memory buffers to disk (with each buffer becoming a new Lucene segment), and a new index reader is opened over the resulting segment files. The updates are now visible for search.


Is "over" the right preposition here?

It's the word that I've heard Lucene developers use, because an IndexReader is like a moving window providing a view "over" a set of segments.

Maybe "with" would make more sense to a casual reader? That doesn't sound quite right, though...

Thanks! I'll keep "over"

_getting-started/intro.md

natebower · 2024-04-04T14:43:00Z

_getting-started/intro.md

+
+### Translog
+
+An indexing or bulk call responds when the documents have been written to the translog and the translog is flushed to disk, so the updates are durable. The updates will be visible from search requests until after a [refresh operation](#refresh).


Is "from" the right preposition here?

"to" is probably better

Also, the word "not" is missing -- The updates will not be visible to search requests until after a refresh operation.

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

_getting-started/intro.md

Signed-off-by: kolchfa-aws <[email protected]>

* First iteration Signed-off-by: Fanit Kolchina <[email protected]> * Add shard and node info Signed-off-by: Fanit Kolchina <[email protected]> * Communicate section additions Signed-off-by: Fanit Kolchina <[email protected]> * Change examples Signed-off-by: Fanit Kolchina <[email protected]> * Remove extraneous files Signed-off-by: Fanit Kolchina <[email protected]> * Update _getting-started/communicate.md Signed-off-by: kolchfa-aws <[email protected]> * Update _getting-started/intro.md Signed-off-by: kolchfa-aws <[email protected]> * Update _getting-started/intro.md Signed-off-by: kolchfa-aws <[email protected]> * Update _getting-started/intro.md Signed-off-by: kolchfa-aws <[email protected]> * Update _getting-started/search-data.md Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Tech review comments Signed-off-by: Fanit Kolchina <[email protected]> * Add link to compound query section Signed-off-by: Fanit Kolchina <[email protected]> * Added install types section Signed-off-by: Fanit Kolchina <[email protected]> * Remove further reading suggestions Signed-off-by: Fanit Kolchina <[email protected]> * Reorder sections Signed-off-by: Fanit Kolchina <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Update _getting-started/intro.md Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Fix links Signed-off-by: Fanit Kolchina <[email protected]> * Reword Signed-off-by: Fanit Kolchina <[email protected]> * Reword Signed-off-by: Fanit Kolchina <[email protected]> * Update _getting-started/intro.md Signed-off-by: kolchfa-aws <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 246bb44) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

kolchfa-aws added 4 commits March 14, 2024 17:07

First iteration

7c6ea2a

Signed-off-by: Fanit Kolchina <[email protected]>

Add shard and node info

a20ed08

Signed-off-by: Fanit Kolchina <[email protected]>

Communicate section additions

3587e5a

Signed-off-by: Fanit Kolchina <[email protected]>

Change examples

5b04bdc

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws self-assigned this Apr 2, 2024

kolchfa-aws requested review from hdhalter, Naarcha-AWS, vagimeli, AMoo-Miki, natebower, dlvenable, stephen-crawford and epugh as code owners April 2, 2024 18:33

Remove extraneous files

69f41d9

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/communicate.md Outdated Show resolved Hide resolved

Update _getting-started/communicate.md

6aa0729

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/intro.md Outdated Show resolved Hide resolved

Update _getting-started/intro.md

1abbab9

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/intro.md Outdated Show resolved Hide resolved

Update _getting-started/intro.md

0e82f23

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/intro.md Outdated Show resolved Hide resolved

Update _getting-started/intro.md

07b9882

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/search-data.md Outdated Show resolved Hide resolved

Update _getting-started/search-data.md

9b66b24

Signed-off-by: kolchfa-aws <[email protected]>

vagimeli approved these changes Apr 3, 2024

View reviewed changes

kolchfa-aws commented Apr 3, 2024

View reviewed changes

_getting-started/communicate.md Outdated Show resolved Hide resolved

kolchfa-aws and others added 2 commits April 3, 2024 15:32

Apply suggestions from code review

02fccc4

Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Apply suggestions from code review

08dcd8e

Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

msfroh reviewed Apr 3, 2024

View reviewed changes

natebower reviewed Apr 4, 2024

View reviewed changes

kolchfa-aws commented Apr 4, 2024

View reviewed changes

_getting-started/communicate.md Outdated Show resolved Hide resolved

kolchfa-aws commented Apr 4, 2024

View reviewed changes

_getting-started/communicate.md Outdated Show resolved Hide resolved

kolchfa-aws commented Apr 4, 2024

View reviewed changes

_getting-started/communicate.md Outdated Show resolved Hide resolved

kolchfa-aws commented Apr 4, 2024

View reviewed changes

Apply suggestions from code review

812f500

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

natebower reviewed Apr 4, 2024

View reviewed changes

kolchfa-aws and others added 4 commits April 4, 2024 10:46

Update _getting-started/intro.md

c552f74

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Fix links

ad6bfaa

Signed-off-by: Fanit Kolchina <[email protected]>

Reword

5c66faf

Signed-off-by: Fanit Kolchina <[email protected]>

Reword

901897d

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws commented Apr 4, 2024

View reviewed changes

_getting-started/intro.md Outdated Show resolved Hide resolved

kolchfa-aws added 2 commits April 4, 2024 16:39

Update _getting-started/intro.md

8793d8c

Signed-off-by: kolchfa-aws <[email protected]>

Merge branch 'main' into getting-started

70d4ee6

kolchfa-aws merged commit 246bb44 into main Apr 8, 2024
6 checks passed

github-actions bot deleted the getting-started branch April 8, 2024 13:10

kolchfa-aws added the backport 2.13 PR: Backport label for 2.13 label Apr 8, 2024

opensearch-trigger-bot bot mentioned this pull request Apr 8, 2024

[Backport 2.13] Add getting started content #6919

Merged

github-actions bot pushed a commit that referenced this pull request Apr 8, 2024

Add getting started content (#6834) (#6919)

e43baab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add getting started content #6834

Add getting started content #6834

kolchfa-aws commented Apr 2, 2024

vagimeli left a comment

vagimeli Apr 3, 2024

vagimeli Apr 3, 2024

msfroh Apr 3, 2024

msfroh Apr 3, 2024

msfroh Apr 3, 2024

msfroh Apr 3, 2024

kolchfa-aws Apr 3, 2024

natebower left a comment

natebower Apr 4, 2024 •

edited by kolchfa-aws

Loading

natebower Apr 4, 2024

natebower Apr 4, 2024

natebower Apr 4, 2024

natebower Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

kolchfa-aws Apr 4, 2024

natebower left a comment

natebower Apr 4, 2024

msfroh Apr 4, 2024

kolchfa-aws Apr 4, 2024

natebower Apr 4, 2024

msfroh Apr 4, 2024


		You interact with OpenSearch clusters using the REST API, which offers a lot of flexibility. Through the REST API, you can change most OpenSearch settings, modify indexes, check the health of the cluster, get statistics---almost everything. You can use clients like [cURL](https://curl.se/) or any programming language that can send HTTP requests.

		You can send HTTP requests in your terminal or in the Dev Tools console in OpenSearch Dashboards.


		For more information about `pretty` and other useful query parameters, see [Common REST parameters]({{site.url}}{{site.baseurl}}/opensearch/common-parameters/).

		For requests that contain a body, specify the `Content-Type` header and provide the request payload in the `-d` (data) oprion:


		## Search methods

		Along with the traditional BM25 search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).

		Any index changes, such as document indexing or deletion, are written to disk during a Lucene commit. However, Lucene commits are expensive operations, so they cannot be performed after every change to the index. Instead, each shard records every indexing operation in a transaction log called _translog_. When a document is indexed, it is added to the memory buffer and recorded in the translog. After a process or host restart, any data in the in-memory buffer is lost. Recording the document in the translog ensures durability because the translog is written to disk.

		Frequent refresh operations write the documents in the memory buffer to a segment and then clear the memory buffer. Periodically, a [flush](#flush) performs a Lucene commit, which includes writing the segments to disk using `fsync`, purging the old translog, and starting a new translog. Thus, a translog contains all operations that have not yet been flushed.

	This request returns no hits because the `keyword` fields must be matched exactly.
	Then the request returns no hits because the `keyword` fields must exactly match.


		This request returns no hits because the `keyword` fields must be matched exactly.

		However, you can search for the exact text `John Doe`:


		### Filters

		You can add a filter clause to your query for fields with exact values using a Boolean query.


		## Search methods

		Along with the traditional full-text search described in this tutorial, OpenSearch supports a range of machine learning (ML)-powered search methods, including k-NN, semantic, multimodal, sparse, hybrid, and conversational search. For information about all search methods, see [Search]({{site.url}}{{site.baseurl}}/search-plugins/).

	- If you're familiar with traditional databases, a document represents a row.
	- A document represents a row in a traditional database.


		You can think of an index in several ways:

		- If you have a collection of encyclopedia articles, an index represents the whole collection.

	- If you have a collection of encyclopedia articles, an index represents the whole collection.
	- In a database of students, an index represents all students in the database.

	- If you're familiar with traditional databases, a document represents a database table.
	- An index represents a database table in a traditional database.

	For example, in a school database, an index might contain all students in the school.
	For example, in a school database, an index might contain information about all students in the school.


		## Clusters and nodes

		OpenSearch is designed to be a distributed search engine. OpenSearch can run on one or more _nodes_---servers that store your data and process search requests. An OpenSearch cluster is a collection of nodes.


		You can run OpenSearch locally on a laptop---its system requirements are minimal---but you can also scale a single cluster to hundreds of powerful machines in a data center.

		In a single-node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state.


		### Full-text search

		You can run a full-text search on fields mapped as `text`. By default, text fields are analyzed by the `default` analyzer. The analyzer splits text into terms and makes it lowercase. For more information about OpenSearch analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/).


		### Keyword search

		The `name` field contains the `name.keyword` subfield, which was added by OpenSearch automatically. You can try to search the `name.keyword` field in a manner similar to the previous request:

	However, you can search for the exact text `John Doe`:
	However, if you search for the exact text `John Doe`:

	You can add a filter clause to your query for fields with exact values using a Boolean query.
	Using a Boolean query, you can add a filter clause to your query for fields with exact values

	Range filters support specifying a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:
	With range filters, you can specify a range of values. For example, the following Boolean query searches for students whose GPA is greater than 3.6:

Add getting started content #6834

Add getting started content #6834

Conversation

kolchfa-aws commented Apr 2, 2024

Checklist

vagimeli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

natebower Apr 4, 2024 • edited by kolchfa-aws Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

natebower Apr 4, 2024 •

edited by kolchfa-aws

Loading


		### Translog

		An indexing or bulk call responds when the documents have been written to the translog and the translog is flushed to disk, so the updates are durable. The updates will be visible from search requests until after a [refresh operation](#refresh).