Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synonyms API - GET Request #5

Conversation

carlosdelest
Copy link
Owner

@carlosdelest carlosdelest commented May 25, 2023

Adds a GET request to synonyms:

GET /_synonyms/<synonym_set>?from=&size=

Where from and size can be used to paginate the synonym rules returned in the synonym set (default values 0 and 10 respectively).

Response:

{
    "count": 3,
    "synonyms_set": [
        {
            "id": "32rfs",
            "synonyms": "hello, hi"
        },
        {
            "id": "SpU9UogB5IirwVpS0H6T",
            "synonyms": "bye, goodbye"
        },
        {
            "id": "S5U9UogB5IirwVpS0H6U",
            "synonyms": "test => check"
        }
    ]
}

stu-elastic and others added 17 commits May 24, 2023 11:23
…#96305)

The AbstractHttpServerTransport can have the DefaultRestChannel append
a `Connection: close` and close the channel after the request.

This is an internal method for now, it will be used to implement graceful shutdown.
Parsing a hugely nested mapping can lead to stack overflow errors.
This PR introduces a protection against this by tracking the depth level
while parsing a object in the mapping, and throwing an error if the max depth
level is exceeded. To do this, we use a new mappingParserContext object
to parse every new mapping, and not reuse the same mappingParserContext.

Currently we do have a protection against super deep object introduced
in elastic#90285. But this protection is happening during dynamic mapping
update, when the mapping is already parsed. To avoid stack overflow
during parsing, we need this change as well.

Closes elastic#52098
Enforce the same license level as the advanced remote cluster security
feature for these APIs.

Relates: elastic#95714, elastic#96085
With this commit we implement chunking for the get stacktraces API in
the profiling plugin as typical response sizes are dozens of MB large.
This adds a dedicated method to retrieve shutdown metadata and check its type at
once as well as removes unnecessary optional handling of metadata.
…nes (elastic#96286)

A data stream can match multiple composable index templates which are
prioritised and the highest priority template is the currently in-use
template. A data stream should always have a matching template, for this
reason when we are removing a composable template we check that it is
not in use. If it is in use, we throw an error and the removal fails.

However, since a data stream can match multiple templates, removing the
currently in use template could mean that the data stream would start
using the next inline. In this case, the removal should be allowed. 

This is what this PR addresses. When we try to remove a composable
template, we do not check if this template is in use, but if this
template is the only template that a data stream could use. Only then we
fail the removal, otherwise the removal succeeds and the data stream
starts using the next in-line template.

Fixes: elastic#96022
For real-time get on Stateless, we'd need to first check the indexing shard whether it has
the document in its Translog, if not we might have to wait on the search shard and then
handle the GET locally.

Relates ES-5537
So that logger names go through as they are in MockLogAppender.
A completely idle `transport_worker` thread is reported as `0.0%` idle,
which is confusing. Moreover the docs on the network threading model do
not reflect the changes made in elastic#90482. This commit fixes both of those
things.
@@ -103,14 +104,14 @@ public int hashCode() {

public static class Response extends ActionResponse implements StatusToXContentObject {

private final Result result;
private final SynonymsManagementAPIService.UpdateSynonymsResult result;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some refactoring for PUT. It didn't feel right that the SynonymsManagementAPIService depended on the request, so I defined the response enum in the SynonymsManagementAPIService and the response now uses it.

.setSize(size)
.execute(listener.delegateFailure((searchResponseListener, searchResponse) -> {
final long totalSynonymRules = searchResponse.getHits().getTotalHits().value;
if (totalSynonymRules == 0) {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should forbid creating a Synonyms Set with 0 elements, as we have no way of understanding whether it does not exist or it has 0 elements


public class PutSynonymsActionRequestSerializingTests extends AbstractWireSerializingTestCase<PutSynonymsAction.Request> {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored out to a SynonymsTestUtils class

@carlosdelest carlosdelest marked this pull request as draft May 25, 2023 13:54
This moves the knn check for adjusting search type to the top of the method instead of possibly 
adjusting the search type multiple times. Purely mechanical change for readability.
@carlosdelest carlosdelest changed the title Carlosdelest/synonyms get Synonyms API - GET Request May 25, 2023
mark-vieira and others added 9 commits May 25, 2023 07:31
This commit replaces the bare `Consumer` and/or `Runnable` used to
represent the global checkpoint sync action with a dedicated interface
so that we can document it, and to ease a future change that will need
to add more parameters to this interface.
I think this also relates to reversing the order of sending response and
unregistration in elastic#94865.

Closes elastic#96103
The changes in this PR extend `RoleDescriptor` to support 
specifying `workflows` `restriction` for API keys:

```
POST /_security/api_key
{
  "name": "my_restricted_api_key",
  "role_descriptors": {
    "my_restricted_role": {
      "indices": [
        {
          "names": ["books", "movies"],
          "privileges": ["read"]
        }
      ],
      "restriction": {
        "workflows": ["my_workflow"]
      }
    }
  }
}
```

The workflows will be defined statically and used to restrict 
the API key's access to only specific REST APIs.

To keep the scope of the PR down, I will address below items in follow up PRs:

- Workflow definition and resolution
- Validation logic to verify that a workflow name is valid
- Blocking workflows restriction to be specified via Roles API
- Blocking file-based definitions of roles with workflows restriction
- Enforcing of workflow restrictions for API keys
Currently we have a number of input streams that specifically override
the skip() method disabling the ability to skip bytes. In each case the
skip implementation works as we have properly implemented the
read(byte[]) methods used to discard bytes. However, we appear to have
disabled it as it would be possible to retry from the end of a skip if
there is a failure in the middle. At this time, that optimization is not
really necessary, however, we sporadically used skip so it would be nice
for the IS to support the method. This commit enables the super.skip()
and adds a comment about future optimizations.
DaveCTurner and others added 28 commits May 30, 2023 11:45
In a busy cluster the list-tasks API may retain information about a very
large number of tasks while waiting for all nodes to respond. This
commit makes the API cancellable so that unnecessary partial results can
be released earlier.

Relates elastic#96279, which implements the early-release functionality.
This adds a `data_lifecycle` section to the _xpack/usage API, giving basic information about data lifecycles in the cluster. The data looks something like:
```
    "data_lifecycle": {
        "available": true,
        "enabled": true,
        "lifecycle": {
            "count": 1,
            "default_rollover_used": true,
            "retention": {
                "minimum_millis": 360000,
                "maximum_millis": 360000,
                "average_millis": 360000.0
            }
        }
    }
```
Correct and add more tests for adding null_value parameter for the
rank_feature field.

Relates to elastic#95811, closes elastic#95149
We did not define if `cluster:monitor/xpack/usage/data_lifecycle` is
operator only or non operator action which resulted in the following
failure:

```
java.lang.AssertionError: Actions are neither operator-only nor non-operator: [[cluster:monitor/xpack/usage/data_lifecycle]]. An action should be declared to be either operator-only in [org.elasticsearch.xpack.security.operator.OperatorOnlyRegistry] or non-operator in [org.elasticsearch.xpack.security.operator.Constants] |  
```

https://gradle-enterprise.elastic.co/s/qs25bf66nam6s/tests/:x-pack:plugin:security:qa:operator-privileges-tests:javaRestTest/org.elasticsearch.xpack.security.operator.OperatorPrivilegesIT/testEveryActionIsEitherOperatorOnlyOrNonOperator?top-execution=1
Book-keeping PR to ensure that we don't prematurely un-ignore a doc
tests suite.
This commits fixes the incorrect pattern for TChar defined in RFC7230 section 3.2.6
`a-zA-z` was accidentally used and the pattern `a-zA-Z` should be used instead
NodeReplacement and NodeShutdown test perform reroute on empty cluster state.
This does nothing (on empty state) and could be safely removed.
This change adds support for unsigned long fields (and any fields that can provide numeric data) to 
use decay functions as described by (elastic#89603). This allows mapped fields developed as plugins to also 
have access to this set of functions provided they return IndexNumericFieldData from getForField.

Closes elastic#89603.
If the source node applies the cluster state before the target then it
responds with an `IllegalIndexShardStateException` which makes the
target retry. But we don't want any retries. This commit avoids the race
by delaying the response on the transport layer.

Closes elastic#96427
Moves ref-counting onto the `SnapshotShardContext` to more clearly
prevent an aborted snapshot from interacting with the `Store`.

Relates elastic#95316
* Fix recovery permit exhaustion message

* Use test framework utility to capture logged message
This class was quite hot in recent benchmarks of shared-cached based
searches and we can make instantiating the releasable locks a little cheaper.
Also, those same benchmarks showed a lot of visible time spent on
dealing with ref counts. I removed one layer of indirection in atomic
use from both the release-once and the abstract ref count which
should save a little in CPU caches as well.
This PR actively blocks creating cross-cluster API keys with another API
key to avoid the issue of derived API key ownership.

Relates: elastic#95714
This change uses `allocation.decision` instead of `Decision.single` so that we
are checking debug flag before formatting explanation message.
Most relevant changes:

- add api to allow concurrent query rewrite (GITHUB-11838 Add api to allow concurrent query rewrite apache/lucene#11840)
- knn query rewrite (Concurrent rewrite for KnnVectorQuery apache/lucene#12160)
- Integrate the incubating Panama Vector API (Integrate the Incubating Panama Vector API  apache/lucene#12311)

As part of this commit I moved the ES codebase off of overriding or relying on the deprecated rewrite(IndexReader) method in favour of using rewrite(IndexSearcher) instead. For score functions, I went for not breaking existing plugins and create a new IndexSearcher whenever we rewrite a filter, otherwise we'd need to change the ScoreFunction#rewrite signature to take a searcher instead of a reader.

Co-authored-by: ChrisHegarty <[email protected]>
Remove some allocations from the synchronized block but more
importantly:
* remove expensive stream operation on list that can be mutated in place
  and move it outside the synchronized block
* used proper navigable set methods instead of redundantly allocation
  head and tail sets
Avoid the use of KeyedLock, which has a high overhead for uncontended locks.
Reduce granularity of lock during #get to the actual region.

Relates elastic#96372
Porting the DLM permissions REST test to the new style of cluster tests.
This has the usual nice perks, and also allows us to remove the separate
`qa/with-security` package. 

No functional or test logic changes. 

I didn't suggest this as part of the PR review for
elastic#95512 so as not to block
that PR further, and also because I wasn't sure about the overhead of
making this change (it did end up taking some battling with gradle).
We track the commit generation as `indexVersion` in each
`BlobStoreIndexShardSnapshot`, apparently for "compatibility with v1.0".
This value has been unused for a long time, so with this commit we stop
tracking it and drop it from the metadata blob. This has no bwc
implications because the metadata is parsed in a manner that does not
require this field to be present.
…-get

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/ActionModule.java
#	server/src/main/java/org/elasticsearch/action/synonyms/PutSynonymsAction.java
#	server/src/main/java/org/elasticsearch/action/synonyms/TransportPutSynonymsAction.java
#	server/src/main/java/org/elasticsearch/synonyms/SynonymsManagementAPIService.java
#	server/src/main/java/org/elasticsearch/synonyms/SynonymsSet.java
#	server/src/test/java/org/elasticsearch/action/synonyms/PutSynonymsActionRequestSerializingTests.java
#	server/src/test/java/org/elasticsearch/action/synonyms/PutSynonymsActionResponseSerializingTests.java
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/user/InternalUsers.java
@carlosdelest
Copy link
Owner Author

Closing in favor of elastic#96467

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.