Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media-type parser #61987

Merged
merged 32 commits into from
Sep 17, 2020
Merged
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
5e42390
Split responsibility for format parsing
pgomulka Sep 2, 2020
01f001d
parse * and ndjson
pgomulka Sep 2, 2020
50a88a0
make format not accepting applicaiton/
pgomulka Sep 2, 2020
fb03ffd
post data request should parse applicaiton/json style
pgomulka Sep 2, 2020
09f281e
unused import
pgomulka Sep 2, 2020
8bc024f
fix sql parsing
pgomulka Sep 2, 2020
070508c
split format and accept header
pgomulka Sep 3, 2020
3c7ab16
fix and todos
pgomulka Sep 3, 2020
59a7f42
Merge branch 'master' into xcontent_format_parsing
pgomulka Sep 3, 2020
968b1c9
media type parser
pgomulka Sep 4, 2020
46f8f33
media type parser
pgomulka Sep 4, 2020
cbbe093
precommit
pgomulka Sep 4, 2020
222caee
rename and null check
pgomulka Sep 7, 2020
6bdec13
Merge branch 'master' into header_version_split
pgomulka Sep 7, 2020
ee97564
Merge branch 'master' into header_version_split
pgomulka Sep 7, 2020
7f52e11
fix text format parsing
pgomulka Sep 7, 2020
63db70c
Merge branch 'master' into header_version_split
pgomulka Sep 8, 2020
57ddb40
Apply suggestions from code review
pgomulka Sep 9, 2020
b5f1eff
code review follow up
pgomulka Sep 9, 2020
90e798d
Merge branch 'header_version_split' of github.com:pgomulka/elasticsea…
pgomulka Sep 9, 2020
fa49be4
javadoc and validation
pgomulka Sep 14, 2020
31d92ac
Update x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/pl…
pgomulka Sep 14, 2020
a925fbe
Merge branch 'header_version_split' of github.com:pgomulka/elasticsea…
pgomulka Sep 14, 2020
23c4e41
javadoc fix
pgomulka Sep 14, 2020
b1e3fb1
remove shortName
pgomulka Sep 14, 2020
c17a895
javadoc fix
pgomulka Sep 14, 2020
7d6bd08
fix compile error
pgomulka Sep 14, 2020
3c93954
fix test compile
pgomulka Sep 14, 2020
77068a8
Merge branch 'master' into header_version_split
pgomulka Sep 15, 2020
8fb0cd4
Merge branch 'master' into header_version_split
pgomulka Sep 15, 2020
63fb6c7
remove todo and a testcase
pgomulka Sep 16, 2020
4598a0a
Apply suggestions from code review
pgomulka Sep 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
media type parser
pgomulka committed Sep 4, 2020
commit 968b1c98da20f2efecd989d6b5926e4241ec46e0
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.common.xcontent;

import java.util.Map;

public interface MediaType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add javadocs to the class and methods?

String type();
String subtype();
String format();
Copy link
Contributor

@jakelandis jakelandis Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format, shortName, and subType seem like 3 ways to represent the same thing, and would be great if we could get to a common term (subtype) across the board. I think we can just drop format in favor of subtype ?

Copy link
Contributor Author

@pgomulka pgomulka Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in some cases these 3 are actually different. For instance - text/plain has subtype=plain, but format and shortName txt
similarly for versioned media types I plan this to be: application/vnd.elasticsearch+json;compatible-with=7 subtype is vnd.elasticsearch+json, format is json and shortName json

I think shortName is redundant and I will give it a go to remove it


default String typeSubtype(){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typeWithSubtype or or just get() or asString

Copy link
Contributor Author

@pgomulka pgomulka Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeWithSubtype looks best to me. (although asString is tempting too..)

return type()+"/"+subtype();
pgomulka marked this conversation as resolved.
Show resolved Hide resolved
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.common.xcontent;

import java.util.HashMap;
import java.util.Locale;
import java.util.Map;
import java.util.Objects;

public class XContentTypeParser<T extends MediaType> {
private Map<String, T> formatToXContentType = new HashMap<>();
private Map<String, T> typeSubtypeToMediaType = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typeWithSubTypeToMediaType or stringToMediaType


public XContentTypeParser(T[] acceptedXContentTypes) {
for (T xContentType : acceptedXContentTypes) {
typeSubtypeToMediaType.put(xContentType.typeSubtype(), xContentType);
formatToXContentType.put(xContentType.format(), xContentType);
}
}

public T fromMediaType(String contentTypeHeader) {
ParsedMediaType parsedMediaType = parseMediaType(contentTypeHeader);
return parsedMediaType != null ? parsedMediaType.getMediaType() : null;
}

public T fromFormat(String mediaType) {
return formatToXContentType.get(mediaType.toLowerCase(Locale.ROOT));
}

public XContentTypeParser withAdditionalMediaType(String typeSubtype, T xContentType) {
typeSubtypeToMediaType.put(typeSubtype.toLowerCase(Locale.ROOT), xContentType);
formatToXContentType.put(xContentType.format(), xContentType);
return this;
}

public ParsedMediaType parseMediaType(String mediaType) {
if (mediaType != null) {
String headerValue = mediaType.toLowerCase(Locale.ROOT);
// split string on semicolon
// validate media type is accepted (IIRC whitespace is ok so trim it) //TODO PG whitespace only ok in params
// rest of strings are params. validate per RFC 7230 and use ones that we care about
// or use a regex and we can change if necessary
String[] split = headerValue.split(";");

String[] typeSubtype = split[0].toLowerCase(Locale.ROOT)
.split("/");
if (typeSubtype.length == 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I completely follow the format variant of parsing... but I think this method does not support that... right ?

The previous version accepted either variants (format (json) or the mime type (application/json) ... so splitting these requires the caller to know know which one to call.

I see two cases TextTemplateEngine and RestSqlQueryAction where it is not immediately obvious that is correct. It seems strange that we would support this shorter name in some cases for parsing headers, but not others, and when the the format variant is used (e.g. json) is used we don't support parsing the parameters.

Is possible to just support the mime type for all content-type and accept header parsing ? (leaving the format soley to the realm of the query string parameter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4x yes.
so this is exactly the intention here.
re1 The parseMediaType is not supporting the format parsing.
re2 The caller has to know which one to call.
re3 There should not be a place were we parse a format but a value is taken from Accept or Content-Type headers. Values from headers should follow media-type format (type/subtype;parameters)
re4 I hope after this PR this will be clear that mime types are for headers, format for query parameter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.. i double checked TextTemplateEngine and while it appears that the former supported either application/json or json, I think we are fine (as-is here) to only support json for these reasons: 1) only applicable to inline mustache scripts in watcher 2) it is not documented and only lightly tested 3) only found one example of its usage in the wild (and it used json). Given the lack of documentation i doubt this is used much at all and if so likely uses the short form json.

I have not validated RestSqlQueryAction behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RestSqlQueryAction was reviewed by @bpintea

String type = typeSubtype[0];
String subtype = typeSubtype[1];
T xContentType = typeSubtypeToMediaType.get(type + "/" + subtype);
if (xContentType != null) {
Map<String, String> parameters = new HashMap<>();
for (int i = 1; i < split.length; i++) {
String[] keyValueParam = split[i].trim().split("=");
parameters.put(keyValueParam[0].toLowerCase(Locale.ROOT), keyValueParam[1].toLowerCase(Locale.ROOT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to check for keyValueParam.size before accessing [1] in the case of a malformed key= (no value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will add test cases

}
return new ParsedMediaType(type, subtype, parameters, xContentType);
}
}

}
return null;
}

public class ParsedMediaType {
private final String type;
private final String subtype;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think type and subtype here are redundant since they can be found via the MediaType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will remove

private final Map<String, String> parameters;
private final T xContentType;

public ParsedMediaType(String type, String subtype, Map<String, String> parameters, T xContentType) {
this.type = type;
this.subtype = subtype;
this.parameters = parameters;
this.xContentType = xContentType;
}

public T getMediaType() {
return xContentType;
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ParsedMediaType parsedMediaType = (ParsedMediaType) o;
return Objects.equals(type, parsedMediaType.type) &&
Objects.equals(subtype, parsedMediaType.subtype) &&
Objects.equals(parameters, parsedMediaType.parameters) &&
xContentType == parsedMediaType.xContentType;
}

public Map<String, String> getParameters() {
return parameters;
}
}
}
Original file line number Diff line number Diff line change
@@ -24,14 +24,10 @@
import org.elasticsearch.common.xcontent.smile.SmileXContent;
import org.elasticsearch.common.xcontent.yaml.YamlXContent;

import java.util.Locale;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
* The content type of {@link org.elasticsearch.common.xcontent.XContent}.
*/
public enum XContentType {
public enum XContentType implements MediaType {

/**
* A JSON based content type.
@@ -114,18 +110,12 @@ public XContent xContent() {
return CborXContent.cborXContent;
}
};
/**
* A regexp to allow parsing media types. It covers two use cases.
* 1. Media type with a version - requires a custom vnd.elasticserach subtype and a compatible-with parameter
* i.e. application/vnd.elasticsearch+json;compatible-with
* 2. Media type without a version - for users not using compatible API i.e. application/json
*/
private static final Pattern MEDIA_TYPE_PATTERN = Pattern.compile(
//type
"^(application|text)/" +
"([^;\\s]+)" + //subtype: json,yaml,etc some of these are defined in x-pack so can't be enumerated
"(?:\\s*;.*)?$",
Pattern.CASE_INSENSITIVE);

public static XContentTypeParser xContentTypeParser = new XContentTypeParser(XContentType.values())
.withAdditionalMediaType("application/*", JSON)
.withAdditionalMediaType("application/x-ndjson", JSON);



/**
* Accepts either a format string, which is equivalent to {@link XContentType#shortName()} or a media type that optionally has
@@ -134,17 +124,7 @@ public XContent xContent() {
* format query string parameter. This method will return {@code null} if no match is found
*/
public static XContentType fromFormat(String mediaType) {

if (mediaType == null) {
return null;
}
for (XContentType type : values()) {
if (type.shortName().equalsIgnoreCase(mediaType)) {
return type;
}
}

return null;
return (XContentType) xContentTypeParser.fromFormat(mediaType);
}

/**
@@ -153,43 +133,9 @@ public static XContentType fromFormat(String mediaType) {
* HTTP header. This method will return {@code null} if no match is found
*/
public static XContentType fromMediaType(String mediaTypeHeaderValue) {
if (mediaTypeHeaderValue == null) {
return null;
}
// we also support newline delimited JSON: http://specs.okfnlabs.org/ndjson/
if (mediaTypeHeaderValue.toLowerCase(Locale.ROOT).equals("application/x-ndjson")) {
return XContentType.JSON;
}
if (mediaTypeHeaderValue.toLowerCase(Locale.ROOT).startsWith("application/*")) {
return JSON;
}

String mediaType = parseMediaType(mediaTypeHeaderValue);
for (XContentType type : values()) {
if (type.mediaTypeWithoutParameters().equals(mediaType)) {
return type;
}
}

return null;
return (XContentType) xContentTypeParser.fromMediaType(mediaTypeHeaderValue);
}

public static String parseMediaType(String mediaType) {
if (mediaType != null) {
Matcher matcher = MEDIA_TYPE_PATTERN.matcher(mediaType);
if (matcher.find()) {
return (matcher.group(1) + "/" + matcher.group(2)).toLowerCase(Locale.ROOT);
}
}

return null;
}

private static boolean isSameMediaTypeOrFormatAs(String stringType, XContentType type) {
return type.mediaTypeWithoutParameters().equalsIgnoreCase(stringType) ||
stringType.toLowerCase(Locale.ROOT).startsWith(type.mediaTypeWithoutParameters().toLowerCase(Locale.ROOT) + ";") ||
type.shortName().equalsIgnoreCase(stringType);
}

private int index;

@@ -211,4 +157,19 @@ public String mediaType() {

public abstract String mediaTypeWithoutParameters();


@Override
public String type() {
return "application";
}

@Override
public String subtype() {
return shortName();
}

@Override
public String format() {
return shortName();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.common.xcontent;

import org.elasticsearch.test.ESTestCase;

import java.util.Collections;
import java.util.Map;

import static org.hamcrest.Matchers.equalTo;

public class XContentTypeParserTests extends ESTestCase {
XContentTypeParser xContentTypeParser = XContentType.xContentTypeParser;

public void testJsonWithParameters() throws Exception {
String mediaType = "application/json";
assertThat(xContentTypeParser.parseMediaType(mediaType).getParameters(),
equalTo(Collections.emptyMap()));
assertThat(xContentTypeParser.parseMediaType(mediaType + ";").getParameters(),
equalTo(Collections.emptyMap()));
assertThat(xContentTypeParser.parseMediaType(mediaType + "; charset=UTF-8").getParameters(),
equalTo(Map.of("charset", "utf-8")));
assertThat(xContentTypeParser.parseMediaType(mediaType + "; custom=123;charset=UTF-8").getParameters(),
equalTo(Map.of("charset", "utf-8", "custom", "123")));
}
}
Original file line number Diff line number Diff line change
@@ -26,6 +26,7 @@
import static org.hamcrest.Matchers.nullValue;

public class XContentTypeTests extends ESTestCase {

public void testFromJson() throws Exception {
String mediaType = "application/json";
XContentType expectedXContentType = XContentType.JSON;
@@ -34,6 +35,14 @@ public void testFromJson() throws Exception {
assertThat(XContentType.fromMediaType(mediaType + "; charset=UTF-8"), equalTo(expectedXContentType));
}

public void testFromNdJson() throws Exception {
String mediaType = "application/x-ndjson";
XContentType expectedXContentType = XContentType.JSON;
assertThat(XContentType.fromMediaType(mediaType), equalTo(expectedXContentType));
assertThat(XContentType.fromMediaType(mediaType + ";"), equalTo(expectedXContentType));
assertThat(XContentType.fromMediaType(mediaType + "; charset=UTF-8"), equalTo(expectedXContentType));
}

public void testFromJsonUppercase() throws Exception {
String mediaType = "application/json".toUpperCase(Locale.ROOT);
XContentType expectedXContentType = XContentType.JSON;
Loading