Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Wrong mapping for nested documents produced by OpenSearch.Client #379

Closed
DumboJet opened this issue Sep 28, 2023 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@DumboJet
Copy link

DumboJet commented Sep 28, 2023

What is the bug?

When mapping nested documents with OpenSearch.Client, the generated mappings appear wrong.

How can one reproduce the bug?

Run this code:

            string documentsIndexName = "test";

            var nodes = new Uri[]
            {
                new Uri(openSearchUrl)
            };

            var connectionPool = new StaticConnectionPool(nodes);
            var config = new ConnectionSettings(connectionPool)
                .BasicAuthentication("admin", "admin")
                .DefaultIndex(documentsIndexName);

            openSearchClient = new OpenSearchClient(config);

            Expression<Func<Document, string>> commentsPath = d => d.Comments;
            Expression<Func<Document, string>> pagesContentPath = d => d.Pages.First().Content;
            Expression<Func<Document, string>> errataContentPath = d => d.Errata.First().Content;

            var mapResponse = openSearchClient
                    .Indices
                    .Create(documentsIndexName, c => c.Map(m =>
                        m.AutoMap<Document>()
                            .Properties<Document>(p =>
                                p.Text(s => s.Name(commentsPath).Index(true).Store(true))
                                .Text(s => s.Name(pagesContentPath).Index(true).Store(true))
                                .Text(s => s.Name(errataContentPath).Index(true).Store(true))
                            )
                         )
                        .Settings(s => s.NumberOfShards(1).NumberOfReplicas(1))
                       );

And here is the definition of Document:


    public class Document
    {
        public int Id { get; set; }
        public string Title { get; set; }
        public string Comments { get; set; }
        public IEnumerable<Pages> Pages { get; set; }
        public IEnumerable<Errata> Errata { get; set; }
    }

    public class Pages
    {
        public int Number { get; set; }
        public string Content { get; set; }
    }

    public class Errata
    {
        public string Content { get; set; }
    }

What is the expected behavior?

I expect a mapping like this to be generated:

{
  "test2": {
    "mappings": {
      "properties": {
        "comments": {
          "type": "text",
          "store": true
        },
        "errata": {
          "properties": {
            "content": {                       // <----------------- Notice this field
              "type": "text",
              "store": true
            }
          }
        },
        "id": {
          "type": "integer"
        },
        "pages": {
          "properties": {
            "content": {                       // <----------------- Notice this field
              "type": "text",
              "store": true
            },
            "number": {
              "type": "integer"
            }
          }
        },
        "title": {
          "type": "text",
          "index": false
        }
      }
    }
  }
}

Using OpenSearch.Client and the C# code above, I get this mapping, instead:

{
  "test": {
    "mappings": {
      "properties": {
        "comments": {
          "type": "text",
          "store": true
        },
        "content": {                       // <----------------- Notice this field
          "type": "text",
          "store": true
        },
        "errata": {
          "properties": {
            "content": {                       // <----------------- Notice this field
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },
        "id": {
          "type": "integer"
        },
        "pages": {
          "properties": {
            "content": {                       // <----------------- Notice this field
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "number": {
              "type": "integer"
            }
          }
        },
        "title": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

If I manually go and create the mapping like this, it seems to work fine:

PUT test2
{
  "mappings": {
    "properties": {
      "id":    { "type" : "integer" },
      "title":     { "type" : "text", "index": false },
      "comments":{ "type" : "text", "store": true, "index": true },
      "pages.number":{ "type" : "integer" },
      "pages.content":{ "type" : "text", "store": true, "index": true },
      "errata.content":{ "type" : "text", "store": true, "index": true }
    }
  }
}

If I try to use strings (instead of lamdas) for specifying the fields in the C# code, like this:

            string commentsPath = "comments";
            string pagesContentPath = "pages.content";
            string errataContentPath = "errata.content";

...then I get exceptions because fields are mapped twice with conflicting store configurations.
This is the exception message:

Request failed to execute. Call: Status code 400 from: PUT /test?pretty=true&error_trace=true. ServerError: Type: mapper_parsing_exception Reason: "Failed to parse mapping [_doc]: Mapper for [pages.content] conflicts with existing mapper:
	Cannot update parameter [store] from [false] to [true]" CausedBy: "Type: illegal_argument_exception Reason: "Mapper for [pages.content] conflicts with existing mapper:
	Cannot update parameter [store] from [false] to [true]""

And this is the mapping the library creates:

{
    "mappings": {
        "properties": {
            "comments": {
                "index": true,
                "store": true,
                "type": "text"
            },
            "errata": {
                "properties": {
                    "content": {                       // <----------------- Notice this field
                        "fields": {
                            "keyword": {
                                "ignore_above": 256,
                                "type": "keyword"
                            }
                        },
                        "type": "text"
                    }
                },
                "type": "object"
            },
            "errata.content": {                       // <----------------- Notice this field
                "index": true,
                "store": true,
                "type": "text"
            },
            "id": {
                "type": "integer"
            },
            "pages": {
                "properties": {
                    "content": {                       // <----------------- Notice this field
                        "fields": {
                            "keyword": {
                                "ignore_above": 256,
                                "type": "keyword"
                            }
                        },
                        "type": "text"
                    },
                    "number": {
                        "type": "integer"
                    }
                },
                "type": "object"
            },
            "pages.content": {                       // <----------------- Notice this field
                "index": true,
                "store": true,
                "type": "text"
            },
            "title": {
                "fields": {
                    "keyword": {
                        "ignore_above": 256,
                        "type": "keyword"
                    }
                },
                "type": "text"
            }
        }
    },
    "settings": {
        "index.number_of_replicas": 1,
        "index.number_of_shards": 1
    }
}

What is your host/environment?

Windows 11
OpenSearch.Client nuget version 1.5.0
OpenSearch Version: 2.9.0
OpenSearch Security Version: 2.9.0.0

Some extra notes:

  1. For the mappings, I have followed the example here: https://opensearch.org/docs/latest/clients/OSC-example/#mappings
  2. If I remove the .Index(true).Store(true) from the C# mapping code, the duplicate mappings go away when using field names instead of lamdas. But I need to set the store flag, so this is not an option for me.

Also, there is another bug, somewhat related to this issue:
When mapping a nullable object property like public int? Value { get; set; } the mapper correctly maps this field into an integer.
But when you map an array of nullable objects like public IEnumerable<int?> Values { get; set; } then it maps the nullable class members instead, like this:
image

@DumboJet DumboJet added bug Something isn't working untriaged labels Sep 28, 2023
@DumboJet DumboJet changed the title [BUG] Wrong mapping for nested documents [BUG] Wrong mapping for nested documents produced by OpenSearch.Client Sep 28, 2023
@Xtansia Xtansia removed the untriaged label Sep 28, 2023
@Xtansia
Copy link
Collaborator

Xtansia commented Sep 28, 2023

Hi @DumboJet,

The correct way to get your desired mapping with the fluent-mapping API is like so:

var res = await client.Indices.CreateAsync(index, i => i.Map<Document>(m => m
            .AutoMap()
            .Properties(p => p
                .Text(t => t
                    .Name(d => d.Title)
                    .Index(false))
                .Text(t => t
                    .Name(d => d.Comments)
                    .Store())
                .Object<Pages>(o => o
                    .Name(d => d.Pages)
                    .AutoMap()
                    .Properties(pp => pp
                        .Text(t => t
                            .Name(pg => pg.Content)
                            .Store())
                    ))
                .Object<Errata>(o => o
                    .Name(d => d.Errata)
                    .AutoMap()
                    .Properties(pp => pp
                        .Text(t => t
                            .Name(e => e.Content)
                            .Store())
                    )))));

This obviously can get a bit verbose, so you could also achieve this by modifying your document classes like so:

public class Document
{
    public int Id { get; set; }
    [Text(Index = false)]
    public string Title { get; set; }
    [Text(Store = true)]
    public string Comments { get; set; }
    public IEnumerable<Pages> Pages { get; set; }
    public IEnumerable<Errata> Errata { get; set; }
}

public class Pages
{
    public int Number { get; set; }
    [Text(Store = true)]
    public string Content { get; set; }
}

public class Errata
{
    [Text(Store = true)]
    public string Content { get; set; }
}

Then you can just use AutoMap like so:

var res = await client.Indices.CreateAsync(
            index, i => i
                .Map<Document>(m => m
                    .AutoMap()));

Additionally could you please make a separate issue for your concerns around nullable int mapping?

@DumboJet
Copy link
Author

Oh, thanks for this!
It is hard figuring these things out without detailed documentation. :(
I will create a new issue and close this one. ;)

@DumboJet
Copy link
Author

Here is the new bug: #380

@assadnazar
Copy link

assadnazar commented Apr 7, 2024

What if Pages is a single record and not IEnumerable.

public class Document
{
    public int Id { get; set; }
    [Text(Index = false)]
    public string Title { get; set; }
    [Text(Store = true)]
    public string Comments { get; set; }
    public Pages Page { get; set; }
    public IEnumerable<Errata> Errata { get; set; }
}

public class Pages
{
    public string Content { get; set; }
}

public class Errata
{
    [Text(Store = true)]
    public string Content { get; set; }
}

How can we achieve nested object. Currently, I am getting this error

# OriginalException: OpenSearch.Net.OpenSearchClientException: Request failed to execute. Call: Status code 400 from: POST /napsresponse/_doc. ServerError: Type: mapper_parsing_exception Reason: "failed to parse field [redirect_params] of type [keyword] in document with id 'B0xuuo4Baeqfga7bTYZu'. Preview of field's value: '{body=SXJlV1I0QlhIbU1xcnRXNGFTNitjc0VLcXlYZ0dJS2RsWm9ERXZyMzdVOVVhbk1vT3p4bXhrZHE0QWF5Qkpxcmc4azJrclRFY1lhRW1aNUdVc0pyQ3VqRVBTMXFtY09uWkk3TjBsQ1E4WnFEdm11c1lYRXhqM3NiZ01mZG9kTE9CQ25Dc054RDZjM0tsdzluN29VTWRLa2VLQmY5TFlWbk1nc3lYMVJTeERvVHpOMUF1ZEpWS0lHSkNBS1NFRUlTMUNqVDMzanp6Uk53d1VxN2hwaGJoNDhzRkkzWEVyMjJMWm9abG5NazExZTZsaVFJdlh2RkxsMG0wRHVUaG4vRDN2OUt3L2NzbGZDRy9lUnhGWWp0ZEloRTdpUHpVdlVzbStscVNab0V0bWVRRDFOTDZlbk9XeWdpZWU1U3NHakxXcFJ3V0ovQUlYNEw4d0FBUTk2ci9hY055ckFxeHJWMXlQV1JOc1FKZGMxeUZ4T1lnL0huMktSdDE3elhsTFZZNG1aaVlPKzJnOEllM29oTFQvd1RzWklBYzdPTDFMOHVhMFMzcHBXeG1rLzJZTGlvb2YwUHZ4QTlXZjNSMFgxcHhDSmpoemNjY2dhNUlvc0F2Y0pra3BzbXhmanB6dWZZK2Z2bnlXYVNwRjE1alk5WW9iN25hRDVhTENIVjBKdTZUVWkxN0dVT01jRjVMNytRVE0vK1ArbDF4eXdENEhIMXlmeElkamtxVWZ2MkpMa1ErZEc0dXVFaG51V0JqNmljSzJsOWVZZ3NsenN3a1ZOMmdkcExhekxCa29xc2xQL1dGMkloYVQ1UFRIUXkvQlVlSTAxUWxRRkRndGRsY2lUL2xTbE1UZ0E1NHFBK3ZyVDJsZ2sxOFJPTThTZ3lZUW5MZnV2aTl1dmFIaDBwejBCSDVzTzNiYUs4WllyS3p2UW1abERJWHZveGFMeEVwVXJEcEZGTlFBQnZ3YllEb0lGTEN6QlFqalcyLzYwRm1CZEpndXNYQjhSbnZUYW13OFY3YlU1bVNkREVhUEYza2J5VXJOTnBObnJCc1Z4cVIzTytDVllaL251aExjbWRHRE9ybXR0dnBsUTdSTmtFUDhCNk9tZGVvQ1dRRTBOTERyY3JhV2I0V0lROW8zdDJVNnoyREp3ZWpaOXAwVms5Rm1xQVZEOFZQZmRlVTRoMzRKb0V0cTBrMDdBbHp5LzhqY0dEVDBRY2RjcTlpZkx1ejFKL1c1bXpaRW5BR3grRUJzMC9QMnN2OWFOeUpnS0tjblBES1F2cUJ4eHlVV3lxb0Q4SGpMaTZnY0xkL0NnRms5aVFaQk9OUW1iVE84YWNWSUQrbVFEM2dpeGI3QnlYQUVpSENMVG5INzB5U0JFVVFpWS91R2w5SnZCcGVLZ3lRZUJ1Z1o5NjYydzhrK25TSUI0NnVqUlRjdDBLTmVvYUFPYzh3VmVHbVc0Q3VKamp6WUtDQUJXUzFZc0sxVjJTeUJuSmZsbStoZkN1dVFVaWl0a1h2blJpMUtlWE84ZmdxcFR5akVrc3R0dWNvS2FaNkN5VXp5N0JUcVY3WXUvVGxjY2QxZDlHbXBhYmNYdHlGdjNPckx4dGdqMU1ZTkNRYTNDYmJrSFd5ZzJYUW0wSS9yWVdidzhGU09qVWUxSmhGckh5NnlZMjB0S1NRdEJtMzFWcDRYRFhGa1I0RHdPTk0wamorME5pTkxuWlA2TGN0RlZ3Zkg4WDFyQjY0cjdWRjUxYnlZM0dCYlhsRC80M2haeUxMNWpiWXMwZk1IanN2UUJ1dm9jNkl5VXNWS0VJallNd0lGK256ZCtzZldUTHhZNnp5WXN3NDczcEdRZVpEQU0yZ0RvUDE2Rno4cDdUNHdGZ3c1WFBXVWFtbU1nMzI2c3VKVDFVeHhISE92TEp1N01mVHdCbWRGY0kwcTZiZWxoL3YxRnNLSE55cVpqV05HTG9SVnNxVDhVbCtRaEpWNFdoeWViL0JoUzRtOG11NExJS25EVThFWGdSVW5aaHY0bWRqbGVNRG84T1FxUnFxK3ZuNmVQUmhTL010SndZVWlaaGtGY0tPRlJKekZCdmJ1MFh1TFNiQU5aQ0dBQVpqNzBLNXVlUFVaZE1TdnFWMDltV0IrVEpSWjVIMG00YnhQRHBtUXJSeEZGN1hqMXcyQy9UeFpuRWI5TVlua1VzNmloSXpkZ25TeDZtU0tnUm9HR0M5TEpWVmhUdFRjdmxpRGljVitHKzdKZXFvMVE4NFZyeWtFUGVNckZkYmEyekpHblJUQW8rRjlJNENKNkdoNTZEbXBnd3d0dDhlQWRIUnBrSVlGZkpXSmVIWUJnUjNnc0ovck1DQmV2MjJBcGxuMndBKy9CaGxOSCsvdVRnYWFIVk9FUVMxcWVMY0M2V3h2enBsZXlCTnNZbFA5RURwbnhsdGtneDhHZTI0Umk2dnRJVEJkRFEwcUF6bS9QUHAxdGxLVS9pdEgzVjNZT3A5eXgyMjVHVUdmRTVJemwvcm1mWkEyajFtcmhseWVTVUdFT05PNUtSWCtpKzAwNXlkcHg4Rkg5bWJ1YVgxTEIyOGRyS0hzQ25KcDVsSjZ1U0o0Y2VwenJiZlMyZjlZKzVCSitpajZRSlFDZkFEeUhySEhvaWZFTUl1dzF5WCtCSTJnVys0WHdKdWJZU3AzS1QySGx1SGQ0NGFBc2pITXdDaCtxMkcvWDZFSk1QZjE0d3RkWG52WjdvSmhIeXd3dE50QmxtNm8vSy9iK2MvNHRSWHJ5QUpJNkZoMkZjVy8xKzRDVkswUjNpaGhjVzZ5c21qTy8yZ3UzOGI0aUZhV21qbnM5UkpCcVlYSE9oMlZWdlJjbTF0bnBFTEV0QXA0blQ2MnoxQTNZVWZ5d0tGWGFBTzRRc1BxY3JnaDhjbzlnRkpsZ2RQTkFVOVh6ZTBNeWdMdjJTSnhoaGxvZXVMMGhKYnZuUmY2aTJ3N1RHMU1UbHRlOHJwais1RndvWUxDL21nU2ZSVm1iZUpNUVhLZzkwYVJRb3ZNRDhFN1BaKzRBYjdYc2VRc3lLVDNTcWROOWhOV3BIQ1RpL2RydWRHdkZvTHQ5cTBSKzZkNnFVZ3M5aUc3Wmc1akdoRlZNMkFlMmhueHJlYjJIUUpFeVdyTHJRZzQ5RkhDWXNEbWxTMzNNSk9veU1MYThCVVJKWlVEN0dxSmIrR0pqT1RuVm54L1AzTUZmTitZWGVWSW10eU1ITkhFWUpyMUpsZUg2S3FsWllKRzF6Q04rcE9SMEI4b0ptTVloeWRVdDFpTHJkejFNcVNtV2oxZ21iaUd5R3haMWhGZ0VNZW5WdUY5Y1NidHVXVWJ0cnhqMFk2MnpGMXVaUUNERk1yeDR1K091cXh0cVpuakpuUlova0FiRFp1UzllZk03NTo6ZEtkS0N0NWFhRm1SdFp4M280NHBSQT09}'" CausedBy: "Type: illegal_state_exception Reason: "Can't get text on a START_OBJECT at 1:360""
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>

@Xtansia
Copy link
Collaborator

Xtansia commented Apr 8, 2024

@assadnazar The required mapping would be identical, OpenSearch does not differentiate between single-value and multi-value properties in mappings.

Your particular exception is because the mapping on your index specifies that the redirect_params property should be a keyword however you are attempting to index an object in that property. If this mapping was not explicitly set at prior, it's possible you've indexed differently shaped documents into the index before where redirect_params was a string and OpenSearch automatically assigned the keyword mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants