Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting AlreadyExpiredException from updating document that has valid _ttl #9956

Closed
coxchen opened this issue Mar 3, 2015 · 10 comments
Closed
Assignees
Labels
>bug help wanted adoptme :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@coxchen
Copy link

coxchen commented Mar 3, 2015

Hi,

I hit the AlreadyExpiredException from updating document with valid _ttl, i.e. _ttl greater than 0.
Here is my index mapping (simplified), both _timestamp and _ttl are enabled, and _timestamp is provided with customized path:

{
  "activity" : {
    "mappings" : {
      "log" : {
        "_timestamp" : {"enabled" : true, "path" : "log_time"},
        "_ttl" : {"enabled" : true, "default" : 300000},
        "properties" : {
          "log_count" : {"type" : "integer"},
          "log_time" : {"type" : "date"}
        }
      }
    }
  }
}

I did some experiments as documented in this article and found that, if I have mappings like above (both _timestamp and _ttl are enabled, and _timestamp is provided with customized path), I'll hit the AlreadyExpiredException from updating document with valid _ttl. I have tried both 1.3.1 and 1.4.2, same behavior.

Is this a known issue? Or is there any design intent behind this?

Thank you for your help!

@clintongormley
Copy link
Contributor

Hi @coxchen

The _ttl is added to the _timestamp, so you need to update both otherwise it uses the old _timestamp as a base.

@coxchen
Copy link
Author

coxchen commented Mar 3, 2015

Hi @clintongormley

Thanks for your reply.

Sorry for the confusion. I didn't update the _timestamp field (or log_time in my example), instead I did partial update to the log_count field with:

curl -XPOST 'localhost:9200/activity/log/1/_update' -d '{ "script" : "ctx._source.log_count+=1" }'

I just want to update fields other than the _timestamp filed and _ttl field, so I don't see why I hit the AlreadyExpiredException.

@coxchen coxchen changed the title Hitting AlreadyExpiredException from updating document with valid _ttl Hitting AlreadyExpiredException from updating document that has valid _ttl Mar 3, 2015
@clintongormley
Copy link
Contributor

@coxchen It means that the document has actually expired, but not yet been deleted. Expired docs are only removed every 60 seconds.

@coxchen
Copy link
Author

coxchen commented Mar 4, 2015

Hi @clintongormley

Nope, it's not that case. The document I'd like to update still has valid _ttl (>> 0).

I first found this issue in a production system I'm working on. In this case, I had a document with _ttl originally set to 7d, and I got the AlreadyExpiredException from updating that document on the 4th day after it been indexed in ES. At that time, the document still has its _ttl about 3 days left.

I did some experiments, which you can read the details at https://medium.com/@coxchen/document-obsolescence-in-elasticsearch-c5973dd9e68d if you have time.

@coxchen
Copy link
Author

coxchen commented Mar 4, 2015

@clintongormley

Below is the result of my experiments, showing how _ttl of document changes over time.

ttl_change_chart

I have four documents indexed in ES

  • /activity/log1/1 is green curve
  • /activity/log1/2 is yellow curve
  • /activity/log2/3 is blue curve
  • /activity/log2/4 is orange curve

with the following index mapping

{
  "activity" : {
    "mappings" : {
      "log1" : {
        "_timestamp" : {"enabled" : true, "path" : "log_time"},
        "_ttl" : {"enabled" : true, "default" : 300000},
        "properties" : {
          "log_count" : {"type" : "integer"},
          "log_time" : {"type" : "date"}
        }
      },
      "log2" : {
        "_ttl" : {"enabled" : true, "default" : 300000},
        "properties" : {
          "log_count" : {"type" : "integer"},
          "log_time" : {"type" : "date"}
        }
      }
    }
  }
}

I also have script to update the log_count field to document 2 and 4 periodically.

curl -XPOST 'localhost:9200/activity/log1/2/_update' -d '{ "script" : "ctx._source.log_count+=1" }'

curl -XPOST 'localhost:9200/activity/log2/4/_update' -d '{ "script" : "ctx._source.log_count+=1" }'

Supposedly, the curves of the 4 document should overlap. But you can see the yellow curve (document 2) is outstanding. It doesn't look right to me. Can you help to explain?

@clintongormley clintongormley self-assigned this Mar 9, 2015
@coxchen
Copy link
Author

coxchen commented Apr 12, 2015

Hi @clintongormley

I trace the source code and found that, in TTLFieldMapper.java, the _timestamp in document source will be used to check expiration when updating a document. So the workaround for my issue is to always provide the original _ttl value when updating the document, even though I don't have the intention to update _ttl.

You can check my article for details: https://medium.com/@coxchen/saving-document-half-life-in-es-89be764f21ca

@clintongormley
Copy link
Contributor

Hi @coxchen

Thanks for digging! I've just tried my own (slightly different) test and see the _ttl increasing by leaps and bounds:

DELETE activity

PUT activity
{
  "mappings": {
    "log": {
      "_timestamp": {
        "enabled": true,
        "path": "log_time",
        "store": true
      },
      "_ttl": {
        "enabled": true,
        "default": 300000
      },
      "properties": {
        "log_count": {
          "type": "integer"
        },
        "log_time": {
          "type": "date"
        }
      }
    }
  }
}

Index a document using tomorrow's date:

PUT activity/log/1
{
  "log_time": "2015-04-14"
}

Repeat these two steps to see the _ttl just keep on growing

GET _search?fields=_ttl

POST activity/log/1/_update 
{
  "doc": { "foo": "bar" }
}

@clintongormley clintongormley added >bug help wanted adoptme :Search Foundations/Mapping Index mappings, including merging and defining field types labels Apr 13, 2015
@coxchen
Copy link
Author

coxchen commented Apr 13, 2015

Hi @clintongormley

Interesting, the log_time is some time in the future, making the _ttl growing with UPDATE.

@darylrobbins
Copy link

I have hit this same issue when upgrading from 1.4.2 to 1.5.2. I'm using a mvel script to update attributes in the document (neither the _timestamp or _ttl). The issue appears to happen for some documents but not others.

@clintongormley
Copy link
Contributor

Closing in favour of #18280

@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug help wanted adoptme :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

4 participants