Skip to content
Fix Code Error

How to sort elasticsearch buckets by timestamp BEFORE results are filtered?

July 5, 2021 by Code Error
Posted By: Anonymous

my use case is as follows:

I have documents on elasticsearch, I save some telemetry data and i’m aggregating it before ingesting it to elastic search (each document can have a count field that tells how many times this event occurred).

Example of a json field:

{
   "timestamp": "2021-06-05T06:53:25.275172"
   "count": 5
   "name": "myname",
   "type: "mytype",
   "sub_type": "mysubtype"
},
{
   "timestamp": "2021-05-05T07:53:25.275172"
   "count": 2
   "name": "myname",
   "type: "mytype",
   "sub_type": "zzzzzmysubtype222"
}
{
   "timestamp": "2021-04-05T06:22:25.275172"
   "count": 1
   "name": "myname",
   "type: "mytype",
   "sub_type": "mysubtype"
},

I’m doing a query to aggregate all "unique" documents (by name, type and sub_type), while counting the total number of responses, and i want to sort the buckets by timestamp.

The thing is, i have a "limit" parameter that limits the number of results received, and when i limit it by 1, i don’t get the latest event bucket (as i understand it sorts them by the 3 fields, then limits the size, and only then sorts it.

So, when running the below query – i would want to get the "myname, mytype, mysubtype", but i actually get the "myname, mytype, zzzzzmysubtype222" as the first bucket.
If i increase the limit – then it’s sorted correctly.

This is my query:

{
        "query": {
            "bool": {
                "filter": [{"range": {"timestamp": {"lte": "now", "gte": "2020-06-05"}}}],
            }
        },
        "aggs": {
            "events": {
                "composite": {
                    "size": 1,    <----------- this doesn't give me latest bucket
                    "sources": [
                        {"name": {"terms": {"field": "name.keyword"}}},
                        {"type": {"terms": {"field": "type.keyword"}}},
                        {"sub_type": {"terms": {"field": "sub_type.keyword"}}},
                    ],
                },
                "aggs": {
                    "events_count": {"sum": {"field": "count"}},
                    "latest_event": {"top_hits": {"sort": [{"timestamp": {"order": "desc"}}], "size": 1}},
                    "top_hit": {"max": {"field": "timestamp"}},
                    "events_bucket_sort": {"bucket_sort": {"sort": [{"top_hit": {"order": "desc"}}]}},
                },
            }
        },
        "size": 0,
    }

So, what i’m trying to do is to sort the buckets by the latest event timestamp (top_hit) – any help would be appreciated.

I think it’s related to this: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html#_order

But timestamp shouldn’t be aggregated so how do i control the natural sort of the buckets? I need only the newest ones to be received first, no matter the size,

If it’s not possible, any way to get those unique documents aggregation that share several common fields?

Solution

As far as I can understand your question, you need to have the unique buckets based on the name, type, and sub_type field, and then based on the buckets formed you need to get the latest bucket(based on the timestamp).

If you want to get unique buckets based on multiple fields, you can use multi terms aggregation along with the top_hits aggregation to get the latest bucket based on the timestamp

{
  "size": 0,
  "aggs": {
    "genres_and_products": {
      "multi_terms": {
        "terms": [
          {
            "field": "name.keyword"
          },
          {
            "field": "type.keyword"
          },
          {
            "field": "sub_type.keyword"
          }
        ]
      }
    },
    "latest_event": {
      "top_hits": {
        "sort": [
          {
            "timestamp": {
              "order": "desc"
            }
          }
        ],
        "size": 1
      }
    }
  }
}

Search Result:

"aggregations": {
    "genres_and_products": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": [
            "myname",
            "mytype",
            "mysubtype"
          ],
          "key_as_string": "myname|mytype|mysubtype",
          "doc_count": 2
        },
        {
          "key": [
            "myname",
            "mytype",
            "zzzzzmysubtype222"
          ],
          "key_as_string": "myname|mytype|zzzzzmysubtype222",
          "doc_count": 1
        }
      ]
    },
    "latest_event": {
      "hits": {
        "total": {
          "value": 3,
          "relation": "eq"
        },
        "max_score": null,
        "hits": [
          {
            "_index": "67775188",
            "_type": "_doc",
            "_id": "1",
            "_score": null,
            "_source": {
              "timestamp": "2021-06-05T06:53:25.275172",
              "count": 5,
              "name": "myname",
              "type": "mytype",
              "sub_type": "mysubtype"
            },
            "sort": [
              1622876005275
            ]
          }
        ]
      }
    }
Answered By: Anonymous

Related Articles

  • How to properly do JSON API GET requests and assign output…
  • How to parse JSON with XE2 dbxJSON
  • Azure Availability Zone ARM Config
  • The 'compilation' argument must be an instance of…
  • Search match multiple values in single field in…
  • Event Snippet for Google only shows one event while testing…
  • mongodb group values by multiple fields
  • Avoid creating new session on each axios request laravel
  • loop and eliminate unwanted lines with beautiful soup
  • NullpointerException error while working with choiceBox and…

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

How to make a simple loop in .vbs

Next Post:

labeling leaf nodes in tree (XSLT 1.0)

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Get code errors & solutions at akashmittal.com
© 2022 Fix Code Error