A
A
Alexander Karabanov2021-04-01 16:43:40
elasticsearch
Alexander Karabanov, 2021-04-01 16:43:40

How to aggregate slightly different data?

Hello.

There is a request with which you can calculate the percentiles of the duration of the request to the endpoint /api/v1/blabla

Request
POST /filebeat-nginx-*/_search
{
  "aggs": {
    "hosts": {
      "terms": {
        "field": "host.name",
        "size": 1000
      },
      "aggs": {
        "url": {
          "terms": {
            "field": "nginx.access.url",
            "size": 1000
          },
          "aggs": {
            "time_duration_percentiles": {
              "percentiles": {
                "field": "nginx.access.time_duration",
                "percents": [
                  50,
                  90
                ],
                "keyed": true
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "should": [
              {
                "prefix": {
                  "nginx.access.url": "/api/v1/blabla" 
                }
              }
            ]
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-10m",
              "lte": "now" 
            }
          }
        }
      ]
    }
  }
}

There is a problem with the fact that some arguments are also passed to this endpoint, for example /api/v1/blabla?lang=en&type=active , or /api/v1/blabla/?lang=en&type=history , etc.

Accordingly, the response shows percentiles for each such "separate" endpoint:
Answer
{
                "key" : "/api/v1/blabla?lang=ru",
                "doc_count" : 423,
                "time_duration_percentiles" : {
                  "values" : {
                    "50.0" : 0.21199999749660492,
                    "90.0" : 0.29839999079704277
                  }
                }
              },
              {
                "key" : "/api/v1/blabla?lang=en&type=active",
                "doc_count" : 31,
                "time_duration_percentiles" : {
                  "values" : {
                    "50.0" : 0.21699999272823334,
                    "90.0" : 0.2510000020265579
                  }
                }
              },
              {
                "key" : "/api/v1/blabla?lang=en",
                "doc_count" : 4,
                "time_duration_percentiles" : {
                  "values" : {
                    "50.0" : 0.22700000554323196,
                    "90.0" : 0.24899999797344208
                  }
                }
              }

Can you please tell me how to aggregate similar endpoints into one /api/v1/blabla* and get a common percentile?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexander Karabanov, 2021-09-05
@karabanov

In general, I decided to stop at this solution:

"aggs": {
                        "uri": {
                          "terms": {
                            "script": {
                              "source": "def uri = /(\\/[^\\?]+)\\?.+/.matcher(doc['nginx.access.url'].value);if ( uri.matches() ) {return uri.group(1)}             else { return 'no_match'}"
                            }
                          }

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question