Rewrite Your Elasticsearch Requests OnTheFly

Rewrite Your Elasticsearch Requests OnTheFly #

In some cases, you may find that the QueryDSL generated by the service code is unreasonable. The general practice is to modify the service code and publish it online. If the launch of a new version takes a long time (for example, the put-into-production window is not reached, major network operation closure is in progress, or additional code needs to be submitted to go live), a large number of tests need to be performed. However, faults in the production environment need to be rectified immediately and customers have no time to wait. What should be done in that case?

Don’t worry. You can use INFINI Gateway to dynamically repair queries.

Example #

See the following query example:

GET _search
{
 "size": 1000000
 , "explain": true
}

The size parameter is set to a very large value and the problem is not found at the beginning. With more and more data generated, too much returned data is bound to cause a sharp decline in performance. In addition, enabling the explain parameter will create unnecessary performance overhead and this function is generally used only during development and debugging.

By adding the request_body_json_set filter to the gateway, you can dynamically replace the value of the specified request body JSON PATH. The configuration for the above example is as follows:

flow:
- name: rewrite_query
  filter:
    - request_body_json_set:
       path:
         - explain -> false
         - size -> 10
   - dump_request_body:
   - elasticsearch:
       elasticsearch: dev

Set the explain and size parameters again. The query is rewritten in the following format before it is sent to Elasticsearch:

{
 "size": 10, "explain": false
}

The problem is successfully fixed in in-service mode.

Another Example #

Look at the following query example. The programmer who writes the code writes the name of the field to be queried by mistake. The name should be name but is written as name1. The size parameter is set to a very large value.

GET medcl/_search
{
  "aggs": {
    "total_num": {
      "terms": {
        "field": "name1",
        "size": 1000000
      }
    }
  }
}

The system goes live but a problem arises when a query is conducted. For this problem, you can add the following filter configuration to the gateway request flow:

flow:
- name: rewrite_query
  filter:
    - request_body_json_set:
       path:
         - aggs.total_num.terms.field -> "name"
         - aggs.total_num.terms.size -> 10
         - size -> 0
   - dump_request_body:
   - elasticsearch:
       elasticsearch: dev

In the above configuration, we can replace the data of the JSON request body through its path, and add one parameter not to return the query document because only aggregated results are required.

Another Example #

The user query is as follows:

{
  "query":{
	"bool":{
	   "should":[{"term":{"isDel":0}},{"match":{"type":"order"}}]
	}	
}
}

Now you want to replace the term query with the equivalent range query as follows:

{
  "query":{
	"bool":{
	   "should":[{ "range": { "isDel": {"gte": 0,"lte": 0 }}},{"match":{"type":"order"}}]
	}	
}
}

Use the following configuration:

flow:
  - name: rewrite_query
    filter:
      - request_body_json_del:
          path:
            - query.bool.should.[0]
      - request_body_json_set:
          path:
            - query.bool.should.[1].range.isDel.gte -> 0
            - query.bool.should.[1].range.isDel.lte -> 0
      - dump_request_body:
      - elasticsearch:
          elasticsearch: dev

In the above configuration, one request_body_json_del filter is used to delete the first element from the Should query, that is, the Term subquery to be replaced. There is only one Match query left. One Should subquery is added, and the added subscript should be 1. Set the attributes of the Range query.

Further Improvement #

In the above examples, queries are directly replaced. In general, you may need to make a judgment about whether to replace the query, for example, replacement may only be performed when the _ctx.request.body_json.query.bool.should.[0].term.isDel JSON field exists. The conditional judgment of the gateway is very flexible and the configuration is as follows:

flow:
  - name: cache_first
    filter:
      - if:
          and:
            - has_fields: ['_ctx.request.body_json.query.bool.should.[0].term.isDel']
        then:
          - request_body_json_del:
              path:
                - query.bool.should.[0]
          - request_body_json_set:
              path:
                - query.bool.should.[1].range.isDel.gte -> 0
                - query.bool.should.[1].range.isDel.lte -> 0
          - dump_request_body:
      - elasticsearch:
          elasticsearch: dev

The feature is superb!