Skip to content
Fix Code Error

Vega-lite data transformation to un-nest objects

July 15, 2021 by Code Error
Posted By: Anonymous

The data is incoming from an elasticsearch url and has the following form :

{
    "took": 44,
    "timed_out": false,
    "hits": {
        "total": 11,
        "max_score": 0,
        "hits": [
            {
                "_index": "dataindex",
                "_type": "span",
                "_id": "tKVUs3kBhoeKMUMeIwCv",
                "_score": 0,
                "_source": {
                    "fieldA": 272.2,
                    "fieldB": 73,
                    "fieldX": "event 1"
                }
            },
            {
                "_index": "dataindex",
                "_type": "span",
                "_id": "iuVetHkBhoeKMUMe4O92",
                "_score": 0,
                "_source": {
                    "fieldA": 305.2,
                    "fieldB": 80,
                    "fieldX": "event 2"
                }
            },
            {
                "_index": "dataindex",
                "_type": "span",
                "_id": "Yt-QwXkBhoeKMUMex3tp",
                "_score": 0,
                "_source": {
                    "fieldA": 281.8,
                    "fieldB": 73,
                    "fieldX": "event 3"
                }
            }
        ]
    }
}

I wish to make a scatter plots matrix. The data array under hits.hits can be accesses through the format.property config.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "repeat": {
    "row": [
      "_source.fieldA",
      "_source.fieldB"
    ],
    "column": [
      "_source.fieldB",
      "_source.fieldA"
    ]
  },
  "spec": {
    "data": {
      "url": "url/to/elastic/query",
      "format": {"property": "hits.hits", "type": "json"}
    },
    "mark": "point",
    "encoding": {
      "x": {
        "field": {"repeat": "column"},
        "type": "quantitative"
      },
      "y": {
        "field": {"repeat": "row"},
        "type": "quantitative"
      },
      "color": {
        "field": "_source.fieldX",
        "type": "nominal"
      },
      "shape": {
        "field": "_source.fieldX",
        "type": "nominal"
      }
    }
  }
}

But there is still an extra _source level that needs to be pointed everywhere (repeat, color, shape) and moreover it appears as such in the axis and legend titles :

enter image description here

Is there a transformation type that would get rid of this _source level ? So the data going to the encoding phase could act as if the source was simply

[
{
    "fieldA": 272.2,
    "fieldB": 73,
    "fieldX": "event 1"
},{
    "fieldA": 305.2,
    "fieldB": 80,
    "fieldX": "event 2"
}
]

Or alternatively a way to dynamically rename the axis in a repeat matrix ?

Solution

There is a way where you have to provide the fields once and it will be out on single level instead of nested. Perform calculate transform as done below or in editor:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
    "transform": [
    {"calculate": "datum._source.fieldA", "as": "fieldA"},
    {"calculate": "datum._source.fieldB", "as": "fieldB"},
    {"calculate": "datum._source.fieldX", "as": "fieldX"}
  ],
  "repeat": {
    "row": ["fieldA", "fieldB"],
    "column": ["fieldB", "fieldA"]
  },
  "spec": {
    "data": {
      "values": {
        "took": 44,
        "timed_out": false,
        "hits": {
          "total": 11,
          "max_score": 0,
          "hits": [
            {
              "_index": "dataindex",
              "_type": "span",
              "_id": "tKVUs3kBhoeKMUMeIwCv",
              "_score": 0,
              "_source": {"fieldA": 272.2, "fieldB": 73, "fieldX": "event 1"}
            },
            {
              "_index": "dataindex",
              "_type": "span",
              "_id": "iuVetHkBhoeKMUMe4O92",
              "_score": 0,
              "_source": {"fieldA": 305.2, "fieldB": 80, "fieldX": "event 2"}
            },
            {
              "_index": "dataindex",
              "_type": "span",
              "_id": "Yt-QwXkBhoeKMUMex3tp",
              "_score": 0,
              "_source": {"fieldA": 281.8, "fieldB": 73, "fieldX": "event 3"}
            }
          ]
        }
      },
      "format": {"property": "hits.hits", "type": "json"}
    },
    "mark": "point",
    "encoding": {
      "x": {"field": {"repeat": "column"}, "type": "quantitative"},
      "y": {"field": {"repeat": "row"}, "type": "quantitative"},
      "color": {"field": "fieldX", "type": "nominal"},
      "shape": {"field": "fieldX", "type": "nominal"}
    }
  }
}
Answered By: Anonymous

Related Articles

  • How to properly do JSON API GET requests and assign output…
  • How to parse JSON with XE2 dbxJSON
  • Azure Availability Zone ARM Config
  • The 'compilation' argument must be an instance of…
  • Search match multiple values in single field in…
  • Event Snippet for Google only shows one event while testing…
  • mongodb group values by multiple fields
  • Avoid creating new session on each axios request laravel
  • Why does this Azure Resource Manager Template fail…
  • loop and eliminate unwanted lines with beautiful soup

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

How to keep user logged in in React app with a Django back-end

Next Post:

Substitute different values for the same line and stay together after the SUBSTITUTE function

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Get code errors & solutions at akashmittal.com
© 2022 Fix Code Error