Wednesday 15 September 2010

google app engine - BigQuery: How to flatten repeated structured property imported from datastore -


Dear All

I started using BigQuery for analysis data in GAE datastore this month is. First of all, I export the data in the Google Cloud Storage via the "Datastore Admin" page of the GAE console. And then, I import data from Google Cloud Storage to BigQuery, it works very smoothly except for the structured assets. I expected that the imported record should be in the following format:

  Parents: "James", children: [{Name: "name1", Age: 5, Gender: "M"}, {Name: "name2", age: 50, gender: "f"}, {name: "name 3", age: 33, gender: "m"},]   

To know how to level the data in the above format. But in BigQuery, the actual data format is in the following format:

  Parents: "James", children.name :::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::( [5, 50, 33 ], Children.gender: ["M", "F", "M"],   

I am thinking that it is possible to level up data in BigQuery for further analysis. The ideal form of the result table in my mind is:

  original name, children. Names, children. Eg, children. Gender James, Name 1, 5, "M" James, Name 2, 50, "F" James, name 3, 33, "M"   

Cheers!

Recently presented - things are very good! Try the following ( use Lizzy SQL >) with the parent (Select "James" as the original name, Structus ("[name1", "name2", "name3"] AS name, [5, 50, 33] as children, children's names, Children, children, gender to children, UNNICE (children. Name) offset off FSET AS pos_name as children, Offset of Pauses, UNENEST (children.age) offsets with children as UNNEST (children's gender) as children, where offset AG Pause_gender is pos_name = pos_ge and pos_name = pos_gender < > "{{" Parent "}:" James "," children ": {" name ": [" name1 "," name2 "," name3 "], as in "Age": ["5", "50", "33"], "gender": ["M", "F" "M"]}}]

and < C Ode> output

Note: Based entirely on what I see in the original question and need to adjust the most needs that you have Hope it helps in the direction of direction and direction to start with!

Added:

The above question is using line-based cross-johns, which means that the first gathering of all changes for the same parent And "wrong" people compared to the WHERE clause.

On the contrary, under Version, use the Inner Join to eliminate them "side effects"

  with parents (select "James" Name as the original name, STRUCT (["name1", "name 2", "name 3"], [5, 50, 33] as the age, ["M", "F", "M"] Gender), select the parent name, children's name, children, children, parents, UNNEST (children.name) as the name of the children in the form of OFFSET as the name UNNEST (children.age) children As a Pos_name = pos_age on pos_age with FASSET OF AGGE included UNNEST (children.gender) as OFFSET with children OFFSET on gender pos_age = pos_gender   

from intiviate, I hope second edition big table Will be a bit more efficient for

No comments:

Post a Comment