Wednesday 15 September 2010

hadoop - Changing bags into arrays in Pig Latin -


I am making some changes on some data sets and need to be published in a sensible looking format. Currently my last set looks like this when I describe:

  {memberId: long, companyIds: {(accessory: long)}}   

I want this to look like this:

  {member: long, company id: [long]}   

where companyIds key is for an array of type ID?

Am I really struggling with manipulating things like this? any idea? I have tried to use FLATTEN and take advantage of other commands. I am using AvroStorage to write files in this schema:

Field schema let me know this The data needs to be written in such a way: <">" fields ": [" name ":" member "," type ":" long "}, {" name ":" company id " , "Type": {"type": "array", "item": "int"}}]

I know this is a bit old, but I've just got one problem In. / P>

Depending on the use of the latest version of pig and avrostorage, it is possible to insert bags directly into the avro array.

In your case, you want something like this:

  Use 'Blow' in Studio Blah; Aerostrase ('schema', 'schema}');   

where the schema contains the array field

  {"name": "companyIds", "type": ["null", {"type": "Array", "item": "long"}], "doctor": "company id"}    

No comments:

Post a Comment