Monday 15 June 2015

Why is the result of a reduce function fed back into reduce using mongodb mapreduce -


To reduce the work of progressive maps, I'm seeing a troublesome behavior using Mongo.

{_ID :, url: 'Some URL from my voluntary location'}

Here's my simple map function:

East> Maps: function () {emit (this.url, {count: 1, id: this._id}); }

and less (with many debugging prints for the logs shown below):

  lower: function (key, value) {var count = 0; Var lastId = null; Var First = Null; If (type value [0] .id == "undefined") {print ("bad id"); Printjson (key); Printjson (value [0]); Return tap; } And {print ("Good ID"); Printjson (key); Printjson (value [0]); } First = Object ID (value [0] .ad) .getTimestamp (); Values.for each (function (v) {count + = v.count; final = object id (v.id) .getTimestamp (); lastId = v.id;}); Return {count: count, first: first, last: last date, last date: last}}; }   

How do I call Matrred:

  mrparams.out = {decrease: this.output}; Mrparams.limit = 100; Mrparams.query = {'_id': {'$ gt': Mongo IID (last ed. Hexstress ())}}; Mrparams.finalize = Null; Mrdb.mapReduce (this.map, this.reduce, mrparams, function (d) {console.log ("finished mr", d); callback ();});   

This is a cron type manner so that each time interval is run on the limit on the number of the record starting with the record Record after lastId It was

Very basic incremental map reduces the luggage ...

But when I run it, I see the return value of reducing the returning methos Reduce method "{": ID count "1," ": ObjectId (" 5175a065b25f029a1d0927e6 ")}

Good Id ": Here's a snapshot of the log

XXXgood ID" "{" count ": 1," id ": ObjectId (" 5175a065d7f115dd41097df6 ")}

Good ID" "{ "counting" 1, "id": ObjectId ( "5175a0657c9c963654094d25")}

YYYThu Jun 20 11:42:11 [conn19938] query vox.system.indexes query: {ns: "Voxktmp .mr.pi_analytics_spark_trending_inventories_6667_inc "} nreturned: 1 reslen: 131 0ms Thursday June 20 11:42:11 [conn19938] query vox.tmp.mr.pi_analytics_spark_trending_inventories_6667 nreturned: 9 reslen: 1716 0ms

ZZZbad id" " { "count": 2, "first": ISODate ( "2013-04-22T20: 41: 11z"), "last": ObjectId ( "5175a067b25f029a1d092802"), "lastCounted": ObjectId ( "5175a067b25f029a1d092802")}

bad id "" {"count": 7, "first ": ISODate (" 2013-04-22T20: 41: 09Z ")," last ": ObjectId (" 5175a067d7f115dd41097e3c ")," lastCounted ": ObjectId (" 5175a067d7f115dd41097e3c ")}

XXX - emitted from a bunch of records my map function (e count with a pricing and am not familiar with that I Jhedaddblu with ID) Y Y - Mngolojh some kind - produced at former jobs after the event, at Less ...

TLDR, when I reduce the map to reduce, it is getting better until it is low As long as a Mongo process does not run until then, I start passing the returned values ​​of the reducing function to its lesser function.

Any ideas why / how is this possible?

Running 2.0.6

Thanks in advance

I thought out of the situation when the production of a map is already present, in order to reduce the job, the Mongo will pass on both new documents and documents which already have a single key back in the output collection Will reduce.

This works basically if you have a format tailored to the price that you get from the map and the price you lower.

This is not well documented, but now that I have come to know that my frustration is transubstantiated in the sense of smarts. The painful lesson learned good time ahead.

No comments:

Post a Comment