Wednesday 15 January 2014

Weka XRFFSaver including missing sparse values -


I am currently using the XRFFSaver class in the weka dev version. I am using xrff instead of efrid because I have very rare data and indicate that rare examples are handled properly and efficiently (i.e., is not included in the output).

However, using XRFFSaver, they are output like this:

  & lt; Value index = "1" unavailable = "yes" /> & lt; Value index = "2" unavailable = "yes" /> ...   

Who defeats the purpose of whole practice Anyone knows whether this operator is an error or I need to write my own saver?

I need to look at the source instantly, but I could not find any way to change this behavior in XFFFSAver or XMI exchanges in any way, although this was a quick glance.

tnx

I quickly find a solution:

Note: this is in C # (I use). However, it should be very easy for anyone to convert to Java. Note 2: The only important line is: if (rarer) is going on on which I also highlighted with the comments below. Everything else is a direct copy of the VICA source which I got through grepcode and google. It is also not certain that if my latest copy has been made, please use it with discretion.

I have also tested to make sure that the standard XRFFLoader handles it properly and it appears.

TNX

  // use var saver = new EfficientXRFFSaver (); Saver.setCompressOutput (file.EndsWith ("GZ.")); Saver.setInstances (example); Saver.setFile (new java.io.File (file)); Saver.writeBatch (); // Implementation Public Class EfficientXRFFSaver: XRFFSaver {public override zero reset option () {base.resetOptions (); SetFileExtension (getCompressOutput ()) XRFFLoader.FILE_EXTENSION_COMPRESSED: XRFFLoader.FILE_EXTENSION); Try {M_XMLEnstance = New Efficient XMLInstances (); } Hold {m_XML difference = null; }}} Public Class Active XML Instances: XML Interface {Safe Override Zero Addition Process (Element Guardian, Example Inst.) {Var node = m_Document.createElement (TAG_INSTANCE); Parent.appendChild (node); Var sparse = instant spans instant; If (rare) {node.SetAttribute (ATT_TYPE, VAL_SPARSE); } If (inst.weight ()! = 1.0) {node.SetAttivate (ATT_WEIGHT, Utills.doubleToString (inst.weight (), m_Precision);; } (Var i = 0; i & lt; inst.numValues ​​(); i ++) {var index = inst.index (i); Var Value = m_Document.createElement (TAG_VALUE); If (inst.isMissing (index)) {// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! // !!!!!!!!!! REQUIRED !!!!!!!!!!!!! // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! // If this is missing and sparse then this line does not add this element. If (sparse) will continue; Value.setAttribute (ATT_MISSING, VAL_YES); } And (if (inst.attribute (index) .isRelationValued ()) {var child = m_Document.createElement (TAG_INSTANCES); value.appendChild (child); (Var n = 0; n & lt; inst.relationalValue (i) .numInstances (); n ++) {addInstance (for children, inst.relationalValue (i) .instance (n));}} and {value.appendChild (inst.attribute (index) .type () == weka .core.Attribute.NUMERIC? M_Document.createTextNode (Utils.doubleToString (inst.value (index), m_Precision): m_Document.createTextNode (validContent (inst.stringValue (index)));}} Node.appendChild (value) ; If (sparse) click on {value.setAttribute (ATT_INDEX, "" + (Index + 1));}}}    

No comments:

Post a Comment