Saturday 15 February 2014

What is the shortest way in Scala/Java to grab all information from a website AND parse it for buttons(including ID and Class if present)? -


I have to capture the IDs, sections and any other information contained in the tag. I am working in Scala, Java is ok though. This is an exact matching parser, it will only catch "button" button id = ... ", for example, is excluded. Other parser? make your? This is what I have found so far. Any help would be appreciated.

  // Currently (using Selenium WebDriver and Scala.XML): // Opens Browser and goes to the page driver.get (URL) // XML / html / e.t.c. And it converts XML format to xmlData = XML.loadString (driver.getPageSource) / pars (XMLData \ "button") for the button.  

This is a question of "my code for me" and it should be closed But at least you have tried something.

1) Parsing

You can not parse html to xml because it Not valid xml , you like the HTML parser.

2) Search for button

instead of \ instead of \ search in all sublettings for.

3) Get attributes

\ or \ method gives you the nodecake < / Code>. You should iterate it to parse the node object.

 for  (node ​​& lt; - nodeSeq) yield ???   

You can use the method text to get the attribute of element and method \ :

  val id = (node ​​\ "@id"). Text    

No comments:

Post a Comment