Giuseppe: python - Scrapy: Return only first XPATH match (one result per page) -

Tuesday, 15 June 2010

python - Scrapy: Return only first XPATH match (one result per page) -

I am scraping multiple xpaths / variables from a webpage and only one of these xpaths has multiple displays per page ( However, I need this to match my results to line-per-line).

Before moving on to the next start_url, do I have to remove only the first example of my path selector on each page? Thanks! DEF Pars (self, response): HXS = HTMaxPath selector (response) item = [] item = FLEECH NAIAR ITEM () item ["age"] = H.x.ct (' // Li [included (class @, "our age")] / span / text () '). Then (r '\ n \ s * (. *) \ N') or [']] item ["product"] = Hxs.select (' // div) '([ID,' price-review-age '] ] / H1 / lesson () '). Remove () item ["value"] = hx Selection ('// DD [included (class, "our")] / text ()'). Re ('r' \ n \ s * (. *) \ N ') item ["availability"] = hxs.select (' // div '((square, "in-stock")] / text () ). Remove item (item)

I'm not completely sure That's what you say. You can select the first match like this: "/ path / element [1]"

Or maybe you want it :

"All of the following for: elementlooking [predecessor :: initialization [[id = 'start_arl'] [1]]"

The ID attribute will be removed from element "start_url" .




Posted by



Unknown




at

03:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment




Newer Post


Older Post

Home




Subscribe to:
Post Comments (Atom)


















    
About Me




Unknown



View my complete profile



Blog Archive








        ► 
      



2015

(1886)





        ► 
      



September

(203)







        ► 
      



August

(208)







        ► 
      



July

(224)







        ► 
      



June

(210)







        ► 
      



May

(230)







        ► 
      



April

(195)







        ► 
      



March

(209)







        ► 
      



February

(201)







        ► 
      



January

(206)









        ► 
      



2014

(2117)





        ► 
      



September

(239)







        ► 
      



August

(251)







        ► 
      



July

(226)







        ► 
      



June

(208)







        ► 
      



May

(229)







        ► 
      



April

(199)







        ► 
      



March

(255)







        ► 
      



February

(275)







        ► 
      



January

(235)









        ► 
      



2013

(2011)





        ► 
      



September

(199)







        ► 
      



August

(228)







        ► 
      



July

(210)







        ► 
      



June

(222)







        ► 
      



May

(217)







        ► 
      



April

(229)







        ► 
      



March

(243)







        ► 
      



February

(221)







        ► 
      



January

(242)









        ► 
      



2012

(1993)





        ► 
      



September

(227)







        ► 
      



August

(235)







        ► 
      



July

(225)







        ► 
      



June

(206)







        ► 
      



May

(221)







        ► 
      



April

(216)







        ► 
      



March

(206)







        ► 
      



February

(227)







        ► 
      



January

(230)









        ► 
      



2011

(1964)





        ► 
      



September

(220)







        ► 
      



August

(222)







        ► 
      



July

(219)







        ► 
      



June

(224)







        ► 
      



May

(219)







        ► 
      



April

(206)







        ► 
      



March

(216)







        ► 
      



February

(221)







        ► 
      



January

(217)









        ▼ 
      



2010

(1952)





        ► 
      



September

(230)







        ► 
      



August

(202)







        ► 
      



July

(221)







        ▼ 
      



June

(207)

php - Populate HTML table from MYSQL and counting ...
Display ellipse in matlab -
debugging - how does debugger resume from breakpoi...
Jquery animate duration: movement not smooth enoug...
VB.NET and IBMi SQL dr.hasrows isn't showing rows ...
networking - validation in p2p communication -
Read Specific line and ignore others using python -
garbage collection - Will Java's interned strings ...
asp.net - Send File to Client from UserControl beh...
jquery - Why is event.namespace undefined for clic...
What is the difference between these facebook acce...
asp.net mvc - Automating Azure VIP Swap -
Android Seekbar float values from 0.0 to 2.0 -
How to change action of Ship button to own Module/...
powershell - Why doesn't psake evaluate my propert...
Including a function within an R script -
ios - Segmented Control with ViewControllers MFMai...
symfony 1.4 - Propel multiple-parameter binding in...
javascript - How to make shorcuts on the current d...
Simple HTTP Server in C - How to response to Brows...
Validate Maven archetype properties -
delphi - How to make a form align to the edge of t...
java - Genereating Random Number method -
python - Django wildcard query -
jquery - Dynamically rotate image according to dis...
java - An "Empty" Character Extracted from a PDF -
loops - Trying to get Pascal to read from file the...
html5 - Footer separated from other content -
May I use numeric in a javascript object? -
Graph all child nodes common lisp -
javascript - Creating an element that can remove i...
glsl - How to Make Large Matrix Multiply -
java - Use while loop to add many objects to array...
powershell - Trying to download a file to a direct...
c# - Using a DatePicker without TextBox (Silverlig...
osx - After uninstalling rvm I get the following w...
linux - Is ethernet checksum exposed via AF_PACKET? -
java - Checking to see if a string is a palindrome...
unit testing - How to simulate HTML5 Drag and Drop...
jquery - Automatically set a checkbox replacing di...
matplotlib - Extracting and plotting tabulated pla...
ios - UIView suddenly disappears -
java - Unable to send the data to Openbravo ERP th...
bundler - Deploying a Ruby Command Line Applicatio...
ios - iPhone 5 front camera - tap to focus? -
python - UnicodeDecodeError: 'ascii' codec can't d...
javascript - Call parent function from iframe with...
regex - RegExp for dates Extjs 3.4 -
eclipse - adding annotations to file outside works...
oop - PHP using objects instead of arrays -
c++ - Why do we need to use virtual ~A() = default...
php - AJAX chat commands - if command doesn't exis...
java - How to ignore prefix in spring view resolver -
Saving to 3 decimal places using QueryNew and Quer...
javascript - Reverting Form Data with Cancel button -
php - Unknown error when matching passwords -
html - FontAwesome icons inside anchors not animat...
Typed Racket: Creating generic types with define-t...
xml - Aggregate by certain characteristic in Mule ...
commenting - how to determine changes belonging to...
Change directory to SBT project in IntelliJ with S...
python - How to ensure that messages get delivered? -
opencv - optical flow works for zooming in and zoo...
php - Laravel 4 using jitimage: it works on locall...
php - ArrayIterator::valid not works -
asp.net web api - How to get BreezeJS metadata wor...
Send thousands of emails via Amazon SES -
javascript - Check if element is visible in div -
Process a file after a file is finished being writ...
objective c - Best & easiest way to do my UX for i...
asp.net mvc - Custom HttpAttribute Validation when...
java - Why isn't my string being displayed in the ...
android - Context becomes null For griview embedde...
google app engine - Default Version of Appengine A...
php - undefined attribute name data-placeholder -
how to post text to friend wall in Facebook in iph...
asp.net - Use updatepanel to load child page witho...
python - How does pretty json works? -
Make POST request using Python -
Memory addresses, pointers, variables, values - wh...
javascript - How to programmatically get the locat...
How to show sort by most popular(most sold) produc...
php - How to leave value unchanged if left empty i...
java - Compare String of patterns -
c# - Design issue with Generics -
qa - Difference between Quality Assurance and Soft...
access control - Show account specific modules in ...
javascript - Display all images from Sdcard -
jsf - FacesContext redirect with POST parameters -
android - Ui change stopped when playing media pla...
php - Add a custom attribute to a Laravel / Eloque...
Is there any standard tool for testing responsive ...
Applet.getCodeBase() returns null for local Applet...
ios - scrollview is not scrolling in iphone -
Finally closing stream using Scala exception catch...
Excel | Sum Rows With Different Lengths -
plot - plotting a rectangle and set of points toge...
android - Add a TextView inside a Custom View that...
php - Undefined property: CI_DB_mysql_result::$res...
webmatrix - Unable to successfully save umbraco co...








        ► 
      



May

(213)







        ► 
      



April

(199)







        ► 
      



March

(234)







        ► 
      



February

(244)







        ► 
      



January

(202)


















    















Powered by Blogger.