Giuseppe: python - Remove unnecessary repeated tags with BeautifulSoup -

Saturday, 15 January 2011

python - Remove unnecessary repeated tags with BeautifulSoup -

I am using Python and beautiful soup to extract some text from html. I have some html in which the text of the form is

  & lt; H3 & gt; & Lt; B & gt; ABC & lt; / B & gt; & Lt; B & gt; DEF & lt; / B & gt; & Lt; / H3 & gt;    I want to remove the repeated b tag. Is there a quick way to do this?   
 
  It works just fine for BS4  
  [4]: soup.h3 out [4]: & lt; H3 & gt; & Lt; B & gt; ABC & lt; / B & gt; & Lt; B & gt; DEF & lt; / B & gt; & Lt; / H3 & gt; In [5]: soup.h3.text out [5]: U 'ABC DEF'    See document and package here:   

 




Posted by



Unknown




at

02:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment




Newer Post


Older Post

Home




Subscribe to:
Post Comments (Atom)


















    
About Me




Unknown



View my complete profile



Blog Archive








        ► 
      



2015

(1886)





        ► 
      



September

(203)







        ► 
      



August

(208)







        ► 
      



July

(224)







        ► 
      



June

(210)







        ► 
      



May

(230)







        ► 
      



April

(195)







        ► 
      



March

(209)







        ► 
      



February

(201)







        ► 
      



January

(206)









        ► 
      



2014

(2117)





        ► 
      



September

(239)







        ► 
      



August

(251)







        ► 
      



July

(226)







        ► 
      



June

(208)







        ► 
      



May

(229)







        ► 
      



April

(199)







        ► 
      



March

(255)







        ► 
      



February

(275)







        ► 
      



January

(235)









        ► 
      



2013

(2011)





        ► 
      



September

(199)







        ► 
      



August

(228)







        ► 
      



July

(210)







        ► 
      



June

(222)







        ► 
      



May

(217)







        ► 
      



April

(229)







        ► 
      



March

(243)







        ► 
      



February

(221)







        ► 
      



January

(242)









        ► 
      



2012

(1993)





        ► 
      



September

(227)







        ► 
      



August

(235)







        ► 
      



July

(225)







        ► 
      



June

(206)







        ► 
      



May

(221)







        ► 
      



April

(216)







        ► 
      



March

(206)







        ► 
      



February

(227)







        ► 
      



January

(230)









        ▼ 
      



2011

(1964)





        ► 
      



September

(220)







        ► 
      



August

(222)







        ► 
      



July

(219)







        ► 
      



June

(224)







        ► 
      



May

(219)







        ► 
      



April

(206)







        ► 
      



March

(216)







        ► 
      



February

(221)







        ▼ 
      



January

(217)

How to convert sql type double-precision in c# -
svn - SVNSERVE - Do all repositories have to be in...
arrays - PHP Reward points to leaderboard and acco...
c# - DataGridView select RowHeader or ColumnHeader -
How to reset google oauth 2.0 authorization? -
c# - Send/Receive http request/response from MS Wo...
jsf 2 - Download link encodedKeyStore -
Rails form remove has many relationship -
android - Scala's class to Java -
java - Exception Signal 11 using JNI CallBooleanMe...
REST console guide -
c# - Self-Deleting Viewmodel in detail view - Bad ...
java - Is there any way to debug a JSP Custom tag ...
c++ - Index exception in 2D -
Java Error "Exception in thread "main"
java.lang.A...
java - Is it necessary to make this variable volat...
php not getting data from Request, posting from cu...
functional programming - Can branch and bound algo...
How to use modulus with PHP to add a CSS class to ...
c++ - Why is my data member empty even after const...
php - Why do references use more memory? -
python - Remove unnecessary repeated tags with Bea...
java - Is it not safe to insert into a list while ...
facebook - Error 500 on dialog/oauth API after tod...
Is there a way to tell if website is built with pe...
How to use .map file to find the space allocations...
android ndk - How do I get crash dump in ndk. not ...
c - How to handle when a Client or Server is Down ...
html - How to use SWF as body Background? -
.net - Can I host multiple different windows servi...
windows phone 8 - Is it possible to set naviagtion...
oop - Refactoring and dependencies issue -
java.net.ProtocolException: Cannot write output af...
android - onClickListener for custom ListView -
xcode - Adding ObjectiveFlickr to Project -
Can the chosen jQuery combobox have different widt...
google mirror api - Can you allow a user to REPLY ...
ruby - relation has_many rails with inheritance -
sql - SqlDataReader Read Order Different Than Query -
android - Starting a new activity after info is re...
android - Background Service to take picture with ...
javascript - requestAnimationFrame and knowing whe...
how can I create ibook shelves like-style user int...
PayPal System error 550006 in Android app -
curl - Run composer with a PHP script in browser -
html5 - div elements will not line up properly (si...
ruby on rails - Rspec test Comparable -
internet explorer - CSS font size not being respec...
powershell - Append CSV output coding issue -
How to make a certain piece optional in a NLog lay...
javascript - Gzipping file using Ajax Minfy tool -
c# - Proper Manner for Writing COM/ATL Wrapper for...
wpf - UserControl within a ControlTemplate -
c - Prototype of recv() -
wordpress - MySQL Script 1064 Error -
c# - import xml and update canvas - silverlight -
javascript - Knockout mapping does represent wishe...
What happens with sidekiq current processes and ca...
python - CherryPy website design -
histogram with dynamic array -
Suppress keyboard after setting text with Android ...
Git Plugin for Oh-My-ZSH broken after Updating Git -
align text fields to right html form css -
bash - Replacing a string using sed which has many...
mysql - JOIN ON Coalesce(Formula1,Formula2,Formula...
java - Adding pooled connection datasource in JBos...
mysql - How to agregate/group php array by SQL tim...
indexing - How to read multikey index values in Mo...
javascript - Antialiasing not working in Three.js -
Value not assigning to variable? C calculator -
javascript - my currrent array format is not being...
html - Jquery ui datepicker previous button not wo...
Navigating Quicktime chapters with Javascript -
ios - Why use "self" and what does it mean? -
mysql - How to use result set from one query in se...
jwplayer - TimeSlider Tooltip Plugin did not work ...
asp.net web api - Secure web api, my public / priv...
plone - Override viewlet to be less pervasive -
sql server - entity framework and native sql access -
extjs4 - How to get form on toolbar in gridpanel E...
how can I monitor SignalR host for deadlock or lon...
Rendering in django template paragraph behaves wei...
c# - What is the name of the ReSharper's Quick Fix...
java - I can not serve a JSON response with Play f...
ember.js - emberjs + rails, how to specify the rou...
sql - How to store Monthly running services in dat...
objective c - Xcode Call Web PHP File with paramet...
c# - Handle multiple SampleGrabber -
how to use multiple arguments in kdb where query? -
python - How to dynamically generate XML file for ...
How to switch language in WordPress "on-the-fly" -
java - Using strings from an array, not working to...
Are css 3d transforms fully suported on android > ...
objective c - NSTokenField click completion list i...
html - Add Options in Flyout through JavaScript -
javascript - Trying to delete a table from a datab...
performance - How to accelerate matlab code? -
ruby - How to run a new browser instance per test ...
php - Redirect if javascript condition is true not...
Saving to two models via intermediate form - Django -










        ► 
      



2010

(1952)





        ► 
      



September

(230)







        ► 
      



August

(202)







        ► 
      



July

(221)







        ► 
      



June

(207)







        ► 
      



May

(213)







        ► 
      



April

(199)







        ► 
      



March

(234)







        ► 
      



February

(244)







        ► 
      



January

(202)


















    















Powered by Blogger.