Giuseppe: regex - Match Chinese character in Perl -

Friday, 15 May 2015

regex - Match Chinese character in Perl -

I know that this question has been asked before. I had checked all the previous replies but still my problem could not be solved. Please forgive me for a clear duplicate question.

I am writing a pell program to process the text file in Chinese file. I want to recognize the Chinese text but I want to exclude all other lines like English or other languages and URLs. I use " utf8 " and " $ line = ~ / (\ p {Han} +) / " but it does nothing if I " Use the utf8 "and" $ line = ~ / ä¿¡æ ?? ¯ / , then it does nothing if I " Use utf8 ", then" $ line = ~ / ä¿¡æ ?? ¯ / "can work but not" $ line = ~ / (\ P {Han} +) / ". I check text file with encoding: file -bi input .txt, it shows: " text / plain; Charset = utf-8 ". The following code is:

  $ | = 1; strict usage; use utf8; $$ $ $ ARGV in my $ [0]; sub main {open ($, Line = ~ / (if my $ line = & lt; IN & gt;); {$ line = ~ / (\ "$" P {Han} +) /) {print "sugar: $ line \ n";} if ($ line = ~ / ä¿¡æ ?? ¯ /) {print "$ line \ n";}} while # (IN);}    Thanks in advance for any help and advice!   
 
  You have to open the file as UTF-8:  
  Open IN, Can not open "& lt :: encoding (UTF-8)", IN $ or $ $ die in $ \ n ";    Otherwise it is read as a byte string, which is not what you want.   

 




Posted by



Unknown




at

03:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment




Newer Post


Older Post

Home




Subscribe to:
Post Comments (Atom)


















    
About Me




Unknown



View my complete profile



Blog Archive








        ▼ 
      



2015

(1886)





        ► 
      



September

(203)







        ► 
      



August

(208)







        ► 
      



July

(224)







        ► 
      



June

(210)







        ▼ 
      



May

(230)

visual c++ - Armadillo in Win64 (LAPACK and BLAS) ...
python - Need help in understanding how the soluti...
javascript - Show ajax tooltip from an SVG onclick...
objective c - GCovr does not generate a valid repo...
php - pyroCMS and codeigniter add-ons -
Unable to get ruby running on Mac OSX 10.8.3 -
graph algorithm - Does a Given Network has a Uniqu...
mule custom transformer works in studio, but not d...
php - Filemtime error? -
powershell - Invoke-WebRequest GetSystemWebProxy() -
ruby on rails - Create a Diagram from Database Inf...
How to make a personal facebook app -
excel VBA End method on Range object -- applicatio...
Middle mouse button on java not working properly -
change datetime from mysql to string format on php -
File works in Java 1.5 but not in Java 7 (java.io....
c# - Why does CloudBlobClient.BaseUri add a traili...
scala - Using Akka in separate library -
php - Joomla check for empty string with JInput -
php - Nested foreach loop issue -
mapreduce - How to scheduling hadoop map tasks in ...
java - How to add sencha gxt palletes to eclipse j...
java - servlet filter to rewrite URL -
visual studio - VS Immediate Window for C++ Declar...
C# sql if query -
wolfram mathematica - using python to solve a nonl...
objective c - Trying to set UIControl objects' fra...
Rails app can't find my Gem's javascript -
c# - How to override print in Forms.WebBrowser -
How to update this PHP regex to not allow more tha...
python - Is mapping features of strings helping pa...
return - Python: passing values from function to f...
angularjs - Testacular Angular UI bootstrap 'direc...
c++ - How to pass a template function in a templat...
html - issue with styling over a slide show -
paypal - Missing shipping address and note to sell...
java - Why is there no StrongReference object? -
jquery - drawing rectangles on an image in javascr...
Extract Xcode project file from an .ipa or executa...
FFMPEG fails to run using exec() or shell_exec() i...
plsql - PL/SQL Statements ignored & Not enough val...
javascript - How is this minified JS the same as t...
iphone - Which method will excute while dismissing...
c++ - cin.Getline returns nothing -
api - Does this simple paypal solution allow credi...
Android resizing based on device screen size -
c - How to implement my own system call without re...
ms access - VBA Macro Code Stopped Executing -
jquery - How to prevent function in a loop -
android - How to scale Movie graphic/animated GIF? -
c++ - QString::toInt() fails when converting a QSt...
uitextfield - UIKeyboard is not visible after relo...
python - How to install new packages for pyramid w...
ruby on rails - User does not respond to 'devise' ...
haskell - Layout for separate 'App' and 'Backgroun...
javascript - Q promises - How does scope work? -
Tapestry Component object Persist issue -
javascript - Endless loops - not sure why? -
css - How to add Rounded corners with HTML -
mov instructions with byte destination for immedia...
aggregation framework - Mongodb query Aggregate Do...
authentication - Navigation in CakePHP -
ruby - Pass an arithmetic operator as parameter to...
ios - How to handle the "ACAccountCredentialRenewR...
php - Querying 3 tables mysql Laravel -
Calling Ant targets dynamically in Jenkins -
eclipse - How do you find the locationURI of menu ...
sql server - Retrieving item with multiple tags ap...
php - Cant retrieve user info from database when r...
ios - Custom images and/or title for More item in ...
php - SQLite database inside Phar archive -
ios - Adding subview with constraints takes 1 seco...
python - Django Admin Bulk Edit on Many to Many Re...
c++ - Unwanted margin inside QGraphicsView with Sc...
objective c - C pass function call as variable -
database - Phonegap storage transaction callback c...
c# - Ninject factory not working with conventions ...
sorting - Filtering and comparing a date stored as...
Screenshots in iOS -
osx - storing kernel core dumps locally on Mac OS X -
How to make titled border for RadioGroup in Androi...
google app engine - ForbiddenError when attempting...
c# - How to find a WebElement by another WebElemen...
nsnotificationcenter - Test if NSObject is observi...
android - Intent within fragment works only half t...
quickbooks - Determining parent of sub-customer th...
jsf - Format decimal places for primefaces ext inp...
python - Unable to install with pip (openSUSE) -
c# - Unable to write to asp.net textbox control wi...
java - Algorithm and Data Structure for Checking l...
uislider - jQuery UI Slider controller can replace...
Oracle query to convert multiple column into one c...
c# - Why, in Mono, won't WebRequestMethods.Ftp use...
javascript - Can I get highlighted text with JQuer...
c++ - How to make 'send' as non-blocking in winsock -
java - Having difficulty understanding an example ...
objective c - Using MongoDB in a Cocoa app -
Mathematica: Applying a function to a list of list -
javascript - How to locate all images in a directo...
multithreading - getting error with pthread mutex ...








        ► 
      



April

(195)







        ► 
      



March

(209)







        ► 
      



February

(201)







        ► 
      



January

(206)









        ► 
      



2014

(2117)





        ► 
      



September

(239)







        ► 
      



August

(251)







        ► 
      



July

(226)







        ► 
      



June

(208)







        ► 
      



May

(229)







        ► 
      



April

(199)







        ► 
      



March

(255)







        ► 
      



February

(275)







        ► 
      



January

(235)









        ► 
      



2013

(2011)





        ► 
      



September

(199)







        ► 
      



August

(228)







        ► 
      



July

(210)







        ► 
      



June

(222)







        ► 
      



May

(217)







        ► 
      



April

(229)







        ► 
      



March

(243)







        ► 
      



February

(221)







        ► 
      



January

(242)









        ► 
      



2012

(1993)





        ► 
      



September

(227)







        ► 
      



August

(235)







        ► 
      



July

(225)







        ► 
      



June

(206)







        ► 
      



May

(221)







        ► 
      



April

(216)







        ► 
      



March

(206)







        ► 
      



February

(227)







        ► 
      



January

(230)









        ► 
      



2011

(1964)





        ► 
      



September

(220)







        ► 
      



August

(222)







        ► 
      



July

(219)







        ► 
      



June

(224)







        ► 
      



May

(219)







        ► 
      



April

(206)







        ► 
      



March

(216)







        ► 
      



February

(221)







        ► 
      



January

(217)









        ► 
      



2010

(1952)





        ► 
      



September

(230)







        ► 
      



August

(202)







        ► 
      



July

(221)







        ► 
      



June

(207)







        ► 
      



May

(213)







        ► 
      



April

(199)







        ► 
      



March

(234)







        ► 
      



February

(244)







        ► 
      



January

(202)


















    















Powered by Blogger.