Saturday 15 August 2015

bash - Extract occurrences of character in file line by line -


I have a large file of a bilingual dictionary with formatted lines in the form:

  Abatement: disminucion; Mitigacion; Moderacion; Rebaja; Deduccion; Supression; Anulacion   

I am trying to figure out which line is the most translated word, and to do so ; I want to find a line with the most events of then the echo of the English word.

I have been able to bring something closer, but it uses sed to trim the data, which means that I can not bring back the English word in line.

Any thoughts?

  awk -F '[:;]' 'if (nf> n ) {N = NF; W = $ 1}} END {print w} 'filename    

No comments:

Post a Comment