I am trying to read a large text file and see the incidence of specific errors counting. For example, for the following sample text
some bla error123 foo test error123 line junk error 55 more accessories I want to end with ( Not really care that data structure though I'm thinking of a map)
error 123 - 2 error 55 - 1 Even I've tried so far away
(read-big- File find-error "sample.txt") Returns: (zero zero "error 123" zero zero "error 123" zero zero "error 55 "zero zero) Next I tried to remove items like zero values and groups
( Which gives
code> {"error 123" ["error 123" "error 123"], "error 55" ["error 55"] }
This desired value Getting closer to production, though it may not be efficient now how can I mean? In addition, in the form of close-up and functional programming in any new form, I appreciate any suggestions about how I can improve it. Thanks!
I think you are looking for the frequency function:
< Code> User = & gt; (Doctor Frequency) ------------------------- Closer.core / Frequency ([cola]) Number of maps compared to different items The times they appear void Therefore, it should you do what you want:
(frequencies (delete from zero? (Read-large -File search-error "sample.txt"))) ;; = & Gt; If your text file is really big, then I would recommend it to the line-seq inline. "Error 123" 2, "Error 55" 1} To make sure that you do not get out of memory, instead of using filter You can also use the map and to delete . (defn count-lines [ex, filename] (with - Open [RDR (IO / Reader Filename)] (frequencies (filter-line-RDR))) (defn is-error-line? [Line] (re-search # error "line)) (count-lines Is-error-line? "Sample.txt") ;; = & Gt; {"Error123" 2, "Error 55" 1}
No comments:
Post a Comment