Tuesday 15 July 2014

shell - Chunk a large file based on regex (LInux) -


A large text file to me and I want to share it in smaller files based on different values ​​in the column, columns are separated by a comma (it is a CSV file) and different values ​​too:

eg

  1012739937,2006-11-28, Di_022452ll 1012739 937 , 2006-11-28, Di_02238545 1,012,739 937,2006-ll-28, Di_02236564 1012739937,2006-11-28, Di_0l9l8338 1012739937,2006- 11-28, d_02148765 1012739 937,2006-11-28, Di_00868949 1,012,739 937,2006 -11-28, Di_0l908448 1012740478, 1 9 8l-06-26, Di_0l9l3689 1012740478, 1 9 8 9 -2-26, ISBN 1012740478, 1 99 8-06-26, Di_02l74766   

M I want to split the file into smaller files, such as records related to one year in each file (2006 record, one for the records of 1998, etc.)

(Here we have limited numbers may be, but I want to do the same thing with different values ​​of a specific column)

< P> You can use awk:
  awk -F, '{split ($ 2, d, "-"); Print & gt; D [1]} "file   

Explanation:

  -F, indicates that input field ',' division ($ 2, D," - ") Divides the second column ('date') with the '-' and the bits array 'd' print & gt; Puts in D [1] prints the entire input line in the file name of the year    

No comments:

Post a Comment