Monday 15 July 2013

r - How to separate rows in a column based on a specification? -


After

I have a matrix with 2 columns, the beginning of the matrix is ​​shown below:

  SNP PI1 [1,] "SNP_Label" "PI1" [2,] "rs482519" "0.37,22,219" [3,] "rs12196956" "0.32,12,364" [4,] " "CNV548726" "0.31,12,315" [5,] "CNV356212" "0.3078721" [6,] "rs4792617" "0.30, 23,402" [7,] "CNV2095401" "0.2 9, 7 , "626" [8,] "CNV4528251" "0.2 9, 391" [9,] "rs9369426" "0.28,60,7 9 3" [10,] "rs31672" "0.2790241" [11,] "rs1323446" The specification is that I want to separate SNPs that start with the " Rs " from SNPs that are " CNV  ", and receives a new matrix for 2 types of SNPS with its fine Pi1 values. All the names of the SNP are random, so randomly from row to row in "RS" or "CNV" column  

I think I want to loop the first letter of 2 entries in the SNP column. You may need to run through, but I do not know whether it is correct or not.

Create a data.frame and then do this:

  Mylabel & lt; - gsub ("[0-9]", "", my.df [[1]] list.of.dfs & lt; - partition (my.df, mylabel)   

The way it works, mylabel will only retain the alphabet identifier after your first column in Gmail. Then split your data based on your identifiers. Split the frame into a segment.

You will be given a list of data. Frames with all unique labels. After that you can choose people with the name 'RU' and 'CNV'. V1 to V2 3 CNV548726 0.3112315 4 CNV356212 0.3078721 6 CNV2095401 0.2 9 7 626 7 CNV4528251.2939 $ 100 $ V1 to V2 1 rs482519 0.37,22,219 rs12196956 0.3212364 5 rs4792617 0.3023402 8 rs9369426 0.28607 939 rs31672 0.27 92424 10 rs1323446 0.27,78,401

No comments:

Post a Comment