Monday 15 March 2010

r - One byte separator argument in read.table() -


I am creating in the table of unexpected character sets with the expected number of columns

for example For example, a sample table may look like this:

FILENAME: foo.txt

SEPARATOR: "\ U00AA"

ROW1, COL1: foo

ROW1, col2: b, r

ROW1, COL3: fo; Ober

ROW1, COL4: BO \ tt

and more.

In RI

read.table ('foo.txt', sep invalid 'sep' value: one byte

Should be

and

what separators should I use to avoid conflict with unattended string? Unicode has accepted up to \ u007F However, R interprets anything higher to be multi-byte. Why?

The way to debug input problems is to first table (count.fields ('File.nam')) And oddities & lt; - look at ('fil.nam') which is (count.fields ('file.nam' %% in odd_counts) and then a readLines [oddities] version or abusive lines, often the problem is a comment character, which is "#" by default and in those cases the solution is comment.char = "" Use read.delim (.) to call>.

No comments:

Post a Comment