This is useful if you need to do some manual munging - you can read the
columns in as character, clean it up with (e.g.) regular expressions and
then let readr take another stab at parsing it. The name is a homage to
the base utils::type.convert().
Usage
type_convert(
df,
col_types = NULL,
na = c("", "NA"),
trim_ws = TRUE,
locale = default_locale(),
guess_integer = FALSE,
guess_max = NA,
verbose = FALSE
)Arguments
- df
A data frame.
- col_types
One of
NULL, acols()specification, or a string.If
NULL, column types will be imputed using all rows.- na
Character vector of strings to interpret as missing values. Set this option to
character()to indicate no missing values.- trim_ws
Should leading and trailing whitespace (ASCII spaces and tabs) be trimmed from each field before parsing it?
- locale
The locale controls defaults that vary from place to place. The default locale is US-centric (like R), but you can use
locale()to create your own locale that controls things like the default time zone, encoding, decimal mark, big mark, and day/month names.- guess_integer
If
TRUE, guess integer types for whole numbers, ifFALSEguess numeric type for all numbers.- guess_max
Maximum number of data rows to use for guessing column types.
NA: uses all data.- verbose
whether to print messages
Examples
df <- data.frame(
x = as.character(runif(10)),
y = as.character(sample(10)),
stringsAsFactors = FALSE
)
str(df)
#> 'data.frame': 10 obs. of 2 variables:
#> $ x: chr "0.735319598810747" "0.195956733077765" "0.980539674637839" "0.741521529154852" ...
#> $ y: chr "4" "2" "3" "1" ...
str(type_convert(df))
#> 'data.frame': 10 obs. of 2 variables:
#> $ x: num 0.7353 0.196 0.9805 0.7415 0.0514 ...
#> $ y: num 4 2 3 1 6 8 5 7 9 10
df <- data.frame(x = c("NA", "10"), stringsAsFactors = FALSE)
str(type_convert(df))
#> 'data.frame': 2 obs. of 1 variable:
#> $ x: num NA 10