This is useful if you need to do some manual munging - you can read the
columns in as character, clean it up with (e.g.) regular expressions and
then let readr take another stab at parsing it. The name is a homage to
the base utils::type.convert()
.
Usage
type_convert(
df,
col_types = NULL,
na = c("", "NA"),
trim_ws = TRUE,
locale = default_locale(),
guess_integer = FALSE,
guess_max = NA,
verbose = FALSE
)
Arguments
- df
A data frame.
- col_types
One of
NULL
, acols()
specification, or a string.If
NULL
, column types will be imputed using all rows.- na
Character vector of strings to interpret as missing values. Set this option to
character()
to indicate no missing values.- trim_ws
Should leading and trailing whitespace (ASCII spaces and tabs) be trimmed from each field before parsing it?
- locale
The locale controls defaults that vary from place to place. The default locale is US-centric (like R), but you can use
locale()
to create your own locale that controls things like the default time zone, encoding, decimal mark, big mark, and day/month names.- guess_integer
If
TRUE
, guess integer types for whole numbers, ifFALSE
guess numeric type for all numbers.- guess_max
Maximum number of data rows to use for guessing column types.
NA
: uses all data.- verbose
whether to print messages
Examples
df <- data.frame(
x = as.character(runif(10)),
y = as.character(sample(10)),
stringsAsFactors = FALSE
)
str(df)
#> 'data.frame': 10 obs. of 2 variables:
#> $ x: chr "0.0807501375675201" "0.834333037259057" "0.600760886212811" "0.157208441523835" ...
#> $ y: chr "6" "9" "5" "8" ...
str(type_convert(df))
#> 'data.frame': 10 obs. of 2 variables:
#> $ x: num 0.0808 0.8343 0.6008 0.1572 0.0074 ...
#> $ y: num 6 9 5 8 7 2 10 3 1 4
df <- data.frame(x = c("NA", "10"), stringsAsFactors = FALSE)
str(type_convert(df))
#> 'data.frame': 2 obs. of 1 variable:
#> $ x: num NA 10