r - extract alphanumeric strings from text -
background
related question not required reading
question
i have string
str_temp <- "{type: [{a: a1, timestamp: 1}, {a:a2, timestamp: 2}]}" from extract 7 alphanumeric substrings: type, a, a1, timestamp, a, a2, timestamp. however, can't regex work.
i have tried both base r , library(stringr) using various combinations of [:word:], [:alnum:], [:alpha:] etc.
one example:
> pattern <- "[:word:]" > str_locate_all(str_temp, pattern) [[1]] start end [1,] 6 6 [2,] 11 11 [3,] 26 26 [4,] 34 34 [5,] 48 48 but gives me end points of strings type, a, timestamp, a, timestamp , not start points, or either of a1 or a2.
what's correct regex extracting 7 alphanumeric strings?
here regex works. matches alphanumeric words not numbers.
((?![0-9]+)[a-za-z0-9]+) http://www.rubular.com/r/euf9afdtxw
thanks richard showing how use in r:
regmatches(str_temp, gregexpr("((?![0-9]+)[a-za-z0-9]+)", str_temp, perl = true))[[1l]]