r - extract alphanumeric strings from text -
background
related question not required reading
question
i have string
str_temp <- "{type: [{a: a1, timestamp: 1}, {a:a2, timestamp: 2}]}"
from extract 7 alphanumeric substrings: type, a, a1, timestamp, a, a2, timestamp
. however, can't regex work.
i have tried both base r , library(stringr)
using various combinations of [:word:], [:alnum:], [:alpha:]
etc.
one example:
> pattern <- "[:word:]" > str_locate_all(str_temp, pattern) [[1]] start end [1,] 6 6 [2,] 11 11 [3,] 26 26 [4,] 34 34 [5,] 48 48
but gives me end points of strings type
, a
, timestamp
, a
, timestamp
, not start points, or either of a1
or a2
.
what's correct regex extracting 7 alphanumeric strings?
here regex works. matches alphanumeric words not numbers.
((?![0-9]+)[a-za-z0-9]+)
http://www.rubular.com/r/euf9afdtxw
thanks richard showing how use in r:
regmatches(str_temp, gregexpr("((?![0-9]+)[a-za-z0-9]+)", str_temp, perl = true))[[1l]]