Posts: 1,058
Threads: 367
Joined: Oct 2007
I am in need to tokenize a string which contains some zero length components as in the example which it follows :
Macro
temp
Trigger
SF12
str s="164```````````````123```"
ARRAY(str) arr
int nt = tok(s arr 16 "`")
out nt
In the above example nt is accessed as 2. I wonder whether there exists a flag to obtain the actual number of tokens, which is 16 in the above example.
Let me add that it works properly using a findrx routine, but programmatically it is slower.
Posts: 12,074
Threads: 141
Joined: Dec 2002
Function
RegexSplitString
;/
function# $s $rx [ARRAY(str)&as] [ARRAY(POINT)&ap] [findrxFlags]
;Gets parts of string separated by substrings that match a regular expression.
;Returns the number of tokens (array length).
;s - string.
;rx - regular expression (separator).
;as - variable that receives string parts. Can be omitted or 0.
;ap - variable that receives positions of string parts in s. Can be omitted or 0.
;findrxFlags - <help>findrx</help> flags. The function uses findrx to find separators; adds flag 4 (find all).
;REMARKS
;The arrays always will have the number of found separators + 1. If s begins or ends with separator, then the arrays will have an empty element at the beginning or end.
;EXAMPLE
;out
;str s="aa bb cc[9]dd"
;ARRAY(str) as; ARRAY(POINT) ap
;out RegexSplitString(s "\s+" as ap)
;int i
;for i 0 as.len
,;out F"[{as[i]}]"
,;out F"{ap[i].x} {ap[i].y}"
opt noerrorshere
if(&as) as=0
if(&ap) ap=0
ARRAY(POINT) _a
findrx(s rx 0 findrxFlags|4 _a)
int i nSep(_a.len) nTok(nSep+1) iFrom iTo(len(s))
if(&as) as.create(nTok)
if(&ap) ap.create(nTok)
for i 0 nSep
,POINT& r=_a[0 i]; int _to=r.x
,if(&ap) ap[i].x=iFrom; ap[i].y=_to
,if(&as and _to>iFrom) as[i].get(s iFrom _to-iFrom)
,iFrom=r.y
if(&ap) ap[i].x=iFrom; ap[i].y=iTo
if(&as and iTo>iFrom) as[i].get(s iFrom iTo-iFrom)
ret nTok
Macro
Macro2773
out
str s="164```````````````123```"
ARRAY(str) arr
RegexSplitString s "`" arr
int i
for i 0 arr.len
,out F"[{arr[i]}]"
Or, if the string is a valid single-line CSV, use ICsv interface.
Posts: 1,058
Threads: 367
Joined: Oct 2007
Dear Gintaras, many thanks indeed. Let me attach my approach. It is implied that yours is elegant and powerful. Best regards.
Function
tempf12
str subject="`164```````````````123```"
str pattern="`"
ARRAY(str) arr
int i i0 i1 nt
ARRAY(CHARRANGE) a
nt=findrx(subject pattern 0 4 a)
;s.get(subject 29 43-29)
;out s
out nt
if nt=0; ret
arr.create(nt+1)
;1st Element
i0=a[0 0].cpMax
arr[0].get(subject 0 i0-1)
;Subsequent Elements
for i 0 a.len-1
,i0=a[0 i].cpMax
,i1=a[0 i+1].cpMin
,arr[i+1].get(subject i0 i1-i0)
;,out F"{i} {i0} {i1} {s}"
for i 0 arr.len
,out F"{i} {arr[i]}"