Posts: 795
Threads: 136
Joined: Feb 2009
Hi
,
is there a macro or function i missed to strip HTML tags from the Macro1642?
if not, how to retreive plain text only to filter urls?
Thanks
Posts: 12,087
Threads: 142
Joined: Dec 2002
Posts: 795
Threads: 136
Joined: Feb 2009
To get all from selection:
Macro Macro1642 ?
str s.getsel(0 "HTML Format")
;out s
int i=find(s "<html" 0 1); if(i<0) ret
s.get(s i)
;out s
Posts: 12,087
Threads: 142
Joined: Dec 2002
Macro
Macro165
HtmlDoc d
d.SetOptions(3)
d.InitFromText(s)
s=d.GetText
out s
Posts: 795
Threads: 136
Joined: Feb 2009
Hmmm, not working.
From the topic header example:
In web page:
Plain text
by ldarrambide on 09 Mar 2012 16:19
By get selection link macro:
<html><body>
<!--StartFragment--><a href="http://www.quickmacros.com/forum/viewtopic.php?p=22851" class="topictitle">Plain text</a>
<br>
by <a href="http://www.quickmacros.com/forum/memberlist.php?mode=viewprofile&u=1745">ldarrambide</a> on 09 Mar 2012 16:19
<!--EndFragment-->
</body>
</html>
With your addition:
Plain text
by ldarrambide on 09 Mar 2012 16:19
I want this result:
Plain text
Plain text
memberlist.php?mode=viewprofile&u=1745
on 09 Mar 2012 16:19
i.e all "normal" text without any HTML reference. Possible?
Posts: 12,087
Threads: 142
Joined: Dec 2002
You want text with some HTML, ie href tags. HtmlDoc removes all HTML, including href tags. Need some string parsing, eg replace link HTML so that URL would be not in < >. Before passing the HTML to HtmlDoc.
Posts: 12,087
Threads: 142
Joined: Dec 2002
If need only text/URL of links:
Macro
Macro1668
out
str s=
;<html><body>
;<!--StartFragment--><a href="http://www.quickmacros.com/forum/viewtopic.php?p=22851" class="topictitle">Plain text</a>
;<br>
;by <a href="http://www.quickmacros.com/forum/memberlist.php?mode=viewprofile&u=1745">ldarrambide</a> on 09 Mar 2012 16:19
;<!--EndFragment-->
;</body>
;</html>
HtmlDoc d
d.SetOptions(3)
d.InitFromText(s)
ARRAY(MSHTML.IHTMLElement) a
d.GetLinks(a)
int i
for i 0 a.len
,out a[i].innerText
,out a[i].getAttribute("href" 0)
Posts: 12,087
Threads: 142
Joined: Dec 2002
all text and (URL)
Macro
Macro1672
out
str s=
;<html><body>
;<!--StartFragment--><a href="http://www.quickmacros.com/forum/viewtopic.php?p=22851" class="topictitle">Plain text</a>
;<br>
;by <a href="http://www.quickmacros.com/forum/memberlist.php?mode=viewprofile&u=1745">ldarrambide</a> on 09 Mar 2012 16:19
;<!--EndFragment-->
;</body>
;</html>
HtmlDoc d
d.SetOptions(3)
d.InitFromText(s)
ARRAY(MSHTML.IHTMLElement) a
d.GetLinks(a)
int i
for i 0 a.len
,str href=a[i].getAttribute("href" 0)
,if href.len
,,str txt=a[i].innerText
,,txt+F" ({href})"
,,a[i].innerText=txt
s=d.GetText
out s
Posts: 795
Threads: 136
Joined: Feb 2009
Yes, i was going that way too.
Your code is of course brillant, and i'll use it.
i need help, because 1 hour on that is making me bored and angry:
if want to find pattern containing a question mark in it.
<a href", but did not find the right syntax.
How to get the whole <a href" string?
i tried many possibilities, and i'm sure it is pretty it is simple.
So for example if i have
<h3 class="r"><a href="http://www
i want to use pattern=<h3 class="r"><a href=", with quotation marks included.
Posts: 12,087
Threads: 142
Joined: Dec 2002
QM uses escape sequence two single quotes for a double quote.
or
Posts: 795
Threads: 136
Joined: Feb 2009
I've been fighting with the single quote thing by PCRE regular thing,
could'nt guess that one though.
Maybe an add in documentation would profitable for all.
Thanks G