Login

r0n · (This post was last modified: 03-01-2022, 10:02 PM by r0n.)

<!DOCTYPE html>

Code:

Copy Help

<html>

    <head>

        <title>Page Title</title>

    </head>

    <body>

        <div class="title"> <---------------------------------------------------- EXTRACT BEGIN

            <a href="https://www.download/book.pdf" target="_blank">

                Book title

            </a>

            <div class="js-subproduct-admin-edit" data-entity-kind="subproduct" data-machine-name="booktitle_1"></div>

        </div> <---------------------------------------------------- EXTRACT END

        </div class="some_other_block">

            test

        </div>            

    </body>

</html>

Is it possible using xPath in QM to extract div block: <div class="title"> , see arrows above
Or can it be done using QM string/regex functions?
I have some attempts but they are ugly not efficient solutions.

I can do it using Acc functions of QM but I want to do it without using Acc, if possible.
I searched "xPath on html" -> https://www.google.com/search?client=fir...th+on+html
But attempting it QM, I get errors.

Kevin · 03-02-2022, 05:23 AM

Can use HtmlDoc class

Function Function35

Code: Copy      Help
str html=

;<!DOCTYPE html>

;<html>

;;;;<head>

;;;;;;;;<title>Page Title</title>

;;;;</head>

;;;;<body>

;;;;;;;;<div class="title">

;;;;;;;;;;;;<a href="https://www.download/book.pdf" target="_blank">

;;;;;;;;;;;;;;;;Book title

;;;;;;;;;;;;</a>

;;;;;;;;;;;;<div class="js-subproduct-admin-edit" data-entity-kind="subproduct" data-machine-name="booktitle_1"></div>

;;;;;;;;</div>

;;;;;;;;</div class="some_other_block">

;;;;;;;;;;;;test

;;;;;;;;</div>

;;;;</body>

;</html>

HtmlDoc d.InitFromText(html)

ARRAY(MSHTML.IHTMLElement) div

int i

d.GetHtmlElements(div "div")

for i 0 div.len

,str cn=div[i].className

,if cn="title"

,,out "------------------InnerHtml------------------"

,,out div[i].innerHTML

,,out "------------------OuterHtml------------------"

,,out div[i].outerHTML

r0n · 03-02-2022, 06:05 AM

Thanks!

Kevin · (This post was last modified: 03-02-2022, 06:10 AM by Kevin.)

an example to use this without having browser open
this will extract the desired text from your 1st post here

Code: Copy      Help
HtmlDoc doc doc2

doc.InitFromWeb("https://www.quickmacros.com/forum/showthread.php?tid=7213")

str s=doc.d3.getElementById("pid_35428").innerText

int i

out

doc2.InitFromText(s)

ARRAY(MSHTML.IHTMLElement) div div2

doc2.GetHtmlElements(div "div")

for i 0 div.len

,str cn=div[i].className

,if cn="title"

,,out "------------------InnerHtml------------------"

,,out div[i].innerHTML

,,out "------------------OuterHtml------------------"

,,out div[i].outerHTML

,,break

r0n · 03-02-2022, 01:56 PM

Thank you!

Login
Username:
Password:	Lost Password?
	Remember me