Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get text of webpage
#30
This is another way. Uses HtmlParse. You still need some parsing using findrx or other string functions, but it should be easier.

Function HtmlTableToArray
Code:
Copy      Help
;/
function $HTML VARIANT'tableNameOrIndex ARRAY(str)&a [flags] [str&tableText] ;;flags: 1 get HTML

;Gets cells of a HTML table into array.

;HTML - all HTML (page source).
;tableNameOrIndex - table name or 0-based index in the HTML.
;a - array variable for results. The function creates 1-dimension array where each element is cell text.
;tableText - optional str variable that receives whole text of the table.


;EXAMPLE
;out
;str s
;IntGetFile "http://www.weather.com/weather/tenday/48183" s
;
;ARRAY(str) a
;HtmlTableToArray s 12 a
;
;;display text in first cell of each row
;int i ncolumns=2
;for i 0 a.len ncolumns

,;out a[i]
,;;out a[i+1] ;;second cell, and so on
,;out "---------"


MSHTML.IHTMLDocument2 d; MSHTML.IHTMLDocument3 d3
HtmlParse HTML d d3

MSHTML.IHTMLTable2 table=d3.getElementsByTagName("TABLE").item(tableNameOrIndex); err end "the specified table does not exist"

MSHTML.IHTMLElementCollection cells=table.cells
a.create(cells.length)
int i
for i 0 a.len
,MSHTML.IHTMLElement el=cells.item(i)
,if(flags&1) a[i]=el.innerHTML
,else a[i]=el.innerText

if(&tableText)
,el=+table
,if(flags&1) tableText=el.innerHTML
,else tableText=el.innerText


Messages In This Thread

Forum Jump:


Users browsing this thread: 1 Guest(s)