Hi Dear Gintaras,
I would like to ask if it is possible to use LA to extract just the python code from a html (convert from jupyter notebook) ? Thank you very much! Here is the html file Car price
birdywen, might I ask if you first converted the Jupyter notebook with e.g. "jupyter nbconvert thenotebook.ipynb --to html"? If you did, you can extract the python code by instead running "jupyter nbconvert thenotebook.ipynb --to python", which will give you a runnable script. If you only have the already-converted file to work with, there may be several ways to do this, with or without LA.
Regards,
burque505
Hi Gintaras,
It's that possible for LA to read the UI inner text? Just like reading the inner text like in HtmlAgilityPack. The reason I ask this question is because sometimes the webpage content what I want to get is required to login in first, but that is impossible for HtmlAgilityPack to extract text without logging on the account to the specific website.
Thank you!
07-30-2023, 04:07 AM (This post was last modified: 07-30-2023, 04:17 AM by Gintaras.)
I know 2 ways, but probably more exist. Google: "C# extract Chrome web page element text".
Get HTML with elm.Html. Then somehow convert HTML to text, for example using regular expressions or HtmlAgilityPack.
Use Selenium. But it has problems connecting to existing web browser window. Look in Cookbook.
07-31-2023, 01:26 PM (This post was last modified: 07-31-2023, 01:28 PM by birdywen.)
Hi Gintaras,
Your method of elm.Html. with HtmlAgilityPack is perfect combination. Now I can easily extract any text or other format (I mean any Elm) I wanted from any webpage.
That's so nice!
Thank you so much!
varw=wnd.find(1,"The Best Python Cheat Sheet | Zero To Mastery - Google Chrome","Chrome_WidgetWin_1"); foreach(vareinw.Elm["web:GROUPING",prop:"@id=cheatsheet-content"]["TEXT",prop:"level=2"].FindAll()) { varhtml=e.Html(false); vardoc1=new HtmlDocument(); doc1.LoadHtml(html); print.it(doc1.DocumentNode.InnerText);
}