Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
PuppeteerSharp an alternative to Selenium
#1
Hey guys, 
first I would like to send a thank to Gintaras, for this amazing software
In addition to meeting my needs in terms of automation, LibreAutomate allowed me to discover and start with C#.
I would like to contribute, so I'm sharing this useful alternative to Selenium, which allow an other approach regarding browser automation.

With PuppeteerSharp the browser Chromium is embedded and is by default updated every time you run the code (can be a fix version).
You can chose to launch it with Headless mode enable or disable.
The browser can be easily set to keep the sessions and the coockies, which I find really useful.
You can find more here https://www.puppeteersharp.com/index.html

Just for information, below is a sample code to generate a PDF file from a website page. 
The  "UserDataDir" allow to save the session as with a common browser.




Code:
Copy      Help
/*/ nuget -\PuppeteerSharp; /*/
using System.Windows.Forms;
using PuppeteerSharp;


using var browserFetcher = new BrowserFetcher();
   await browserFetcher.DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
   var browser = await Puppeteer.LaunchAsync(new LaunchOptions
   {
        Headless = false,
    UserDataDir = @"D:\LibreAutomate - WorkSpaces\LibreAutomate\files\UserData" // Spécifiez le répertoire souhaité pour stocker les données du profil utilisateur
   });


var page = await browser.NewPageAsync();
await page.GoToAsync("https://fr.wikipedia.org/wiki/Mario_Kart:_Super_Circuit");
await page.PdfAsync(@"D:\test\page.pdf");
  // Fermez la page
    await page.CloseAsync();


Hope that can help.
#2
Great, Victor-P, nice one! Thanks very much. Cool
#3
Victor-P, thanks for steering me back into web scraping after a couple of years of neglect!  Gintaras, thank you for LA! It seems to me that this can do quite a bit of RPA-ish stuff.

I wanted to post this PuppeteerSharp code that scrapes Hacker News for links, uses 'print.it' to show them, and then kills the browser. (I noticed that 'await page.CloseAsync()' still leaves that 'about:blank' page open, so I took the route below. I still need to figure out how to get rid of that 'about:blank' page when the browser starts, but I'm sure a little googling and fiddling will get me there in short order.
 
Code:
Copy      Help
/*/ nuget -\PuppeteerSharp; /*/ //.
using PuppeteerSharp;
script.setup(trayIcon: true, sleepExit: true);
//..
using var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions {
    Headless = false,
    UserDataDir = @"E:\LibreAutomate\UserData" // Pick your own data dir!
});

var page = await browser.NewPageAsync();
{
        await page.GoToAsync("https://news.ycombinator.com/");
        print.it("Get all urls from page");
        var jsCode = @"() => {
        var arr = [], l = document.links;
        for(var i=0; i<l.length; i++) {
          arr.push(l[i].href);
        }
        return arr;
        }";
        var results = await page.EvaluateFunctionAsync(jsCode);
        foreach (var result in results)
        {
            print.it(result.ToString());
        }
        print.it("Finished.");
}
browser.Disconnect();
await browser.CloseAsync();


Regards,
burque505
#4
Hey friend, why there is an error like below on the pic.


Attached Files Image(s)
   
#5
Hi, Unfortunately I cannot help you on this one my friend as I have switched form PuppeeterSHarp to https://playwright.dev which is more powerfull and also because the developper of PuppeeterSHarp has joined the team of Playwright.

Another avantage of Playwright is that it's a Microsoft project.
#6
Hi birdywen, unfortunately I can't help either. I just ran that script, and for me it still runs without errors. I do see that I also have PuppeteerExtraSharp installed from Nuget, not sure if that would make a difference, it shouldn't since I don't reference it in that script.
#7
Hi Friend, I have fixed that issue by deleting that line of code. I don’t know the logic behind but at least it worked. Thanks! By the way, playwright is very powerful. I hope you could shared some real world code example about playwright. I really love that
#8
Code:
Copy      Help
/*/ role exeProgram; outputPath %folders.Documents%\YourFolder; icon .\Robot.ico; nuget Base\microsoft.playwright; /*/
using Au;
using Au.Types;
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Playwright;
using System.Reflection;
using System.Windows.Forms;
using System.Drawing;

//Software and version information
[assembly: AssemblyVersion("1.2.0.0")]
//[assembly: AssemblyFileVersion("1.0.0.0")] //if missing, uses AssemblyVersion
[assembly: AssemblyTitle("Your_Project")]
//[assembly: AssemblyDescription("Comments")]
[assembly: AssemblyCompany("Your_Company")]
[assembly: AssemblyProduct("Your_Project")]
[assembly: AssemblyInformationalVersion("1.2.0.0")] //product version
[assembly: AssemblyCopyright("Copyright © 2023")]
[assembly: AssemblyTrademark("Your_Company")]

// Start Script VTC Download
namespace Project_Name_download
{
    public class VTCBot
    {
        private string myDocumentsPath;
        private string TaxiFolderPath;
        private string UserDataFolderPath;
        private string ChromiumFolderPath;
        private string DownloadFolderPath;
        private string logFilePath;
        private string newFileName;

        public VTCBot()
        {
            //Dialog window for fleet selection
            if (!dialog.showInput(out string flotte, "Sélectionnez votre flotte", title: "Your_Project", editType: DEdit.Combo, comboItems: "CARTWHEEL|FLOTTE-2|FLOTTE-3|FLOTTE-3|...")) return;
            Console.WriteLine(flotte);
            //define the path to the Documents folder
            myDocumentsPath = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
            //define path to SEND folder
            TaxiFolderPath = Path.Combine(myDocumentsPath, "VCT-Brand-VTC", flotte, "VCT-Brand", "1-CSV");
            //define the path to the Download folder
            DownloadFolderPath = Path.Combine(myDocumentsPath, "VCT-Brand-VTC", "z-system", "Downloads");
            //define the path to the UserData folder
            UserDataFolderPath = Path.Combine(myDocumentsPath, "VCT-Brand-VTC", flotte, "UserData");
            //setting path to Chromium folder
            ChromiumFolderPath = Path.Combine(myDocumentsPath, "VCT-Brand-VTC","z-system", "Chromium", "Win64-1069273", "chrome-win", "chrome.exe");
            newFileName = "VCT-Brand-CHIFFRES-Project_Name_" + DateTime.Now.ToString("yyyy-MM-dd") + ".csv";
            logFilePath = Path.Combine(myDocumentsPath, "VCT-Brand-VTC", flotte, "Logs", newFileName);
        }

        public async Task Main(string[] args)
        {
            // Create log folder if none exists
            /*Directory.CreateDirectory(Path.GetDirectoryName(logFilePath));
            string newFileName = "VCT-Brand-Project_Name_" + DateTime.Now.ToString("yyyy-MM-dd") + ".csv";*/

            // Initialize the log file
            using (StreamWriter logFile = File.AppendText(logFilePath))
            {
                logFile.WriteLine($"--- Log Started: {DateTime.Now} ---");
            }

            try
            {
                using (var down = new FileSystemWatcher(DownloadFolderPath))
                {
                    down.Created += (o, e) =>
                    {
                        LogEvent($"File created: {e.FullPath}");
                        filesystem.copyTo(e.FullPath, TaxiFolderPath);
                    };
                    down.EnableRaisingEvents = true;

                    using (var renmov = new FileSystemWatcher(TaxiFolderPath))
                    {
                        renmov.Created += (o, e) =>
                        {
                            LogEvent($"File renamed: {e.FullPath} => {newFileName}");
                            filesystem.rename(e.FullPath, newFileName, FIfExists.Delete);
                        };
                        renmov.EnableRaisingEvents = true;

                        using var playwright = await Microsoft.Playwright.Playwright.CreateAsync();
                        var browserType = playwright.Chromium;
                        var browser = await browserType.LaunchPersistentContextAsync(UserDataFolderPath, new BrowserTypeLaunchPersistentContextOptions
                        {
                            Headless = false,
                            DownloadsPath = DownloadFolderPath,
                            ExecutablePath = ChromiumFolderPath
                        });

                        var page = await browser.NewPageAsync();
                        string url = "https://target-site.com/";
                        await page.GotoAsync(url);
                        LogEvent("Page opened");

                        await Task.Delay(3000);
                        await page.Locator("data-testid=header-nav-/reports").ClickAsync();
                        LogEvent("Reports page opened");

                        await Task.Delay(3000);
                       /*Group Row of the page. 1st filter by Role must contain the text "payments"
                        We combine the Roles to construct the Locator named “rowLocator”*/
                        var rowLocator = page
                           .GetByRole(AriaRole.Rowgroup)
                           .Filter(new() { HasText = "payments" })
                           //The 2nd filter by Role must not contain the text "Activity"
                           .GetByRole(AriaRole.Row)
                           .Filter(new() { HasNotText = "Activité" });
                        
                        
                        /*We filter the Locator; it must not contain the text “organization”
                        locate the Download button by Xpath and click */
                        await rowLocator
                        .Filter(new() { HasNotText = "organisation" })
                        .Locator("xpath=//button").First.ClickAsync();
                        
                        
                        Console.WriteLine("Download");
                        
                        LogEvent("Download button clicked");

                        await Task.Delay(8000);
                        LogEvent("Closing browser");
                        await browser.CloseAsync();
                        dialog.show("Task completed","You can click OK to close", title: "Your_Project", icon: DIcon.Info);
                    }
                }
            }
            catch (Exception ex)
            {
                LogException(ex);
                ShowErrorMessage(ex.Message);
            }

            // Ajouter une ligne de fin au fichier de log
            using (StreamWriter logFile = File.AppendText(logFilePath))
            {
                logFile.WriteLine($"--- Log Ended: {DateTime.Now} ---");
            }
        }

        private void LogEvent(string message)
        {
            string logEntry = $"{DateTime.Now} - {message}";

            // Écrire l'événement dans le fichier de log
            using (StreamWriter logFile = File.AppendText(logFilePath))
            {
                logFile.WriteLine(logEntry);
            }

            // Afficher l'événement dans la console
            Console.WriteLine(logEntry);
        }

        private void LogException(Exception ex)
        {
            string logEntry = $"{DateTime.Now} - Exception: {ex.Message}";

            // Écrire l'exception dans le fichier de log
            using (StreamWriter logFile = File.AppendText(logFilePath))
            {
                logFile.WriteLine(logEntry);
            }

            // Afficher l'exception dans la console
            Console.WriteLine(logEntry);
        }

        private void ShowErrorMessage(string message)
        {
            MessageBox.Show(message, "Erreur", MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
    }

    public class Program
    {
        public static async Task Main(string[] args)
        {
            var bot = new VTCBot();
            await bot.Main(args);
        }
    }
}

Hello guys,
I'm going to share a little script that use Playwright. I've moved to Playwright mainly for this script.
The purpose was to connect a specific website with login to collect the weekly  results of specifi VCT driver.
The first challenge was to use a persistent browser, in order to avoid having to fulfill the verification process.
The second part was the need of various options to fulfill the navigation requirements until the download of the file.
Hopefuly Playwright provide a very large choice of Locators and options to reach your goal.

I have provided comments for some locators.
I'm not a Dev, so the script can appear to be uggly.

Hope everything is clear, as french is my native language.


Forum Jump:


Users browsing this thread: 3 Guest(s)