PHP Curl DOM Parse multiples HTML files in folder into CSV (inline data in row)

Im willing to use PHP curl to parse multiples HTML files from a directory source instead of a single one. I would like to parse any kind of text element and put it in one row per html file and datas would fit in columns. I dunno how to do such a code.

The HTML files i got are made with lots of TD and span that makes then very annoying. I just know how to curl a single url and get results in a webpage.

So I made the code below to get any text over.

Thank you in advance for helping me with this matter 😉

<?php

$url = “Url of a single file”;
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);

$dom = new DOMDocument();
@$dom->loadHTML($html);

$titles = $dom->getElementsByTagName(‘title’);
foreach ($titles as $title) {
echo $title->nodeValue, PHP_EOL;
echo “<br />”;
}

$descriptions = $dom->getElementsByTagName(‘span’);
foreach ($descriptions as $description) {
echo $description->nodeValue, PHP_EOL;
echo “<br />”;
}

$descriptionsP = $dom->getElementsByTagName(‘p’);
foreach ($descriptionsP as $descriptionP) {
echo $descriptionP->nodeValue, PHP_EOL;
echo “<br />”;
}

foreach($dom->getElementsByTagName(‘img’) as $image) {
echo $image->getAttribute(‘src’);
echo “<br />”;

}

Read more here:: PHP Curl DOM Parse multiples HTML files in folder into CSV (inline data in row)

Leave a Reply

Your email address will not be published. Required fields are marked *