A
A
Artem Frolov2016-03-22 15:23:05
PHP
Artem Frolov, 2016-03-22 15:23:05

Parsing with Simple Html Dom, How to?

Good day. I'm trying to parse an archive of runs using the Simple Html Dom Parser library.
There is a site code:

<tr class="S2H"><td colspan="4" class="S2L">Футбол. До 17 лет. Чемпионат Европы. Элитный раунд</td><td class="bl">1</td><td>X</td><td class="br">2
</td></tr>
<tr><td>1</td><td>21.03 17:30</td><td class="S1L">Уэльс U17 - Швеция U17</td><td>0:1</td><td class="bl">32.00 / 30.81</td><td>28.00 / 26.00</td><td class="br">40.00 / 43.19
</td></tr>

And so in the end, 15 matches in the draw.
I need to add each individual attribute to the database, in one entry. I have a database structure like this:
  • id - match id (from 1 to 15)
  • date (date)
  • tourney (league name)
  • match (team name)
  • score (score)
  • kef (odds distribution, coefficients).

And so send 15 records from one circulation to the database.
I'm trying to pull one record at a time like this :
<?
include 'simple_html_dom.php';

$html = file_get_html('http://sportsbet.com/list/ru/322/');
$res = $html->find('tr'5);
echo $res;

?>

The result is:
121.03 17:30Wales U17 - Sweden U170:132.00 / 30.8128.00 / 26.0040.00 / 43.19

How can I share this data correctly? or how to iterate after S2H each ?
And another question, if you look for the S2H class, then there is no data on the league (S2L class).
I am new to this business, please help me organize a competent parsing in order to pull out the entire record and then split it, or work with each element, read the literature, but I don’t understand how to implement it all.
Thank you very much in advance!

Answer the question

In order to leave comments, you need to log in

2 answer(s)
O
OVK2015, 2016-03-22
@Select1d

<?php	
  function getRemoteData($url, $argsArray, $ifPostRequest)
  {		
    $userAgent = "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2414.0 Safari/537.36";
    $cURLsession = curl_init();
  
    curl_setopt($cURLsession, CURLOPT_URL, $url);		
    curl_setopt($cURLsession, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($cURLsession, CURLOPT_RETURNTRANSFER, true);			
    curl_setopt($cURLsession, CURLOPT_USERAGENT, $userAgent);							
    curl_setopt($cURLsession, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($cURLsession, CURLOPT_CONNECTTIMEOUT, 30);
    // curl_setopt($cURLsession, CURLOPT_REFERER, $url);
    if($ifPostRequest)
    {
      curl_setopt($cURLsession, CURLOPT_POST, true);		
      curl_setopt($cURLsession, CURLOPT_POSTFIELDS, $argsArray);
      curl_setopt($cURLsession, CURLOPT_HTTPHEADER, 
      array
      (			
        "X-Requested-With: XMLHttpRequest"		   
      ));			
    }
    if(($curlResult = curl_exec($cURLsession)) === false)		
    {		
      die("Error fetchind data: ".curl_error($cURLsession)." from ".$this->url);								
    }
    
    curl_close($cURLsession);
  
    return $curlResult;
  }		
  
  $url = "http://toto.fonsportsbet.com/list/ru/322/";
  $content = getRemoteData($url, "", false);

  // file_put_contents(__DIR__."\\footbal.html", $content);
  // echo "Saved\n";
 
  // $content = file_get_contents(__DIR__."\\footbal.html");

  $regExpLigaWrapper = 
    "#(?<=<td colspan=4 class=S2L>)(.*?)(<td class=bl>)".
    "(.*?)((?:<td colspan=4 class=S2L>)|(?:</table>))#si";
  $regExpPlayWrapper = 
    "#<td>(\d{1,})<td>(.*?)<td class=S1L>(.*?)<td>".
    "(.*?)<td(?:.*?)bl>(.*?)<td>(.*?)<(?:.*?)>(.*?)(?:<|$)#si";
  preg_match_all($regExpLigaWrapper, $content, $ligaMatches, PREG_SET_ORDER);	
  
  foreach($ligaMatches as $ligaMatch) 
  {
    echo "Liga: ".$ligaMatch[1]."\n****************************\n";
    preg_match_all($regExpPlayWrapper, $ligaMatch[3], $playMatches, PREG_SET_ORDER);		
    foreach($playMatches as $playMatch) 
    {
      echo 
      "id: ".$playMatch[1]."\n".
      "Time: ".$playMatch[2]."\n".
      "Name: ".$ligaMatch[1]."\t".$playMatch[3]."\n".
      "Count: ".$playMatch[4]."\n".
      "Class1: ".$playMatch[5]."\n".
      "Class2: ".$playMatch[6]."\n".
      "Class3: ".$playMatch[7]."\n".
      "\n";			
    }
  }
?>

A
Artem Frolov, 2016-03-22
@Select1d

I have achieved that the array contains the following data:

Football. Up to 17 years old. Europe championship. Elite round1X2
121.03 17:30Wales U17 - Sweden U170:132.00 / 30.8128.00 / 26.0040.00 / 43.19

The source code looks like this:
Everything works together, as I understand it, due to the fact that I do not have css style. But that's not the point.
Convert it to this:
That is, put spaces and put each attribute in separate variables. (id, date, match, score, koeficienti). Please tell me reg. expression so that I can substitute spaces instead of tags and decompose it all into variables.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question