Answer the question
In order to leave comments, you need to log in
How to parse the array returned by the server as a regular string?
There is a task, it is not yet clear where to dig, so I don’t want to use regular expressions, but so far I don’t see another solution, you can do it both in php and in Python. The second one I just started.
In general, data is loaded on the page on request to a JS script, I caught the request, I send it myself - the answer comes in the form of a regular string, not json. Like this:
"var A=Array(376);
var B=Array(135);
var C=Array(66);
A[1]=[1765540,14,2799,4790,'Ts','MSC', '2019,8,7,21,00,00',-1,3,2,1,2,1,1,1,2,'20','13','','',82,' ','',8,0];A[2]=[1706041,83,4134,19230,'3DF','rSC','2019,8,7,21,00,00',-1,3 ,1,2,0,0,0,0,0,'14','8','','',66,'','',0,0];
..."
etc. The problem itself is that this is a String and not JSON, and such a string must somehow be turned into an array ... You need to get array A and parse it into a PHP / Python array. In addition to regular expressions, I can’t see or google solutions yet, since apparently the task is somehow rare. But I don’t want to bother with regular seasons, because a large array is always given from 200 to 1000 values, you never know where there will be a semicolon that will incorrectly break the line or some other trick.
PS Corrected the question so that some do not get confused with the code tag. I wanted to highlight so it was clear, but it turned out as always ...
Answer the question
In order to leave comments, you need to log in
import re
import pandas as pd
from io import StringIO
srv_data = "A[1]=[1765540,14,2799,4790,'Ts','MSC','2019,8,7,21,00,00',-1,3,2,1,2,1,1,1,2,'20','13','','',82,'','',8,0];"\
"A[2]=[1706041,83,4134,19230,'3DF','rSC','2019,8,7,21,00,00',-1,3,1,2,0,0,0,0,0,'14','8','','',66,'','',0,0];"
csv_data = '\n'.join(re.findall('=\[(.+?)\];', srv_data))
# Вариант без регулярок:
# csv_data = '\n'.join(line.split('=')[1].strip('[]') for line in srv_data.split(';') if line)
df = pd.read_csv(StringIO(csv_data), quotechar="'", header=None)
import pandas as pd
data = "A[1]=[1765540,14,2799,4790,'Ts','MSC','2019,8,7,21,00,00',-1,3,2,1,2,1,1,1,2,'20','13','','',82,'','',8,0];"\
"A[2]=[1706041,83,4134,19230,'3DF','rSC','2019,8,7,21,00,00',-1,3,1,2,0,0,0,0,0,'14','8','','',66,'','',0,0];"
A = {}
exec(data) # Потенциально опасная операция, т.к. в ответе сервера может быть вредоносный код
pd.DataFrame(A.values())
The author's fear of regular seasons is incomprehensible.
As if any other parser wouldn't trip over a missing semicolon.
Despite the fact that regular expressions make the code orders of magnitude shorter.
preg_match_all('!A\[\d+\]=\[(.*?)\]!', $s, $matches);
$data = [];
foreach ($matches[1] as $row) {
$data[] = str_getcsv($row, ",", "'");
}
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question