N
N
Non_Joy2016-05-12 12:48:24
Perl
Non_Joy, 2016-05-12 12:48:24

Regular expressions. How to cut the desired text?

There is text like this:

<img id = "3" class="lazy" src="/media/a.jpg" data-original="/media/a.jpg" alt="text">

You need to cut everything except /media/a.jpg from data-original.
Not enough mind. You need to pull out about 2 thousand pictures from the site. Everything else has already been written.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
I
Ivan Bogachev, 2016-05-12
@Non_Joy

You can look towards sed. In this way
You can get the data-original value from your string

Сергей Горностаев, 2016-05-12
@sergey-gornostaev Куратор тега Python

Это будет быстрее и проще регулярных выражений:

tag = '<img id = "3" class="lazy" src="/media/a.jpg" data-original="/media/a.jpg" alt="text">'
pos1 = a.index('data-original="') + len('data-original="')
pos2 = tag.index('"', pos1)
link = tag[pos1:pos2]

T
targumon, 2016-05-12
@targumon

use Mojo::DOM;

my $text = '<img id = "3" class="lazy" src="/media/a.jpg" data-original="/media/a.jpg" alt="text">';
my $data_original = Mojo::DOM->new( $text )->find( 'img' )->map( attr => 'data-original' );

print "$_\n" foreach @$data_original;

Z
zergon321, 2017-01-17
@zergon321

import re

reg = re.compile(r"data_original=\"([A-Za-z./]+\.jpg)\"")
print(reg.findall("<img id = \"3\" class=\"lazy\" src=\"/media/a.jpg\" data-original=\"/media/a.jpg\" alt=\"text\">"))

Веня Бабийман, 2018-12-18
@vanyabrovaru

Сама регулярка:

echo '<img id = "3" class="lazy" src="/media/a.jpg" data-original="/media/a.jpg" alt="text">' | perl -lne '/data-original="([^"]+)/; print "$1";'

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question