R
R
r4khic2019-08-23 07:35:53
Python
r4khic, 2019-08-23 07:35:53

How to remove spliced ​​extra data in python?

Hello ! I have such a question, how to make a slice of the sparse data?
I parse the dates of 10 news resources, and when parsing one of the resources, I have troubles.
When I parse the date of this resource, the date when parsing comes out like this:
5d5f6c666e5db191152393.png
How best to implement so that the date is: 23 August 2019 10:25
Code:

# < Собираем даты с страниц.
def get_item_datetime(item_page,datetime_rule,datetime1_rule):
    soup = BeautifulSoup(item_page, 'lxml')
    item_datetime = soup.find(datetime_rule[0],{datetime_rule[1]:datetime_rule[2]})
    if item_datetime is not None:
        item_datetime = soup.find(datetime_rule[0],{datetime_rule[1]:datetime_rule[2]}).text
        print(item_datetime)
        #item_datetime = dateparser.parse(item_date, date_formats=['%d %B %Y %H'])
    else:
        if (len(datetime1_rule) == 3):
            item_datetime = soup.find(datetime1_rule[0],{datetime1_rule[1]:datetime1_rule[2]})
            item_datetime = dateparser.parse(item_datetime, date_formats=['%d %B %Y %H'])
        else:
            item_datetime = ''
    return item_datetime

Answer the question

In order to leave comments, you need to log in

1 answer(s)
F
FeNUMe, 2019-08-23
@FeNUMe

Use decompose() to remove the nested view count span from the date div.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question