R
R
Ruslan2021-01-12 13:53:00
Python
Ruslan, 2021-01-12 13:53:00

How to rewind in pyparser SkipTo to the right place?

Help to understand this pyparser, there is a test code like this:

from pprint import pprint
from pyparsing import ParseException, CaselessLiteral, White, Suppress, Combine, CaselessKeyword, Word, Optional, nums, alphas, SkipTo, ZeroOrMore, OneOrMore, printables, Group, LineStart, LineEnd, QuotedString, delimitedList, pyparsing_unicode

source_string = '''

... много чего выше ...

Номер нарушения: 82364311 (ACTIVE)
****************** Источник инцидента: ******************
Тип инцидента: Source IP - 10.12.2.138
Description: Одновременное подключения нескольких пользователей Рекомендации по устранению: 1.Проанализировать IP источников 
****************** Общая информация ******************
Description: OTR.WIN.CASE04: Windows Multiple Network Logins
Категория инцидента: N/A
Дата и время обнаружения нарушения ИБ: 2019-11-09 11:33:29
Дата и время окончания нарушения ИБ: 2019-11-09 11:33:29
Credibility:3 Relevance:6 Severity:1
Тип правила: EVENT
Всего событий и потоков: 9
****************** Информация об активе ******************
Основные свойства
Given Name: OTRBSB01
Unified Name: OTRBSB01
Description: Сервер БД 
"DataBase DB PROD" 
БД системы: 
[CL_OTRB_7307] – основная БД ICM, содержит все данные и параметры приложения ICM 
[CL_OTRB_SYS] – содержит данные и па
Network: N/A
Business Owner: N/A
Business Contact: N/A
Technical Owner: N/A
Technical Contact: N/A
Technical User: N/A

... много чего ниже ...

'''

try:
    _printables = printables + pyparsing_unicode.Cyrillic.alphas + '–'

    _onsource_id = CaselessKeyword('Номер нарушения:') + Word(_printables) + Suppress(Optional(Word(_printables)))
    _name = CaselessKeyword('Description:') + Word(_printables + ' ')
    _onsource_ts = CaselessKeyword('Дата и время обнаружения нарушения ИБ:') + Word(_printables + ' ')

    _common_information = OneOrMore(_onsource_id) \
        + Suppress(ZeroOrMore(Word(_printables + ' ') + SkipTo(Word('*')+CaselessKeyword('Общая информация')+Word('*')))) \
        | OneOrMore(_name) \
        | OneOrMore(_onsource_ts)

    pprint(_common_information.searchString(source_string).asList())

except ParseException as err:
    print(err.line)
    print(" " * (err.column - 1) + "^")
    print(err)

When executed, I get the following result:


It occurs 3 times in the text Description: , I only need the one in the "General Information" block, and before that, having taken the "Violation Number"

First Description: I excluded it by rewinding to the "General Information" block like this:
Suppress(ZeroOrMore(Word(_printables + ' ') + SkipTo(Word('*')+CaselessKeyword('Общая информация')+Word('*'))))


If I add rewind after OneOrMore(_onsource_ts) like this:
| OneOrMore(_onsource_ts) \
+ Suppress(OneOrMore(Word(_printables + ' ') + SkipTo(Word('*'))))

That result does not change.
And if I do it through OR
| OneOrMore(_onsource_ts) \
| Suppress(OneOrMore(Word(_printables + ' ') + SkipTo(Word('*'))))

That result is:

[[], ['Description:', 'DB Server ']]


Help me figure out what I'm doing wrong?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
Radavan, 2021-01-12
@Radavan

is a list, just select an element of this list by index - 1

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question