V
V
Vladimir2018-12-21 07:00:32
Python
Vladimir, 2018-12-21 07:00:32

How to get all xml file xsd schema mismatch errors?

I was puzzled by checking the xml file with my xsd schema. For this purpose, I use python with the lxml module. The code is the following:

import os
import sys
import datetime
from lxml import etree

rootDir = os.path.dirname(sys.argv[0])
scheme = '{}\\{}'.format(rootDir,'ListForRating_v03\ListForRating_v03.xsd')
pathToXML = '{}\\{}'.format(rootDir,'xml')

class Validator:

  def __init__(self, xsd_path: str):
    xmlschema_doc = etree.parse(xsd_path)
    self.xmlschema = etree.XMLSchema(xmlschema_doc)

  def validate(self, xml_path: str) -> str:
    xml_doc = etree.parse(xml_path)
    #result = self.xmlschema.validate(xml_doc)
    try:
      self.xmlschema.assert_(xml_doc)
      return 'Valid! :)'
    except AssertionError as e:
      print(str(e))
    return 'Not valid! :('

validator = Validator(scheme)


Files = os.listdir(pathToXML)
t = 'files' if len(Files) > 1 else 'file'
print('Scanning %s %s' % (len(Files), t))
timeBegin = datetime.datetime.now()
for file_name in Files:
  print('{}: '.format(file_name), end='')
  file_path = '{}\\{}'.format(pathToXML, file_name)
  print(validator.validate(file_path))
timeEnd = datetime.datetime.now()
delta = timeEnd - timeBegin
print('Scanning complete. Total time: {} seconds'.format(delta.seconds))

Actually everything works correctly, except for one. As soon as the first discrepancy to the scheme is found, this is all over. And I need to display all the errors that are in the xml file. As, for example, done in the xml tools plugin for notepad++. Can you please tell me what to fix in the code to get all the errors?
Thanks in advance.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladimir, 2018-12-21
@darkelephant

Dmitry , thanks for the link. I figured it out and finished the functionality as I wanted. Here is the code, maybe it will help someone

import os
import sys
import datetime
from lxml import etree

rootDir = os.path.dirname(sys.argv[0])
scheme = '{}\\{}'.format(rootDir, 'ListForRating_v03\ListForRating_v03.xsd')
pathToXML = '{}\\{}'.format(rootDir, 'xml')


class Validator:

    def __init__(self, xsd_path: str):
        xmlschema_doc = etree.parse(xsd_path)
        self.xmlschema = etree.XMLSchema(xmlschema_doc)

    def validate(self, xml_path: str) -> str:
        xml_doc = etree.parse(xml_path)
        try:
            self.xmlschema.assertValid(xml_doc)
            return 'Valid! :)'
        except etree.DocumentInvalid as e:
            msg = 'Not valid! :( Errors save in errors.log:'
            with open('errors.log', 'a') as f:
                for error in self.xmlschema.error_log:
                    f.write('File name: {} Error: {} Line: {}.\n'.format(os.path.basename(xml_path), error.message, error.line))
            return msg

validator = Validator(scheme)

Files = os.listdir(pathToXML)
t = 'files' if len(Files) > 1 else 'file'
print('Scanning %s %s' % (len(Files), t))
timeBegin = datetime.datetime.now()
for file_name in Files:
    print('{}: '.format(file_name), end='')
    file_path = '{}\\{}'.format(pathToXML, file_name)
    print(validator.validate(file_path))
timeEnd = datetime.datetime.now()
delta = timeEnd - timeBegin

D
Dmitry Shitskov, 2018-12-21
@Zarom

https://stackoverflow.com/questions/11581351/how-t...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question