S
S
Stepan Sidorov2020-12-09 12:15:52
Python
Stepan Sidorov, 2020-12-09 12:15:52

How to read a file/image from an XML document?

Hello.
I have an XML document, I get it from the Docx document.
And here I am trying to get a picture, i.e. it in bytes so that you can save the picture to a folder.
Here is a picture in this document

<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml">
  <w:pPr>
    <w:pStyle w:val="Normal"/>
    <w:bidi w:val="0"/>
    <w:jc w:val="start"/>
    <w:rPr/>
  </w:pPr>
  <w:r>
    <w:rPr/>
    <w:drawing>
      <wp:anchor behindDoc="0" distT="0" distB="0" distL="0" distR="0" simplePos="0" locked="0" layoutInCell="0" allowOverlap="1" relativeHeight="2">
        <wp:simplePos x="0" y="0"/>
        <wp:positionH relativeFrom="column">
          <wp:align>center</wp:align>
        </wp:positionH>
        <wp:positionV relativeFrom="paragraph">
          <wp:posOffset>635</wp:posOffset>
        </wp:positionV>
        <wp:extent cx="6120130" cy="6120130"/>
        <wp:effectExtent l="0" t="0" r="0" b="0"/>
        <wp:wrapSquare wrapText="largest"/>
        <wp:docPr id="1" name="Image1" descr="" title=""/>
        <wp:cNvGraphicFramePr>
          <a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1"/>
        </wp:cNvGraphicFramePr>
        <a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
          <a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
            <pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
              <pic:nvPicPr>
                <pic:cNvPr id="1" name="Image1" descr="" title=""/>
                <pic:cNvPicPr>
                  <a:picLocks noChangeAspect="1" noChangeArrowheads="1"/>
                </pic:cNvPicPr>
              </pic:nvPicPr>
              <pic:blipFill>
                <a:blip r:embed="rId2"/>
                <a:stretch>
                  <a:fillRect/>
                </a:stretch>
              </pic:blipFill>
              <pic:spPr bwMode="auto">
                <a:xfrm>
                  <a:off x="0" y="0"/>
                  <a:ext cx="6120130" cy="6120130"/>
                </a:xfrm>
                <a:prstGeom prst="rect">
                  <a:avLst/>
                </a:prstGeom>
              </pic:spPr>
            </pic:pic>
          </a:graphicData>
        </a:graphic>
      </wp:anchor>
    </w:drawing>
  </w:r>
</w:p>


How can this be done with Python?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
L
Lone Ice, 2020-12-09
@always-prog

We all know that docx is an archive that contains xml with markup and pictures in a separate folder. It turns out that the task is to find an image (Image1) in the xml code, and then find an image with the same name in the folder.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question