S
S
sudo rm -rf /2017-06-18 03:20:21
Python
sudo rm -rf /, 2017-06-18 03:20:21

Why does the find/xargs link only work on one file?

Good afternoon!
After some time reading fb2 books from one translator, I selected a number of regexp to correct some points. It takes a long time to use them one by one and even for each remaining file, so I wrote a simple Python script (actually, right after I learned some basics).
Here is what came out:
- csetfix.sh

enconv -x UTF-8 $1
python bfix.py $1

-bfix.py
# -*- coding: utf-8 -*-
import re
import sys

if len(sys.argv) < 2:
  exit(1)

reg = [
  {'find':r'windows-1251', 'replace':r'utf-8'},
  {'find':r'-нить', 'replace':r'-нибудь'},
  {'find':r'кол-в', 'replace':r'количеств'},
  {'find':r'-же', 'replace':r' же'},
  {'find':r'так же', 'replace':r'также'},
  {'find':r'Однако,', 'replace':r'Однако'},
  {'find':r'Высочество', 'replace':r'Величество'},
  {'find':r'Глава №?(\d+)\.? ', 'replace':r'Глава \1: '},
  {'find':r'(?<=<body>)\s+<title>[\s\S]+?</title>', 'replace':r''},
  {'find':r'– -{30,43},', 'replace':r' ================================ '},
  {'find':r'<p>\s+', 'replace':r'<p>'},
  {'find':r'\s+</p>', 'replace':r'</p>'},
  {'find':r'<emphasis>\s+', 'replace':r'<emphasis>'},
  {'find':r'\s+</emphasis>', 'replace':r'</emphasis>'},
  {'find':r'</emphasis>\s*<emphasis>', 'replace':r' '},
  {'find':r'<strong>\s+', 'replace':r'<strong>'},
  {'find':r'\s+</strong>', 'replace':r'</strong>'},
  {'find':r'</strong>\s*<strong>', 'replace':r' '},
  {'find':r'<strong></strong>', 'replace':r''},
  {'find':r'<emphasis></emphasis>', 'replace':r''},
  {'find':r'<p></p>', 'replace':r''},
  {'find':r'( –)(?= (?:(?:потому )?что(?:бы)?|если|то|а|да|и|или|однако|но) )', 'replace':r','},
  {'find':r'\.</strong>', 'replace':r'</strong>'},
  {'find':r':</strong>([^\s<])', 'replace':r':</strong> \1'},
  {'find':r'(= </p>\s+<p><strong>)– ', 'replace':r'\1'},
  {'find':r'\s+(\.|,|!|\?)', 'replace':r'\1'},
  {'find':r'\n+', 'replace':r'\n'}
]

file = open(sys.argv[1], "r")
result = ""
for line in file.readlines():
  result = result+line
file.close()
for i in range(0,len(reg)):
  result = re.sub(reg[i]['find'], reg[i]['replace'], result)
result = re.split("\n+", result)
file2 = open(sys.argv[1], "w")
for line in result:
  file2.write(line+'\n')
file2.close()

However, when starting
find -type f -name "*.fb2" | sort | xargs python bfix.py

only that file is changed, the path to which find gives out first.
What is the reason for this? How to decide?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
J
jcmvbkbc, 2017-06-18
@MaxLevs

Your python script processes a single parameter file. xargs with no additional parameters calls whatever it was given with as many parameters as possible. For this bundle to work, write like this:
find -type f -name "*.fb2" | sort | xargs -n1 python bfix.py

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question