Answer the question
In order to leave comments, you need to log in
How to remove substrings in strings in txt file?
There is a text file user.js
https://gist.githubusercontent.com/anonymous/8b1e7...
it has many lines with repeated substrings.
For example, there is the 1189th line
"user_pref("geo.wifi.uri", ""); // comments;"
and there is the 12th line
"user_pref("geo.wifi.uri", "");"
from the point of view of a text editor, these are different lines, but from the point of view of logic, these are the same lines.
And there are many such examples (lines).
How can I remove these duplicate lines?
Answer the question
In order to leave comments, you need to log in
#!/usr/bin/python3
import re
ptrn = re.compile(r'^\s*user_pref\(([^\)]+)\);').search
unic = set()
with open("user.js", "r") as fi, open("user_nodup.js", "w") as fo:
for s in fi:
m = ptrn(s)
if m:
data = m.group(0)
if data in unic:
print(s, end="") # duplicate
continue
unic.add(data)
fo.write(s)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question