A
A
AlexHDreal2020-06-01 17:11:10
Python
AlexHDreal, 2020-06-01 17:11:10

How to copy files with the same name under different names, and not overwrite the created file when a file of the same name is found?

PYTHON

Help. The disk has an unknown number of files with the same name "pic one". You need to find them all and make copies, but with different sequential names (example: pic1, pik2, pic3.......) to avoid overwriting the same file found with the same name.

#КОД ИЩЕТ ФАЙЛ И КОПИРУЕТ ЕГО В УКАЗАННОЕ МЕСТО

import os

    name = 'pic one.jpg'
    disk = ('C:\\')

    for root, dirs, files in os.walk(disk):

        if name in dirs or name in files:
        print(f'We find {name} in {root}')

        filen1 = (f'{root}' + str('\\') + f'{name}')
        filen2 = 'C:\\Users\\........\\Desktop\\test\\pic.jpg'

        file1 = open(filen1, 'rb')
        file2 = open(filen2, 'wb')

        file2.write(file1.read())

        file1.close()
        file2.close()
        print('COPY COMPLETED')

print ('THE END')
input()

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Pankov, 2020-06-02
@trapwalker

There are a few simple rules:
0. Watch for indentation and format the code correctly.
For python, this is the most important rule. You incorrectly set the tags for wrapping the code, I removed the extra ones for you, but your indents in the code are broken. This must be treated carefully.
Remember that in addition to spaces, there are tabs. A tabulation is one wide character, but its width is not regulated anywhere, and python understands its nesting in the code by the number of empty characters at the beginning of the line. By itself, mixing tabs and spaces in Python code leads to terrible confusion and guaranteed errors. Holivars "spaces versus tabs" is not one hundred man-years old, but there is a general rule fixed in the PEP: use 4 spaces as one indent. Always. With no exceptions.
one. Do not use parentheses where they are not needed:

disk = ('C:\\')
filen1 = (f'{root}' + str('\\') + f'{name}')

In Python, brackets are commonly used to denote tuples. The tuple is formed by commas in the expression, but the necessary elements are limited by brackets. In any case, extra parentheses in python are bad style. In your case, they do not break the logic, but form a bad habit for you.
1. Don't manually concatenate paths like you do here: Casting string to string is pointless. We could have just written: , but this is also a bad idea, because you cannot manually concatenate paths. Different operating systems use different separator characters in paths. In python, there are modules for working with paths: os.path (ancient as mammoth dung) and pathlib (new, object, very convenient). 2. Use pathlib to work with paths. Explore this library. You will discover a lot
f'{root}' + str('\\') + f'{name}'

3. Use the with statement when dealing with closable resources.
Your code:
file1 = open(filen1, 'rb')
file2 = open(filen2, 'wb')
file2.write(file1.read())
file1.close()
file2.close()

Will become like this:
with open(filen1, 'rb') as file1, open(filen2, 'wb') as file2:
    file2.write(file1.read())

And using pathlib is even more concise:
directory = Path('path/to/my/dir')
fn1 = directory / 'file1.jpg'
fn2 = directory / 'file2.jpg'
file2.write_bytes(file1.read_bytes())

The slashes will be placed in the right direction themselves, as the system needs, you will never have to think about it again. You don't have to think about escaping slashes like you did in your paths. The division operator is blocked in the library, it is convenient to collect the path through it.
And it's also full of all sorts of conveniences like checking the existence of a file, creating directories, deleting ...
4. Do not mix working with directories and files:
if name in dirs or name in files:
In some cases, there may be similarities, because in the file system a directory is also a file . But you will not be able to read it, as such, or write it down. According to the logic of your program, it turns out that if there is a directory with the same name and extension, then you will also try to read it. this is where your code will fail.
By the way, pathlib has a great rglob method that recursively iterates over all paths that match the mask. You can do this with respect to any directory: it will find all jpeg images on your c: drive. In this case, you will get an object that you can iterate over with a loop or turn it into a list. The elements of such a sequence will also be of type pathlib.Path. 5. To solve any problem, break it down into elementary parts and solve them separately.
files = Path('c:').rglob('*.jpg')

In this case, it is logical to write a separate function for securely copying a single file. This function should check if there is a target file name (if not, then it copies), what size the file is there (if it is the same up to a byte, then you need to check the hash and refuse to copy if it matches, if it is different, then you need to change the target name or path ). With pathlib, you can substitute the extension and quickly get all files by mask in a directory. This will allow, for example, to quickly understand which is the latest file with a name of the form 'file_name_00023.jpg' using the mask 'file_name_*****.jpg', get its digital tail, turn it into an integer, add one and substitute it into the name again: Path('file_name_{last_index+1:05}.jpg').
6. I suspect you want to write secure image deduplication.
  • Read about hash functions and how they can help you tell for sure if two files are the same.
  • Read about symbolic links . These are like shortcuts, but at the file system level, you can work with files in the same way as with the originals. Such links will allow you not to copy or duplicate data, when you only need to somehow alternatively group them into directories.
  • Read about hard links . It looks like several names of the same file in arbitrary directories. If at least one name remains not deleted, then the file will be on the disk. The data is stored once, but the file lies in several places at once and there is no priority between them.
  • Consider logging. Sometimes it is more correct, instead of real operations with the file system, to write a detailed log of what you want to do in it, and then, at the end of the dangerous long process of compiling this log, you can approve and coordinate actions with the user, analyze the list of upcoming operations and perform them calmly or refuse them.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question