Is this the best way to get unique version of filename w/ Python?

Still 'diving in' to Python, and want to make sure I'm not overlooking something. I wrote a script that extracts files from several zip files, and saves the extracted files together in one directory. To prevent duplicate filenames from being over-written, I wrote this little function - and I'm just wondering if there is a better way to do this? Thanks!

def unique_filename(file_name):
counter = 1
file_name_parts = os.path.splitext(file_name) # returns ('/path/file', '.ext')
while os.path.isfile(file_name): 
    file_name = file_name_parts[0] + '_' + str(counter) + file_name_parts[1]
    counter += 1
return file_name

I really do require the files to be in a single directory, and numbering duplicates is definitely acceptable in my case, so I'm not looking for a more robust method (tho' I suppose any pointers are welcome), but just to make sure that what this accomplishes is getting done the right way.


Asked by: Emma617 | Posted: 27-01-2022






Answer 1

One issue is that there is a race condition in your above code, since there is a gap between testing for existance, and creating the file. There may be security implications to this (think about someone maliciously inserting a symlink to a sensitive file which they wouldn't be able to overwrite, but your program running with a higher privilege could) Attacks like these are why things like os.tempnam() are deprecated.

To get around it, the best approach is to actually try create the file in such a way that you'll get an exception if it fails, and on success, return the actually opened file object. This can be done with the lower level os.open functions, by passing both the os.O_CREAT and os.O_EXCL flags. Once opened, return the actual file (and optionally filename) you create. Eg, here's your code modified to use this approach (returning a (file, filename) tuple):

def unique_file(file_name):
    counter = 1
    file_name_parts = os.path.splitext(file_name) # returns ('/path/file', '.ext')
    while 1:
        try:
            fd = os.open(file_name, os.O_CREAT | os.O_EXCL | os.O_RDRW)
            return os.fdopen(fd), file_name
        except OSError:
            pass
        file_name = file_name_parts[0] + '_' + str(counter) + file_name_parts[1]
        counter += 1

[Edit] Actually, a better way, which will handle the above issues for you, is probably to use the tempfile module, though you may lose some control over the naming. Here's an example of using it (keeping a similar interface):

def unique_file(file_name):
    dirname, filename = os.path.split(file_name)
    prefix, suffix = os.path.splitext(filename)

    fd, filename = tempfile.mkstemp(suffix, prefix+"_", dirname)
    return os.fdopen(fd), filename

>>> f, filename=unique_file('/home/some_dir/foo.txt')
>>> print filename
/home/some_dir/foo_z8f_2Z.txt

The only downside with this approach is that you will always get a filename with some random characters in it, as there's no attempt to create an unmodified file (/home/some_dir/foo.txt) first. You may also want to look at tempfile.TemporaryFile and NamedTemporaryFile, which will do the above and also automatically delete from disk when closed.

Answered by: Walter771 | Posted: 28-02-2022



Answer 2

Yes, this is a good strategy for readable but unique filenames.

One important change: You should replace os.path.isfile with os.path.lexists! As it is written right now, if there is a directory named /foo/bar.baz, your program will try to overwrite that with the new file (which won't work)... since isfile only checks for files and not directories. lexists checks for directories, symlinks, etc... basically if there's any reason that filename could not be created.

EDIT: @Brian gave a better answer, which is more secure and robust in terms of race conditions.

Answered by: Daryl454 | Posted: 28-02-2022



Answer 3

Two small changes...

base_name, ext = os.path.splitext(file_name) 

You get two results with distinct meaning, give them distinct names.

file_name = "%s_%d%s" % (base_name, str(counter), ext)

It isn't faster or significantly shorter. But, when you want to change your file name pattern, the pattern is on one place, and slightly easier to work with.

Answered by: Elise622 | Posted: 28-02-2022



Answer 4

If you want readable names this looks like a good solution.
There are routines to return unique file names for eg. temp files but they produce long random looking names.

Answered by: Chelsea583 | Posted: 28-02-2022



Answer 5

if you don't care about readability, uuid.uuid4() is your friend.

import uuid

def unique_filename(prefix=None, suffix=None):
    fn = []
    if prefix: fn.extend([prefix, '-'])
    fn.append(str(uuid.uuid4()))
    if suffix: fn.extend(['.', suffix.lstrip('.')])
    return ''.join(fn)

Answered by: Dominik374 | Posted: 28-02-2022



Answer 6

How about

def ensure_unique_filename(orig_file_path):    
    from time import time
    import os

    if os.path.lexists(orig_file_path):
        name, ext = os.path.splitext(orig_file_path)
        orig_file_path = name + str(time()).replace('.', '') + ext

    return orig_file_path

time() returns current time in milliseconds. combined with original filename, it's fairly unique even in complex multithreaded cases.

Answered by: Elian653 | Posted: 28-02-2022



Similar questions

python - Choose the filename of an uploaded file with Django

I'm uploading images (represented by a FileField) and I need to rename those files when they are uploaded. I want them to be formated like that: "%d-%d-%s.%s" % (width, height, md5hash, original_extension) I've read the documentation but I don't know if I need to write my own FileSystemStorage class or my own FileField class or ... ? Everything is so linked I don't know where to start.


python - `cat filename | grep -B 5 -C 5 foo`

for filename in os.listdir("."): for line in open(filename).xreadlines(): if "foo" in line: print line So this is a simple python equivalent of cat filename | grep foo. However, I would like the equivalent of cat filename | grep -B 5 -C 5 foo, how should the above code be modified?


Python long filename support broken in Windows

I write Python script to copy files; unfortunately it keeps failing because filename is too long(>256). Is there anyway to deal with that problem? I'm using Python 2.5.4 and Windows XP. Cheers,


How to encode UTF8 filename for HTTP headers? (Python, Django)

I have problem with HTTP headers, they're encoded in ASCII and I want to provided a view for downloading files that names can be non ASCII. response['Content-Disposition'] = 'attachment; filename="%s"' % (vo.filename.encode("ASCII","replace"), ) I don't want to use static files serving for same issue with non ASCII file names but in this case there would be a problem with File system and ...


Python: getting filename case as stored in Windows?

Though Windows is case insensitive, it does preserve case in filenames. In Python, is there any way to get a filename with case as it is stored on the file system? E.g., in a Python program I have filename = "texas.txt", but want to know that it's actually stored "TEXAS.txt" on the file system, even if this is inconsequential for various file operations.


logging - Python logger dynamic filename

I want to configure my Python logger in such a way so that each instance of logger should log in a file having the same name as the name of the logger itself. e.g.: log_hm = logging.getLogger('healthmonitor') log_hm.info("Testing Log") # Should log to /some/path/healthmonitor.log log_sc = logging.getLogger('scripts') log_sc.debug("Testing Scripts") # Should log to /some/path/scripts.log log_cr = l...


python - How do you find the filename that you pass to open()?

I'm trying to open a file with Python, but I'm unsure how to find the correct filename to use.


Using python 2.7, why does my unicode filename raise an IOError when calling file() on it?

Python 2.7: I am trying to open an mp3 to read its ID3 tag using mutagen (so I'd rather not have to change the method), which calls: file(filename, "rb") For files with no special characters this works fine, but otherwise, I sometimes seem to get IOError: [Errno 2] No such file or directory: u"somepath\\08 - Muse - I Belong To You - Mon C\x9cur S'ouvre \xc0 Ta Voix....


python - Django file upload: filename not sticking

I'm uploading files and storing metadata in a db. Part of the metadata is the file name itself. However, somewhere down the line, the filename seems to not be getting saved! I will paste only what I think are relevant parts of the code to keep this short. class UploadFile(models.Model): ... theFile = models.FileField(upload_to = "Genius/Uploads/", null = True) filename = models.CharField(max_l...


python - FTP filename encoding

Hi I use twisted library to connect to FTP server but I have problem with filename encoding. I receive 'Illusion-N\xf3z.txt' so its not unicode. Is there any FTP command to force specific encoding? Thanks in advance! MK






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top