4

I have a question quite similar to this question, where I need the follow conditions to be upheld:

  • If a file is opened for reading, that file may only be opened for reading by any other process/program
  • If a file is opened for writing, that file may only be opened for reading by any other process/program

The solution posted in the linked question uses a third party library which adds an arbitrary .LOCK file in the same directory as the file in question. It is a solution that only works wrt to the program in which that library is being used and doesn't prevent any other process/program from using the file as they may not be implemented to check for a .LOCK association.

In essence, I wish to replicate this result using only Python's standard library.

BLUF: Need a standard library implementation specific to Windows for exclusive file locking

To give an example of the problem set, assume there is:

  • 1 file on a shared network/drive
  • 2 users on separate processes/programs

Suppose that User 1 is running Program A on the file and at some point the following is executed:

with open(fp, 'rb') as f:
    while True:
        chunk = f.read(10)
        if chunk:
            # do something with chunk
        else:
            break 

Thus they are iterating through the file 10 bytes at a time.

Now User 2 runs Program B on the same file a moment later:

with open(fp, 'wb') as f:
    for b in data:  # some byte array
        f.write(b)

On Windows, the file in question is immediately truncated and Program A stops iterating (even if it wasn't done) and Program B begins to write to the file. Therefore I need a way to ensure that the file may not be opened in a different mode that would alter its content if previously opened.

I was looking at the msvcrt library, namely the msvcrt.locking() interface. What I have been successful at doing is ensuring that a file opened for reading can be locked for reading, but nobody else can read the file (as I lock the entire file):

>>> f1 = open(fp, 'rb')
>>> f2 = open(fp, 'rb')
>>> msvcrt.locking(f1.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
>>> next(f1)
b"\x00\x05'\n"
>>> next(f2)
PermissionError: [Errno 13] Permission denied

This is an acceptible result, just not the most desired.

In the same scenario, User 1 runs Program A which includes:

with open(fp, 'rb') as f
    msvcrt.locking(f.fileno(), msvcrt.LK_LOCK, os.stat(fp).st_size)
    # repeat while block
    msvcrt.locking(f.fileno(), msvcrt.LK_UNLCK, os.stat(fp).st_size)

Then User 2 runs Program B a moment later, the same result occurs and the file is truncated.

At this point, I would've liked a way to throw an error to User 2 stating the file is opened for reading somewhere else and cannot be written at this time. But if User 3 came along and opened the file for reading, then there would be no problem.

Update:

A potential solution is to change the permissions of a file (with exception catching if the file is already in use):

>>> os.chmod(fp, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH)
>>> with open(fp, 'wb') as f:
        # do something
PermissionError: [Errno 13] Permission denied <fp>

This doesn't feel like the best solution (particularly if the users didn't have the permission to even change permissions). Still looking for a proper locking solution but msvcrt doesn't prevent truncating and writing if the file is locked for reading. There still doesn't appear to be a way to generate an exclusive lock with Python's standard library.

5
  • If it's just Windows, you can call CreateFile (e.g. PyWin32's win32file.CreateFile) and set the sharing mode to the desired read/execute, write/append, and delete/rename sharing. Wrap the file handle it returns with a file descriptor via msvcrt.open_osfhandle. Then open the file descriptor via open. Commented Mar 17, 2020 at 4:09
  • @ErykSun But that is not a Python standard library implementation is it? It requires PyWin32. Commented Mar 17, 2020 at 13:18
  • 1
    The standard library has ctypes. It's a bit more work to implement it with ctypes, assuming you set the function prototypes and properly handle errors and exceptions to make it idiomatic. Commented Mar 17, 2020 at 16:33
  • @ErykSun Yep, this is the way I am currently going. Commented Mar 17, 2020 at 16:35
  • @ErykSun While it works as intended (I will post a solution), oddly enough CreateFileW doesn't throw a FileNotFoundError. If the path doesn't exist, it returns -1 instead and then msvcrt.open_osfhandle returns an OSError: Bad file descriptor. Per the MSDN docs I would've thought the former error would've been raised. Commented Mar 17, 2020 at 18:13

1 Answer 1

1

For those who are interested in a Windows specific solution:

import os
import ctypes
import msvcrt
import pathlib

# Windows constants for file operations
NULL = 0x00000000
CREATE_ALWAYS = 0x00000002
OPEN_EXISTING = 0x00000003
FILE_SHARE_READ = 0x00000001
FILE_ATTRIBUTE_READONLY = 0x00000001  # strictly for file reading
FILE_ATTRIBUTE_NORMAL = 0x00000080  # strictly for file writing
FILE_FLAG_SEQUENTIAL_SCAN = 0x08000000
GENERIC_READ = 0x80000000
GENERIC_WRITE = 0x40000000

_ACCESS_MASK = os.O_RDONLY | os.O_WRONLY
_ACCESS_MAP = {os.O_RDONLY: GENERIC_READ,
               os.O_WRONLY: GENERIC_WRITE
               }

_CREATE_MASK = os.O_CREAT | os.O_TRUNC
_CREATE_MAP = {NULL: OPEN_EXISTING,
               os.O_CREAT | os.O_TRUNC: CREATE_ALWAYS
               }

win32 = ctypes.WinDLL('kernel32.dll', use_last_error=True)
win32.CreateFileW.restype = ctypes.c_void_p
INVALID_FILE_HANDLE = ctypes.c_void_p(-1).value


def _opener(path: pathlib.Path, flags: int) -> int:

    access_flags = _ACCESS_MAP[flags & _ACCESS_MASK]
    create_flags = _CREATE_MAP[flags & _CREATE_MASK]

    if flags & os.O_WRONLY:
        share_flags = NULL
        attr_flags = FILE_ATTRIBUTE_NORMAL
    else:
        share_flags = FILE_SHARE_READ
        attr_flags = FILE_ATTRIBUTE_READONLY

    attr_flags |= FILE_FLAG_SEQUENTIAL_SCAN

    h = win32.CreateFileW(path, access_flags, share_flags, NULL, create_flags, attr_flags, NULL)

    if h == INVALID_FILE_HANDLE:
        raise ctypes.WinError(ctypes.get_last_error())

    return msvcrt.open_osfhandle(h, flags)


class _FileControlAccessor(pathlib._NormalAccessor):

    open = staticmethod(_opener)


_control_accessor = _FileControlAccessor()


class Path(pathlib.WindowsPath):

    def _init(self) -> None:

        self._closed = False
        self._accessor = _control_accessor

    def _opener(self, name, flags) -> int:

        return self._accessor.open(name, flags)
Sign up to request clarification or add additional context in comments.

23 Comments

Use kernel32 = WinDLL('kernel32.dll', use_last_error=True). Set the result type to a pointer (handle): kernel32.CreateFileW.restype = ctypes.c_void_p. Define INVALID_HANDLE_VALUE = ctypes.c_void_p(-1).value. If it returns the latter, then raise ctypes.WinError(ctypes.get_last_error()).
This is not a lock file. The share mode is per-open, not per-process, so if you don't share write access, the file cannot be reopened with write access, not even by your own process. Your code should directly use this handle, wrapped in an fd via msvcrt.open_osfhandle. You can subsequently open the fd as a file object via builtin open.
Your function that opens a handle, wraps it in an fd, and returns a file object should do the last two steps in nested try-finally blocks. If open_osfhandle fails, the finally handler should call kernel32.CloseHandle(handle) to avoid leaking a handle. If open fails, the finally handler should call os.close(fd). After the handle is owned by the file object, do not call CloseHandle on the handle or os.close on the fd. Python's I/O stack owns it now.
c_void_p is initialized from an unsigned integral pointer value, which is typically a memory address or an opaque handle. As a 64-bit signed integer, -1 is natively (i.e. at the bare metal level in the CPU) represented as 0xFFFF_FFFF_FFFF_FFFF, where each hexadecimal digit is 4 bits and 0xF is 0b1111. As an unsigned value, this is 18446744073709551615. To understand why -1 is stored this way as a signed 64-bit integer, read about two's complement.
Note that Python's int type itself doesn't store the value as two's complement, as its a variable-sized "big int" implementation. However, its bitwise operations do preserve what one would expect with two's complement, e.g. -1 & (2**64 - 1) == 18446744073709551615.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.