Skip to content

Commit 718ab66

Browse files
authored
GH-125413: Add pathlib.Path.info attribute (#127730)
Add `pathlib.Path.info` attribute, which stores an object implementing the `pathlib.types.PathInfo` protocol (also new). The object supports querying the file type and internally caching `os.stat()` results. Path objects generated by `Path.iterdir()` are initialised with status information from `os.DirEntry` objects, which is gleaned from scanning the parent directory. The `PathInfo` protocol has four methods: `exists()`, `is_dir()`, `is_file()` and `is_symlink()`.
1 parent a1417b2 commit 718ab66

File tree

10 files changed

+526
-101
lines changed

10 files changed

+526
-101
lines changed

Doc/library/pathlib.rst

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1177,6 +1177,38 @@ Querying file type and status
11771177
.. versionadded:: 3.5
11781178

11791179

1180+
.. attribute:: Path.info
1181+
1182+
A :class:`~pathlib.types.PathInfo` object that supports querying file type
1183+
information. The object exposes methods that cache their results, which can
1184+
help reduce the number of system calls needed when switching on file type.
1185+
For example::
1186+
1187+
>>> p = Path('src')
1188+
>>> if p.info.is_symlink():
1189+
... print('symlink')
1190+
... elif p.info.is_dir():
1191+
... print('directory')
1192+
... elif p.info.exists():
1193+
... print('something else')
1194+
... else:
1195+
... print('not found')
1196+
...
1197+
directory
1198+
1199+
If the path was generated from :meth:`Path.iterdir` then this attribute is
1200+
initialized with some information about the file type gleaned from scanning
1201+
the parent directory. Merely accessing :attr:`Path.info` does not perform
1202+
any filesystem queries.
1203+
1204+
To fetch up-to-date information, it's best to call :meth:`Path.is_dir`,
1205+
:meth:`~Path.is_file` and :meth:`~Path.is_symlink` rather than methods of
1206+
this attribute. There is no way to reset the cache; instead you can create
1207+
a new path object with an empty info cache via ``p = Path(p)``.
1208+
1209+
.. versionadded:: 3.14
1210+
1211+
11801212
Reading and writing files
11811213
^^^^^^^^^^^^^^^^^^^^^^^^^
11821214

@@ -1903,3 +1935,56 @@ Below is a table mapping various :mod:`os` functions to their corresponding
19031935
.. [4] :func:`os.walk` always follows symlinks when categorizing paths into
19041936
*dirnames* and *filenames*, whereas :meth:`Path.walk` categorizes all
19051937
symlinks into *filenames* when *follow_symlinks* is false (the default.)
1938+
1939+
1940+
Protocols
1941+
---------
1942+
1943+
.. module:: pathlib.types
1944+
:synopsis: pathlib types for static type checking
1945+
1946+
1947+
The :mod:`pathlib.types` module provides types for static type checking.
1948+
1949+
.. versionadded:: 3.14
1950+
1951+
1952+
.. class:: PathInfo()
1953+
1954+
A :class:`typing.Protocol` describing the
1955+
:attr:`Path.info <pathlib.Path.info>` attribute. Implementations may
1956+
return cached results from their methods.
1957+
1958+
.. method:: exists(*, follow_symlinks=True)
1959+
1960+
Return ``True`` if the path is an existing file or directory, or any
1961+
other kind of file; return ``False`` if the path doesn't exist.
1962+
1963+
If *follow_symlinks* is ``False``, return ``True`` for symlinks without
1964+
checking if their targets exist.
1965+
1966+
.. method:: is_dir(*, follow_symlinks=True)
1967+
1968+
Return ``True`` if the path is a directory, or a symbolic link pointing
1969+
to a directory; return ``False`` if the path is (or points to) any other
1970+
kind of file, or if it doesn't exist.
1971+
1972+
If *follow_symlinks* is ``False``, return ``True`` only if the path
1973+
is a directory (without following symlinks); return ``False`` if the
1974+
path is any other kind of file, or if it doesn't exist.
1975+
1976+
.. method:: is_file(*, follow_symlinks=True)
1977+
1978+
Return ``True`` if the path is a file, or a symbolic link pointing to
1979+
a file; return ``False`` if the path is (or points to) a directory or
1980+
other non-file, or if it doesn't exist.
1981+
1982+
If *follow_symlinks* is ``False``, return ``True`` only if the path
1983+
is a file (without following symlinks); return ``False`` if the path
1984+
is a directory or other other non-file, or if it doesn't exist.
1985+
1986+
.. method:: is_symlink()
1987+
1988+
Return ``True`` if the path is a symbolic link (even if broken); return
1989+
``False`` if the path is a directory or any kind of file, or if it
1990+
doesn't exist.

Doc/whatsnew/3.14.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -617,6 +617,15 @@ pathlib
617617

618618
(Contributed by Barney Gale in :gh:`73991`.)
619619

620+
* Add :attr:`pathlib.Path.info` attribute, which stores an object
621+
implementing the :class:`pathlib.types.PathInfo` protocol (also new). The
622+
object supports querying the file type and internally caching
623+
:func:`~os.stat` results. Path objects generated by
624+
:meth:`~pathlib.Path.iterdir` are initialized with file type information
625+
gleaned from scanning the parent directory.
626+
627+
(Contributed by Barney Gale in :gh:`125413`.)
628+
620629

621630
pdb
622631
---

Lib/glob.py

Lines changed: 30 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,7 @@ def lexists(path):
348348

349349
@staticmethod
350350
def scandir(path):
351-
"""Implements os.scandir().
351+
"""Like os.scandir(), but generates (entry, name, path) tuples.
352352
"""
353353
raise NotImplementedError
354354

@@ -425,23 +425,18 @@ def wildcard_selector(self, part, parts):
425425

426426
def select_wildcard(path, exists=False):
427427
try:
428-
# We must close the scandir() object before proceeding to
429-
# avoid exhausting file descriptors when globbing deep trees.
430-
with self.scandir(path) as scandir_it:
431-
entries = list(scandir_it)
428+
entries = self.scandir(path)
432429
except OSError:
433430
pass
434431
else:
435-
prefix = self.add_slash(path)
436-
for entry in entries:
437-
if match is None or match(entry.name):
432+
for entry, entry_name, entry_path in entries:
433+
if match is None or match(entry_name):
438434
if dir_only:
439435
try:
440436
if not entry.is_dir():
441437
continue
442438
except OSError:
443439
continue
444-
entry_path = self.concat_path(prefix, entry.name)
445440
if dir_only:
446441
yield from select_next(entry_path, exists=True)
447442
else:
@@ -483,15 +478,11 @@ def select_recursive(path, exists=False):
483478
def select_recursive_step(stack, match_pos):
484479
path = stack.pop()
485480
try:
486-
# We must close the scandir() object before proceeding to
487-
# avoid exhausting file descriptors when globbing deep trees.
488-
with self.scandir(path) as scandir_it:
489-
entries = list(scandir_it)
481+
entries = self.scandir(path)
490482
except OSError:
491483
pass
492484
else:
493-
prefix = self.add_slash(path)
494-
for entry in entries:
485+
for entry, _entry_name, entry_path in entries:
495486
is_dir = False
496487
try:
497488
if entry.is_dir(follow_symlinks=follow_symlinks):
@@ -500,7 +491,6 @@ def select_recursive_step(stack, match_pos):
500491
pass
501492

502493
if is_dir or not dir_only:
503-
entry_path = self.concat_path(prefix, entry.name)
504494
if match is None or match(str(entry_path), match_pos):
505495
if dir_only:
506496
yield from select_next(entry_path, exists=True)
@@ -528,9 +518,16 @@ class _StringGlobber(_GlobberBase):
528518
"""Provides shell-style pattern matching and globbing for string paths.
529519
"""
530520
lexists = staticmethod(os.path.lexists)
531-
scandir = staticmethod(os.scandir)
532521
concat_path = operator.add
533522

523+
@staticmethod
524+
def scandir(path):
525+
# We must close the scandir() object before proceeding to
526+
# avoid exhausting file descriptors when globbing deep trees.
527+
with os.scandir(path) as scandir_it:
528+
entries = list(scandir_it)
529+
return ((entry, entry.name, entry.path) for entry in entries)
530+
534531
if os.name == 'nt':
535532
@staticmethod
536533
def add_slash(pathname):
@@ -544,3 +541,19 @@ def add_slash(pathname):
544541
if not pathname or pathname[-1] == '/':
545542
return pathname
546543
return f'{pathname}/'
544+
545+
546+
class _PathGlobber(_GlobberBase):
547+
"""Provides shell-style pattern matching and globbing for pathlib paths.
548+
"""
549+
550+
lexists = operator.methodcaller('exists', follow_symlinks=False)
551+
add_slash = operator.methodcaller('joinpath', '')
552+
553+
@staticmethod
554+
def scandir(path):
555+
return ((child.info, child.name, child) for child in path.iterdir())
556+
557+
@staticmethod
558+
def concat_path(path, text):
559+
return path.with_segments(str(path) + text)

Lib/pathlib/_abc.py

Lines changed: 29 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,9 @@
1313

1414
import functools
1515
import io
16-
import operator
1716
import posixpath
1817
from errno import EINVAL
19-
from glob import _GlobberBase, _no_recurse_symlinks
18+
from glob import _PathGlobber, _no_recurse_symlinks
2019
from pathlib._os import copyfileobj
2120

2221

@@ -76,21 +75,6 @@ def magic_open(path, mode='r', buffering=-1, encoding=None, errors=None,
7675
raise TypeError(f"{cls.__name__} can't be opened with mode {mode!r}")
7776

7877

79-
class PathGlobber(_GlobberBase):
80-
"""
81-
Class providing shell-style globbing for path objects.
82-
"""
83-
84-
lexists = operator.methodcaller('exists', follow_symlinks=False)
85-
add_slash = operator.methodcaller('joinpath', '')
86-
scandir = operator.methodcaller('_scandir')
87-
88-
@staticmethod
89-
def concat_path(path, text):
90-
"""Appends text to the given path."""
91-
return path.with_segments(str(path) + text)
92-
93-
9478
class CopyReader:
9579
"""
9680
Class that implements the "read" part of copying between path objects.
@@ -367,7 +351,7 @@ def full_match(self, pattern, *, case_sensitive=None):
367351
pattern = self.with_segments(pattern)
368352
if case_sensitive is None:
369353
case_sensitive = _is_case_sensitive(self.parser)
370-
globber = PathGlobber(pattern.parser.sep, case_sensitive, recursive=True)
354+
globber = _PathGlobber(pattern.parser.sep, case_sensitive, recursive=True)
371355
match = globber.compile(str(pattern))
372356
return match(str(self)) is not None
373357

@@ -388,33 +372,45 @@ class ReadablePath(JoinablePath):
388372
"""
389373
__slots__ = ()
390374

375+
@property
376+
def info(self):
377+
"""
378+
A PathInfo object that exposes the file type and other file attributes
379+
of this path.
380+
"""
381+
raise NotImplementedError
382+
391383
def exists(self, *, follow_symlinks=True):
392384
"""
393385
Whether this path exists.
394386
395387
This method normally follows symlinks; to check whether a symlink exists,
396388
add the argument follow_symlinks=False.
397389
"""
398-
raise NotImplementedError
390+
info = self.joinpath().info
391+
return info.exists(follow_symlinks=follow_symlinks)
399392

400393
def is_dir(self, *, follow_symlinks=True):
401394
"""
402395
Whether this path is a directory.
403396
"""
404-
raise NotImplementedError
397+
info = self.joinpath().info
398+
return info.is_dir(follow_symlinks=follow_symlinks)
405399

406400
def is_file(self, *, follow_symlinks=True):
407401
"""
408402
Whether this path is a regular file (also True for symlinks pointing
409403
to regular files).
410404
"""
411-
raise NotImplementedError
405+
info = self.joinpath().info
406+
return info.is_file(follow_symlinks=follow_symlinks)
412407

413408
def is_symlink(self):
414409
"""
415410
Whether this path is a symbolic link.
416411
"""
417-
raise NotImplementedError
412+
info = self.joinpath().info
413+
return info.is_symlink()
418414

419415
def __open_rb__(self, buffering=-1):
420416
"""
@@ -437,15 +433,6 @@ def read_text(self, encoding=None, errors=None, newline=None):
437433
with magic_open(self, mode='r', encoding=encoding, errors=errors, newline=newline) as f:
438434
return f.read()
439435

440-
def _scandir(self):
441-
"""Yield os.DirEntry-like objects of the directory contents.
442-
443-
The children are yielded in arbitrary order, and the
444-
special entries '.' and '..' are not included.
445-
"""
446-
import contextlib
447-
return contextlib.nullcontext(self.iterdir())
448-
449436
def iterdir(self):
450437
"""Yield path objects of the directory contents.
451438
@@ -471,7 +458,7 @@ def glob(self, pattern, *, case_sensitive=None, recurse_symlinks=True):
471458
else:
472459
case_pedantic = True
473460
recursive = True if recurse_symlinks else _no_recurse_symlinks
474-
globber = PathGlobber(self.parser.sep, case_sensitive, case_pedantic, recursive)
461+
globber = _PathGlobber(self.parser.sep, case_sensitive, case_pedantic, recursive)
475462
select = globber.selector(parts)
476463
return select(self)
477464

@@ -498,18 +485,16 @@ def walk(self, top_down=True, on_error=None, follow_symlinks=False):
498485
if not top_down:
499486
paths.append((path, dirnames, filenames))
500487
try:
501-
with path._scandir() as entries:
502-
for entry in entries:
503-
name = entry.name
504-
try:
505-
if entry.is_dir(follow_symlinks=follow_symlinks):
506-
if not top_down:
507-
paths.append(path.joinpath(name))
508-
dirnames.append(name)
509-
else:
510-
filenames.append(name)
511-
except OSError:
512-
filenames.append(name)
488+
for child in path.iterdir():
489+
try:
490+
if child.info.is_dir(follow_symlinks=follow_symlinks):
491+
if not top_down:
492+
paths.append(child)
493+
dirnames.append(child.name)
494+
else:
495+
filenames.append(child.name)
496+
except OSError:
497+
filenames.append(child.name)
513498
except OSError as error:
514499
if on_error is not None:
515500
on_error(error)

0 commit comments

Comments
 (0)