Skip to content

[io_file] Add the ability to get file metadata on Windows. #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

brianquinlan
Copy link
Contributor

@brianquinlan brianquinlan commented Mar 17, 2025

  • Implements metadata fetching for Windows (equivalent to dart:io FileStat.stat)
  • I will reorganize the code and tests so that common concepts are in class Metadata when I implement the POSIX version.
  • Documentation is incomplete because I may move attributes around.
  • I'm not confident in the API design - in dart:io FileStat is not constructible but I'm making the Metadata constructor public, which means that it will be harder to change. But I was imagining that making it constructible would be useful in mocks.

TL;DR: The implementation should be correct but the API will change.


  • I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:

Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.

@brianquinlan brianquinlan marked this pull request as draft March 17, 2025 17:42
@brianquinlan brianquinlan marked this pull request as ready for review March 18, 2025 03:10
@brianquinlan
Copy link
Contributor Author

@halildurmus Ready for review!

Copy link

@halildurmus halildurmus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brianquinlan
Copy link
Contributor Author

LGTM

Thank you!

Copy link
Member

@lrhn lrhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. My comments are mainly nits, "improvement suggestions", optimization suggestions (which can definitely be delayed until the API is complete) and some expected Windows quirkiness.

if (fileSystem case final WindowsFileSystem fs) {
return fs.metadata(path).isHidden;
} else {
// On POSIX, convention is that files starting with a period are hidden
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe: "files" -> "files and directories"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

final count = (isDirectory ? 1 : 0) + (isFile ? 1 : 0) + (isLink ? 1 : 0);
if (count > 1) {
// TODO(brianquinlan): Decide whether a path must be a a file, directory
// or link and whether it can be more than one of these at once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows links are either file links or directory links. It makes sense that they can represent the difference.

Do you have to use WindowsMetadata below on Windows?
(That is: What is the use-cases for this class. Is it a default least-common-denominator of the platform specific metadata subclasses? Can users use it as input to any function of the file system, where they can fx. choose between passing in a Metadata or a WindowsMetadata to a WindowsFilesystem? If so, they can use this class on Windows. If not, does it need a public constructor?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to use WindowsMetadata on Windows. The usual pattern of calling fileSystem.metadata(path) will return a Metadata, which should contain the metadata available on all platforms. WindowsMetadata is designed to capture the Windows platform-specific attributes.

The constructor is public because I want other people to be able to implement their own file system that return Metadata or a subclass.

Maybe I should make this an abstract interface? But then I won't be able to add to it in the future.

I'm not confident in my API design choices here :-(

final bool isFile;
final bool isDirectory;
final bool isLink;
final int size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Totally over-optimizing, ... but you can store three bools in one int and save two storage slots. I don't trust our compilers to do that for you.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll consider it ;-)

import 'package:path/path.dart' as p;

bool isHidden(String path) {
if (fileSystem case final WindowsFileSystem fs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For future extension: The OSX file system has a hidden flag too, you can set it with chflags hidden /path/to/folder/. Not all tools recognize it - Finder does, ls doesn't. It's from BSD, so other Unixes might have it too. We should probably support it if possible/reasonable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, linux has a hidden flag too - it's just not used in many file systems. I plan on moving some of these attributes around when I do the POSIX stat implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the semantics probably are quite different it makes a lot of sense to not share the hidden flag between platforms though they are named the same...


final isDirectory = attributes & win32.FILE_ATTRIBUTE_DIRECTORY > 0;
final isLink = attributes & win32.FILE_ATTRIBUTE_REPARSE_POINT > 0;
final isFile = !(isDirectory || isLink);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a link is a link to a directory (or to a link to a directory) I expect both of isDirectory and isLink to be true. (Can't check right now, no Windows machine.)

In that case isFile should probably just be !isDirectory (And you don't need to store it, you can just have bool get isFile => !isDiretory;.)

Or you could make isDirectory be (attributes & ...) & !isLink, and then have WindowsMetadata specific getters for isFileLink and/or isDirectoryLink, separate from isFile and isDirectory, so isFile, isDirectory and isLink are mutually exclusive again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Windows, the type of link is not indicated in GetFileAttributes

And yeah, there is some tension here. Right now I was going to make one of the is* true and the values would be consistent on POSIX and Windows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, and I think a reparse point having the DIRECTORY bit set controls whether it's a file or directory reparse point.
(It doesn't say whether it's a junction, symbolic link, or something more esoteric, I think you need a second query to get the reparse-data.)

Every entity in the Windows FS is either a file or a directory, including links, controlled by the DIRECTORY bit.
If it's a reparse-point, then it's also a link, so it can be link-and-file or link-and-directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes sense from an API POV? I think that no more than one of these should be set. And maybe it makes sense for zero of them to be set e.g. for a unix domain socket, a named pipe, etc.

@@ -23,4 +61,12 @@ base class FileSystem {
void rename(String oldPath, String newPath) {
throw UnsupportedError('rename');
}

/// Returns metadata for the given path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method has a noun-phrased word, so it's a parameterized getter. Document it the same way.
Drop the "Returns". (Never start a DartDoc with the word "Returns". )

Maybe just /// Metadata for the file or directory at [path].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote as:

  /// Metadata for the file system object at [path].

@brianquinlan
Copy link
Contributor Author

@lrhn Please take another look when you get a chance.

/// If `path` represents a symbolic link then metadata for the link is
/// returned.
Metadata metadata(String path) {
throw UnsupportedError('metadata');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid design for a growing API, but also a potentially troublesome one.

It can become a problem if people can't trust that methods are implemented.

If anyone ever feels they need to write try { fs.something(); } on UnsupportedError { ... }, I'd say the design has failed.

Maybe some annotations can help ...
Maybe start with @subclassMustOverride ... something() => throw UnsupportedError("something not implemented"); and then, in a later version, change it to ... something();. But that's still a breaking change, so it'd deserve a major version increment.

And then one might as well just add something(); directly, in a new major version. If most users will just use the file systems supprted by default, and only a few rarely-used packages will try to implement FileSystem, then .... well, versioning is still hard.

MAYBE, just for complete pedantic adherence to semver:

  • Make FileSystem a final class.
  • Export FileSystemBase which implements FileSystem and is base from a different package intended for file system implementors (with a strict dependence on the FileSystem package it comes from).
  • Then FileSystem can add a new member in a minor version increment, because to users it's a minor version increment, but the FileSystemBase class package gets a major version increment.
  • So FS 3.5.0 adds a new member. FSB (base package) releases 35.0.0 supporting (and requiring) that member. If you want to use FS 3.5.0 for that new member, and your custom file system implementation doesn't have a version that works with FSB 35.0.0, it won't solve. You can use FS 3.4.0/FSB 34.0.0, but not the new member, or you can make CustomeFS release a version that implements FSB 35.0.0's FileSystem.

Not sure the complexity is worth the result. It'll still end up with failed resolution. The only benefit is that an incremental addition to the API isn't breaking (and does not require a major version increment, which is hard to push through if multiple packages use the library) for code that only uses the default implementation.

this.isReadOnly = false,
this.isHidden = false,
this.isSystem = false,
this.isArchive = false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding is works for most things, but not all. One size doesn't fit all, as usual :)

I can see why "Flag" isn't great in the name. needsArchive, isArchiveReady or isArchiveable might work (if the last one is even a word). So stepping back ...

The read only flag means that the file "is read-only". ✓
The hidden flag means that the file "is hidden". ✓
The system flag means that the file "is (a) system (file)". ✓

The "archive" bit is really a "dirty" bit, ... and isDirty would be fine, it's just not the traditional name.
Phrased in terms of "archiving", it means "needs to be archived", "do archive" (matching the flag name) and "is not archived", which is the opposite of "is archived", and isArchive just doesn't work in that context.
We could use isArchived if we flip the value. It's my best suggestion so far.

  if (!file.isArchived) copy(file, archivePathOf(file)); // ✓

this.isArchive = false,
this.isTemporary = false,
this.isOffline = false,
this.isContentNotIndexed = false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go with a good Dart API, with names that are relatable to the windows API, with documentation about the specific relation.

If we were just doing a Windows API, it would be an integer and bit flags.

So something like:

  /// Whether the file is considered archived, and not in need of being archived.
  ///
  /// The Windows file system ARCHIVE flag (traditionally displayed as an "A" attribute)
  /// is set when the file needs archiving. 
  /// This value expresses whether the ARCHIVE flag is unset.
  abstract final bool isArchived;

or

  /// Whether the file needs to be archived.
  ///
  /// The Windows file system ARCHIVE flag (traditionally displayed as an "A" attribute)
  /// is set when the file needs archiving. 
  /// This value expresses whether the ARCHIVE flag is set.
  abstract final bool isNotArchived;

(Heck, have both! It's not unheard of to have both isEmpty and isNotEmpty.)

/// Sets metadata for the file system entity.
///
/// TODO(brianquinlan): Document the arguments.
void setMetadata(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make any sense to take a Metadata argument that can be used as base, instead of fetching it from the file system.

It's not an atomic operation to setMetadata if it starts by reading metadata, does updating, and the writes it back. If it's not an atomic operation, it means the file system can change between fetching the existing attributes and writing them back.
Not being atomic is generally not a good thing for low-level functions.

It also means that if I first call metadata to look at the data, then decide to change the values using setMetadata, then I'll actually read the attributes twice.

If I could pass the WindowsMetadata I got from metadata as a base argument to setMetadata, so it uses that instead of fetching new, then the set operation becomes atomic. (The entire operation doesn't, but that's clear when I call two different methods.)

If the base argument is omitted, it can still fetch the attributes from the file system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// `FILE_ATTRIBUTE_NORMAL` indicates that no other attributes are set and
// is valid only when used alone.
attributes = win32.FILE_ATTRIBUTE_NORMAL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Could we check here whether the value didn't change, and then not write anything?
So keep a copy of the original, and if (original == attributes) return;.
It's not the same - if the underlying file system has changed in the meantime, it matters whether we write or not. Especially if using a base metadata argument, then you should probably always write.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm supporting base (called original) and always write.


final isDirectory = attributes & win32.FILE_ATTRIBUTE_DIRECTORY > 0;
final isLink = attributes & win32.FILE_ATTRIBUTE_REPARSE_POINT > 0;
final isFile = !(isDirectory || isLink);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, and I think a reparse point having the DIRECTORY bit set controls whether it's a file or directory reparse point.
(It doesn't say whether it's a junction, symbolic link, or something more esoteric, I think you need a second query to get the reparse-data.)

Every entity in the Windows FS is either a file or a directory, including links, controlled by the DIRECTORY bit.
If it's a reparse-point, then it's also a link, so it can be link-and-file or link-and-directory.

isFile: isFile,
isOffline: attributes & win32.FILE_ATTRIBUTE_OFFLINE != 0,
isContentNotIndexed:
attributes & win32.FILE_ATTRIBUTE_NOT_CONTENT_INDEXED != 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Could just store the attributes and compute the booleans on access.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@brianquinlan
Copy link
Contributor Author

Changed isArchive to needsArchive.
Changed isContentNotIndexed to isContentIndexed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants