Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filesystem: Revert deprecation of FileHandle #3867

Merged
merged 4 commits into from
Mar 23, 2017

Conversation

geky
Copy link
Contributor

@geky geky commented Mar 1, 2017

As identified by @hasnainvirk, @kjbracey-arm, the FileHandle and FileBase serve two separate functions and their integration is limiting for certain use cases.

FileLike is actually the redundant class here, but the multiple inheritance it provides is used as a hack by the retargeting code to get at the FileHandle implementation bound to the FileBase name.

It may make more sense for the FileBase to inherit from FileHandle, (with perhaps a different name), but rather than explore the possibility, this will just restore the previous hierarchy.

cc @simonqhughes, @hasnainvirk, @kjbracey-arm

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor nits/suggestions.

* and associated _sys_* functions.
*
* FileHandle is an abstract class, needing at least sys_write and
* FileHandle is an abstract class, needing at least sys_write and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read() and write()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, technically, do you even need read() and write()? You could provide just one. Or maybe you only have len()?

The FileHandle is in essence so abstract that pretty much everything is device-dependent. In POSIX there are rules about classes of devices (real files, pipes, sockets...), but to a large extent devices can do pretty much what they like.

Then it's more a matter of common sense what sort of application you attach to what sort of device. (Don't run a database on a serial port, don't seek if you want to support tape drives, don't try to talk PPP to a text file).

Copy link
Contributor Author

@geky geky Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this comment was pulled in from the previous version and is kinda inaccurate, will update.

A file of only length is an interesting thought exercise...

It could make sense to have other interfaces for devices with different limitations (ie ReadStream, Seekable), but after expirementing with the idea, it doesn't seem to fit in with c++'s oop inheritance well and brought little benefit outside of documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my view too - it's so generic and open-ended, it seems futile to try to pin it down through C++ classes.

@@ -91,7 +114,8 @@ class FileHandle {
* new file position on success,
* -1 on failure or unsupported
*/
virtual off_t lseek(off_t offset, int whence) = 0;
MBED_DEPRECATED_SINCE("mbed-os-5.4", "Replaced by FileLike::seek")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileHandle::seek

virtual void lock() {
// Stub
}
MBED_DEPRECATED_SINCE("mbed-os-5.4", "Replaced by FileLike::sync")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileHandle::sync

// Stub
}
MBED_DEPRECATED_SINCE("mbed-os-5.4", "Replaced by FileLike::sync")
virtual int fsync() { return sync(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you've removed lock - I'm going to need to think a bit about whether this is right or wrong.

Its semantics certainly weren't clear to me, so maybe it should go? I guess it's not part of the abstract API. All external calls should not need to call it, right? Okay, makes sense.

I think some lock might be necessary for the poll mechanism, but maybe that can go into a separate bit of machinery (PollableFileHandle?), like FileBase's lock. (Possibly protected?)

Copy link
Contributor Author

@geky geky Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that was unintentional, are those lock/unlock functions used at all? It seems like files should probably not be shared accross threads, and if you did you would want to lock at a higher level anyways?

Could they be added back in if we needed them in a backwards compatible manner? I'm also not sure any classes provide the functionality they promise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no, they weren't used. retarget didn't call them, nor did anyone else. They were kind of just like a placeholder for a lock that may or may not be used internally.

So I think they can go.

There is another conceptual thought on the subject though - some FileHandles may be usable from interrupt (and hence directly from attach callback). Their internal lock could be a critical section, rather than a mutex. I believe our BufferedSerial could maybe be like this. I'm wondering whether this could be an API feature that could ever be relied on.

Could it maybe be the case that having the lock visible might make that overrideable, like it is for some of the HAL devices? Override lock to prevent a mutex claim?

My feeling is that I don't want to do that - not abstract enough. Instead, I suggest internal lock behaviour is up to the class. Generic callers of FileHandle methods should assume that by default it has to be called from foreground context, but a given FileHandle might document that it is usable from interrupt.

@@ -52,8 +53,8 @@ FileBase::~FileBase() {
_mutex->unlock();

if (getPathType() == FilePathType) {
extern void remove_filehandle(FileLike *file);
remove_filehandle(static_cast<FileLike*>(this));
extern void remove_filehandle(FileHandle *file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also found ourselves adding global functions to retarget.cpp, and wondering where to declare them.

Given the absence of a separate header file, we decided that FileHandle.h was the correct header file to declare them in - it seems logical that system functions acting on FileHandles like fdopen and remove_filehandle can live there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we actually have retarget.h now, perhaps system functions should go there? retarget.h sounds like a reasonable location for fdopen at least.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fdopen is a little special, as its the only thing that works on both FILE and FileHandle, so must be implemented in retarget.cpp (or at least it needs knowledge of retarget.cpp internals). It goes in <stdio.h> in POSIX,

Then there are things like poll() which are non-member things working on 1 or more FileHandles - they go in <unistd.h> in POSIX.

retarget.h feels a bit wrong to me. It feels like it should be the fairly private C library->platform API, rather than something applications should be using. But maybe it's not that?

Wonder where other non-members like poll should go. They're not bound in any way to retarget.cpp, so I had them in FileHandle.cpp and FileHandle.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's an interesting question. It sounds like poll should go in its own file? There are a few other non-member functions like sleep and wait that get their own files in a similar manner.

*/
virtual ssize_t read(void* buffer, size_t length) = 0;
virtual int sync() = 0;
Copy link
Contributor

@kjbracey kjbracey Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've deprecated fsync() in favour of sync(). Those are two different calls in POSIX. Which one is this supposed to be? I'm assuming fsync() (flush and block until flushed), because it has a return code, but it's named like sync() (schedule flush and don't block, no success report).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Posix, sync is an operation on filesystems, and fsync is an operation on files. The "f" prefix effectively provides a namespace for C functions so they don't share a name. In c++ we have the namespace provided by the class, so the "f" prefix is redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really buying this argument. Non-standard naming for standard things is confusing. This has driven me nuts in the past - eg Nanostack is still suffering from the fact that someone called its socket write function send. I can't fix that breakage to introduce the standard send that takes flags (because C, so no overloading).

If this API wasn't so obviously POSIX, the deviation from POSIX naming wouldn't really be an issue. The fact that it is otherwise exactly correlates to POSIX is what makes it look like the name difference might have some meaning. You should at least document that this is equivalent to POSIX fsync().

(Wasn't there a whole lot of discussion about introducing standard POSIX file system concepts recently too? That's why it's surprising me to see moves away from POSIX. Or is there an intent that we provide totally standard C fsync(), and people use it?)

How's about retaining fsync(), but not deprecating it? Appeasing both "neat names" and "POSIX names" advocates. Not sure which one calls which in that case.

@@ -91,7 +114,8 @@ class FileHandle {
* new file position on success,
* -1 on failure or unsupported
*/
virtual off_t lseek(off_t offset, int whence) = 0;
MBED_DEPRECATED_SINCE("mbed-os-5.4", "Replaced by FileLike::seek")
virtual off_t lseek(off_t offset, int whence) { return seek(offset, whence); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why seek rather than lseek? lseek is the standard POSIX name - it struck me as good to use standard names to indicate standard intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "l" in lseek stands for "long", to differentiate early versions of the function from its 16-bit counterpart. I'm not sure this is useful anymore, especially in c++.


/** Rewind the file position to the beginning of the file
*
* @note This is equivalent to file_seek(file, 0, FS_SEEK_SET)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"seek(file, 0, SEEK_SET)"?

So why is it actually needed? Could it be a non-virtual helper that called that?

Copy link
Contributor Author

@geky geky Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For optimization purposes. For example, seek may walk back through the linked list of a file in FAT, where as rewind can simply reset the linked list to the head.

Since the default is trivial, there is little downside for implementations that just provide seek.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized the defaults aren't in this file (they're actually over here for file objects https://github.com/ARMmbed/mbed-os/blob/master/features/filesystem/FileSystem.cpp#L38).

I can go ahead and move the defaults up to this interface unless you see a good reason not to. Then their implementation is entirely optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd be fine with that, if we're not being religious about FileHandle being a totally pure interface class.

*
* @return The current offset in the file
*/
virtual off_t tell() = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for rewind, equivalent to "seek(file, 0, SEEK_CUR)". So is it needed as a separate helper? Maybe not virtual?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for optimization, which is probably the motivation behind these functions being a part of the posix spec.

Consider the implementation of these two functions in the FAT filesystem:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not dug into that one in detail, but it seems to me the tell implementation might be too optimised, to the point of being faulty.

Unlike seek, it doesn't check handle validity, it doesn't return sticky errors. Shouldn't it?

But yes, ftell[o] is standard POSIX, so why not. (I just failed to spot it)

@kjbracey
Copy link
Contributor

kjbracey commented Mar 2, 2017

Okay, just added some more comments about API that would have been better on #3773 (but I missed it). It seems that for some methods you're deprecating what was very clearly a POSIX-like API on <unistd.h> file descriptors with something that is a slight mush of POSIX and C and C++ APIs, which is a little confusing.

From my point of view, retarget.cpp is only one possible user of these file handles, so I don't feel we should align FileHandle's methods specifically with its naming, rather than sticking with the conventional POSIX concept.

@geky
Copy link
Contributor Author

geky commented Mar 2, 2017

For the filesystem we already provide an actual posix API, I hope to see this practice grow. Users should be able to use the posix API without surprises.

As for the c++ side of things, I did adopt a posix-like API, but updated to conform to c++ style and take advantage of c++ features. The most notable was the namespace introduced by the c++ FileHandle class. With this namespace, the 'f' prefix is redundant. With features such as function overloading, prefixing functions with the type of their arguments is also redundant.

I see the benefit of conforming to existing standards, and the api was designed to even if the names don't match exactly. (Note, work started with the File class at the top and were moved backwards during a separate integration step, so that's why things jiggled a little bit).


That being said, I don't care enough about the names of things to block anything. @0xc0170, @theotherjimmy, thoughts?

Function names different from posix, let me know if I missed any:

fsync -> FileHandle::sync
ftell -> FileHandle::tell
lseek -> FileHandle::seek

@kjbracey
Copy link
Contributor

kjbracey commented Mar 3, 2017

For the filesystem we already provide an actual posix API, I hope to see this practice grow. Users should be able to use the posix API without surprises.

So how do we make this work with non-filesystem FileHandles? It clearly should. Is it just a question of making sure a FileHandle * is registered in some array somewhere? Some sort of int mbed_bind_posix_file_descriptor(FileHandle *)?

@kjbracey
Copy link
Contributor

kjbracey commented Mar 3, 2017

Function names different from posix, let me know if I missed any:

ftell -> FileHandle::tell

It's worse than that, it's the even more ugly ftello(). Almost enough to make me relent.
size() is non-standard, but I guess it's a reasonable thing to have rather than a full stat(). (Do you have stat in the C API?). I was similarly planning to have non-standard set_blocking(), rather than fcntl(O_NONBLOCK). But maybe we should go generic?

And also, I note that ftello and size should presumably be const methods, from a C++ point of view

@kjbracey
Copy link
Contributor

kjbracey commented Mar 3, 2017

On naming, if we do have truly POSIX C functions, and they would work for our serial use, then the naming of the FileHandle methods is perhaps less significant - if people are going to use the real POSIX functions. I guess I was thinking of FileHandle as the primary application API, but maybe it could be more the proper POSIX->driver internal API.

@0xc0170
Copy link
Contributor

0xc0170 commented Mar 6, 2017

@geky - whats the status? How important is this patch, should this be added to known-issues (what's the scope of this patch) ?

Please look at the travis failures.

@geky
Copy link
Contributor Author

geky commented Mar 7, 2017

Sorry, I've been kept busy. I've updated most of Kevin's comments. @kjbracey-arm, are you ok with this being merged as is with the real posix functions being available?

If this patch doesn't get in, users may have deprecation notices that aren't valid, but other than that it should be able to come in without code changes to master. It'd be preferable to get in but not release stopping I think.

As a sidenote from a separate discussion, FileHandle/FileBase/FileLike should all be moved to platform, if no one is against this I'll update when I can fix the mbed 2 failures so it doesn't clobber Kevin's comments.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is/will also be possible to drive arbitrary FileHandles via the C POSIX interface, then fine.

@geky geky force-pushed the fs-revert-filehandle branch 4 times, most recently from 80566d1 to 7133236 Compare March 7, 2017 19:12
@0xc0170
Copy link
Contributor

0xc0170 commented Mar 9, 2017

/morph test

@mbed-bot
Copy link

mbed-bot commented Mar 9, 2017

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 1661

All builds and test passed!

@0xc0170
Copy link
Contributor

0xc0170 commented Mar 9, 2017

@geky As I see it, this would land into the next minor release, correct me if I am wrong

@0xc0170 0xc0170 removed the needs: CI label Mar 9, 2017
@geky
Copy link
Contributor Author

geky commented Mar 9, 2017

Preferably a patch release. Otherwise users that consume FileHandle already may have incorrect deprecation notices. This is more important with the filesystem being introduced this release.

@TuomoHautamaki
Copy link

@geky can you push this further, you may have to re-base as there has been some other conflicting changes lately. This one is blocking us to merge our filehandle and cellular feature branch to master

@adbridge
Copy link
Contributor

Also still waiting on a review from @simonqhughes

* @param buffer the buffer to write from
* @param length the number of characters to write
* @param buffer The buffer to read in to
* @param size The number of bytes to read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@param size should now be @param len.

* @returns
* The number of characters written (possibly 0) on success, -1 on error.
* @param buffer The buffer to write from
* @param size The number of bytes to write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@param size should now be @param len

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch 👍

As identified by @hasnainvirk, @kjbracey-arm, the FileHandle and
FileBase serve two separate functions and their integration is
limiting for certain use cases.

FileLike is actually the redundant class here, but the multiple
inheritance it provides is used as a hack by the retargeting code
to get at the FileHandle implementation bound to the FileBase name.

It may make more sense for the FileBase to inherit from FileHandle,
(with perhaps a different name), but rather than explore the
possibility, this will just restore the previous hierarchy.
Should follow same path as FileHandle, although this is less used
and there is currently no route to introduce a hook for a customized
DirHandle in retarget.
@simonqhughes
Copy link
Contributor

simonqhughes commented Mar 14, 2017

In mbed-os 5.4 the following changes were made:

  • class FileLike : public FileHandle, public FileBase became class FileLike : FileBase
  • class File : public FileLike was introduced.
  • the FileHandle methods were refactored into FileLike (abstract base class) and File (realisation) dispatching into FileSystem handlers.
  • class FileHandle deprecation.

This PR is:

  • reverting the point 1-3 above, and
  • reverting the FileHandle deprecation.

Please can someone articulate the reasons for this reversal?

@kjbracey
Copy link
Contributor

We need an abstract file handle API that makes it possible to implement things providing read/write etc that aren't files - eg serial devices.

That is, the direct analogue of the POSIX file descriptor.

FileHandle already provides that. FileLike is not such a good fit, as it assumes it is a thing with a name in a filing system, and has a bunch of implementation to do with that. It couples the two separate concepts of "I provide read/write/close" and "I am a named object in a filing system".

A POSIX file descriptor is not necessarily associated with a filing system, or openable via any sort of filing system call. It is just a thing that provides the read/write/seek/etc/close operations.

For example, a Socket could certainly be a FileHandle, but not so naturally a FileLike.

(Also, most C++ coding guidelines prohibit the dual inheritance, except to provide abstract APIs, so for objects to inherit from FileHandle to provide the API would be generally be seen as less problematic than inheriting from FileLike).

@TuomoHautamaki
Copy link

@simonqhughes can you take this further asap. Thanks

@sg-
Copy link
Contributor

sg- commented Mar 22, 2017

Given API changes this will be for 5.5 It also needs lots of work related to use and best practices and tests.

/morph test-nightly

@sg-
Copy link
Contributor

sg- commented Mar 22, 2017

@mbed-bot: TEST

HOST_OSES=ALL
BUILD_TOOLCHAINS=ALL
TARGETS=ALL

@mbed-bot
Copy link

[Build 1310]
SUCCESS: Building succeeded and tests were run! Be sure to check the test results

@mbed-bot
Copy link

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 1714

All builds and test passed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants