Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move File and IO::FileDescriptor's platform-specific parts to Crystal::System #5333

Merged
merged 3 commits into from
Dec 13, 2017

Conversation

RX14
Copy link
Contributor

@RX14 RX14 commented Nov 28, 2017

Hopefully the third time lucky we'll manage to merge an IO portability refactor.

I've copied the commit messages out below so people actually read them. I also suggest you review each commit in order, instead of reviewing the diff as a whole. it makes the entire thing a lot more readable and understandable.

Make IO::FileDescriptor fcntl private

fcntl() is an extremely low-level platform-specific detail which should only be used internally to IO::FileDescriptor. In this commit, it is removed from the public API and made private.

The only usage of this API was added only recently, in Crystal::Main. It is replaced with a method which calls LibC.fcntl directly, since it is too early in init to create a new IO::FileDescriptor instance directly and ask it if it's blocking.

Separate platform-specific parts of IO::FileDescriptor

In order to add windows support to Crystal, IO::FileDescriptor cannot depend on posix APIs. Here, we refactor out the platform-specific parts of IO::FileDescriptor into a mixin module in Crystal::System. Some platform-specific methods are refactored out of methods which contain both platform-specific and cross-platform behaviour. These methods are prefixed with system_ and are private.

Make Tempfile inherit File

Making Tempfile inherit from File allows the instance to be passed to methods which expect File, and makes Tempfile much more powerful. This requires adding a new private constructor to File which sets @path and calls super.

Separate platform-specific parts of File

Similar to previous patch: Separate platform-specific parts of IO::FileDescriptor.

@RX14 RX14 force-pushed the feature/io-crystal-system-2 branch from b0e0631 to fd81b4d Compare November 29, 2017 00:07
@straight-shoota
Copy link
Member

In the second commit, the extracted platform-specific methods are prefixed platform_. However, in the commit message it says to be system_.

I'd prefer system_ because it matches the namespace and is more descriptive.

@RX14
Copy link
Contributor Author

RX14 commented Nov 29, 2017

I'll wait a few days to see if there is consensus on naming. Then I'll pick one and make the commit message and diff match up.

If the only problem you could find is naming, I'm happy!

raise "Invalid access mode #{mode}"
end
else
raise "Invalid access mode #{mode}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't support a mode like rb+. Also, #{mode.inspect} would make it clearer in the output what comes from the user and what's the error message itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is directly copied from the old file.cr if it has issues, it's out of the scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

path = "#{tmpdir}#{name}.XXXXXX#{extension}"

if extension
fd = LibC.mkstemps(path, extension.bytesize)
Copy link
Contributor

@Papierkorb Papierkorb Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mkstemps(3) reads: Since it [the template] will be modified, template must not be a string constant, but should be declared as a character array. That's not obvious from the function itself (At least to a non-C user), there should be atleast a comment noting that the string will be modified (Which isn't "possible" from Crystal), or it could be copied into a temporary Bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


def stat(path)
if LibC.stat(path.check_no_null_byte, out stat) != 0
raise Errno.new("Unable to get stat for '#{path}'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.inspect would give proper visual "escaping"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


def real_path(path)
real_path_ptr = LibC.realpath(path, nil)
raise Errno.new("Error resolving real path of #{path}") unless real_path_ptr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same path.inspect, this and following error messages omit the single-quote. (Consistency)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@Papierkorb
Copy link
Contributor

I think there needs to be a proper support for platform-specific functionality in the stdlib. IMHO, throwing stuff out like #fcntl (The easy access to it) can be harmful: The wrapper methods are usually much easier and nicer to use; The LibC pendants are harder to get right.

Truth is, programs tend to need platform-specific stuff that's not yet nicely wrapped. I doubt that you'll be able to offer everything that the control functions support in an agnostic manner. And many programs are fine if they don't work on some random platform, as long they work where they're intended to be used. Diminishing all API to the most common denominator will only hurt on the long run.

Outside the scope of this PR, I'd actually like to see a documentation flag to note platform information right in the in-source docs. The docs generator could then mark such a section visually, or add a "Tag badge" to it.

@RX14
Copy link
Contributor Author

RX14 commented Nov 29, 2017

re: removing fcntl, I actually think you're right. I'll probably revert that commit.

@RX14
Copy link
Contributor Author

RX14 commented Nov 29, 2017

I appreciate these comments, but comments (in support of or against) about the approach of using a mixin module from Crystal::System would be most helpful to tell whether this PR is going to get merged at all.

@RX14 RX14 mentioned this pull request Nov 29, 2017
24 tasks
Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure mixins are better than composition, but that should really help going forward, so I'm gonna say 👍 ! I guess we'll probably rework it later when there is windows support, but... later. Let's go forward —as long as the public API is good and stable.

I don't think we need the Mixin suffix when naming modules, I mean include Crystal::System::FileDescriptor is just fine, and the filenames don't have the _mixin suffix anyway (crystal/system/file_descriptor.cr).

About platform-specific methods, there are usages of the _impl suffix in different places in the stdlib. Maybe we could stick to that? My point is that "private impl" can englob the "target-specific" meaning.

src/file.cr Outdated

class File < IO::FileDescriptor
private alias Sys = Crystal::System::File
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid the alias and use the long naming.

# TODO: use fcntl/lockf instead of flock (which doesn't lock over NFS)
# TODO: always use non-blocking locks, yield fiber until resource becomes available

def flock_shared(blocking = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be opportune to break the file lock API now? The current flock naming and API is tied to the underlying flock POSIX implementation, which is bad. What about abstracting it now, since this patch is meant to support windows, eventually.

enum Lock
  Shared
  Exclusive
end

def lock(operation : Lock, blocking = false, &block)
end

def lock(operation : Lock, blocking = false)
end

def unlock
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd tentatively say that's easily done in a latter PR - and so it should be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So be it. Let's stick to extracting posix-specific calls in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend making an issue to track that though, we seem to forget a lot of these refactors once the PR gets merged.

@RX14
Copy link
Contributor Author

RX14 commented Nov 30, 2017

@ysbaddaden if you look at File, there's a Crystal::System::File and a Crystal::System::FileMixin. The Mixin naming comes from there.

@ysbaddaden
Copy link
Contributor

Hum, I prefer to use def self. instead of extend self when there is no actual benefit to using the module methods as both includeable and class methods (for example Random::Secure does benefit). This would avoid having System::File and System::FileMixin and but just System::File, with some class methods, and some instance methods to be included in ::File.

Fairly minor, but it looks better to my eyes.

@RX14
Copy link
Contributor Author

RX14 commented Nov 30, 2017

@ysbaddaden sounds good to me. I'll change it.

@RX14
Copy link
Contributor Author

RX14 commented Dec 1, 2017

I'm still undecided about naming. I think a system_ prefix is better than _impl, because there's a difference between just a class-private implementation detail and a platform-specific method. system corresponds to the namespace, so it'd a good name.

@ysbaddaden
Copy link
Contributor

To me it boils down to "implementation detail", but that's pedantic nitpicking, no need to spend hours on such details.

@RX14 RX14 force-pushed the feature/io-crystal-system-2 branch from fd81b4d to 70bb1e0 Compare December 1, 2017 15:20
@RX14
Copy link
Contributor Author

RX14 commented Dec 1, 2017

Updated this PR. Changed the naming to system_, and removed the Mixin naming. Also re-added fcntl and moved the IO::System-specific parts to the mixin (windows doesn't use it yet because there's no evented IO).

@RX14 RX14 mentioned this pull request Dec 1, 2017
@RX14
Copy link
Contributor Author

RX14 commented Dec 1, 2017

Also, if this gets merged, please make sure to rebase, not just squash. I think it'll make for a better commit history.

private def unbuffered_rewind
seek(0, IO::Seek::Set)
self
self.pos = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pos=(value) calls seek(value) and Seek::Set is default. So seek(0) would be the most direct call here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason for the change, it's trivial to see it's the same but I can revert it if you want.

end
end

private def system_pos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer system_tell over system_pos.

I wish we'd drop IO#pos and IO#pos= and kept IO#seek and IO#tell instead (less duplication, more explicit naming).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just need 1 call: seek which always returns the current position. I suggested that IO implement pos pos= and tell in the base class to @asterite and simply provide a raising IO#tell by default.

For now, this is how IO works. It's not great, but it is how it is until I refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally i'd rather keep pos, pos= and seek. tell is obscure and weird naming to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's "seek and tell". They're well known associated terms, whatever the language. E.g. C, Haskell, Python, Ruby or Rust, all have both.

I don't like the pos naming, it's an abbreviation and inexplicit (inherited from Ruby). At least position would be a more explicit name but I still prefer tell (broadly used). Using a setter (pos=) to perform an action, that is move the cursor in a file, sounds awful to me, in addition to be an exact duplicate for seek, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say "seek" is not an unknown term to me, although "tell" is. I agree with you on pos vs position, but not that setters shouldn't have performance side-effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the terms tell and seek come from working with files. IO is a general class, and to me it doesm't seem logical to transfer this meaning from files to any other stream. But I may be mistaken.
Another thought: pos and pos= is a concept easy to understand. One reads the current stream position, the other sets it (position and position= are a bit longer but also fine). With seek and tell it is more complicated, if your not familiar with these terms. You have to remember two names instead of one and can't simply asseses their meaning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also for dropping #seek/#tell and using #position/#position= (Not #pos). The latter is really clear in intention. Also, it lets you seek around in a file using normal code:

my_file.position += 5 # This is how you'd write it anywhere else too! Fantastic!
my_file.seek(my_file.tell + 5) # Compared to the one above: Meh.
my_file.seek(5, File::SEEK_CUR) # I can never remember which SEEK_* I need - this sucks.

Copy link
Contributor Author

@RX14 RX14 Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think seek should stay, io.position = io.size - 5 involves 2 syscalls whereas io.seek(-5, IO::Seek::End) requires only one. Similar for io.position += 5. That being said, I think that using io.position is a much nicer interface which is more than adequate for most people. Perhaps some caching for size and position would mean that we can obviate seek entirely.

Regardless of that, we should make a new issue containing all the IO refactor suggestions from this thread. Unless anyone else volunteers to make that issue now, i'll get it done once this PR is merged.

r
end

def fcntl(cmd, arg = 0)
Copy link
Contributor

@ysbaddaden ysbaddaden Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IO#fcntl method is supposed to be public, but it won't be documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't documented before, I can add documentation if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant: will the #fcntl method still appear in the documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure. Let me check.

Copy link
Contributor Author

@RX14 RX14 Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They actually break the doc generator by saying they're included from the module, but the module is :nodoc: so that the links don't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go and do this method the same as the rest - with the system_ prefix and {% ifdef %} the method out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've noticed this, too. Inclusion and extension of modules that are not documented should probably be hidden in the docs generator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a fixup commit which resolves this. If this is OK i'll rebase to allow this to be merged.

@RX14
Copy link
Contributor Author

RX14 commented Dec 4, 2017

There have been loads of great ideas about refactoring IO::FileDescriptor and IO here, but I think that i'll work on a later PR which basically contains all this. I want to really focus on windows this christmas, instead of the IO refactors that work uncovers.

If people feel that some of that work would be better in this PR however, i'm happy to include it.

@RX14
Copy link
Contributor Author

RX14 commented Dec 4, 2017

Plus, a larger scope pf the PR means I have the chance to perform these same refactors on Socket and such. I really feel that these should be seperate PRs for more reasons than wanting to get this merged.

@RX14 RX14 requested a review from ysbaddaden December 6, 2017 11:24
@RX14
Copy link
Contributor Author

RX14 commented Dec 12, 2017

Ping? I think I've solved all the reviews, and all this should need is a tiny review of the fixup commit from @ysbaddaden or a review from another core team member. I think this PR is fine to land in 0.24.1.

In order to add windows support to Crystal, IO::FileDescriptor cannot depend on
posix APIs. Here, we refactor out the platform-specific parts of
IO::FileDescriptor into a mixin module in Crystal::System. Some
platform-specific methods are refactored out of methods which contain both
platform-specific and cross-platform behaviour. These methods are prefixed with
`system_` and are private.
Making Tempfile inherit from File allows the instance to be passed to methods
which expect File, and makes Tempfile much more powerful. This requires adding a
new private constructor to File which sets @path and calls super.
Similar to previous patch: Separate platform-specific parts of
IO::FileDescriptor.
@RX14 RX14 force-pushed the feature/io-crystal-system-2 branch from ccdd6b7 to 586846d Compare December 13, 2017 15:02
@RX14
Copy link
Contributor Author

RX14 commented Dec 13, 2017

Build failure is known related to travis's new build images: travis-ci/travis-ci#8891.

@RX14 RX14 merged commit 5fd965e into crystal-lang:master Dec 13, 2017
@RX14 RX14 added this to the Next milestone Dec 24, 2017
@RX14 RX14 added kind:refactor topic:stdlib platform:windows Windows support based on the MSVC toolchain labels Dec 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:refactor platform:windows Windows support based on the MSVC toolchain topic:stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants