Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: use StateBase #2323

Merged
merged 1 commit into from
Jul 29, 2019
Merged

get: use StateBase #2323

merged 1 commit into from
Jul 29, 2019

Conversation

pared
Copy link
Contributor

@pared pared commented Jul 25, 2019

  • Have you followed the guidelines in our
    Contributing document?

  • Does your PR affect documented changes or does it add new functionality
    that should be documented? If yes, have you created a PR for
    dvc.org documenting it or at
    least opened an issue for it? If so, please add a link to it.


Fixes #2135

@pared pared force-pushed the 2135 branch 2 times, most recently from 1d459c7 to 81dc568 Compare July 25, 2019 13:46
@pared pared changed the title [WIP] get: use StateBase get: use StateBase Jul 26, 2019
@pared pared requested a review from efiop July 26, 2019 10:36
@@ -72,7 +72,12 @@ def __init__(self, root_dir=None):
self.lock = Lock(self.dvc_dir)
# NOTE: storing state and link_state in the repository itself to avoid
# any possible state corruption in 'shared cache dir' scenario.
self.state = State(self, self.config.config)
if no_state:
from dvc.state import StateBase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this up to from dvc.state import State above, just to keep it tidy

self.lock.lock_file,
self.config.config_local_file,
updater.updater_file,
updater.lock.lock_file,
] + self.state.temp_files

if not no_state:
flist += [self.state.state_file]
Copy link
Contributor

@efiop efiop Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little weird that we have temp_files, but for state_file we have a special "if". How about we define state_files = [] and use it instead of state_file same as we do with temp_files? Or maybe we could even name it self.state.files :)

dvc/repo/__init__.py Outdated Show resolved Hide resolved
dvc/repo/__init__.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Suor Suor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just:

repo.state = BaseState()
# or
repo.cache.local.state = BaseState()

Why do we need those no_state and additional changes?

with external_repo(cache_dir=tmp_dir, url=url, rev=rev) as repo:
with external_repo(
cache_dir=tmp_dir, url=url, rev=rev, no_state=True
) as repo:
# Try any links possible to avoid data duplication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change should be here:

repo.state = ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, state db is not connected to until __enter__, so we could do that.

@pared pared force-pushed the 2135 branch 3 times, most recently from a583564 to 26cfb64 Compare July 27, 2019 13:50
self._dir_info = {}

@property
def state(self):
return self._state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Why can't we just repo.cache.local.state = StateBase() in appropriate place?

Copy link
Contributor Author

@pared pared Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we repo.cache.local.state = StateBase(), we end up with situation, where RemoteLOCAL and Repo use different instance of state, while originally, they have been using the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is problem in current issue, but I think that it is potential bug. If we pass Repo to RemoteLOCAL, and then assign its state to RemoteLOCAL.state, we create unobvious dependency, that we have to remember about, if we ever need to substitute state. We also need to remember which state is used in which part of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

Copy link
Contributor

@Suor Suor Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we don't need this _state and state property though. We can simply use:

class RemoteBASE:
    # ...
    state = StateBase()

    def __init__(...):
        # ...

That would be properly shadowed by state property in RemoteLOCAL you already have.

dvc/repo/get.py Outdated
# Note: we need to replace state, because in case of getting DVC
# dependency on CIFS or NFS filesystems, sqlite-based state
# will be unable to obtain lock
repo.state = StateBase()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You separated comment from its code by inserting it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what do you mean? It should be below repo.state = StateBase()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "Also, we can't use theoretical "move" link type here ... " is about the code "repo.config.set(..." and you added your lines inbetween.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, sorry.


def __exit__(self, exc_type, exc_val, exc_tb):
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't need these either if we only change remote state.

Copy link
Contributor

@Suor Suor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing we might consider:

  1. rename StateBase -> NoopState.
  2. stop inheriting State from it.

No urgency though, so postpone it if you are not up to adding it here.

self._dir_info = {}

@property
def state(self):
return self._state
Copy link
Contributor

@Suor Suor Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we don't need this _state and state property though. We can simply use:

class RemoteBASE:
    # ...
    state = StateBase()

    def __init__(...):
        # ...

That would be properly shadowed by state property in RemoteLOCAL you already have.

dvc/state.py Outdated
@@ -32,6 +32,10 @@ def __init__(self, dvc_version, expected, actual):


class StateBase(object):
@property
def files(self):
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply files = [] will do.

Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@efiop efiop merged commit c5c248c into iterative:master Jul 29, 2019
@Suor
Copy link
Contributor

Suor commented Jul 29, 2019

Are those fails unrelated? Maybe just travis or some homebrew fail?

@efiop
Copy link
Contributor

efiop commented Jul 29, 2019

@Suor yep, take a look at our slack :)

@pared pared deleted the 2135 branch December 17, 2019 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get: disable state lock and .dvc/lock to be able to deploy on nfs
3 participants