-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get: use StateBase #2323
get: use StateBase #2323
Conversation
1d459c7
to
81dc568
Compare
dvc/repo/__init__.py
Outdated
@@ -72,7 +72,12 @@ def __init__(self, root_dir=None): | |||
self.lock = Lock(self.dvc_dir) | |||
# NOTE: storing state and link_state in the repository itself to avoid | |||
# any possible state corruption in 'shared cache dir' scenario. | |||
self.state = State(self, self.config.config) | |||
if no_state: | |||
from dvc.state import StateBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this up to from dvc.state import State
above, just to keep it tidy
dvc/repo/__init__.py
Outdated
self.lock.lock_file, | ||
self.config.config_local_file, | ||
updater.updater_file, | ||
updater.lock.lock_file, | ||
] + self.state.temp_files | ||
|
||
if not no_state: | ||
flist += [self.state.state_file] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little weird that we have temp_files, but for state_file we have a special "if". How about we define state_files = []
and use it instead of state_file
same as we do with temp_files
? Or maybe we could even name it self.state.files
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just:
repo.state = BaseState()
# or
repo.cache.local.state = BaseState()
Why do we need those no_state
and additional changes?
with external_repo(cache_dir=tmp_dir, url=url, rev=rev) as repo: | ||
with external_repo( | ||
cache_dir=tmp_dir, url=url, rev=rev, no_state=True | ||
) as repo: | ||
# Try any links possible to avoid data duplication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change should be here:
repo.state = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, state db is not connected to until __enter__
, so we could do that.
a583564
to
26cfb64
Compare
dvc/remote/base.py
Outdated
self._dir_info = {} | ||
|
||
@property | ||
def state(self): | ||
return self._state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? Why can't we just repo.cache.local.state = StateBase()
in appropriate place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we repo.cache.local.state = StateBase()
, we end up with situation, where RemoteLOCAL
and Repo
use different instance of state, while originally, they have been using the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is problem in current issue, but I think that it is potential bug. If we pass Repo
to RemoteLOCAL
, and then assign its state
to RemoteLOCAL.state
, we create unobvious dependency, that we have to remember about, if we ever need to substitute state
. We also need to remember which state is used in which part of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we don't need this _state
and state
property though. We can simply use:
class RemoteBASE:
# ...
state = StateBase()
def __init__(...):
# ...
That would be properly shadowed by state property in RemoteLOCAL
you already have.
dvc/repo/get.py
Outdated
# Note: we need to replace state, because in case of getting DVC | ||
# dependency on CIFS or NFS filesystems, sqlite-based state | ||
# will be unable to obtain lock | ||
repo.state = StateBase() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You separated comment from its code by inserting it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what do you mean? It should be below repo.state = StateBase()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment "Also, we can't use theoretical "move" link type here ... " is about the code "repo.config.set(..." and you added your lines inbetween.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, sorry.
|
||
def __exit__(self, exc_type, exc_val, exc_tb): | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't need these either if we only change remote state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing we might consider:
- rename
StateBase
->NoopState
. - stop inheriting
State
from it.
No urgency though, so postpone it if you are not up to adding it here.
dvc/remote/base.py
Outdated
self._dir_info = {} | ||
|
||
@property | ||
def state(self): | ||
return self._state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we don't need this _state
and state
property though. We can simply use:
class RemoteBASE:
# ...
state = StateBase()
def __init__(...):
# ...
That would be properly shadowed by state property in RemoteLOCAL
you already have.
dvc/state.py
Outdated
@@ -32,6 +32,10 @@ def __init__(self, dvc_version, expected, actual): | |||
|
|||
|
|||
class StateBase(object): | |||
@property | |||
def files(self): | |||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply files = []
will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Are those fails unrelated? Maybe just travis or some homebrew fail? |
@Suor yep, take a look at our slack :) |
Have you followed the guidelines in our
Contributing document?
Does your PR affect documented changes or does it add new functionality
that should be documented? If yes, have you created a PR for
dvc.org documenting it or at
least opened an issue for it? If so, please add a link to it.
Fixes #2135