Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File/address validation #9

Closed
barneydobson opened this issue Jan 18, 2024 · 4 comments · Fixed by #83
Closed

File/address validation #9

barneydobson opened this issue Jan 18, 2024 · 4 comments · Fixed by #83

Comments

@barneydobson
Copy link
Collaborator

There's a lot of os.path.join - file addresses should be defined and validated in one place

This was referenced Jan 25, 2024
@barneydobson
Copy link
Collaborator Author

  • Use pathlib not OS.
  • Should validate that a file exists at some point. I guess at two main points, after the downloaders are called in prepare_data, and after the preprocessing functions are called in pre_processing.
  • Should graph functions define their file dependencies somewhere?
  • Are addresses handled in the same way as the old repo? I.e., after validation, everything gets stored in a dict?

@barneydobson
Copy link
Collaborator Author

@dalonsoa I think the #10 has made pretty much cleared what i need to know to start on graph functions. We don't have to solve all of this now, but I wanted to check if I could still make the assumption that the graph functions are receiving paths as args in the same way as before.
e.g.,

@register_graphfcn
def double_directed(G, river_buffer_distance, address_to_some_raster: str | Path, **kwargs):
    print(river_buffer_distance)
    return G

@dalonsoa
Copy link
Collaborator

I would say so, given that it is impractical in some cases to pass fully loaded data objects like dataframes and the sort.

Ideally, pass Path objects, no strings, so you can start rolling out the use of pathlib.

@barneydobson
Copy link
Collaborator Author

Sounds good - don't need to decide this now but for consistency with params could also put addresses in pydantic classes too. Though would ideally make the address relative to the user's defined base directory in some way and also work with model instances... Don't need to work this out yet though

class DownloadAddresses(BaseModel):
    dir: Path = Field(default = Path('/base_dir/bounding_box_1/downloads', description = 'folder containing downloads for a given bbox')
    elevation: Path = Field(default = Path(dir / 'elevation.tif'), description="Raw elevation data.")
    
class NationalAddresses(BaseModel):
    dir: Path = Field(default = Path('/base_dir/national_downloads', description = 'folder containing large national scale downloads')
    buildings: Path = Field(default = Path(dir / 'GBR.parquet'), description="National building datasets.")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants