Proposed Style Guide WIP #292

rafmudaf · 2022-01-25T22:25:45Z

rafmudaf
Jan 25, 2022
Maintainer

Style Guide for FLORIS v3

Here are some ideas for establishing our own internal style guide.

Numpy usage

Numpy is quite flexible and this leave many ways to do the same thing with comparable performance.

How to operate on arrays

Do we use np.array methods on the arrays or from the numpy library:

a = np.array([...])
np.shape(a)

versus

a = np.array([...])
a.shape

	`np.func(array)`	`array.func()`
Pros	expands use case to array-like data	more concise
	consistent with functions that are not also class methods	acts as check for correct data types
Cons	uses more space	possibly confusing to those less familiar with numpy
	less expected way to use numpy

Support Python List?

I vote no. I'd rather only support np.array for input arguments instead of both np.array and List. This is for type hinting in the function signature, and should only apply to the simulation package.

A second vote for np.array support only unless of a specific use case, or potentially externally facing functionality. It is typically faster for much of the work that will be done with FLORIS and ensures consistent vectorization across the software.

Standard method for adding axes to an array for broadcasting

We've seen many ways to handle resizing and reshaping arrays prior to an array arithmetic operation that will involve broadcasting. Here, we should establish our standard method for accomplishing this task since it done in many places.

Possible methods:

np.resize(array, (new_dimension_size, *array.shape): adds new dimensions of any size
np.stack([array] * new_dimension_size, axis=0): adds new dimensions of any size, but needs to be done more carefully
np.expand_dims(array, axis=0): adds a sized-1 new dimension and easily controlled via axis=
array[np.newaxis, :]: same as above, but more concise
array[None, :]: same as above since np.newaxis is same as None.

Operation chaining

It can be simpler to chain operations so that when they are expanded the full line results in one operation. For example, the fCt functions in FLORIS are handles to callable functions, so they can be accessed and used like this:

thrust_coefficient = np.zeros_like(average_velocities)
for i in range(n_wind_speeds):
    for j in range(n_turbines):
        thrust_coefficient[i, j] = fCt[i, j](average_velocities[i, j])

The last line above gets the callable from the fCt numpy array and also calls the function. Due to the chained operations, this is not very clear to read. For anyone not familiar with this area of the code, it might be unexpected to have an array of functions in the first place, so this line is even more opaque. A more clear approach is to first pull out the function and name it appropriately. Then, call the function directly.

thrust_coefficient = np.zeros_like(average_velocities)
for i in range(n_wind_speeds):
    for j in range(n_turbines):
        ct_interp_function = fCt[i, j]
        thrust_coefficient[i, j] = ct_interp_function(average_velocities[i, j])

Array masking

We often write the construction of a boolean mask like this:

mask = np.array(x > x0)

However, this causes the creation of an extra array due to the unnecessary type cast. The result of a comparison operation involving a np.Array is an array with type np.Array, anyway. This should instead be written like this including the parenthesis to denote that its an operation in itself rather than a chain of arithmetic:

mask = (x > x0)

Explicit vs implicit array sizing

For the sake of being explicit about our data, the FLORIS code should resize arrays with the explicit dimension size where possible. For example, it is valid to do this:

WIND_SPEEDS = [8.0, 9.0, 10.0, 11.0]
WIND_SPEEDS_BROADCAST = np.reshape(np.array(WIND_SPEEDS), (1, -1, 1, 1))

and it would result in an array of size (1,4,1,1). However, in this case we know the size of the second dimension so it would be equivalent in operation and better in readability to do this:

WIND_SPEEDS = [8.0, 9.0, 10.0, 11.0]
WIND_SPEEDS_BROADCAST = np.reshape(np.array(WIND_SPEEDS), (1, 4, 1, 1))

or even better

WIND_SPEEDS = [8.0, 9.0, 10.0, 11.0]
n_wind_speeds = len(wind_speeds)
WIND_SPEEDS_BROADCAST = np.reshape(np.array(WIND_SPEEDS), (1, n_wind_speeds, 1, 1))

Scalar values in right-most dimensions

We often have a scalar value in the right-most dimensions of an array. For example, the wake models are a function of the current turbine's rotor diameter, so when we are finding the wake profile on the TurbineGrid we have something like

rotor_diameter[:, :, :, None, None] * Ct[:, :, :, None, None] * u  # Here, u has 5x5 in the right-most dimensions but these are broadcast by Numpy

This could also be handled by expanding the scalar values to 5x5 grids in rotor_diameter and Ct. The first method is more specially efficient (memory-wise), but the second method is less complex since we could say blanket-statement all arrays have the same shape.

Import statements

Absolute imports are recommended over relative imports as described in pep8: https://www.python.org/dev/peps/pep-0008/#imports

Module directories should contain a file __init__.py that imports all modules with the directory. For example, the file at /Users/rmudafor/Development/floris/src/simulation/wake_velocity/__init__.py should contain the following line:

from src.simulation.wake_velocity.jensen import JensenVelocityDeficit

So that the imported object can be used in other modules as

from src.simulation.wake_velocity import JensenVelocityDeficit

Errors, warnings, and other feedback to users

We currently either raise an exception or use the logging_manager to display information to the user via an error file or dump to the console. Exceptions should be raised when the code enters a state that is invalid. For example, a NaN value in a calculation may be a valid reason to raise an exception. On the other hand, a message should be sent through the logging manager in the following situations:

The code changes the configuration (i.e. inputs) based on some predefined rules that are not obvious to the user
The configuration is atypical and likely unintended such as using the "none" models
The inputs are invalid such as a negative rotor diameter or outrageous power

Syntax

These should be configured in a Python formatter where possible.

Multiline statements with arguments

Function definitions and function calls (including creating container classes) may use a multiline syntax when appropriate. For instances with a single or two arguments, it may be more readable to use a one-line statement. However, multiline statements provide a few benefits:

Comment out or change one argument without affection the others
Add contextual comment to one argument
Use a generator expression while maintaining readability; in a single line, these expressions add too much visual complexity
Move arguments around easily with key strokes (option+up/down in VS Code)
Multiline select with keyboard only to edit arguments

Arrays and other containers

There should be no space at the beginning and end of the container:

a = [1, 2, 3]
b = {"a": 1, "b": 2}

Lines exceeding max line length

When an equation exceeds the line length configured in the linter, it is preferred to wrap the equation in parenthesis and express it vertically term by term. For example, the equation below is too long on one line, but it can easily be split at the mathematical operators so that it can be read and comprehended vertically. Notice how cosd(yaw_angle) ** pW are kept on a single line retaining the context that the cos(yaw_angle) is cubed. This is consistent with how we would communicate this mathematically or in conversation.

    # Too long
    yaw_effective_velocity = (air_density/ref_density_cp_ct)**(1/3) * average_velocity(velocities) * cosd(yaw_angle) ** pW

    # Split vertically
    yaw_effective_velocity = (
        (air_density/ref_density_cp_ct)**(1/3)
        * average_velocity(velocities)
        * cosd(yaw_angle) ** pW
    )

There are two main benefits to this style:

Context is maintained since it is split by each term
No single term is emphasized; auto formatters tend to split on open parenthesis without regard to the meaning, but this over emphasizes one part over the other. The example below is not preferred as it draws eyes to average_velocity and away from the rest of the terms:

    yaw_effective_velocity = (air_density/ref_density_cp_ct)**(1/3) * average_velocity(
        velocities
    ) * cosd(yaw_angle) ** pW

Similarly, f-strings that exceed the max line length can be wrapped in parentheses and split by line. No comma is needed between lines as Python will automatically concatenate each line.

# Too long
raise KeyError(f"Wake: '{k}' was given as input but it is not a valid option. Required inputs are: {', '.join(required_strings)}")

# Multiline f-string
raise KeyError(
    f"Wake: '{k}' was given as input but it is not a valid option."
    f"Required inputs are: {', '.join(required_strings)}"
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed Style Guide WIP #292

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Proposed Style Guide WIP #292

rafmudaf Jan 25, 2022 Maintainer

Style Guide for FLORIS v3

Numpy usage

How to operate on arrays

Support Python List?

Standard method for adding axes to an array for broadcasting

Operation chaining

Array masking

Explicit vs implicit array sizing

Scalar values in right-most dimensions

Import statements

Errors, warnings, and other feedback to users

Syntax

Multiline statements with arguments

Arrays and other containers

Lines exceeding max line length

Replies: 0 comments

rafmudaf
Jan 25, 2022
Maintainer