Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 4.4 #259

Merged
merged 90 commits into from
Apr 16, 2024
Merged

Version 4.4 #259

merged 90 commits into from
Apr 16, 2024

Conversation

bockthom
Copy link
Collaborator

4.4

Announcement

  • Due to a bug in package igraph (p-value missing for fit_power_law() igraph/rigraph#1158), which is present in their versions 2.0.0 to 2.0.3, the functions metrics.scale.freeness and metrics.is.scale.free can currently not be used with these igraph versions. If you need to call any of these two functions, you either need to install igraph version 1.6.0 or wait until the bug in igraph is fixed in a future version of igraph.

Added

Changed/Improved

Fixed

maxloeffler and others added 30 commits September 5, 2023 10:57
Introduce edge construction into the 'get.artifact.network.issue' function.
Connect 'add_link' and 'referenced_by' issue-events by an edge.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
Add new 'add_link' and 'referenced_by' issue events to the testing data to
allow for new tests. Add a test for the construction of an issue-based
artifact-network with 'issues.only.comments' turned on and off.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
For some issue-events the 'event.info.1' column is to be interpreted as
a reference to an issue. In this case it is more consistent to reformat
the column entry into the <issue-[issue.source]-[event.info.1]> format.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
To be consistent with bipartite networks it is necessary to rename the
vertex attribute 'IssueEvent' to 'Issue' in multi-networks.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
Modify 'split.data.time.based' to be able to split by activity-based bins.
Rename the function to 'split.data.by.time.or.bins'. Introduce wrapper
functions 'split.data.by.bins.vector' and 'split.data.time.based' to call
'split.data.by.time.or.bins'.

Add 'include.duplicate.ids' parameter in 'split.get.bins.activity.based'
to obtain bins covering all data elements from 'df' by which the split
is being performed, regardless of the elements ids uniqueness.

In 'split.data.activity.based', after calculating the bins to place data
elements into, replace the time-based splitting by
'split.data.by.bins.vector'. Time-based splitting is incorrect for the
case that the date of the last element in a bin is the same as the date
of the first element of the next bin.

Adjust calculation of 'offset.end' in 'split.data.activity.based' to fix
a bug where because of a short last window the end offset would cross
the border of the last window, overlapping into the second last. Because
of this overlap the last sliding windows would not be calculated as expected.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
Instead of creating only undirected issue-based artifact-networks, we now
take the directedness information out of the network config. Edges are
already created in a way that they can be interpreted as directed
edges from the issue referencing to the referenced issue.

Replace for loop in edge creation by more efficient mclapply and fix
some minor formatting inconsistencies.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
Rework the algorithm to create sliding windows in activity-based splitting.
Instead of cutting off half a range many elements at the end before building
sliding windows (which creates a lot of edge cases), build sliding windows with
every element up to the last one. Then remove the last incomplete range.
The contents of the last incomplete range will be fully included in the second
last range and therefore redundant.
Sometimes the last incomplete range is a regular range. Previously the last
range always had to be a regular range. This means that removing the last
incomplete range requires updating the tests.

Additionally fix and improve documentation of splitting methods and fix
minor spelling bugs.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
This renaming makes sense as the method only splits dataframes. Rename
'split.data.by.bins.vector' to 'split.data.by.bins' as it is more readable
and easier to understand.

This works towards #239.

Signed-off-by: Maximilian Löffler <[email protected]>
Check type of the 'bins' parameter and its components in the wrapper
functions 'split.data.by.time' and 'split.data.by.bins'. Move
'split.data.by.time.or.bins' into a new category for internal helper
functions to discourage direct invocation.

Signed-off-by: Maximilian Löffler <[email protected]>
Check if the 'bins' parameter of the 'split.data.by.bins' actually
contains 'bins' component. Use 'get.date.from.string' instead of
accessing 'lubridate::ymd_hms' directly, to encapsulate date conversion.
Allow 'vector' component of 'bins' to be of any subclass of
'numeric' instead of explicitly 'numeric'.

Disallow lists that contain elements that are not representing a date
in 'split.data.time.based' as they do not comply with the expected
format of bins for 'split.data.by.time.or.bins'.

Signed-off-by: Maximilian Löffler <[email protected]>
Add tests that call 'split.data.time.based' and 'split.data.by.bins'
with various malformed 'bins' parameters and expect failure.

Signed-off-by: Maximilian Löffler <[email protected]>
Reverse order of reference from Jira issue 332 and Jira issue 328, since
previously, Jira issue 332 was referenced before its creation.
Copy GitHub and Jira issue data from 'test_feature/feature' also to
'test_proximity/proximity' to keep the data consistent.

Adjust all effected tests to comply with the changed testing data.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Also use backticks instead of single ticks for proper markdown highlighting.
Improve changelog messages by clarifing and properly focusing on the
important changes as well as adding more relevant commit hashes.

Signed-off-by: Maximilian Löffler <[email protected]>
Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes

Reviewed-by: Thomas Bock <[email protected]>
Reviewed-by: Christian Hechtl <[email protected]>
For testing purposes

Signed-off by: Leo Sendelbach <[email protected]>
maxloeffler and others added 27 commits March 7, 2024 18:00
As the 'relation' edge-attribute can only be singular values if
'simplify.multiple.relations' is not set, we can set the previously
introduced lambda as default.

This works towards #251.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
* Enable 'simplify.multiple.relations' config parameter to work properly.
Previously, networks would be simplified before merging datasources, i.e.,
before edges could even have multiple relations
* Circumvent undocumented behavior in 'plyr::rbind.fill' that occurs when
merging already simplified networks (see PR#255)

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Concat all relations of an edge to a single string to determine the edges
color by, since ggplot2 does not accept lists as an identifier.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Work on the simplification of multi-relation edges in networks

Reviewed-by: Thomas Bock <[email protected]>
Occasionally, the 'bins' attribute did not include the end-date
of the second-last range, as described in Issue #256.

In addition, fix one faulty test.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
As these networks are already simplified, further simplifications
are not necessary and might even lead to unexpected effects.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Fix a bug in the reconstruction of the 'bins' network-attribute from given ranges

Reviewed-by: Thomas Bock <[email protected]>
…ed.by.timestamps`

If no custom event timestamps are available in the ProjectData object, an
error is thrown in `split.data.time.based.by.timestamps`, to avoid passing
`NULL` to the called splitting functions.

Signed-off-by: Thomas Bock <[email protected]>
To avoid running into various errors when calculating scale-freeness
for an empty network, check if the network is empty before calculating
scale-freeness and return `NA` right away if the network is empty.

Signed-off-by: Thomas Bock <[email protected]>
In commit e72eff8, the function
`get.expanded.adjacency.matrices` has been refactored. Unfortunately,
the author names have only be extracted from each network individually
instead of globally over all networks. This way, the functionality of
the function has been changed inadvertently. With this commit, we
follow-up on this and restore the previous behavior.

As `get.author.names.from.networks(networks)` now always returns a list,
accessing the first element is necessary (although this was not
necessary in previous implementations).

Moreover, this commit also fixes a wrong test with respect to
get.expanded.adjacency.matrices`.

Signed-off-by: Thomas Bock <[email protected]>
To stay consistent with the naming scheme of our test files,
rename 'test-util-networks-misc.R' to 'test-networks-misc.R'.

Signed-off-by: Thomas Bock <[email protected]>
In various of our source-code files, our coding conventions
have been violated. With this commit, we fix the most obvious violations:

- semicolon at the end of a statement
- missing space between "if" and the subsequent "("
- superfluous space between "return" and "("
- inconsistent indentation

Signed-off-by: Thomas Bock <[email protected]>
The bug in igraph's `disjoint_union` regarding unintended type
conversions is still present in the current igraph version 2.0.3.
Therefore, we update the corresponding comment in our source code.

Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Thomas Bock <[email protected]>
Various minor fixes

Reviewed-by: Christian Hechtl <[email protected]>
Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Christian Hechtl <[email protected]>
@bockthom bockthom added this to the v4.4 milestone Apr 16, 2024
@bockthom bockthom merged commit 24005e4 into master Apr 16, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants