-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 4.4 #259
Merged
Version 4.4 #259
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduce edge construction into the 'get.artifact.network.issue' function. Connect 'add_link' and 'referenced_by' issue-events by an edge. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
Add new 'add_link' and 'referenced_by' issue events to the testing data to allow for new tests. Add a test for the construction of an issue-based artifact-network with 'issues.only.comments' turned on and off. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
For some issue-events the 'event.info.1' column is to be interpreted as a reference to an issue. In this case it is more consistent to reformat the column entry into the <issue-[issue.source]-[event.info.1]> format. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
To be consistent with bipartite networks it is necessary to rename the vertex attribute 'IssueEvent' to 'Issue' in multi-networks. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
Modify 'split.data.time.based' to be able to split by activity-based bins. Rename the function to 'split.data.by.time.or.bins'. Introduce wrapper functions 'split.data.by.bins.vector' and 'split.data.time.based' to call 'split.data.by.time.or.bins'. Add 'include.duplicate.ids' parameter in 'split.get.bins.activity.based' to obtain bins covering all data elements from 'df' by which the split is being performed, regardless of the elements ids uniqueness. In 'split.data.activity.based', after calculating the bins to place data elements into, replace the time-based splitting by 'split.data.by.bins.vector'. Time-based splitting is incorrect for the case that the date of the last element in a bin is the same as the date of the first element of the next bin. Adjust calculation of 'offset.end' in 'split.data.activity.based' to fix a bug where because of a short last window the end offset would cross the border of the last window, overlapping into the second last. Because of this overlap the last sliding windows would not be calculated as expected. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
Instead of creating only undirected issue-based artifact-networks, we now take the directedness information out of the network config. Edges are already created in a way that they can be interpreted as directed edges from the issue referencing to the referenced issue. Replace for loop in edge creation by more efficient mclapply and fix some minor formatting inconsistencies. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
Rework the algorithm to create sliding windows in activity-based splitting. Instead of cutting off half a range many elements at the end before building sliding windows (which creates a lot of edge cases), build sliding windows with every element up to the last one. Then remove the last incomplete range. The contents of the last incomplete range will be fully included in the second last range and therefore redundant. Sometimes the last incomplete range is a regular range. Previously the last range always had to be a regular range. This means that removing the last incomplete range requires updating the tests. Additionally fix and improve documentation of splitting methods and fix minor spelling bugs. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
This renaming makes sense as the method only splits dataframes. Rename 'split.data.by.bins.vector' to 'split.data.by.bins' as it is more readable and easier to understand. This works towards #239. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Check type of the 'bins' parameter and its components in the wrapper functions 'split.data.by.time' and 'split.data.by.bins'. Move 'split.data.by.time.or.bins' into a new category for internal helper functions to discourage direct invocation. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Check if the 'bins' parameter of the 'split.data.by.bins' actually contains 'bins' component. Use 'get.date.from.string' instead of accessing 'lubridate::ymd_hms' directly, to encapsulate date conversion. Allow 'vector' component of 'bins' to be of any subclass of 'numeric' instead of explicitly 'numeric'. Disallow lists that contain elements that are not representing a date in 'split.data.time.based' as they do not comply with the expected format of bins for 'split.data.by.time.or.bins'. Signed-off-by: Maximilian Löffler <[email protected]>
Add tests that call 'split.data.time.based' and 'split.data.by.bins' with various malformed 'bins' parameters and expect failure. Signed-off-by: Maximilian Löffler <[email protected]>
Reverse order of reference from Jira issue 332 and Jira issue 328, since previously, Jira issue 332 was referenced before its creation. Copy GitHub and Jira issue data from 'test_feature/feature' also to 'test_proximity/proximity' to keep the data consistent. Adjust all effected tests to comply with the changed testing data. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Also use backticks instead of single ticks for proper markdown highlighting. Improve changelog messages by clarifing and properly focusing on the important changes as well as adding more relevant commit hashes. Signed-off-by: Maximilian Löffler <[email protected]>
Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes Reviewed-by: Thomas Bock <[email protected]> Reviewed-by: Christian Hechtl <[email protected]>
For testing purposes Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
Signed-off by: Leo Sendelbach <[email protected]>
As the 'relation' edge-attribute can only be singular values if 'simplify.multiple.relations' is not set, we can set the previously introduced lambda as default. This works towards #251. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
* Enable 'simplify.multiple.relations' config parameter to work properly. Previously, networks would be simplified before merging datasources, i.e., before edges could even have multiple relations * Circumvent undocumented behavior in 'plyr::rbind.fill' that occurs when merging already simplified networks (see PR#255) Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Concat all relations of an edge to a single string to determine the edges color by, since ggplot2 does not accept lists as an identifier. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Work on the simplification of multi-relation edges in networks Reviewed-by: Thomas Bock <[email protected]>
Occasionally, the 'bins' attribute did not include the end-date of the second-last range, as described in Issue #256. In addition, fix one faulty test. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
As these networks are already simplified, further simplifications are not necessary and might even lead to unexpected effects. Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
Fix a bug in the reconstruction of the 'bins' network-attribute from given ranges Reviewed-by: Thomas Bock <[email protected]>
…ed.by.timestamps` If no custom event timestamps are available in the ProjectData object, an error is thrown in `split.data.time.based.by.timestamps`, to avoid passing `NULL` to the called splitting functions. Signed-off-by: Thomas Bock <[email protected]>
To avoid running into various errors when calculating scale-freeness for an empty network, check if the network is empty before calculating scale-freeness and return `NA` right away if the network is empty. Signed-off-by: Thomas Bock <[email protected]>
In commit e72eff8, the function `get.expanded.adjacency.matrices` has been refactored. Unfortunately, the author names have only be extracted from each network individually instead of globally over all networks. This way, the functionality of the function has been changed inadvertently. With this commit, we follow-up on this and restore the previous behavior. As `get.author.names.from.networks(networks)` now always returns a list, accessing the first element is necessary (although this was not necessary in previous implementations). Moreover, this commit also fixes a wrong test with respect to get.expanded.adjacency.matrices`. Signed-off-by: Thomas Bock <[email protected]>
To stay consistent with the naming scheme of our test files, rename 'test-util-networks-misc.R' to 'test-networks-misc.R'. Signed-off-by: Thomas Bock <[email protected]>
In various of our source-code files, our coding conventions have been violated. With this commit, we fix the most obvious violations: - semicolon at the end of a statement - missing space between "if" and the subsequent "(" - superfluous space between "return" and "(" - inconsistent indentation Signed-off-by: Thomas Bock <[email protected]>
The bug in igraph's `disjoint_union` regarding unintended type conversions is still present in the current igraph version 2.0.3. Therefore, we update the corresponding comment in our source code. Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Thomas Bock <[email protected]>
Various minor fixes Reviewed-by: Christian Hechtl <[email protected]>
Signed-off-by: Thomas Bock <[email protected]> Signed-off-by: Christian Hechtl <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
4.4
Announcement
igraph
(p-value missing forfit_power_law()
igraph/rigraph#1158), which is present in their versions 2.0.0 to 2.0.3, the functionsmetrics.scale.freeness
andmetrics.is.scale.free
can currently not be used with theseigraph
versions. If you need to call any of these two functions, you either need to installigraph
version 1.6.0 or wait until the bug inigraph
is fixed in a future version ofigraph
.Added
split.data.by.bins
function (not to be confused with a previously existing function that had the same name and was renamed in this context), which splits data based on given activity-based bins (PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, ece569c, ed5feb2)get.bin.dates.from.ranges
function to convert date ranges into bins format (PR Miscellaneous improvements to splitting #249, a1842e9, 858b181)assert.sparse.matrices.equal
function to compare two sparse matrices for equality for testing purposes (PR Tests and Fixes for utils-networks-misc.R #248, 9784cdf, d9f1a8d)util-networks-misc.R
(Tests for functionality in util-networks-misc.R #242, PR Tests and Fixes for utils-networks-misc.R #248, PR Various minor fixes #258, f3202a6, 030574b, 380b022, 8b803c5, 7335c3d, 6b600df, a53fab8, faf19fc)Changed/Improved
bins
parameter insplit.data.time.based
andsplit.data.by.bins
(PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, ed0a530, 5e5ecba)bins
attribute on network-, and data-splits (PR Miscellaneous improvements to splitting #249, c064aff, 93051ab)bins
attribute on every output network-split (while minimizing recalculations) (PR Miscellaneous improvements to splitting #249, Wrong bins attribute after sliding-window-based network splitting & inconsistent behavior ofconstruct.ranges
#256, PR Fix a bug in the reconstruction of the 'bins' network-attribute from given ranges #257, a1842e9, 8695fbe)split.data.by.bins
intosplit.dataframe.by.bins
as this it what it does (PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, ed5feb2)split.data.time.based.by.timestamps
if no custom event timestamps are available in the ProjectData object (6305adc)add_link
andreferenced_by
issue events, which connect issues to form edges in issue-based artifact-networks. This includes duplicate edge information in JIRA data as produced by codeface-extraction (PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, 9f840c0, ea4fe8d, 6eb7311)metrics.scale.freeness
andmetrics.is.scale.free
and returnNA
if the network is empty (29418f2)get.author.names.from.network
andget.author.names.from.data
to always have the same output format. Now it doesn't depend on theglobal
flag anymore (PR Tests and Fixes for utils-networks-misc.R #248, d87d325, ddbfe68)util-tensor.R
to correctly use the new output format ofget.author.names.from.network
(PR Tests and Fixes for utils-networks-misc.R #248, 72b663e)convert.adjacency.matrix.list.to.array
if the function is called with wrong parameters (PR Tests and Fixes for utils-networks-misc.R #248, ece2d38, 1a3e510)compare.networks
toassert.networks.equal
to better match the purpose of the function (PR Tests and Fixes for utils-networks-misc.R #248, d9f1a8d)Fixed
event.info.1
column of issue data according to the <issue-%source-%id> format, if the content of theevent.info.1
field references another issue (PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, 62ff9d0)IssueEvent
toIssue
in multi-networks, to be consistent with bipartite-networks (PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, 26d7b7e)split.data.time.based
insidesplit.data.activity.based
to split data into the previously derived bins, when elements close to bin borders share the same timestamps. It is fixed by replacingsplit.data.time.based
bysplit.data.by.bins
(PR Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244, ece569c)metrics.smallworldness
(03e0688)get.expanded.adjacency
to work if the provided author list does not contain all authors from the network and add a warning when that happens since it causes some authors from the network to be lost in the resulting matrix (PR Tests and Fixes for utils-networks-misc.R #248, ff59017)get.expanded.adjacency.matrices
to have correct names for the columns and rows (PR Tests and Fixes for utils-networks-misc.R #248, PR Various minor fixes #258, e72eff8, a53fab8)get.expanded.adjacency.cumulated
so that it works ifweighted
parameter is set toFALSE
(PR Tests and Fixes for utils-networks-misc.R #248, 2fb9a5d)igraph
version 2.0.1.1, which does not allow to add an empty list of vertices (PR Enable simplification of multi-relation edges & make multi-network construction work with igraph version 2.0.1.1 #250, 5547896)