-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find/render related nodes #22
Comments
Hmmm...Looks like this hasn't been spec-ed / exists only in the viz example. There are two means to put this:
(1) can be added directly, but (2) requires either lunrjs for the code to still be able to run on gh-pages or having a full-blown server-side indexing/vectorization. |
On Mon, Jan 09, 2017 at 05:46:17AM -0800, rht wrote:
1. do a regex match on the remaining issues that hasn't been matched
by the 'depends on: ...' syntax (e.g. '#9' and '#15' in
#22 (comment))
I think we want a regexp looking for ‘related to: …’ syntax, because
that will let you declare relations consistently regardless of whether
the relative is on GitHub or not.
2. vectorize (be it semantic or not) the content of each issue, then
construct a similarity matrix. This could be used for issue dedup
as well. An existing example I have seen is the related question
in SO when posting for a new question (looks like it only matches
the question's title instead of the bodies).
I'd rather have these be explicitly declared (with ‘related to: …’),
since that avoids the need to define matching heuristics. And I'm not
sure how often related issue share a lot of similar strings.
“Related” is different from “duplicated”.
The reason I've put off related edges so far is that they're
undirected, so you'd either have to document them on each side (in
issue A: ‘related to: #B’, and in issue B ‘related to: #A’) or have a
way to discover backreferences. See #25 about the difficulties of
backreference discovery.
|
For such purpose, an explicit syntax shouldn't be required. The regexp can be augmented to parse gitlab / mailing list thread / atlassian urls.
The matching heuristics is useful for discovery since a human annotator wouldn't be able to constantly comb through the issues (or recall all possibly related past issues) for such.
They should both refer to specific objects, vars, error messages, etc. The description should be sufficiently regular, there have been libs used to detect duplicated code.
This could be done incrementally. |
On Tue, Jan 10, 2017 at 02:19:29AM -0800, rht wrote:
> I think we want a regexp looking for ‘related to: …’ syntax,
> because that will let you declare relations consistently
> regardless of whether the relative is on GitHub or not.
For such purpose, an explicit syntax shouldn't be required. The
regexp can be augmented to parse gitlab / mailing list thread /
atlassian urls.
Fair enough, that makes the regexp more complicated, but it would be
workable. However…
> I'd rather have these be explicitly declared (with ‘related to:
> …’), since that avoids the need to define matching heuristics.
The matching heuristics is useful for discovery since a human
annotator wouldn't be able to constantly comb through the issues (or
recall all possibly related past issues) for such.
Maybe a separate tool to apply this heuristic and suggest ‘related to:
…’ annotations for the annotator to consider? For example, [1] links
to #45, but the connection between #45 and #59 is mostly for
historical interest and not something where I think an edge in the
issue graph would help organize future work. Using ‘related to: …’
lets you curate your edges, and folks who are comfortable with an
automated heuristic can run:
$ your-heuristic-related-to-injector jbenet/depviz
(or whatever) to add them to their project.
[1]: #59 (comment)
|
We don't do this at the moment (there's a FIXME in the GitHub module). Screenshot for the designed display in #9. Related but (I think?) distinct idea in #15.
The text was updated successfully, but these errors were encountered: