Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage metadata to populate gems urls #724

Closed
wants to merge 4 commits into from

Conversation

kbrock
Copy link
Contributor

@kbrock kbrock commented Oct 10, 2014

This parses common urls from the gemspec metadata. It uses the same keys as the api (found in gemspec.rb).

Addresses questions I've seen in mailing list and gh issues.

Addresses issues in #718

Thanks

/cc @qrush @fwolfst

UPDATE: reduced heading comment since the original concern was determining the keys to use. (this was resolved by using the ones in gemspec.rb

@PragTob
Copy link

PragTob commented Oct 11, 2014

Seems like a good addition to me :)

@fwolfst
Copy link

fwolfst commented Oct 11, 2014

Sweet. No additions (you asked in #718) from my side.

@tenderlove
Copy link
Contributor

@drbrain is this the right direction for adding this stuff?

@mpapis
Copy link

mpapis commented Oct 23, 2014

just to be clear 👍 :)

assert_equal "http://www.docs.com/documentation_url", @linkset.docs
assert_equal "http://www.mail.com/mailing_list_url", @linkset.mail
assert_equal "http://www.code.com/source_url", @linkset.code
assert_equal "http://www.bugs.com/issue_tracker", @linkset.bugs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the desired behavior, that the gemspec metadata will replace links set from the web gui? (i.e. do we care about the source of the linkset data?)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushing gem with updated information should have higher priority then some old edits on site ... at least I would prefer this for my gems

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpapis IMHO, the website data should always take precedence over gem metadata. User shouldn’t have to push a new gem version just to change the bugtracker URL. Users who never use the web UI will get the behavior you prefer. I think it would be very confusing and frustrating to make a change on the web UI, hit the “Save” button, and not see that change reflected on the website.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, then we have a conflicting situation here, maybe when the gem is pushed we should check the metadata for information and push it to database, this way the latest change (UI/metadata) will take precedence.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait I think this is what happens in this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m confused. There should be specs for each of these scenarios so we can be confident everything works (and continues to work) the way we expect:

  • No URL data in the database, new gem is pushed with metadata (expect: gem metadata is displayed for this gem)
  • URL data exists in the database, new gem is pushed with metadata (expect: ???)
  • No gem metadata exists, user adds URLs via web UI (expect: URLs are displayed for this gem)
  • Gem metadata exists, user updates URLs via web UI (expect: updated URLs are displayed for this gem)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first two points are covered in the file we are commenting on, the second two do not have to be covered in this file, from what I remember from my rails days you should not test basic rails functionalities like update_attributes ... and here I found functional test for code url https://github.com/kbrock/rubygems.org/blob/c3994fd11ecab28ea9b1629298a12eeb0be6a5bb/test/functional/rubygems_controller_test.rb#L123 ... at least it looks like related ... but this is not part of this PR, it should be already there, @sferik so does it look good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpapis @sferik we could split this difference and store linksets from the GUI in different fields from the ones populated from the gemspec, then add a flag that says which link source to preferably surface if a link is configured in both places.

Then whatever the source, new data, even if blank, always overwrites old data. And the 'flag' is set separately.

@bf4
Copy link
Contributor

bf4 commented Oct 24, 2014

If accepted, there should be followup PR, per #718, to document these in the gemspec (or is that too tightly coupling the service to client?). Also, possibly write a small blog post?

@mpapis
Copy link

mpapis commented Oct 24, 2014

for me this should be documented in https://github.com/rubygems/rubygems and http://guides.rubygems.org/specification-reference/ should be regenerated

@kbrock
Copy link
Contributor Author

kbrock commented Oct 24, 2014

Thanks so much for your insight

This PR currently implements the following workflows:
(I'm listing this for discussion, not trying to state that I think this is 100% correct)

Base case

  1. The user uses the web ui to add urls
  2. The user pushes a gem with no urls in the gemspec.

End result: The urls are still in the ui and have not been changed.

Improves metadata in the gemspec

  1. The user uses the web ui to add urls
  2. The user pushes a new improved gem with urls in the gemspec.

End result: The urls have been updated from the gemspec. Non-specified urls have not been changed.

Fixup metadata via the ui

  1. The user pushes a new improved gem with bad urls in the gemspec.
  2. The user uses the web ui to fix the urls.
  3. The user pushes the gem still with bad urls in the gemspec.

End result: The gemspec in the gem is the canonical source. It overwrites any values populated in the web UI.


Thoughts: If a user does not add the urls to the gemspec, then the web is still the source of urls. Once the user actively adds these urls to the gemspec, they will continue to change these urls in that place. The web ui is a backup in case the user does not want to push again, but it is a temporary fix, and it is assumed that the changes are also made to the gemspec.

setup do
@spec = gem_specification_from_gem_fixture('test-0.0.0')
@linkset = create(:linkset,
:wiki => "http://www.wiki.com/wiki_url",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be http://wiki.example/wiki_url, http://docs.example/documentation_url, etc. Using real domain names is not a best practice.

@kbrock
Copy link
Contributor Author

kbrock commented Oct 25, 2014

@drbrain Thanks for the different website urls

  • I will create a repo for a test gem with a branch per version.
  • Do we need more version of gems containing metadata?
  • Should I create a branches for the other gems in the gem fixtures directory?
  • Anyone want different metadata keys? Would be easier to change now than later

version 1.0.10

  • gemspec no url metadata

version 1.0.20

version 1.0.30

@bf4
Copy link
Contributor

bf4 commented Oct 27, 2014

@kbrock I don't think you need to actually create any gems to sit in as test fixtures. It sounds from your comment above that you're actually building gems somewhere else?

:mail => spec.metadata && spec.metadata["mailing_list_url"],
:code => spec.metadata && spec.metadata["source_url"],
:bugs => spec.metadata && spec.metadata["issue_tracker"]
}.delete_if { |_n, v| v.nil? }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do it all at once? (assuming we always want to over-write)

   self.update_attributes!({
      :home => spec.homepage,
      :wiki => spec.metadata && spec.metadata["wiki_url"],
      :docs => spec.metadata && spec.metadata["documentation_url"],
      :mail => spec.metadata && spec.metadata["mailing_list_url"],
      :code => spec.metadata && spec.metadata["source_url"],
      :bugs => spec.metadata && spec.metadata["issue_tracker"]
    }.reject{ |_, v| v.nil? }
 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thnx

@kbrock
Copy link
Contributor Author

kbrock commented Oct 28, 2014

all, thanks for all the great pointers:

  • using a factory rather than an actual gem for metadata. (makes tests easier to read/write)
  • condensed update_attribute_from_gemspec
  • change wording on tests to be more explicit and closer to language from @sferik

@sferik / @bf4 / @mpapis - I feel that all metadata should come from the gemspec. It comes from there for every other field, including the homepage url. The edit linkset ui was necessary in the past. Now that there is a place in the gemspec to populate the linkset data, it may not even be necessary at all.

Since the data previously populated is staying in the tables, nothing is going to break or require change. Only developers who want to make their gemspec more rich will need to add these fields. They can even be part of PRs.

Many people populated those fields years ago in the web and don't even remember that the web ui exists. If there are special rules around those 4 fields, will developers get confused? They push a new gemspec and all but those 4 fields are updated.

Is the consensus option 3?

  1. remove web ui for linkset. (but keep the link values from previous edits)
  2. leave linkset model and web ui as is, allowing gemspec links to overwrite linkset values from web.
  3. Add 5 fields to linksets, keeping gemspec and web ui links separate. web version takes precedence. Maybe clear out web ui links when they match the gemspec version. Slowly migrating to the gemspec version.

@mpapis
Copy link

mpapis commented Oct 28, 2014

@kbrock missed point 4: add warning to the UI that now you can do the same with gemspec

@kbrock
Copy link
Contributor Author

kbrock commented Oct 28, 2014

@mpapis lol. added documentation everywhere EXCEPT for rubygems.

that warning is for option 2 in the linkset view?
If we go with 3, the wording may be tricky, but probably belongs in there as well.

You leaning towards any of the above options?
I coded up option 2, but probably lean towards 1.
If the team likes 3 then I can just do that.
I just want the values into there.

After that is determined, would be nice to review the actual names for the hash variables.

@mpapis
Copy link

mpapis commented Oct 28, 2014

well, 2+4 is simplest, I would not go for 1 as like someone explained earlier you might use it to fix simple metadata changes without pushing new gem version ... but hey, going with 1 might be the way - removing code has to be good ;)

@kbrock
Copy link
Contributor Author

kbrock commented Oct 28, 2014

I liked the idea of using the web ui to fix bad metadata until I realized that the homepage was not in the web ui. And then realized that no other meta-data was in there. So it seems odd that we have a ui to fix only some of the fields.

I'll get 4 into 2. Feels a little chicken/egg with regards to 4:
Need the documentation to be done before linking there.
Need this PR to finalize the names of the fields before creating the documentation.

Maybe 4 needs to be done after updating docs after 2 is done

)
end

should "add new link that has no existing value" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should set a new link

:mail => spec.metadata && spec.metadata["mailing_list_url"],
:code => spec.metadata && spec.metadata["source_url"],
:bugs => spec.metadata && spec.metadata["issue_tracker"]
}.delete_if { |_n, v| v.nil? })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better implementation would be to make a link mapping (coding in the browser, caveat reader)

LINKSET_METADATA_MAP = {
  wiki: 'wiki_url',
  docs: 'documentation_url',
  mail: 'mailing_list_url',
  code: 'source_url',
  bugs: 'issue_tracker',
}
def ....
  update_attributes!(linkset_from_spec(spec))
end
def linkset_from_spec(spec)
  {
  home: spec.homepage,
  }.merge(
    Hash[
      LINKSET_METADATA_MAPPING.map { |attribute, metadata_key|
        # metadata will always be a hash `{}` unless we erroneously allow it to be something else in the setter
        if link = spec.metadata[metadata_key]
          [attribute, link]
        end
      }.compact
    ]
  )
end

I also would like some discussion on standardizing metadata keys and format before merge. I think I'd prefer a metadata key 'urls' that has all the mappings inside it, e.g.

{ urls: { docs: docs_url, wiki: wiki_url }}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well LINKSET_METADATA_MAP eliminates code duplication, but it also makes more code then it's worth it, I would extract spec.metadata && before it all like this:

def update_attributes_from_gem_specification!(spec)
  attributes =  { :home => spec.homepage }
  attributes.merge({
    :wiki => spec.metadata["wiki_url"],
    :docs => spec.metadata["documentation_url"],
    :mail => spec.metadata["mailing_list_url"],
    :code => spec.metadata["source_url"],
    :bugs => spec.metadata["issue_tracker"]
  }.delete_if { |_n, v| v.nil? }) if spec.metadata
  self.update_attributes!(attributes)
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't really thinking in terms of duplication, but in terms of extensibility.

  update_attributes! {
    :home => spec.homepage,
  }.merge({
    :wiki => spec.metadata["wiki_url"],
    :docs => spec.metadata["documentation_url"],
    :mail => spec.metadata["mailing_list_url"],
    :code => spec.metadata["source_url"],
    :bugs => spec.metadata["issue_tracker"],
  }).delete_if {|_, url| 
    # validating it's a url is sensible, no?
    begin
      not URI(url).absolute
    rescue URI::InvalidURIError, ArgumentError
      # don't just swallow the error. return it in the api somehow
      true
    end
  }

can we also discuss metadata keys we'll be promoting?

mapping wiki to wiki_url make sense. code to source_url.. I'd prefer code -> code_url, docs -> docs_url, mail -> mail_url, bugs -> bugs_url. Now, if we had a mapping, we could accept source_url or code_url to map to code. YMMV

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URI("[email protected]:wayneeseguin/rvm.git").absolute => URI::InvalidURIError: bad URI(is not URI?): [email protected]:wayneeseguin/rvm.git

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that is the correct behavior. It's really irritating the pull down gem spec data over the API and then do something with the homepage url, only to discover it's something like that. I mean, your example input really isn't a url...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(BTW, I am a drinking out of my RVM 2.0 water glass now 😄 )

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it is a way of describing where code is, not all git/svn/cvs/hg has to have http representation

@kbrock
Copy link
Contributor Author

kbrock commented Oct 28, 2015

@arthurnn / @mpapis I updated to take into account the stored metadata. Does this look like it still meets our needs?

<%= link_to_page t("rubygems.show.links.#{link}"), @rubygem.linkset.public_send(link) %>
<%- end %>
<% end %>
<%- Linkset::LINKS.each do |link, attribute| %>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of this, literally just suggested this in the original commit today. 👍

@kbrock
Copy link
Contributor Author

kbrock commented Jul 13, 2016

@arthurnn did you get a chance to look at #1234? do you find that code simpler / better?
thanks

@homu
Copy link
Contributor

homu commented Nov 6, 2016

☔ The latest upstream changes (presumably #1423) made this pull request unmergeable. Please resolve the merge conflicts.

@homu
Copy link
Contributor

homu commented Dec 28, 2016

☔ The latest upstream changes (presumably #945) made this pull request unmergeable. Please resolve the merge conflicts.

@kbrock
Copy link
Contributor Author

kbrock commented Dec 30, 2016

closing, preferring #1234
not worth the constant rebasing

@kbrock kbrock closed this Dec 30, 2016
@kbrock kbrock deleted the rubygem_fields branch December 30, 2016 14:44
homu added a commit that referenced this pull request Jan 6, 2017
Leverage metadata to populate gems urls v2

## High Level Goals
1. The app knows too much about the url generation process. Provide a decorator `versioned_links` to simplify this logic.
2. More importantly: All metadata for a gem comes from the `gemspec` file except for urls. Provide a mechanism to put urls into the `gemspec`.

goal 1 makes the process of fetching the urls out of gemspec trivial - 2 lines of change. If we are unsure about #2, then can we at least focus/agree on the first goal?
## Commentary

This is an alternative to #724. It is in response to comment from @qrush [[here]](#1127 (comment)) For implementing #1127 - take urls from `rubygems.metadata` and retire `linkset`.

@arthurnn ~~I thought it would be simpler to keep this separate from #724. Let me know if you want me to add metadata (and/or add change log) support directly to this PR.~~ I added in the metadata as the second commit
## before:

Links on the pages are generated from `Links` and `Version` objects. This is needed because some urls have `version.number`.
This logic finds it's way into rss, views, helpers, and models. Mostly in `rubygems.rb`, `rubygems_helper.rb`, and `links.rb`.
## after:

A decorator `LinkedVersion` is introduced to simplify the relationship between these objects. Since rss uses this as well, the decorator is kept in `app/model/`.
## future:

Once `rubygems.metadata` is used by most (all?) gems, drop linkset. (many years from now)

Thanks,
Keenan

**UPDATED:** hope this is more readable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.