Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handicap common words?: not just one but two links to "interpretation" #44

Open
holtzermann17 opened this issue May 20, 2013 · 1 comment

Comments

@holtzermann17
Copy link
Collaborator

http://planetmath.org/node/87431#comment-19465

... This makes me think that we should give words a negative relevance score in NNexus based on how common they are in non-mathematical texts. Here is a list of the most common 5000 words. Some of the words on the list include:

  • group
  • interpretation
  • instance
  • structure
  • field
  • contain
  • concept
  • collection
  • ...

I'm not saying that we should never link these words, but maybe we could give them a "handicap" that could be overturned by other evidence (e.g. MSC data).

@dginev
Copy link
Owner

dginev commented Jun 21, 2013

I will move this issue to the 3.0 milestone. For the June release I will only look into handicapping MSC categories that are too specific.

Commonality in natural language is another good indication that a word is not likely to be a term, true, will think of it a bit later in the summer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants