-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add chunk positions #92
Conversation
Thanks for taking this on @renatho . Assuming @syavorsky is ok with it, I'm wondering though if we could instead have the parsing fully preserve, by option, whitespace before and after each component. One could still calculate the counts, but the stringifier could also use this (e.g., after a targeted modification) to preserve whatever whitespace the user originally had, e.g,. whether tabs, spaces, etc. So, maybe something like this (using tab escapes for readability): whitespace: {
start: ' ',
tag: ' \t\t',
type: ' ',
name: ' ',
description: '\t',
end: '\t \t'
} e.g., for: /**
*\t\t@my-tag {my.type} name\tdescription\t \t
*/ It'd also be great if we could have a way (as per #79) for multi-line descriptions to distinguish whether asterisks were missing or not: /**
* @some-tag {Type} name.subname.subsubname Singleline or
* multiline description text
*/ /**
* @some-tag {Type} name.subname.subsubname Singleline or
multiline description text
*/ (So that for users whose say ...and in a way where we knew whether there was leading whitespace on any given line before the line's asterisk. Finally, one other issue related to whitespace (though maybe better for us to resolve separately as it is not tag-related)... It doesn't seem that there is any indication about how many lines (if any) occur before the main (leading) description or after it (while the latter "after" info can be constructed when tags are present by looking at their line number, the presence of additional lines after a description isn't parseable when lines are not present). The current top-level |
@renatho this is totally good idea! I would just spend more time discussing how to make offsets capturing more consistent, and making sure we address concerns @brettz9 has brought up about stringifier. Given following:
I would try to address following:
I get something like this in mind: it('should report offsets, trim = false`', function () {
expect(parse(function () {
/**
* Overall description
*
* another line
*
* @param {Object} options
* @example default
this line has no marker
* doSomething(options)
* doSomething(options)
*/
}, {
trim: false
})).to.eql([{
description: '\nOverall description\n\n another line\n',
line: 1,
source: '\nOverall description\n\n another line\n\n@param {Object} options\n@example default\nthis line has no marker\ndoSomething(options)\ndoSomething(options)\n',
offsets: {
block: {start: 4, end: 4},
description: [
{start: 7, end: 7, callout: 4}, // line break right next to /**
{start: 7, end: 26, callout: 5}, // * Overall description
{start: 6, end: 6, callout: 5}, // *
{start: 7, end: 19, callout: 4}, // * another line
{start: 6, end: 6, callout: 5}, // *
],
},
tags: [{
description: '',
line: 6,
name: 'options',
optional: false,
source: '@param {Object} options',
tag: 'param',
type: 'Object',
offsets: {
tag: {start: 7, end: 13, callout: 5},
type: {start: 14, end: 22},
name: {start: 23, end: 30}
}
}, {
description: 'this line has no marker\ndoSomething(options)\ndoSomething(options)\n',
line: 7,
name: 'default',
optional: false,
source: '@example default\nthis line has no marker\ndoSomething(options)\ndoSomething(options)\n',
tag: 'example',
type: '',
offsets: {
tag: {start: 7, end: 15, callout: 5},
name: {start: 16, end: 23},
description: [
{start: 4, end: 27, callout: -1}, // this line has no marker
{start: 6, end: 26, callout: 4}, // * doSomething(options)
{start: 8, end: 28, callout: 6}, // * doSomething(options)
{start: 4, end: 4, callout: -1}, // final line break
]
}
}]
}])
}) Now, how this would look for the ones below? /** one line */ /** one line
*/ /** 组件返回的对象 */ and what it it all gonna look like for |
While \t/**
\t *
\t */ Good point re: As far as multi-byte characters, I don't think any special handling should be required within |
@brettz9 I see your point about keeping exact whitespaces. Would you model sample output for clarity, including multilane cases and oneliners? |
See the As far as an API to communicate the whitespace (and asterisks) within multi-line tag descriptions, a minimal solution might be to have a /**
\t * @tag name Some long
\t ** multiline
\t * description.
\t */ to: {
multilineAsterisks: true,
descriptionRaw: 'Some long\n* multiline\n description.\n'
} As far as the linebreaks before and after descriptions or at the end: /**
*
* Here is a description.
*
*/ ...could perhaps be represented with: {
description: {
// These two would both be `0` for `/** description */`
start: 2,
end: 4,
// This `descriptionRaw` is different from the per-tag raw description mentioned above
descriptionRaw: '\n *\n *Here is a description.\n *\n *'
// If auto-determining indent level, asterisks, etc., it might instead be:
// descriptionRaw: '\n\nHere is a description.\n\n'
}
} |
@brettz9 sorry, while it may sound very straightforward to you, I am still having trouble to build complete picture of what you are proposing in my mind. Can we model the entire output for this?
|
Here is my tentative first shot at it, though with testing, and attempting to stringify, some changes might be called for: [{
description: "Overall description\n\nanother line",
descriptionMeta: {
start: 1,
end: 5,
descriptionRaw: '\n * Overall description\n\n * \n * another line'
// If the `commentParser` call were passed a new options object like:
// `{indent: 4, indentChar: ' ', postAsteriskSpace: true}`,
// ...so as to suggest a structure to impose on the raw data (though without losing idiosyncratic spacing
// info as `description` does, then the raw might instead be the following, stripping out the expected
// template, but keeping deviations (though maybe we should leave any optimized parsing as a utility
// since different consumers might want to say avoid imposing this on `@example`):
descriptionRaw: '\nOverall description\n\n\n * another line'
},
line: 1,
source: "Overall description\n\nanother line\n\n@param {Object} options\n * @example default\n this line has no *\ndoSomething(options)\ndoSomething(options)",
tags: [
{
description: "",
descriptionRaw: "",
line: 6,
name: "options",
optional: false,
source: "@param {Object} options",
tag: "param",
type: "Object",
// Indicate the spacing before the tag, type, name, etc. (as well as at the start and end of the line)
whitespace: {
start: ' ',
tag: ' ',
type: ' ',
name: ' ',
description: ' ',
end: ''
}
}
{
description: "this line has no *\ndoSomething(options)\ndoSomething(options)",
descriptionRaw: "\n this line has no *\n * doSomething(options)\n * doSomething(options)\n ",
line: 7,
name: "default",
optional: false,
source: "@example default\nthis line has no *\ndoSomething(options)\ndoSomething(options)",
tag: "example",
type: "",
whitespace: {
start: ' ',
tag: ' ',
type: ' ',
name: ' ',
description: ' ',
end: ''
}
}
]
}] |
I think we are trying to introduce to many concepts: start, end, whitespace, descriptionMeta, descriptionRaw. It would make hard to read the output and make sense of it. And over time, more concepts would keep getting added. I want to propose a simpler alternative, your {
description: '\nOverall description\n\n another line\n',
line: 1,
source: '\nOverall description\n\n another line\n\n@param {Object} options\n@example default\nthis line has no marker\ndoSomething(options)\ndoSomething(options)\n',
tokens: {
descr: [{ // line 1
start: ' ',
delim: '*',
postdelim: ' ',
descr: 'Overall description'
}, { // line 2, empty
start: ' ',
delim: '*',
}, { // line 3, shifted
start: ' ',
delim: '*',
postdelim: ' ',
descr: 'Overall description'
}, { // line 4, empty
start: ' ',
delim: '*',
}]
},
tags: [{
description: '',
line: 6,
name: 'options',
optional: false,
source: '@param {Object} options',
tag: 'param',
type: 'Object',
tokens: {
start: ' ',
delim: '*',
postdelim: ' ',
tag: '@param',
posttag: ' ',
type: '{Object}',
posttype: ' ',
name: 'options'
}
}, {
description: 'this line has no marker\ndoSomething(options)\ndoSomething(options)\n',
line: 7,
name: 'default',
optional: false,
source: '@example default\nthis line has no marker\ndoSomething(options)\ndoSomething(options)\n',
tag: 'example',
type: '',
tokens: {
start: ' ',
delim: '*',
postdelim: ' ',
tag: '@example',
posttag: ' ',
name: 'default',
descr: [{
// nothing, line break
}, {
start: ' ',
descr: 'this line has no marker'
}, {
start: ' ',
delim: '*',
postdelim: ' ',
descr: 'doSomething(options)'
}, {
start: ' ',
delim: '*',
postdelim: ' ',
descr: 'doSomething(options)'
}]
}
}]
} each tag, as well as general description, has order = ['start', 'delim', 'postdelim', 'tag', 'posttag', 'type', 'posttype', 'name', 'postname', 'descr']
function stringify(tokens) {
return order.reduce((block, token) => {
return block + (Array.isArray(tokens[token]) ? tokens[token].map(stringify).join('\n') : (tokens[token] || ''))
}, '')
} task of validating/aligning callouts or tags using '.tokens' should be pretty straightforward. And this approach is agnostic to |
Excellent--the only things I see we'd also want:
|
Sure, some tweaks will be added as implemented. I will take it from here |
Hey folks! I'm sorry for the delay to answer here, I'm having some busy days.
Yep! This solution works to compare the alignment. \o/
Sure! No problem! Thank you for that! |
The PR
We're working in an Eslint rule (gajus/eslint-plugin-jsdoc#636 / gajus/eslint-plugin-jsdoc#638) which validates JSDoc alignments. And we'd like to use this lib to help to get the positions of the JSDoc parts. So this PR introduces positions to the parsing results.
Happy to see opinions about that and if we can improve something!
Example
JSDoc:
Parsing result:
Caveats
description
result contains\n
, the description will bemultiline: true
, otherwise, it'll bemultiline: false
. It means thatopts.trim
andopts.join
can impact it.TODO
Future ideas