-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
discuss: a new specifications #36
Comments
I had created a SYNTAX.md at docs/. Sorry for not using org-mode, but github can't render org-mode file properly :(. |
I like this idea. Using org-elements as a guide is good, but it has too many bugs to be a spec. I haven't looked into it, but maybe the org syntax draft at orgmode.org is open to contributions? If so, and they had a common understanding of what the source of truth is (likely org-mode itself), perhaps we could upstream some/all of these edge cases.
|
Neither have I. But the original version of org syntax actually contains some implementation information about elisp, which is irrelevant to orgize. So I wish to have a new specification without anything related to eslip.
Yes, this is the problem that the new specification is going to resolve. Basically, I want to make clear definitions of space, tab, line ending, line and blank line which can be used throughout the whole specification. Here is an initial version of definitions (majorly stole from CommonMark): A space is A tab is A line is a sequence of zero or more characters other than newline ( A line ending is a newline ( A blank line is a line containing no characters, or a line containing only spaces (
I can see that you're trying to behave the same way as org-mode or org-element. But I don't think it's a good idea to introduce too much special cases. Just like the definitions I mentioned above, if a headlines' stars should be followed by a space, I would expect it also accepts a tab. Same with todo keyword, priority and tags. After all, I believe the specification should be able to enforce the consistent behave of parsing without bringing too much special cases. |
That seems like a perfectly reasonable way to define it, thank you for the explanation. I care more about clear, consistent, documented behavior than what it is, and this specification will deliver that. |
A correct implementation requires a precise specifications. However, neither org syntax draft nor org elements api can really serves as a good specifications: Org syntax doesn't specify syntax unambiguously, meanwhile, org-elements-api is quite buggy and provides some apparent wrong results in some cases (e.g. #22, #19 and #14).
To resolve that, I think the ultimate solution is to maintain a specification by our own. To be clear, it's not going to create a subset of org markup language but just describes the expected results you will get when using orgize.
Fortunately, we don't need to start from scratch. We can just make a copy of the original org syntax and adapt it for our needs (like defining required and optional fields). Then, I would like to borrow some concepts from CommonMark Spec, especially the part of handling whitespace and list indenting. In short, our new specification will be majorly based on org syntax and with some modifications and additions.
Feel free to leave any comments below if you have any idea about the new upcoming specification =).
As a reminder, here're some issues/commits that need to be reopened/reverted after applying the new specification.
#17: should be reopened. Org syntax clarified that title should be matched after other part have been matched.
ba9c83c: should reverted. We should only handle ascii whitespace. I think it is totally unnecessary to take care of unicode whitespace if we don't need to be compatible with org-elements-api.
#26 #34: should be closed and moved to a new issue. Org syntax specifies that headlines' stars must be followed by at least one space character. But I think tab is also acceptable (just like CommonMark does and I will include it in the new specifications latter). Hence, only
***
,*** \n
or***\t\t
should be valid, but not***
,****\r
or***\n
.#33: should be closed.
The text was updated successfully, but these errors were encountered: