Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ColumnsSelection DSL overhaul #372

Merged
merged 169 commits into from
Jan 30, 2024
Merged

Conversation

Jolanrensen
Copy link
Collaborator

@Jolanrensen Jolanrensen commented May 10, 2023

Discussed in #396

Originally posted by Jolanrensen June 9, 2023
Hi everyone!
While working on the Columns Selection DSL, which is an integral part of the library, I found many inconsistencies and took it upon myself to fix that. I documented almost every function and reworked all overloads, but I still have doubts about some.
So, I collected these concerns in a helpful Kotlin Notebook for you to experience and judge! The notebook imports a proposal version of DataFrame based on the extensive PR, so you can experience the changes firsthand.
The data used for the example can be found here and to view and run the notebook, I recommend the Kotlin Notebook Plugin for IntelliJ.
You can also use Datalore or the Kotlin Jupyter kernel, but then you'd miss out on the new and extensive KDocs, so I wouldn't recommend that.

Please leave your feedback below in the corresponding thread below. I'm curious to hear your feedback! Also, feel free to leave any questions below as well if my notebook isn't clear enough in explaining the issues.

For reviews, I'll break it up into parts. I'll mark items that can be reviewed in bold and with corresponding files (or sections). When the reviewing of a certain section is done, I'll check the box in front of it. Hopefully, that way we can check off all the functions in an ordered manner :). To check out the docs I'd recommend checking out the branch, publish to maven local, and use it in a sample project (or notebook). Then you can see how they are rendered in IntelliJ and test out the links.

@Jolanrensen Jolanrensen added enhancement New feature or request KDocs Improvements or additions to KDocs labels May 10, 2023
@Jolanrensen Jolanrensen added this to the 0.11.0 milestone May 10, 2023
@Jolanrensen Jolanrensen self-assigned this May 10, 2023
…-type-safe apis to access ColumnGroups.

Refactoring
…olumn no longer implements ColumnSet. This means proper separation per function in the selection DSL. Requires sometimes some explicit conversion to ColumnGroups.
… applicable. Expanded KProperty examples, started with filter {}
… to avoid clashes with stdlib functions for string and lists
@zaleslaw
Copy link
Collaborator

zaleslaw commented Jun 2, 2023

It's interesting, could we split the model/plugin changes from the documentation changes?

@Jolanrensen
Copy link
Collaborator Author

Jolanrensen commented Jan 19, 2024

Some small improvements that need to be made before merging this PR:

  • Documentation could be a bit simpler in some places (like in all(), the user doesn't immediately need to know what a ColumnsResolver is.
  • Usage needs a bit more explanation and better titles (also maybe be named Grammar?)
  • ColumnFilter could have better docs, especially for colsAtAnyDepth {}
  • Selection dsl return type might be unclear, because just returning a kproperty doesn't work etc. Clarify in Selection DSL docs
  • each individual CS dsl interface doesn't have docs yet
  • colsOf: clarify that this includes subtypes
  • Clear up relativity with ColumnSet.except
  • colsInGroups could be confused with cols
  • Each function should mention that it's top-level only

# Conflicts:
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Access.columnSelectors.html
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Access.columnSelectorsMisc.html
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Access.columnSelectorsModifySet.html
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Access.columnSelectorsUsages.html
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Modify.convert.html
#	docs/StardustDocs/snippets/org.jetbrains.kotlinx.dataframe.samples.api.Modify.update.html
@Jolanrensen Jolanrensen requested review from zaleslaw and koperagen and removed request for koperagen January 25, 2024 13:34
@@ -205,6 +205,7 @@ val processKDocsMain by creatingProcessDocTask(
ARG_DOC_PROCESSOR,
COMMENT_DOC_PROCESSOR,
SAMPLE_DOC_PROCESSOR,
// REMOVE_ESCAPE_CHARS_PROCESSOR,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be in the next release of the doc preprocessor, I must not forget to enable it then :)

* If no column adheres to the given [condition\], [NoSuchElementException] is thrown.
*
* This function only looks at columns at the top-level.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks at columns - is unclear for me (search among?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with "operates"

* {@setArg [UpdateOperationArg] [**update**][update]}{@comment The default name of the `update` operation function name.}
*/
public interface Usage
public interface Grammar
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GrammarDI or GrammarDocInterface. I'm starting to think, that if we could add prefix or postfix to filter easier all the documentation interfaces

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I don't know, as you can also copy docs from other functions or classes, which are not solely "documentation interfaces" like these and those won't have a special name. Plus, this name is also what people will see when they read the KDocs:
image
Now it just says "public interface Grammar", which looks nice, plus I can also refer to it without alias, like See [Grammar] for....

What we could do, maybe in the future, is to have two main interfaces somewhere: PublicDocInterface, which all documentation interfaces that are user-facing implement, and InternalDocInterFace or TempDocInterface which would indicate which doc interfaces would be removed after the docs are processed. That might clean up all intermediate doc interfaces neatly. (The doc processor cannot do this (yet) btw, just an idea).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's where @Annotations would make more sense I think, like:

/**
 * Some Doc
 */
 @ExcludeFromSources
 internal interface Doc

@Jolanrensen Jolanrensen marked this pull request as ready for review January 30, 2024 15:22
@Jolanrensen Jolanrensen merged commit 5f5b5d3 into master Jan 30, 2024
1 check passed
@Jolanrensen Jolanrensen deleted the continued-columnsselectiondsl branch August 1, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request KDocs Improvements or additions to KDocs research This requires a deeper dive to gather a better understanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants