Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for retrieving the list of files that will be accessed #1659

Merged
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ This project adheres to [Semantic Versioning](https://semver.org/).

### API Changes & RuleSet providers

#### Retrieve input files

The list of files which will be accessed by KtLint when linting or formatting a given path can now be retrieved with the new API `KtLint.getInputPaths(path: Path): List<Path>`. Currently, the `.editorconfig` files are the only files which will be accessed during linting or formatting in addition to the source files itself.

This API can be called with either a file or a directory. It's intended usage is that it is called once with a directory path before actually linting or formatting files. When called with a directory path, all `.editorconfig` in the directory or any of its subdirectories (except hidden directories) are returned. In case the given directory does not contain an `.editorconfig` file or if it does not contain the `root=true` setting, the parent directories are scanned as well until a root `.editorconfig` file is found.

Calling this API with a file path results in the `.editorconfig` files that will be accessed when processing that specific file. In case the directory in which the file resides does not contain an `.editorconfig` file or if it does not contain the `root=true` setting, the parent directories are scanned until a root `.editorconfig` file is found.

### Added

### Fixed
Expand Down
11 changes: 11 additions & 0 deletions ktlint-core/src/main/kotlin/com/pinterest/ktlint/core/KtLint.kt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import com.pinterest.ktlint.core.api.EditorConfigOverride
import com.pinterest.ktlint.core.api.EditorConfigOverride.Companion.emptyEditorConfigOverride
import com.pinterest.ktlint.core.api.EditorConfigProperties
import com.pinterest.ktlint.core.api.UsesEditorConfigProperties
import com.pinterest.ktlint.core.internal.EditorConfigFinder
import com.pinterest.ktlint.core.internal.EditorConfigGenerator
import com.pinterest.ktlint.core.internal.EditorConfigLoader
import com.pinterest.ktlint.core.internal.RuleExecutionContext
Expand All @@ -21,6 +22,7 @@ import java.nio.file.FileSystems
import java.nio.file.Path
import java.nio.file.Paths
import java.util.Locale
import kotlin.io.path.isDirectory
import org.ec4j.core.Resource
import org.ec4j.core.model.PropertyType
import org.jetbrains.kotlin.com.intellij.lang.ASTNode
Expand Down Expand Up @@ -340,6 +342,15 @@ public object KtLint {
)
}

/**
* Get the list of files which will be accessed by KtLint when linting or formatting the given file or directory.
*/
public fun getInputPaths(path: Path): List<Path> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Gradle Plugin perspective, we want to start watching input files to be able to reload ktlint caches when their content changes. The preferred way is to call KtLint#reloadEditorConfigFile, which name suggests it's meant to be used with editorconfig files only.
So, I started wondering if these two shouldn't be symmetrical 🤔 With getInputPaths() and reloadEditorConfigFile each library consumer, who also experiences ktlint config being cached across re-runs, will have to filter out unknown files. If we had a separate method per each type (getEditorConfigPaths + reloadEditorConfigFile) or more generic methods (getInputPaths + reloadInput) it would be clear how they interact with each other.

I know currently, mostly .editorconfig files will be returned here, I'm just trying to future-proof the api

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it works to rename getInputPaths to getEditorConfigFilePaths and leave out the source file itself which I added solely because the getInputPath sounded more generic ;-)
I see no need to the generic methods (getInputPaths + reloadInput) until we actually have other sources of inputs.

EditorConfigFinder().findEditorConfigs(path) +
listOfNotNull(
path.takeIf { !it.isDirectory() },
)

/**
* Generates Kotlin `.editorconfig` file section content based on [ExperimentalParams].
*
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
package com.pinterest.ktlint.core.internal

import com.pinterest.ktlint.core.initKtLintKLogger
import com.pinterest.ktlint.core.internal.ThreadSafeEditorConfigCache.Companion.threadSafeEditorConfigCache
import java.nio.charset.StandardCharsets
import java.nio.file.FileVisitResult
import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.SimpleFileVisitor
import java.nio.file.attribute.BasicFileAttributes
import kotlin.io.path.isDirectory
import mu.KotlinLogging
import org.ec4j.core.Resource
import org.ec4j.core.ResourcePropertiesService
import org.ec4j.core.model.Version
import org.jetbrains.kotlin.konan.file.File

private val logger = KotlinLogging.logger {}.initKtLintKLogger()

internal class EditorConfigFinder {
/**
* Finds all relevant ".editorconfig" files for the given path.
*/
fun findEditorConfigs(path: Path): List<Path> {
val result = mutableListOf<Path>()
val normalizedPath = path.normalize().toAbsolutePath()
if (path.isDirectory()) {
result += findEditorConfigsInSubDirectories(normalizedPath)
}
Comment on lines +30 to +32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started wondering if there shouldn't be a way to exclude searching in subdirectories 🤔 I initially thought of passing project root directory as the path, but that would result in traversing through all files and folders, including build, docs or resources directory, which would be wasteful.

Gradle plugins would have to be smarter, and we'd have to figure out a way to identify the root folder for all directories containing Kotlin sources, which unfortunately I failed to do so :/ For Kotlin plugin we get SourceDirectorySet#getSrcDirTrees which would allow us to identify multiple root directories, but I'm unaware of a similar way to obtain such information for other plugins (for example, for Android Gradle Plugin).

Given all of the above, I started leaning towards ignoring subdirectories, for performance reasons (and I believe sharing common editorconfig settings for a compilation unit is a fairly common situation). Having the subdirectories' traversal optional, ktlint consumers could switch to a more precise input calculation if they are able to identify a proper top-most path.
Do you have any thoughts/ideas/advices here?

I can link my comment on the same topic: jeremymailen/kotlinter-gradle#265 (comment)

Copy link
Collaborator Author

@paul-dingemans paul-dingemans Sep 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think the performance impact of scanning all subdirectories it too big? This call should be made only once. For testing KtLint I always run KtLint on a collection of open source projects which in total contain 2,800+ directories and more than 5,000+ files (both excluding files in .git and build dirs) and the scanning is really fast.

I have not encountered many use cases in which a project contains multiple .editorconfig files. But I think it would be really confusing for developers if the .editorconfig file in the root of the project and its parents is treated differently from .editorconfig files in subdirectories.

As with all performance issues, my advice would be first to measure before deciding to optimize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think the performance impact of scanning all subdirectories it too big?

I don't know. I haven't ever done any sort of file performance things. My assumption was that doing things takes more time that completely avoiding it 😛

This call should be made only once

From Gradle Plugin perspective, it's once per task invocation, so every time someone calls ./gradlew ktlintTask.

As with all performance issues, my advice would be first to measure before deciding to optimize.

Sure, sounds reasonable 👍 I'll make sure to do extra profiling when switching to the new API 👍 Thanks for answering my doubts (and for finding time to work on the new api) 🙏

result += findEditorConfigsInParentDirectories(normalizedPath)
return result.toList()
}

private fun findEditorConfigsInSubDirectories(path: Path): List<Path> {
val result = mutableListOf<Path>()

Files.walkFileTree(
path,
object : SimpleFileVisitor<Path>() {
override fun visitFile(
filePath: Path,
fileAttrs: BasicFileAttributes,
): FileVisitResult {
if (filePath.File().name == ".editorconfig") {
logger.trace { "- File: $filePath: add to list of accessed files" }
result.add(filePath)
}
return FileVisitResult.CONTINUE
}

override fun preVisitDirectory(
dirPath: Path,
dirAttr: BasicFileAttributes,
): FileVisitResult {
return if (Files.isHidden(dirPath)) {
logger.trace { "- Dir: $dirPath: Ignore" }
FileVisitResult.SKIP_SUBTREE
} else {
logger.trace { "- Dir: $dirPath: Traverse" }
FileVisitResult.CONTINUE
}
}
},
)

return result.toList()
}

private fun findEditorConfigsInParentDirectories(path: Path): List<Path> {
// The logic to load parental ".editorconfig" files resides in the ec4j library. This library however uses a
// cache provided by KtLint. As of this the list of parental ".editorconfig" files can be extracted from the
// cache.
createLoaderService().queryProperties(path.resource())
return threadSafeEditorConfigCache.getPaths()
}

private fun Path?.resource() =
Resource.Resources.ofPath(this, StandardCharsets.UTF_8)

private fun createLoaderService() =
ResourcePropertiesService.builder()
.cache(threadSafeEditorConfigCache)
.loader(org.ec4j.core.EditorConfigLoader.of(Version.CURRENT))
.build()
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package com.pinterest.ktlint.core.internal

import com.pinterest.ktlint.core.initKtLintKLogger
import java.nio.file.Paths
import java.util.concurrent.locks.ReentrantReadWriteLock
import kotlin.concurrent.read
import kotlin.concurrent.write
Expand Down Expand Up @@ -72,6 +73,14 @@ internal class ThreadSafeEditorConfigCache : Cache {
}.clear()
}

/**
* Get the paths of files stored in the cache.
*/
fun getPaths() =
inMemoryMap
.keys
.map { Paths.get(it.path.toString()) }

private data class CacheValue(
val editorConfigLoader: EditorConfigLoader,
val editConfig: EditorConfig,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
package com.pinterest.ktlint.core.internal

import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import kotlin.io.path.absolutePathString
import kotlin.io.path.writeText
import org.assertj.core.api.Assertions.assertThat
import org.jetbrains.kotlin.konan.file.File
import org.junit.jupiter.api.Nested
import org.junit.jupiter.api.Test
import org.junit.jupiter.api.io.TempDir

class EditorConfigFinderTest {
@Nested
inner class FindByFile {
@Test
fun `Given a kotlin file in a subdirectory and a root-editorconfig file in the same directory then get the path of that editorconfig file`(
@TempDir
tempDir: Path,
) {
val someSubDir = "some-project/src/main/kotlin"
tempDir.createFile("$someSubDir/.editorconfig", "root=true")
val kotlinFilePath = tempDir.createFile("$someSubDir/test.kt", "val foo = 42")

val actual =
EditorConfigFinder()
.findEditorConfigs(kotlinFilePath)
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual).contains("$someSubDir/.editorconfig")
}

@Test
fun `Given a kotlin file in a subdirectory and a root-editorconfig file in a parent directory then get the path of that parent editorconfig file`(
@TempDir
tempDir: Path,
) {
val someProjectDirectory = "some-project"
tempDir.createFile("$someProjectDirectory/.editorconfig", "root=true")
val kotlinFilePath = tempDir.createFile("$someProjectDirectory/src/main/kotlin/test.kt", "val foo = 42")

val actual =
EditorConfigFinder()
.findEditorConfigs(kotlinFilePath)
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual).contains("$someProjectDirectory/.editorconfig")
}

@Test
fun `Given a kotlin file in a subdirectory and a non-root-editorconfig file in that same directory and a root-editorconfig file in a parent directory then get the paths of both editorconfig files`(
@TempDir
tempDir: Path,
) {
val someProjectDirectory = "some-project"
tempDir.createFile("$someProjectDirectory/.editorconfig", "root=true")
tempDir.createFile("$someProjectDirectory/src/main/.editorconfig", "root=false")
val kotlinFilePath = tempDir.createFile("$someProjectDirectory/src/main/kotlin/test.kt", "val foo = 42")

val actual =
EditorConfigFinder()
.findEditorConfigs(kotlinFilePath)
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual).contains(
"$someProjectDirectory/.editorconfig",
"$someProjectDirectory/src/main/.editorconfig",
)
}

@Test
fun `Given a kotlin file in a subdirectory and a root-editorconfig file in the parent directory and another root-editorconfig file in a great-parent directory then get the paths of editorconfig files excluding root-editorconfig once the first one is found`(
@TempDir
tempDir: Path,
) {
val someProjectDirectory = "some-project"
tempDir.createFile("$someProjectDirectory/src/main/.editorconfig", "root=false")
tempDir.createFile("$someProjectDirectory/src/.editorconfig", "root=true")
tempDir.createFile("$someProjectDirectory/.editorconfig", "root=true")
val kotlinFilePath = tempDir.createFile("$someProjectDirectory/src/main/kotlin/test.kt", "val foo = 42")

val actual =
EditorConfigFinder()
.findEditorConfigs(kotlinFilePath)
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual)
.contains(
"$someProjectDirectory/src/main/.editorconfig",
"$someProjectDirectory/src/.editorconfig",
).doesNotContain(
"$someProjectDirectory/.editorconfig",
)
}
}

@Nested
inner class FindByDirectory {
@Test
fun `Given a directory containing a root-editorconfig file and a subdirectory containing a editorconfig file then get the paths of both editorconfig files`(
@TempDir
tempDir: Path,
) {
val someDirectory = "some-project"
tempDir.createFile("$someDirectory/.editorconfig", "root=true")
tempDir.createFile("$someDirectory/src/main/kotlin/.editorconfig", "some-property=some-value")

val actual =
EditorConfigFinder()
.findEditorConfigs(tempDir.plus(someDirectory))
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual).contains(
"$someDirectory/.editorconfig",
"$someDirectory/src/main/kotlin/.editorconfig",
)
}

@Test
fun `Given a subdirectory containing an editorconfig file and a sibling subdirectory contain a editorconfig file in a parent directory then get the path of all editorconfig file except of the sibling subdirectory`(
@TempDir
tempDir: Path,
) {
val someProjectDirectory = "some-project"
tempDir.createFile("$someProjectDirectory/.editorconfig", "root=true")
tempDir.createFile("$someProjectDirectory/src/main/kotlin/.editorconfig", "some-property=some-value")
tempDir.createFile("$someProjectDirectory/src/test/kotlin/.editorconfig", "some-property=some-value")

val actual =
EditorConfigFinder()
.findEditorConfigs(tempDir.plus("$someProjectDirectory/src/main/kotlin"))
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual)
.contains(
"$someProjectDirectory/.editorconfig",
"$someProjectDirectory/src/main/kotlin/.editorconfig",
).doesNotContain(
"$someProjectDirectory/src/test/kotlin/.editorconfig",
)
}

@Test
fun `Given a directory containing an editorconfig file and multiple subdirectores containing a editorconfig file then get the path of all editorconfig files`(
@TempDir
tempDir: Path,
) {
val someProjectDirectory = "some-project"
tempDir.createFile("$someProjectDirectory/.editorconfig", "root=true")
tempDir.createFile("$someProjectDirectory/src/main/kotlin/.editorconfig", "some-property=some-value")
tempDir.createFile("$someProjectDirectory/src/test/kotlin/.editorconfig", "some-property=some-value")

val actual =
EditorConfigFinder()
.findEditorConfigs(tempDir.plus(someProjectDirectory))
.mapNotNull { it.toRelativePathStringIn(tempDir) }

assertThat(actual).contains(
"$someProjectDirectory/.editorconfig",
"$someProjectDirectory/src/main/kotlin/.editorconfig",
"$someProjectDirectory/src/test/kotlin/.editorconfig",
)
}
}

private fun Path.createFile(fileName: String, content: String): Path {
val dirPath = fileName.substringBeforeLast("/", "")
Files.createDirectories(plus(dirPath))
return Files
.createFile(plus(fileName))
.also { it.writeText(content) }
}

private fun Path.plus(subPath: String): Path =
this
.absolutePathString()
.plus(File.separator)
.plus(subPath)
.let { Paths.get(it) }

private fun Path.toRelativePathStringIn(tempDir: Path): String? =
this
.takeIf {
// Ignore files created in temp dirs of other tests
it.startsWith(tempDir)
}?.absolutePathString()
?.removePrefix(tempDir.absolutePathString())
?.removePrefix(File.separator)
?.replace(tempDir.fileSystem.separator, "/")
}