docs (#3540)

streetsidesoftware · Aug 18, 2024 · 85512bb · 85512bb
1 parent 2b42261
commit 85512bb
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 0 deletions.
diff --git a/design-docs/spelling-metrics.md b/design-docs/spelling-metrics.md
@@ -0,0 +1,38 @@
+# Spelling Metrics and Usage data
+
+The goal of collecting spelling metrics is to improve the overall user experience.
+
+## Improving Spelling Suggestions
+
+The goal is to improve the spelling suggestions presented to the user.
+The idea is present the "best" choices higher on the list than other choices.
+The current algorithm uses only a modified edit distance algorithm to move some choices higher.
+By collecting the choices the user has used in the past and word usage frequency we can adjust the order.
+
+## Collect Spelling Corrections
+
+Fixing spelling errors is usually done with a Command. The idea is to record each time the command is used to replace a spelling error.
+
+Things to record:
+
+-   `word`<sup>\*</sup> - the word with the spelling error. This could be the word or a hash.
+-   `replacement`<sup>\*</sup> - the replacement chosen. This could be a word or a hash.
+-   `timestamp` - the Unix timestamp in milliseconds. Helps use keep track of the order and to diminish weight over time.
+-   `fileType` - the file type (LanguageID) of the file.
+-   `locale` - the current locale setting.
+-   `scheme` - the URI scheme
+-   `machineId` - a hash of the machine id (does not need to be unique, just consistent).
+-   `userId` - a hash of the user id (does not need to be unique, just consistent).
+
+<sup>\*</sup> - Necessary fields needed to make the suggestions.
+
+## Word Frequency Data
+
+Word Frequency can be used to improve suggestions by pushing words that already exist in the project towards the top.
+
+Record:
+
+-   `word` - the text of the word. (prefer whole words over camel case splits). We might want to limit the length less than 64 characters or use a hash.
+-   `count` - the number of times the word was seen in the file.
+-   `fileType` - the file type (LanguageId) of the file.
+-   `fileId` - the id of the file
diff --git a/design-docs/wip/normalized-json.md b/design-docs/wip/normalized-json.md
@@ -0,0 +1,3 @@
+# Normalized JSON
+
+Normalized JSON is a data structure that normalizes duplicate values. It handles circular references as well as multiple reverences to the same object.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Normalized JSON

		Normalized JSON is a data structure that normalizes duplicate values. It handles circular references as well as multiple reverences to the same object.