-
Notifications
You must be signed in to change notification settings - Fork 323
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation for the searcher categories (#1910)
- Loading branch information
1 parent
6af2338
commit bba5ab4
Showing
21 changed files
with
591 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
71 changes: 71 additions & 0 deletions
71
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
## Enso is, first and foremost, a tool for putting you in direct contact with | ||
your data, and the Data Science tools are central to that workflow. | ||
|
||
This section contains the functionality that you need for working with data, | ||
from loading it into your workflow, to cleaning and transforming it, to | ||
visualising it and getting aggregate results from it, and much more besides. | ||
|
||
> Example | ||
Read the active sheet of an XLSX from disk and convert it into a table. | ||
|
||
import Standard.Table | ||
import Standard.Examples | ||
|
||
example_xlsx_to_table = Examples.xlsx.read_xlsx | ||
|
||
> Example | ||
Write a table to an XLSX file. | ||
|
||
import Standard.Examples | ||
|
||
example_to_xlsx = | ||
path = Enso_Project.data / example_xlsx_output.xlsx | ||
Examples.inventory_table.write_xlsx path | ||
|
||
> Example | ||
Join multiple tables together. It joins tables on their indices, so we need | ||
to make sure the indices are correct. | ||
|
||
import Standard.Examples | ||
import Standard.Table | ||
|
||
example_join = | ||
table_1 = Examples.inventory_table | ||
table_2 = Examples.popularity_table | ||
Table.join [table_1, table_2] | ||
|
||
> Example | ||
Select only the items where more than half the stock has been sold. | ||
|
||
import Standard.Examples | ||
|
||
example_where = | ||
table = Examples.inventory_table | ||
mask = (table.at "sold_stock" > (table.at "total_stock" / 2)) | ||
table.where mask | ||
|
||
> Example | ||
Sort the shop inventory based on the total stock, using the number sold to | ||
break ties in descending order. | ||
|
||
import Standard.Examples | ||
|
||
example_sort = | ||
table = Examples.inventory_table | ||
table.sort by=["total_stock", "sold_stock"] order=Sort_Order.Descending | ||
|
||
> Example | ||
Compute the number of transactions that each item has participated in, as | ||
well as the number of each item sold across those transactions. | ||
|
||
import Standard.Examples | ||
import Standard.Table | ||
|
||
example_group = | ||
transactions = Examples.transactions_table | ||
item_names = Examples.inventory_table.at "item_name" | ||
aggregated = transactions.group by="item_id" | ||
num_transactions = aggregated.at "transaction_id" . reduce .length . rename "transaction_count" | ||
num_sold = aggregated.at "quantity" . reduce .sum . rename "num_sold" | ||
Table.join [item_names, num_transactions, num_sold] | ||
|
42 changes: 42 additions & 0 deletions
42
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Aggregate.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
## With your data in the state that you want, the next step is to pull useful | ||
summaries from it to give you insight. This is the process of _aggregation_ | ||
or _summarisation_. | ||
|
||
Enso provides robust facilities for getting aggregate results from your data, | ||
all built on a flexible foundation of grouping. | ||
|
||
> Example | ||
Compute the number of transactions that each item has participated in, as | ||
well as the number of each item sold across those transactions. | ||
|
||
import Standard.Examples | ||
import Standard.Table | ||
|
||
example_group = | ||
transactions = Examples.transactions_table | ||
item_names = Examples.inventory_table.at "item_name" | ||
aggregated = transactions.group by="item_id" | ||
num_transactions = aggregated.at "transaction_id" . reduce .length . rename "transaction_count" | ||
num_sold = aggregated.at "quantity" . reduce .sum . rename "num_sold" | ||
Table.join [item_names, num_transactions, num_sold] | ||
|
||
> Example | ||
Compute the maximum value of a column. | ||
|
||
import Standard.Examples | ||
|
||
example_max = Examples.integer_column.max | ||
|
||
> Example | ||
Sum the values in a column. | ||
|
||
import Standard.Examples | ||
|
||
example_sum = Examples.integer_column.sum | ||
|
||
> Example | ||
Compute the mean value of a column. | ||
|
||
import Standard.Examples | ||
|
||
example_mean = Examples.integer_column.mean |
41 changes: 41 additions & 0 deletions
41
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Compare.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
## Data never exists in isolation. Seeing how your data compares to previous | ||
data sets, market trends or competitors is a key operation for data science. | ||
|
||
Enso provides a robust set of tools for comparing your data with itself or | ||
with other data sets. | ||
|
||
> Example | ||
Checking if the variable `a` is equal to `147`. | ||
|
||
from Standard.Base import all | ||
|
||
example_equality = | ||
a = 7 * 21 | ||
a == 147 | ||
|
||
> Example | ||
Checking if the variable `a` is not equal to `147`. | ||
|
||
from Standard.Base import all | ||
|
||
example_inequality = | ||
a = 7 * 21 | ||
a != 147 | ||
|
||
> Example | ||
Checking if the variable `a` is greater than `147`. | ||
|
||
from Standard.Base import all | ||
|
||
example_greater = | ||
a = 7 * 28 | ||
a > 147 | ||
|
||
> Example | ||
Checking if the variable `a` is less than `147`. | ||
|
||
from Standard.Base import all | ||
|
||
example_less = | ||
a = 7 * 21 | ||
a < 147 |
34 changes: 34 additions & 0 deletions
34
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Date_And_Time.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
## Along with numbers and text, dates and times are probably the most common | ||
type of data encountered whilst doing data analysis. | ||
|
||
Enso provides a robust and capable suite of date and time operations, | ||
allowing you to work with your data with ease. | ||
|
||
> Example | ||
Get the current time | ||
|
||
import Standard.Base.Data.Time | ||
|
||
example_now = Time.now | ||
|
||
> Example | ||
Parse UTC time. | ||
|
||
import Standard.Base.Data.Time | ||
|
||
example_parse = Time.parse "2020-10-01T04:11:12Z" | ||
|
||
> Example | ||
Convert time instance to -04:00 timezone. | ||
|
||
import Standard.Base.Data.Time | ||
import Standard.Base.Data.Time.Zone | ||
|
||
exaomple_at_zone = Time.new 2020 . at_zone (Zone.new -4) | ||
|
||
> Example | ||
Convert the current time to a date. | ||
|
||
import Standard.Base.Data.Time | ||
|
||
example_date = Time.now.date |
42 changes: 42 additions & 0 deletions
42
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Input_And_Output.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
## All the tools to work with data are useless without a way to get your data | ||
into Enso. | ||
|
||
This section contains a suite of common tools for loading your data into your | ||
workflow, including the ability to read and write Excel files, CSVs, and | ||
text. Beyond that, it also provides tools for creating your own data on the | ||
fly, including text, numbers, and tables. | ||
|
||
> Example | ||
Read the active sheet of an XLSX from disk and convert it into a table. | ||
|
||
import Standard.Table | ||
import Standard.Examples | ||
|
||
example_xlsx_to_table = Examples.xlsx.read_xlsx | ||
|
||
> Example | ||
Read a CSV from disk and convert it into a table. | ||
|
||
import Standard.Table | ||
import Standard.Examples | ||
|
||
example_csv_to_table = Examples.csv.read_csv | ||
|
||
> Example | ||
Write a table to an XLSX file. | ||
|
||
import Standard.Examples | ||
|
||
example_to_xlsx = | ||
path = Enso_Project.data / example_xlsx_output.xlsx | ||
Examples.inventory_table.write_xlsx path | ||
|
||
> Example | ||
Write a table to a CSV file. | ||
|
||
import Standard.Examples | ||
|
||
example_to_csv = | ||
path = Enso_Project.data / example_csv_output.csv | ||
Examples.inventory_table.write_csv path | ||
|
26 changes: 26 additions & 0 deletions
26
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Join.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
## It is common to have multiple sources of data that you want to combine to get | ||
your insights from. Joining is this process. | ||
|
||
This section contains Enso's utilities for joining data, be it two sources or | ||
multiple. | ||
|
||
> Example | ||
Join multiple tables together. It joins tables on their indices, so we need | ||
to make sure the indices are correct. | ||
|
||
import Standard.Examples | ||
import Standard.Table | ||
|
||
example_join = | ||
table_1 = Examples.inventory_table | ||
table_2 = Examples.popularity_table | ||
Table.join [table_1, table_2] | ||
|
||
> Example | ||
Join the popularity table and the inventory table to see the relative | ||
popularities of the items in the shop inventory. | ||
|
||
import Standard.Examples | ||
|
||
example_join = | ||
Examples.inventory_table.join Examples.popularity_table |
25 changes: 25 additions & 0 deletions
25
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Numbers.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
## Numerical data is _everywhere_. This makes it of the greatest importance that | ||
any data analysis system provides a robust set of numerical operations. | ||
|
||
Enso provides a comprehensive suite of tools to work with numbers, spanning | ||
from the standard mathematical operations to some more advanced tools. | ||
|
||
> Example | ||
Create a range containing the numbers 0, 1, 2, 3, 4. | ||
|
||
0.up_to 5 | ||
|
||
> Example | ||
Calculate the smallest number out of 1 and 2. | ||
|
||
Math.min 1 2 | ||
|
||
> Example | ||
Calculate the largest number out of 1 and 2. | ||
|
||
Math.max 1 2 | ||
|
||
> Example | ||
Calculate the sine of 2. | ||
|
||
2.sin |
39 changes: 39 additions & 0 deletions
39
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Preparation.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
## As nice as it would be to have your data all ready for analysis, this is | ||
rarely the case. The process of getting it ready to provide insights is known | ||
as data preparation. | ||
|
||
This section of the library contains a curated selection of tools that help | ||
you to get the best out of your data. From dealing with missing values to | ||
getting rid of erroneous rows and columns, the tools here are specialised for | ||
getting the most out of your data. | ||
|
||
> Example | ||
Get the item name and price columns from the shop inventory. | ||
|
||
import Standard.Examples | ||
|
||
example_select = | ||
Examples.inventory_table.select ["item_name", "price"] | ||
|
||
> Example | ||
Remove any rows that contain missing values from the table. | ||
|
||
import Standard.Examples | ||
|
||
example_drop_missing_rows = | ||
Examples.inventory_table.drop_missing_rows | ||
|
||
> Example | ||
Remove any columns that contain missing values from the table. | ||
|
||
import Standard.Examples | ||
|
||
example_drop_missing_cols = | ||
Examples.inventory_table.drop_missing_columns | ||
|
||
> Example | ||
Fill missing values in a column with the value 20.5. | ||
|
||
import Standard.Examples | ||
|
||
example_fill_missing = Examples.decimal_column.fill_missing 20.5 |
26 changes: 26 additions & 0 deletions
26
distribution/lib/Standard/Searcher/0.1.0/src/Data_Science/Text.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
## Textual data is one of the most common formats found during data analysis. | ||
This makes it all the more important that you have robust tools to manipulate | ||
the text as you need. | ||
|
||
Enso's text representation is highly efficient and comes with a robust suite | ||
of tools for processing the text in any way you can think of. | ||
|
||
> Example | ||
Split the text on whitespace into a vector of items. | ||
|
||
"ham eggs cheese tomatoes".split Split_Kind.Whitespace | ||
|
||
> Example | ||
Getting the words in the sentence "I have not one, but two cats." | ||
|
||
"I have not one, but two cats.".words | ||
|
||
> Example | ||
See if the text "Hello" contains the text "ell". | ||
|
||
"Hello".contains "ell" | ||
|
||
> Example | ||
Replace letters in the text "aaa". | ||
|
||
'aaa'.replace 'aa' 'b' == 'ba' |
Oops, something went wrong.