Skip to content

Commit

Permalink
Add documentation for the searcher categories (#1910)
Browse files Browse the repository at this point in the history
  • Loading branch information
iamrecursion authored Jul 30, 2021
1 parent 6af2338 commit bba5ab4
Show file tree
Hide file tree
Showing 21 changed files with 591 additions and 2 deletions.
2 changes: 2 additions & 0 deletions RELEASES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

- Added support for writing tables to XLSX spreadsheets
([#1906](https://github.com/enso-org/enso/pull/1906)).
- Added documentation for the new searcher categories
([#1910](https://github.com/enso-org/enso/pull/1910)).

# Enso 0.2.17 (2021-07-28)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Text.characters =
"ham,eggs,cheese,tomatoes".split ","

> Example
Split the string on whitespace into a vector of items.
Split the text on whitespace into a vector of items.

"ham eggs cheese tomatoes".split Split_Kind.Whitespace
Text.split : Split_Kind -> Vector.Vector Text
Expand Down
71 changes: 71 additions & 0 deletions distribution/lib/Standard/Searcher/0.1.0/src/Data_Science.enso
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
## Enso is, first and foremost, a tool for putting you in direct contact with
your data, and the Data Science tools are central to that workflow.

This section contains the functionality that you need for working with data,
from loading it into your workflow, to cleaning and transforming it, to
visualising it and getting aggregate results from it, and much more besides.

> Example
Read the active sheet of an XLSX from disk and convert it into a table.

import Standard.Table
import Standard.Examples

example_xlsx_to_table = Examples.xlsx.read_xlsx

> Example
Write a table to an XLSX file.

import Standard.Examples

example_to_xlsx =
path = Enso_Project.data / example_xlsx_output.xlsx
Examples.inventory_table.write_xlsx path

> Example
Join multiple tables together. It joins tables on their indices, so we need
to make sure the indices are correct.

import Standard.Examples
import Standard.Table

example_join =
table_1 = Examples.inventory_table
table_2 = Examples.popularity_table
Table.join [table_1, table_2]

> Example
Select only the items where more than half the stock has been sold.

import Standard.Examples

example_where =
table = Examples.inventory_table
mask = (table.at "sold_stock" > (table.at "total_stock" / 2))
table.where mask

> Example
Sort the shop inventory based on the total stock, using the number sold to
break ties in descending order.

import Standard.Examples

example_sort =
table = Examples.inventory_table
table.sort by=["total_stock", "sold_stock"] order=Sort_Order.Descending

> Example
Compute the number of transactions that each item has participated in, as
well as the number of each item sold across those transactions.

import Standard.Examples
import Standard.Table

example_group =
transactions = Examples.transactions_table
item_names = Examples.inventory_table.at "item_name"
aggregated = transactions.group by="item_id"
num_transactions = aggregated.at "transaction_id" . reduce .length . rename "transaction_count"
num_sold = aggregated.at "quantity" . reduce .sum . rename "num_sold"
Table.join [item_names, num_transactions, num_sold]

Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
## With your data in the state that you want, the next step is to pull useful
summaries from it to give you insight. This is the process of _aggregation_
or _summarisation_.

Enso provides robust facilities for getting aggregate results from your data,
all built on a flexible foundation of grouping.

> Example
Compute the number of transactions that each item has participated in, as
well as the number of each item sold across those transactions.

import Standard.Examples
import Standard.Table

example_group =
transactions = Examples.transactions_table
item_names = Examples.inventory_table.at "item_name"
aggregated = transactions.group by="item_id"
num_transactions = aggregated.at "transaction_id" . reduce .length . rename "transaction_count"
num_sold = aggregated.at "quantity" . reduce .sum . rename "num_sold"
Table.join [item_names, num_transactions, num_sold]

> Example
Compute the maximum value of a column.

import Standard.Examples

example_max = Examples.integer_column.max

> Example
Sum the values in a column.

import Standard.Examples

example_sum = Examples.integer_column.sum

> Example
Compute the mean value of a column.

import Standard.Examples

example_mean = Examples.integer_column.mean
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## Data never exists in isolation. Seeing how your data compares to previous
data sets, market trends or competitors is a key operation for data science.

Enso provides a robust set of tools for comparing your data with itself or
with other data sets.

> Example
Checking if the variable `a` is equal to `147`.

from Standard.Base import all

example_equality =
a = 7 * 21
a == 147

> Example
Checking if the variable `a` is not equal to `147`.

from Standard.Base import all

example_inequality =
a = 7 * 21
a != 147

> Example
Checking if the variable `a` is greater than `147`.

from Standard.Base import all

example_greater =
a = 7 * 28
a > 147

> Example
Checking if the variable `a` is less than `147`.

from Standard.Base import all

example_less =
a = 7 * 21
a < 147
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
## Along with numbers and text, dates and times are probably the most common
type of data encountered whilst doing data analysis.

Enso provides a robust and capable suite of date and time operations,
allowing you to work with your data with ease.

> Example
Get the current time

import Standard.Base.Data.Time

example_now = Time.now

> Example
Parse UTC time.

import Standard.Base.Data.Time

example_parse = Time.parse "2020-10-01T04:11:12Z"

> Example
Convert time instance to -04:00 timezone.

import Standard.Base.Data.Time
import Standard.Base.Data.Time.Zone

exaomple_at_zone = Time.new 2020 . at_zone (Zone.new -4)

> Example
Convert the current time to a date.

import Standard.Base.Data.Time

example_date = Time.now.date
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
## All the tools to work with data are useless without a way to get your data
into Enso.

This section contains a suite of common tools for loading your data into your
workflow, including the ability to read and write Excel files, CSVs, and
text. Beyond that, it also provides tools for creating your own data on the
fly, including text, numbers, and tables.

> Example
Read the active sheet of an XLSX from disk and convert it into a table.

import Standard.Table
import Standard.Examples

example_xlsx_to_table = Examples.xlsx.read_xlsx

> Example
Read a CSV from disk and convert it into a table.

import Standard.Table
import Standard.Examples

example_csv_to_table = Examples.csv.read_csv

> Example
Write a table to an XLSX file.

import Standard.Examples

example_to_xlsx =
path = Enso_Project.data / example_xlsx_output.xlsx
Examples.inventory_table.write_xlsx path

> Example
Write a table to a CSV file.

import Standard.Examples

example_to_csv =
path = Enso_Project.data / example_csv_output.csv
Examples.inventory_table.write_csv path

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
## It is common to have multiple sources of data that you want to combine to get
your insights from. Joining is this process.

This section contains Enso's utilities for joining data, be it two sources or
multiple.

> Example
Join multiple tables together. It joins tables on their indices, so we need
to make sure the indices are correct.

import Standard.Examples
import Standard.Table

example_join =
table_1 = Examples.inventory_table
table_2 = Examples.popularity_table
Table.join [table_1, table_2]

> Example
Join the popularity table and the inventory table to see the relative
popularities of the items in the shop inventory.

import Standard.Examples

example_join =
Examples.inventory_table.join Examples.popularity_table
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
## Numerical data is _everywhere_. This makes it of the greatest importance that
any data analysis system provides a robust set of numerical operations.

Enso provides a comprehensive suite of tools to work with numbers, spanning
from the standard mathematical operations to some more advanced tools.

> Example
Create a range containing the numbers 0, 1, 2, 3, 4.

0.up_to 5

> Example
Calculate the smallest number out of 1 and 2.

Math.min 1 2

> Example
Calculate the largest number out of 1 and 2.

Math.max 1 2

> Example
Calculate the sine of 2.

2.sin
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## As nice as it would be to have your data all ready for analysis, this is
rarely the case. The process of getting it ready to provide insights is known
as data preparation.

This section of the library contains a curated selection of tools that help
you to get the best out of your data. From dealing with missing values to
getting rid of erroneous rows and columns, the tools here are specialised for
getting the most out of your data.

> Example
Get the item name and price columns from the shop inventory.

import Standard.Examples

example_select =
Examples.inventory_table.select ["item_name", "price"]

> Example
Remove any rows that contain missing values from the table.

import Standard.Examples

example_drop_missing_rows =
Examples.inventory_table.drop_missing_rows

> Example
Remove any columns that contain missing values from the table.

import Standard.Examples

example_drop_missing_cols =
Examples.inventory_table.drop_missing_columns

> Example
Fill missing values in a column with the value 20.5.

import Standard.Examples

example_fill_missing = Examples.decimal_column.fill_missing 20.5
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
## Textual data is one of the most common formats found during data analysis.
This makes it all the more important that you have robust tools to manipulate
the text as you need.

Enso's text representation is highly efficient and comes with a robust suite
of tools for processing the text in any way you can think of.

> Example
Split the text on whitespace into a vector of items.

"ham eggs cheese tomatoes".split Split_Kind.Whitespace

> Example
Getting the words in the sentence "I have not one, but two cats."

"I have not one, but two cats.".words

> Example
See if the text "Hello" contains the text "ell".

"Hello".contains "ell"

> Example
Replace letters in the text "aaa".

'aaa'.replace 'aa' 'b' == 'ba'
Loading

0 comments on commit bba5ab4

Please sign in to comment.