Skip to content

Latest commit

 

History

History
437 lines (269 loc) · 10.2 KB

README.rst

File metadata and controls

437 lines (269 loc) · 10.2 KB

Mario: Shell pipes in Python

GitHub Documentation Status Build status PyPI package Coverage

Have you ever wanted to use Python functions directly in your Unix shell? Mario can read and write csv, json, and yaml; traverse trees, and even do xpath queries. Plus, it supports async commands right out of the box. Build your own commands with a simple configuration file, and install plugins for even more!

Mario is the plumbing snake 🐍🔧 helping you build data pipelines in your shell 🐢.

What time is it in Sydney?

Features

  • Execute Python code in your shell.
  • Pass Python objects through multi-stage pipelines.
  • Read and write csv, json, yaml, toml, xml.
  • Run async functions natively.
  • Define your own commands in a simple configuration file or by writing Python code.
  • Install plugins to get more commands.
  • Enjoy high test coverage, continuous integration, and nightly releases.

Installation

Mario

Windows support is hopefully coming soon. Linux and MacOS are supported now.

Get Mario with pip:

python3.7 -m pip install mario

If you're not inside a virtualenv, you might get a PermissionsError. In that case, try using:

python3.7 -m pip install --user mario

or for more isolation, use pipx:

pipx install --python python3.7 mario

Mario addons

The mario-addons package provides a number of useful commands not found in the base collection.

Get Mario addons with pip:

python3.7 -m pip install mario-addons

If you're not inside a virtualenv, you might get a PermissionsError. In that case, try using:

python3.7 -m pip install --user mario-addons

or for more isolation, use pipx:

pipx install --python python3.7 mario
pipx inject mario mario-addons

Quickstart

Basics

Invoke with mario at the command line.

$ mario eval 1+1
2

Given a csv like this:

$ cat <<EOF > hackers.csv
name,age
Alice,21
Bob,22
Carol,23
EOF

Use read-csv-dicts to read each row into a dict:

$ mario read-csv-dicts < hackers.csv
{'name': 'Alice', 'age': '21'}
{'name': 'Bob', 'age': '22'}
{'name': 'Carol', 'age': '23'}

Use map to act on each input item x :

$ mario read-csv-dicts map 'x["name"]' < hackers.csv
Alice
Bob
Carol

Chain Python functions together with !:

$ mario read-csv-dicts map 'x["name"] ! len' < hackers.csv
5
3
5

or by adding another command

$ mario read-csv-dicts map 'x["name"]' map len < hackers.csv
5
3
5

Use x as a placeholder for the input at each stage:

$ mario read-csv-dicts map 'x["age"] ! int ! x*2'  < hackers.csv
42
44
46

Automatically import modules you need:

$ mario map 'collections.Counter ! dict' <<<mississippi
{'m': 1, 'i': 4, 's': 4, 'p': 2}

You don't need to explicitly call the function with some_function(x); just use the function's name, some_function. For example, instead of

$ mario map 'len(x)' <<EOF
a
bb
EOF
1
2

try

$ mario map len <<EOF
a
bb
EOF
1
2

More commands

Here are a few commands. See Command reference for the complete set, and get even more from mario-addons.

eval

Use eval to evaluate a Python expression.

 % mario eval 'datetime.datetime.utcnow()'
2019-01-01 01:23:45.562736

map

Use map to act on each input item.

$ mario map 'x * 2' <<EOF
a
bb
EOF
aa
bbbb

filter

Use filter to evaluate a condition on each line of input and exclude false values.

$ mario filter 'len(x) > 1' <<EOF
a
bb
ccc
EOF
bb
ccc

apply

Use apply to act on the sequence of items.

$ mario apply 'len(x)' <<EOF
a
bb
EOF
2

chain

Use chain to flatten a list of lists into a single list, like itertools.chain.from_iterable.

For example, after generating a several rows of items,

$ mario read-csv-tuples <<EOF
a,b,c
d,e,f
g,h,i
EOF
('a', 'b', 'c')
('d', 'e', 'f')
('g', 'h', 'i')

use chain to put each item on its own row:

$ mario read-csv-tuples chain <<EOF
a,b,c
d,e,f
g,h,i
EOF
a
b
c
d
e
f
g
h
i

async-map

Making sequential requests is slow. These requests take 16 seconds to complete.

% time mario map 'await asks.get ! x.json()["url"]'  <<EOF
http://httpbin.org/delay/5
http://httpbin.org/delay/1
http://httpbin.org/delay/2
http://httpbin.org/delay/3
http://httpbin.org/delay/4
EOF
https://httpbin.org/delay/5
https://httpbin.org/delay/1
https://httpbin.org/delay/2
https://httpbin.org/delay/3
https://httpbin.org/delay/4
0.51s user
0.02s system
16.460 total

Concurrent requests can go much faster. The same requests now take only 6 seconds. Use async-map, or async-filter, or reduce with await some_async_function to get concurrency out of the box.

% time mario async-map 'await asks.get ! x.json()["url"]'  <<EOF
http://httpbin.org/delay/5
http://httpbin.org/delay/1
http://httpbin.org/delay/2
http://httpbin.org/delay/3
http://httpbin.org/delay/4
EOF
https://httpbin.org/delay/5
https://httpbin.org/delay/1
https://httpbin.org/delay/2
https://httpbin.org/delay/3
https://httpbin.org/delay/4
0.49s user
0.03s system
5.720 total

Configuration

Define new commands and set default options. See Configuration reference for details.

Plugins

Add new commands like map and reduce by installing Mario plugins. You can try them out without installing by adding them to any .py file in your ~/.config/mario/modules/.

Share popular commands by installing the mario-addons package.

Q & A

What's the status of this package?

  • This package is experimental and is subject to change without notice.
  • Check the issues page for open tickets.

Why another package?

A number of cool projects have pioneered in the Python-in-shell space. I wrote Mario because I didn't know these existed at the time, but now Mario has a bunch of features the others don't (user configuration, multi-stage pipelines, async, plugins, etc).