Skip to content

Commit

Permalink
Add basic Arrow Flight docs
Browse files Browse the repository at this point in the history
  • Loading branch information
lidavidm committed Jun 26, 2019
1 parent a8ac27f commit f7631a2
Show file tree
Hide file tree
Showing 9 changed files with 416 additions and 27 deletions.
9 changes: 9 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,16 @@
from unittest import mock
pyarrow.cuda = sys.modules['pyarrow.cuda'] = mock.Mock()

try:
import pyarrow.flight
flight_enabled = True
except ImportError:
flight_enabled = False
pyarrow.flight = sys.modules['pyarrow.flight'] = mock.Mock()


def setup(app):
# Use a config value to indicate whether CUDA API docs can be generated.
# This will also rebuild appropriately when the value changes.
app.add_config_value('cuda_enabled', cuda_enabled, 'env')
app.add_config_value('flight_enabled', flight_enabled, 'env')
1 change: 1 addition & 0 deletions docs/source/cpp/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,4 @@ API Reference
api/table
api/utilities
api/cuda
api/flight
126 changes: 126 additions & 0 deletions docs/source/cpp/api/flight.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements. See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership. The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied. See the License for the
.. specific language governing permissions and limitations
.. under the License.
================
Arrow Flight RPC
================

.. warning:: Flight is currently unstable. APIs are subject to change,
though we don't expect drastic changes.

.. warning:: Flight is currently only available when built from source
appropriately.

Common Types
============

.. doxygenstruct:: arrow::flight::Action
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::ActionType
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::Criteria
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::FlightDescriptor
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::FlightEndpoint
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::FlightInfo
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::FlightPayload
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::FlightListing
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::Location
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::PutResult
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::Result
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::ResultStream
:project: arrow_cpp
:members:

.. doxygenstruct:: arrow::flight::Ticket
:project: arrow_cpp
:members:

Clients
=======

.. doxygenclass:: arrow::flight::FlightClient
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::FlightCallOptions
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::ClientAuthHandler
:project: arrow_cpp
:members:

.. doxygentypedef:: arrow::flight::TimeoutDuration
:project: arrow_cpp

Servers
=======

.. doxygenclass:: arrow::flight::FlightServerBase
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::FlightDataStream
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::FlightMessageReader
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::RecordBatchStream
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::ServerAuthHandler
:project: arrow_cpp
:members:

.. doxygenclass:: arrow::flight::ServerCallContext
:project: arrow_cpp
:members:
106 changes: 106 additions & 0 deletions docs/source/format/Flight.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements. See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership. The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied. See the License for the
.. specific language governing permissions and limitations
.. under the License.
Arrow Flight RPC
================

Arrow Flight is a RPC framework for high-performance data services
based on Arrow data, and is built on top of gRPC_ and the :doc:`IPC
format <IPC>`.

Flight is organized around streams of Arrow record batches, being
either downloaded from or uploaded to another service. A set of
metadata methods offers discovery and introspection of streams, as
well as the ability to implement application-specific methods.

Methods and message wire formats are defined by Protobuf, enabling
interoperability with clients that may support gRPC and Arrow
separately, but not Flight. However, Flight implementations include
further optimizations to avoid overhead in usage of Protobuf (mostly
around avoiding excessive memory copies).

.. _gRPC: https://grpc.io/

RPC Methods
-----------

Flight defines a set of RPC methods for uploading/downloading data,
retrieving metadata about a data stream, listing available data
streams, and for implementing application-specific RPC methods. A
Flight service implements some subset of these methods, while a Flight
client can call any of these methods. Thus, one Flight client can
connect to any Flight service and perform basic operations.

Data streams are identified by descriptors, which are either a path or
an arbitrary binary command. A client that wishes to download the data
would:

#. Construct or acquire a ``FlightDescriptor`` for the data set they
are interested in. A client may know what descriptor they want
already, or they may use methods like ``ListFlights`` to discover
them.
#. Call ``GetFlightInfo(FlightDescriptor)`` to get a ``FlightInfo``
message containing details on where the data is located (as well as
other metadata, like the schema and possibly an estimate of the
dataset size).

Flight does not require that data live on the same server as
metadata: this call may list other servers to connect to. The
``FlightInfo`` message includes a ``Ticket``, an opaque binary
token that the server uses to identify the exact data set being
requested.
#. Connect to other servers (if needed).
#. Call ``DoGet(Ticket)`` to get back a stream of Arrow record
batches.

To upload data, a client would:

#. Construct or acquire a ``FlightDescriptor``, as before.
#. Call ``DoPut(FlightData)`` and upload a stream of Arrow record
batches. They would also include the ``FlightDescriptor`` with the
first message.

See `Protocol Buffer Definitions`_ for full details on the methods and
messages involved.

Authentication
~~~~~~~~~~~~~~

Flight supports application-implemented authentication
methods. Authentication, if enabled, has two phases: at connection
time, the client and server can exchange any number of messages. Then,
the client can provide a token alongside each call, and the server can
validate that token.

Applications may use any part of this; for instance, they may ignore
the initial handshake and send an externally acquired token on each
call, or they may establish trust during the handshake and not
validate a token for each call. (Note that the latter is not secure if
you choose to deploy a layer 7 load balancer, as is common with gRPC.)

External Resources
------------------

- https://arrow.apache.org/blog/2018/10/09/0.11.0-release/
- https://www.slideshare.net/JacquesNadeau5/apache-arrow-flight-overview

Protocol Buffer Definitions
---------------------------

.. literalinclude:: ../../../format/Flight.proto
:language: protobuf
:linenos:
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ such topics as:
format/Layout
format/Metadata
format/IPC
format/Flight

.. _toc.usage:

Expand Down
1 change: 1 addition & 0 deletions docs/source/python/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ API Reference
api/files
api/tables
api/ipc
api/flight
api/formats
api/plasma
api/cuda
Expand Down
82 changes: 82 additions & 0 deletions docs/source/python/api/flight.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements. See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership. The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied. See the License for the
.. specific language governing permissions and limitations
.. under the License.
.. currentmodule:: pyarrow.flight

Arrow Flight
============

.. ifconfig:: not flight_enabled

.. error::
This documentation was built without Flight enabled. The Flight
API docs are not available.

.. NOTE We still generate those API docs (with empty docstrings)
.. when Flight is disabled and `pyarrow.flight` mocked (see conf.py).
.. Otherwise we'd get autodoc warnings, see https://github.com/sphinx-doc/sphinx/issues/4770
.. warning:: Flight is currently unstable. APIs are subject to change,
though we don't expect drastic changes.

.. warning:: Flight is currently not distributed as part of wheels or
in Conda - it is only available when built from source
appropriately.

Common Types
------------

.. autosummary::
:toctree: ../generated/

Action
ActionType
DescriptorType
FlightDescriptor
FlightEndpoint
FlightInfo
Location
Ticket
Result

Flight Client
-------------

.. autosummary::
:toctree: ../generated/

FlightCallOptions
FlightClient

Flight Server
-------------

.. autosummary::
:toctree: ../generated/

FlightServerBase
GeneratorStream
RecordBatchStream

Authentication
--------------

.. autosummary::
:toctree: ../generated/

ClientAuthHandler
ServerAuthHandler
Loading

0 comments on commit f7631a2

Please sign in to comment.