Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AD-649] Add missing documentation #73

Merged
merged 19 commits into from
May 20, 2022
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
246 changes: 15 additions & 231 deletions README.md

Large diffs are not rendered by default.

Binary file added src/markdown/images/odbc-data-source-admin.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
96 changes: 96 additions & 0 deletions src/markdown/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Amazon DocumentDB ODBC Driver Documentation

## Overview

The ODBC driver for the Amazon DocumentDB managed document database provides an
SQL-relational interface for developers and BI tool users.

## License

This project is licensed under the Apache-2.0 License.

## Architecture

ODBC is wrapping the Amazon DocumentDB JDBC Driver with JNI. This will add a translation layer between C++ objects and Java objects.
This is a 2-tier approach, where document scanning, metadata discovery, and SQL to MQL translation is performed using Java/JVM on a local machine. The communication from ODBC Adapter to JVM will be using JNI. For performance reasons, a separate (C/C++) client driver connection will be used to query and return results from the DocumentDB database.


```mermaid
graph LR
A(BI Tool) --> B(ODBC Driver Adapter)
subgraph Driver [ODBC Driver]
B --> C(JAVA Adapter)
B --> D(Native Adapter)
end
C --> E[(DocumentDB Server)]
D --> E
```
## Documentation

- Setup
- [Amazon DocumentDB ODBC Driver Setup](setup/setup.md)
- [DSN](setup/dsn-configuration.md)
affonsov marked this conversation as resolved.
Show resolved Hide resolved
- Managing Schema
- [Schema Discovery and Generation](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/schema-discovery.md)
- [Managing Schema Using the Command Line Interface](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/manage-schema-cli.md)
- [Table Schemas JSON Format](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/table-schemas-json-format.md)
- SQL Compatibility
- [SQL Support and Limitations](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/sql/sql-limitations.md)
- Support
- [Troubleshooting Guide](support/troubleshooting-guide.md)

## Getting Started

Follow the [requirements and setup directions](setup/setup.md) to get your environment ready to use the
Amazon DocumentDB ODBC driver. Assuming your Amazon DocumentDB cluster is hosted in a private VPC,
you'll want to [create an SSH tunnel](setup/setup.md#using-an-ssh-tunnel-to-connect-to-amazon-documentdb) to bridge to
your cluster in the VPC. If you're a Tableau or other BI user, follow the directions on how to
[setup and use BI tools](setup/setup.md#driver-setup-in-bi-applications) with the driver.

## Setup and Usage

To set up and use the DocumentDB ODBC driver, see [Amazon DocumentDB ODBC Driver Setup](setup/setup.md).

## Connection String Syntax

```
DRIVER={Amazon DocumentDB};[HOSTNAME=<host>:<port>];[DATABASE=<database>];[USER=<user>];[PASSWORD=<password>][;<option>=<value>[;<option>=<value>[...]]];
```

For more information about connecting to an Amazon DocumentDB database using this ODBC driver, see
the [dsn configuration](setup/dsn-configuration.md) for more details.
## Schema Discovery

The Amazon DocumentDB ODBC driver can perform automatic schema discovery and generate an SQL to
DocumentDB schema mapping. See the [schema discovery documentation](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/schema-discovery.md)
for more details of this process.

## Schema Management

The SQL to DocumentDB schema mapping can be managed using JDBC command line in the following ways:

- generated
- removed
- listed
- exported
- imported

See the [schema management documentation](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/manage-schema-cli.md) and
[table schemas JSON format](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/schema/table-schemas-json-format.md) for further
information.

**Note**: A common schema management task is to regenerate or clear the existing schema that has
become out of date when your database has changed, for example, when there are new collections or new
fields in an existing collection. To regenerate or clear the existing schema, please refer to the
[Schema Out of Date](#schema-out-of-date) topic in the troubleshooting guide.

## SQL and ODBC Limitations

The Amazon DocumentDB ODBC driver has a number of important limitations. See the
[SQL limitations documentation](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/develop/src/markdown/sql/sql-limitations.md).

## Troubleshooting Guide

If you're having an issue using the Amazon DocumentDB ODBC driver, consult the
[Troubleshooting Guide](support/troubleshooting-guide.md) to see if it has a solution for
your issue.
100 changes: 100 additions & 0 deletions src/markdown/setup/connection-string.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Connection String Syntax and Options
`DRIVER={Amazon DocumentDB};HOSTNAME=<host>:<port>;DATABASE=<database>;USER=<user>;PASSWORD=<password>;<option>=<value>;`

### Driver
`Driver:` Required: the driver for this ODBC driver.

### Parameters
| Property | Description | Default |
|--------|-------------|---------------|
| `DATABASE` (required) | The name of the database the ODBC driver will connect to. | `NONE`
| `HOSTNAME` (required) | The hostname or IP address of the DocumentDB server or cluster. | `NONE`
| `PORT` (optional) | The port number the DocumentDB server or cluster is listening on. | `27017`
| `USER` (optional) | The username of the authorized user. While the username is optional on the connection string, it is still required either via the connection string, or the properties. _Note: the username must be properly (%) encoded to avoid any confusion with URI special characters._ | `NONE`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `PASSWORD` (optional) | The password of the authorized user. While the password is optional on the connection string, it is still required either via the connection string, or the properties. _Note: the password must be properly (%) encoded to avoid any confusion with URI special characters._ | `NONE`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `OPTION` (optional) | One of the connection string options listed below. | `NONE`
| `VALUE` (optional) | The associated value for the option. | `NONE`

### Options
| Option | Description | Default |
|--------|-------------|---------------|
| `APP_NAME` | (string) Sets the logical name of the application. | `Amazon DocumentDB ODBC Driver {version}`
| `LOGIN_TIMEOUTSEC` | (int) How long a connection can take to be opened before timing out (in seconds). Alias for connectTimeoutMS but using seconds. | `NONE`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `READ_PREFERENCE` | (enum/string) The read preference for this connection. Allowed values: `primary`, `primaryPreferred`, `secondary`, `secondaryPreferred` or `nearest`. | `primary`
| `REPLICA_SET` | (string) Name of replica set to connect to. For now, passing a name other than `rs0` will log a warning. | `NONE`
| `RETRY_READS` | (true/false) If true, the driver will retry supported read operations if they fail due to a network error. | `true`
| `TLS` | (true/false) If true, use TLS encryption when communicating with the DocumentDB server. | `true`
| `TLS_ALLOWINVALID_HOSTNAMES` | (true/false) If true, invalid host names for the TLS certificate are allowed. This is useful when using an internal SSH tunnel to a DocumentDB server. | `false`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `TLS_CA_FILE` | (string) The path to the trusted Certificate Authority (CA) `.pem` file. If the path starts with the tilde character (`~`), it will be replaced with the user's home directory. Ensure to use only forward slash characters (`/`) in the path or URL encode the path. Providing the trusted Certificate Authority (CA) `.pem` file is optional as the current Amazon RDS root CA is used by default when the `tls` option is set to `true`. This embedded certificate is set to expire on 2024-08-22. For example, to provide a new trusted Certificate Authority (CA) `.pem` file that is located in the current user's `Downloads` subdirectory of their home directory, use the following: `TLS_CA_FILE=~/Downloads/rds-ca-2019-root.pem`. | `NONE`
| `SSH_USER` | (string) The username for the internal SSH tunnel. If provided, options `sshHost` and `sshPrivateKeyFile` must also be provided, otherwise this option is ignored. | `NONE`
| `SSH_HOST` | (string) The host name for the internal SSH tunnel. Optionally the SSH tunnel port number can be provided using the syntax `<ssh-host>:<port>`. The default port is `22`. If provided, options `SSH_USER` and `sshPrivateKeyFile` must also be provided, otherwise this option is ignored. | `NONE`
| `SSH_PRIVATE_KEYFILE` | (string) The path to the private key file for the internal SSH tunnel. If the path starts with the tilde character (`~`), it will be replaced with the user's home directory. If the path is relative, the absolute path will try to be resolved by searching in the user's home directory (`~`), the `.documentdb` folder under the user's home directory or in the same directory as the driver JAR file. If the file cannot be found, a connection error will occur. If provided, options `SSH_USER` and `SSH_HOST` must also be provided, otherwise this option is ignored. | `NONE`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `SSH_PRIVATE_KEY_PASSPHRASE` | (string) If the SSH tunnel private key file, `sshPrivateKeyFile`, is passphrase protected, provide the passphrase using this option. If provided, options `sshUser`, `sshHost` and `sshPrivateKeyFile` must also be provided, otherwise this option is ignored. | `NONE`
| `SSH_STRICT_HOST_KEY_CHECKING` | (true/false) If true, the 'known_hosts' file is checked to ensure the target host is trusted when creating the internal SSH tunnel. If false, the target host is not checked. Disabling this option is less secure as it can lead to a ["man-in-the-middle" attack](https://en.wikipedia.org/wiki/Man-in-the-middle_attack). If provided, options `sshUser`, `sshHost` and `SSH_PRIVATE_KEYFILE` must also be provided, otherwise this option is ignored. | `true`
| `SSHKNOWNHOSTFILE` | (string) The path to the 'known_hosts' file used for checking the target host for the SSH tunnel when option `sshStrictHostKeyChecking` is `true`. The `known_hosts` file can be populated using the `ssh-keyscan` [tool](maintain_known_hosts.md). If provided, options `sshUser`, `sshHost` and `sshPrivateKeyFile` must also be provided, otherwise this option is ignored. | `~/.ssh/known_hosts`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `SCAN_METHOD` | (enum/string) The scanning (sampling) method to use when discovering collection metadata for determining table schema. Possible values include the following: 1) `random` - the sample documents are returned in _random_ order, 2) `idForward` - the sample documents are returned in order of id, 3) `idReverse` - the sample documents are returned in reverse order of id or 4) `all` - sample all the documents in the collection. | `random`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `scanLimit` | (int) The number of documents to sample. The value must be a positive integer. If `scanMethod` is set to `all`, this option is ignored. | `1000`
affonsov marked this conversation as resolved.
Show resolved Hide resolved
| `SCHEMA_NAME` | (string) The name of the SQL mapping schema for the database. | `_default`.
| `DEFAULT_FETCH_SIZE` | (int) The default fetch size (in records) when retrieving results from Amazon DocumentDB. It is the number of records to retrieve in a single batch. The maximum number of records retrieved in a single batch may also be limited by the overall memory size of the result. | `2000`
| `REFERESH_SCHEMA` | (true/false) If true, generates (refreshes) the SQL schema with each connection. It creates a new version, leaving any existing versions in place. _Caution: use only when necessary to update schema as it can adversely affect performance._ | `false`

## Examples

### Connecting to an Amazon DocumentDB Cluster

```
DRIVER={Amazon DocumentDB};HOSTNAME=localhost;DATABASE=customer;TLSALLOWINVALIDHOSTNAMES=true;
```

#### Notes:

1. An external [SSH tunnel](setup.md#using-an-ssh-tunnel-to-connect-to-amazon-documentdb) is being used where the local
port is `27017` (`27017` is default).
2. The Amazon DocumentDB database name is `customer`.
3. The Amazon DocumentDB is TLS-enabled (`tls=true` is default)
4. User and password values are passed to the ODBC driver using **Properties**.

### Connecting to an Amazon DocumentDB Cluster on Non-Default Port

```
DRIVER={Amazon DocumentDB};HOSTNAME=localhost:27017;DATABASE=customer;TLSALLOWINVALIDHOSTNAMES=true;
affonsov marked this conversation as resolved.
Show resolved Hide resolved
```

#### Notes:

1. An external [SSH tunnel](setup.md#using-an-ssh-tunnel-to-connect-to-amazon-documentdb) is being used where the local
port is `27117`.
1. The Amazon DocumentDB database name is `customer`.
1. The Amazon DocumentDB is TLS-enabled (`tls=true` is default).
1. User and password values are passed to the ODBC driver using **Properties**.

### Connecting to an Amazon DocumentDB Cluster using an Internal SSH tunnel

```
DRIVER={Amazon DocumentDB};HOSTNAME=localhost:27017;DATABASE=customer;TLSALLOWINVALIDHOSTNAMES=true;SSHUSER=ec2-user;SSHHOST=ec2-254-254-254-254.compute.amazonaws.com;SSHPRIVATEKEYFILE=~/.ssh/ec2-privkey.pem
affonsov marked this conversation as resolved.
Show resolved Hide resolved
```

#### Notes:

1. DocumentDB cluster host is `docdb-production.docdb.amazonaws.com` (using default port `27017`).
2. The Amazon DocumentDB database name is `customer`.
3. The Amazon DocumentDB is TLS-enabled (`tls=true` is default).
4. An internal SSH tunnel will be created using the user `ec2-user`,
host `ec2-254-254-254-254.compute.amazonaws.com`, and private key file `~/.ssh/ec2-privkey.pem`.
6. User and password values are passed to the ODBC driver using **Properties**.

### Change the Scanning Method when Connecting to an Amazon DocumentDB Cluster

```
DRIVER={Amazon DocumentDB};HOSTNAME=localhost:27017;DATABASE=customer;TLSALLOWINVALIDHOSTNAMES=true;SCANMETHOD=idForward;SCANLIMIT=5000
affonsov marked this conversation as resolved.
Show resolved Hide resolved
```

#### Notes:

1. An external [SSH tunnel](setup.md#using-an-ssh-tunnel-to-connect-to-amazon-documentdb) is being used where the
local port is `27017` (`27017` is default).
2. The Amazon DocumentDB database name is `customer`.
3. The Amazon DocumentDB is TLS-enabled (`tls=true` is default).
4. User and password values are passed to the ODBC driver using **Properties**.
5. The scan method `idForward` will order the result using the `_id` column in the collection.
affonsov marked this conversation as resolved.
Show resolved Hide resolved
6. The scan limit `5000` will limit the number of scanned documents to 5000.
Loading