Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Google Analytics v4: Declare oauth parameters in google sources #6414

Merged
merged 9 commits into from
Oct 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/publish-command.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@ jobs:
GH_NATIVE_INTEGRATION_TEST_CREDS: ${{ secrets.GH_NATIVE_INTEGRATION_TEST_CREDS }}
GOOGLE_ADS_TEST_CREDS: ${{ secrets.GOOGLE_ADS_TEST_CREDS }}
GOOGLE_ANALYTICS_V4_TEST_CREDS: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS }}
GOOGLE_ANALYTICS_V4_TEST_CREDS_SRV_ACC: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS_SRV_ACC }}
GOOGLE_ANALYTICS_V4_TEST_CREDS_OLD: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS_OLD }}
GOOGLE_CLOUD_STORAGE_TEST_CREDS: ${{ secrets.GOOGLE_CLOUD_STORAGE_TEST_CREDS }}
GOOGLE_DIRECTORY_TEST_CREDS: ${{ secrets.GOOGLE_DIRECTORY_TEST_CREDS }}
GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS: ${{ secrets.GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/test-command.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ jobs:
GH_NATIVE_INTEGRATION_TEST_CREDS: ${{ secrets.GH_NATIVE_INTEGRATION_TEST_CREDS }}
GOOGLE_ADS_TEST_CREDS: ${{ secrets.GOOGLE_ADS_TEST_CREDS }}
GOOGLE_ANALYTICS_V4_TEST_CREDS: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS }}
GOOGLE_ANALYTICS_V4_TEST_CREDS_SRV_ACC: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS_SRV_ACC }}
GOOGLE_ANALYTICS_V4_TEST_CREDS_OLD: ${{ secrets.GOOGLE_ANALYTICS_V4_TEST_CREDS_OLD }}
GOOGLE_CLOUD_STORAGE_TEST_CREDS: ${{ secrets.GOOGLE_CLOUD_STORAGE_TEST_CREDS }}
GOOGLE_DIRECTORY_TEST_CREDS: ${{ secrets.GOOGLE_DIRECTORY_TEST_CREDS }}
GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS: ${{ secrets.GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"sourceDefinitionId": "eff3616a-f9c3-11eb-9a03-0242ac130003",
"name": "Google Analytics v4",
"dockerRepository": "airbyte/source-google-analytics-v4",
"dockerImageTag": "0.1.3",
"dockerImageTag": "0.1.7",
"documentationUrl": "https://docs.airbyte.io/integrations/sources/source-google-analytics-v4",
"icon": "google-analytics.svg"
}
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@
- sourceDefinitionId: eff3616a-f9c3-11eb-9a03-0242ac130003
name: Google Analytics v4
dockerRepository: airbyte/source-google-analytics-v4
dockerImageTag: 0.1.6
dockerImageTag: 0.1.7
documentationUrl: https://docs.airbyte.io/integrations/sources/source-google-analytics-v4
icon: google-analytics.svg
sourceType: api
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@ RUN pip install .
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=0.1.6
LABEL io.airbyte.version=0.1.7
LABEL io.airbyte.name=airbyte/source-google-analytics-v4
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,22 @@ tests:
connection:
- config_path: "secrets/config.json"
status: "succeed"
- config_path: "secrets/service_config.json"
status: "succeed"
- config_path: "secrets/old_config.json"
status: "succeed"
- config_path: "integration_tests/invalid_config.json"
status: "failed"
discovery:
- config_path: "secrets/config.json"
- config_path: "secrets/service_config.json"
basic_read:
- config_path: "secrets/config.json"
- config_path: "secrets/service_config.json"
configured_catalog_path: "integration_tests/configured_catalog.json"
empty_streams: []
incremental:
- config_path: "secrets/config.json"
- config_path: "secrets/service_config.json"
configured_catalog_path: "integration_tests/configured_catalog.json"
future_state_path: "integration_tests/abnormal_state.json"
full_refresh:
- config_path: "secrets/config.json"
- config_path: "secrets/service_config.json"
configured_catalog_path: "integration_tests/configured_catalog.json"
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
{
"credentials": { "credentials_json": "" },
"credentials": {
"auth_type": "Service",
"credentials_json": "None"
},
"view_id": "211669975",
"start_date": "2021-02-11",
"window_in_days": 1,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -383,55 +383,29 @@ def get_updated_state(self, current_stream_state: MutableMapping[str, Any], late
return {self.cursor_field: max(latest_record.get(self.cursor_field, ""), current_stream_state.get(self.cursor_field, ""))}


class GoogleAnalyticsOauth2Authenticator(Oauth2Authenticator):
"""
This class supports either default authorization_code and JWT OAuth
authorizations in case of service account.

Request example for API token extraction:
class GoogleAnalyticsServiceOauth2Authenticator(Oauth2Authenticator):
"""Request example for API token extraction:
curl --location --request POST
https://oauth2.googleapis.com/token?grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer&assertion=signed_JWT
"""

use_jwt_auth: bool = False

def __init__(self, config):
client_secret, client_id, refresh_token = None, None, None
if "credentials_json" in config:
# Backward compatability with previous config format. Use
# credentials_json from config root.
auth = config
else:
auth = config["credentials"]
if "credentials_json" in auth:
# Service account JWT authorization
self.use_jwt_auth = True
credentials_json = json.loads(auth["credentials_json"])
client_secret, client_id, refresh_token = credentials_json["private_key"], credentials_json["private_key_id"], None
self.client_email = credentials_json["client_email"]
else:
# OAuth 2.0 authorization_code authorization
client_secret, client_id, refresh_token = auth["client_secret"], auth["client_id"], auth["refresh_token"]
self.credentials_json = json.loads(config["credentials_json"])
self.client_email = self.credentials_json["client_email"]
self.scope = "https://www.googleapis.com/auth/analytics.readonly"

super().__init__(
token_refresh_endpoint="https://oauth2.googleapis.com/token",
client_secret=client_secret,
client_id=client_id,
refresh_token=refresh_token,
scopes=[self.scope],
client_secret=self.credentials_json["private_key"],
client_id=self.credentials_json["private_key_id"],
refresh_token=None,
)

def refresh_access_token(self) -> Tuple[str, int]:
"""
Calling the Google OAuth 2.0 token endpoint. Used for authorizing
with signed JWT if credentials_json provided by config. Otherwise use
default OAuth2.0 workflow.
:return tuple with access token and token's time-to-live.
Calling the Google OAuth 2.0 token endpoint. Used for authorizing signed JWT.
Returns tuple with access token and token's time-to-live
"""
if not self.use_jwt_auth:
return super().refresh_access_token()

response_json = None
try:
response = requests.request(method="POST", url=self.token_refresh_endpoint, params=self.get_refresh_request_params())
Expand All @@ -452,7 +426,6 @@ def refresh_access_token(self) -> Tuple[str, int]:
def get_refresh_request_params(self) -> Mapping[str, any]:
"""
Sign the JWT with RSA-256 using the private key found in service account JSON file.
Not used with default OAuth2.0 authorization_code grant_type.
"""
token_lifetime = 3600 # token lifetime is 1 hour

Expand All @@ -469,17 +442,35 @@ def get_refresh_request_params(self) -> Mapping[str, any]:
}
headers = {"kid": self.client_id}
signed_jwt = jwt.encode(payload, self.client_secret, headers=headers, algorithm="RS256")
return {"grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer", "assertion": signed_jwt}
return {"grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer", "assertion": str(signed_jwt)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since line 432 was removed, it doesn't seem to be using the method from super() in non-jwt cases... (ie when not using service account json?)

In those cases, it seems the grant_type should be simply a refresh_token value:

Is this making the connector supporting only the service account / jwt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_authenticator method contains the logic for choosing the authorization type, there is support for client and service authorization



class SourceGoogleAnalyticsV4(AbstractSource):
"""Google Analytics lets you analyze data about customer engagement with your website or application."""

@staticmethod
def get_authenticator(config):
# backwards compatibility, credentials_json used to be in the top level of the connector
if config.get("credentials_json"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if config.get("credentials_json"):
# backwards compatibility, credentials_json used to be in the top level of the connector
if config.get("credentials_json"):

return GoogleAnalyticsServiceOauth2Authenticator(config)

auth_params = config.get("credentials")
if auth_params.pop("auth_type") == "Service":
return GoogleAnalyticsServiceOauth2Authenticator(auth_params)
else:
return Oauth2Authenticator(
token_refresh_endpoint="https://oauth2.googleapis.com/token",
client_secret=auth_params.get("client_secret"),
client_id=auth_params.get("client_id"),
refresh_token=auth_params.get("refresh_token"),
scopes=["https://www.googleapis.com/auth/analytics.readonly"],
)

def check_connection(self, logger, config) -> Tuple[bool, any]:
try:
url = f"{GoogleAnalyticsV4TypesList.url_base}"

authenticator = GoogleAnalyticsOauth2Authenticator(config)
authenticator = self.get_authenticator(config)

session = requests.get(url, headers=authenticator.get_auth_header())
session.raise_for_status()
Expand All @@ -496,7 +487,7 @@ def check_connection(self, logger, config) -> Tuple[bool, any]:
def streams(self, config: Mapping[str, Any]) -> List[Stream]:
streams: List[GoogleAnalyticsV4Stream] = []

authenticator = GoogleAnalyticsOauth2Authenticator(config)
authenticator = self.get_authenticator(config)

config["authenticator"] = authenticator

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,50 +4,9 @@
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Google Analytics V4 Spec",
"type": "object",
"required": ["credentials", "view_id", "start_date"],
"required": ["view_id", "start_date"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove credentials as required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old configuration does not contain a credential field and when we run the connector the following error appears Exception: Config validation error: 'credentials' is a required property

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, good point. Thanks for the clarification

"additionalProperties": true,
"properties": {
"credentials": {
"title": "Authentication mechanism",
"type": "object",
"description": "Choose either OAuth2.0 flow or provide your own JWT credentials for service account",
"oneOf": [
{
"type": "object",
"title": "OAuth2.0 authorization",
"properties": {
"option_title": {
"type": "string",
"const": "Default OAuth2.0 authorization"
},
"client_id": { "type": "string" },
"client_secret": { "type": "string", "airbyte_secret": true },
"refresh_token": { "type": "string", "airbyte_secret": true },
"access_token": { "type": "string", "airbyte_secret": true }
},
"required": ["client_id", "client_secret", "refresh_token"],
"additionalProperties": false
},
{
"type": "object",
"title": "Service Account Key",
"properties": {
"option_title": {
"type": "string",
"const": "Service account credentials"
},
"credentials_json": {
"type": "string",
"title": "Credentials JSON",
"description": "The contents of the JSON service account key. Check out the <a href=\"https://docs.airbyte.io/integrations/sources/googleanalytics\">docs</a> if you need help generating this key.",
"airbyte_secret": true
}
},
"required": ["credentials_json"],
"additionalProperties": true
}
]
},
"view_id": {
"type": "string",
"title": "View ID",
Expand All @@ -70,6 +29,75 @@
"title": "Custom Reports",
"type": "string",
"description": "A JSON array describing the custom reports you want to sync from GA. Check out the <a href=\"https://docs.airbyte.io/integrations/sources/google-analytics-v4\">docs</a> to get more information about this field."
},
"credentials": {
"type": "object",
"oneOf": [
{
"title": "Authenticate via Google (Oauth)",
"type": "object",
"required": [
"auth_type",
"client_id",
"client_secret",
"refresh_token"
],
"properties": {
"auth_type": {
"type": "string",
"const": "Client",
"enum": ["Client"],
"default": "Client",
"order": 0
},
"client_id": {
"title": "Client ID",
"type": "string",
"description": "The Client ID of your developer application",
"airbyte_secret": true
},
"client_secret": {
"title": "Client Secret",
"type": "string",
"description": "The client secret of your developer application",
"airbyte_secret": true
},
"refresh_token": {
"title": "Refresh Token",
"type": "string",
"description": "A refresh token generated using the above client ID and secret",
"airbyte_secret": true
},
"access_token": {
"title": "Access Token",
"type": "string",
"description": "A access token generated using the above client ID, secret and refresh_token",
"airbyte_secret": true
}
}
},
{
"type": "object",
"title": "Service Account Key Authentication",
"required": ["auth_type", "credentials_json"],
"properties": {
"auth_type": {
"type": "string",
"const": "Service",
"enum": ["Service"],
"default": "Service",
"order": 0
},
"credentials_json": {
"type": "string",
"description": "The JSON key of the service account to use for authorization",
"examples": [
"{ \"type\": \"service_account\", \"project_id\": YOUR_PROJECT_ID, \"private_key_id\": YOUR_PRIVATE_KEY, ... }"
]
}
}
}
]
}
}
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,10 @@ def test_lookup_metrics_dimensions_data_type(metrics_dimensions_mapping, mock_me
def test_check_connection_jwt(jwt_encode_mock, mocker, mock_metrics_dimensions_type_list_link, mock_auth_call):
test_config = json.loads(read_file("../integration_tests/sample_config.json"))
del test_config["custom_reports"]
test_config["credentials"] = {"credentials_json": '{"client_email": "", "private_key": "", "private_key_id": ""}'}
test_config["credentials"] = {
"auth_type": "Service",
"credentials_json": '{"client_email": "", "private_key": "", "private_key_id": ""}',
}
source = SourceGoogleAnalyticsV4()
assert source.check_connection(MagicMock(), test_config) == (True, None)
jwt_encode_mock.encode.assert_called()
Expand All @@ -81,6 +84,7 @@ def test_check_connection_oauth(jwt_encode_mock, mocker, mock_metrics_dimensions
test_config = json.loads(read_file("../integration_tests/sample_config.json"))
del test_config["custom_reports"]
test_config["credentials"] = {
"auth_type": "Client",
"client_id": "client_id_val",
"client_secret": "client_secret_val",
"refresh_token": "refresh_token_val",
Expand Down
3 changes: 2 additions & 1 deletion docs/integrations/sources/google-analytics-v4.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,8 @@ The Google Analytics connector should not run into Google Analytics API limitati

| Version | Date | Pull Request | Subject |
| :------ | :-------- | :----- | :------ |
| 0.1.6 | 2021-09-27 | [6459](https://github.com/airbytehq/airbyte/pull/6459) | Update OAuth Spec File |
| 0.1.7 | 2021-10-07 | [6414](https://github.com/airbytehq/airbyte/pull/6414) | Declare oauth parameters in google sources |
| 0.1.6 | 2021-09-27 | [6459](https://github.com/airbytehq/airbyte/pull/6459) | Update OAuth Spec File |
| 0.1.3 | 2021-09-21 | [6357](https://github.com/airbytehq/airbyte/pull/6357) | Fix oauth workflow parameters |
| 0.1.2 | 2021-09-20 | [6306](https://github.com/airbytehq/airbyte/pull/6306) | Support of airbyte OAuth initialization flow |
| 0.1.1 | 2021-08-25 | [5655](https://github.com/airbytehq/airbyte/pull/5655) | Corrected validation of empty custom report|
Expand Down
2 changes: 2 additions & 0 deletions tools/bin/ci_credentials.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ write_standard_creds source-gitlab "$GITLAB_INTEGRATION_TEST_CREDS"
write_standard_creds source-github "$GH_NATIVE_INTEGRATION_TEST_CREDS"
write_standard_creds source-google-ads "$GOOGLE_ADS_TEST_CREDS"
write_standard_creds source-google-analytics-v4 "$GOOGLE_ANALYTICS_V4_TEST_CREDS"
write_standard_creds source-google-analytics-v4 "$GOOGLE_ANALYTICS_V4_TEST_CREDS_SRV_ACC" "service_config.json"
write_standard_creds source-google-analytics-v4 "$GOOGLE_ANALYTICS_V4_TEST_CREDS_OLD" "old_config.json"
write_standard_creds source-google-directory "$GOOGLE_DIRECTORY_TEST_CREDS"
write_standard_creds source-google-search-console "$GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS"
write_standard_creds source-google-search-console "$GOOGLE_SEARCH_CONSOLE_CDK_TEST_CREDS_SRV_ACC" "service_account_config.json"
Expand Down