Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hooks extended from http hook #16109

Merged
merged 7 commits into from
May 27, 2021
Merged

Conversation

msumit
Copy link
Contributor

@msumit msumit commented May 27, 2021

Opened in place of #16048

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 27, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk
Copy link
Member

potiuk commented May 27, 2021

Static checks failing @msumit :)

@potiuk
Copy link
Member

potiuk commented May 27, 2021

I HEARTILY recommend running pre-commit install locally . That helps a LOT :)

@uranusjr
Copy link
Member

Would this not break backwards compatibility to existing usages of these hooks?

@msumit
Copy link
Contributor Author

msumit commented May 27, 2021

Would this not break backwards compatibility to existing usages of these hooks?

@uranusjr no it won't. Cause when Airflow fetches the connections from DB, it won't check its conn_type.

@uranusjr
Copy link
Member

I see, thanks for the clarification.

@potiuk
Copy link
Member

potiuk commented May 27, 2021

Yeah. We discussed it in #16048. It's lesser known fact with Airflow that you can use ANY connection with ANY hook. As long as it has the right extras/values - it will work just fine :).

The type is only used to make it easier to edit it via UI (shows custom fields, and add custom validations). It's not used for anything else. The custom fields are anyhow mapped to extras which are handled the same way for all connections.

@msumit
Copy link
Contributor Author

msumit commented May 27, 2021

@potiuk not sure why this test is failing, on my local I can see opsgenie in connections list

  ________________________ TestProviderManager.test_hooks ________________________
  
  self = <test_providers_manager.TestProviderManager testMethod=test_hooks>
  
      def test_hooks(self):
          provider_manager = ProvidersManager()
          connections_list = list(provider_manager.hooks.keys())
  >       assert CONNECTIONS_LIST == connections_list
  E       AssertionError: assert ['airbyte', '...egistry', ...] == ['airbyte', '...egistry', ...]
  E         At index 43 diff: 'opsgenie' != 'neo4j'
  E         Use -v to get the full diff

@potiuk
Copy link
Member

potiuk commented May 27, 2021

looking at it.

BTW. When you look at the output it's possible to run breeze in the very CI image that was used during the tests locally:

This way you can inspect what's going on and reproduce it (this command is printed in the logs when test fail)

./breeze --github-image-id 25af14c01fff61a658dc031d1155fe0a25f3caec --backend postgres --postgres-version 9.6 --python 3.7 --db-reset --skip-mounting-local-sources --test-type Core shell

@potiuk
Copy link
Member

potiuk commented May 27, 2021

After entering the image with above shell command:

pytest tests/core/test_providers_manager.py::TestProviderManager::test_hooks
________________________________________________________________________________________________________________________________ TestProviderManager.test_hooks ________________________________________________________________________________________________________________________________

self = <test_providers_manager.TestProviderManager testMethod=test_hooks>

    def test_hooks(self):
        provider_manager = ProvidersManager()
        connections_list = list(provider_manager.hooks.keys())
>       assert CONNECTIONS_LIST == connections_list
E       AssertionError: assert ['airbyte',\n 'asana',\n 'aws',\n 'azure',\n 'azure_batch',\n 'azure_container_registry',\n 'azure_cosmos',\n 'azure_data_explorer',\n 'azure_data_factory',\n 'azure_data_lake',\n 'cassandra',\n 'cloudant',\n 'databricks',\n 'dataprep',\n 'dingding',\n 'discord',\n 'docker',\n 'druid',\n 'elasticsearch',\n 'emr',\n 'exasol',\n 'facebook_social',\n 'ftp',\n 'gcpcloudsql',\n 'gcpcloudsqldb',\n 'gcpssh',\n 'google_cloud_platform',\n 'grpc',\n 'hdfs',\n 'hive_cli',\n 'hive_metastore',\n 'hiveserver2',\n 'http',\n 'imap',\n 'jdbc',\n 'jenkins',\n 'jira',\n 'kubernetes',\n 'leveldb',\n 'livy',\n 'mongo',\n 'mssql',\n 'mysql',\n 'opsgenie',\n 'neo4j',\n 'odbc',\n 'oracle',\n 'pig_cli',\n 'postgres',\n 'presto',\n 'qubole',\n 'redis',\n 's3',\n 'slackwebhook',\n 'samba',\n 'segment',\n 'sftp',\n 'snowflake',\n 'spark',\n 'spark_jdbc',\n 'spark_sql',\n 'sqlite',\n 'sqoop',\n 'ssh',\n 'tableau',\n 'trino',\n 'vault',\n 'vertica',\n 'wasb',\n 'yandexcloud'] == ['airbyte',\n 'asana',\n 'aws',\n 'azure',\n 'azure_batch',\n 'azure_container_registry',\n 'azure_cosmos',\n 'azure_data_explorer',\n 'azure_data_factory',\n 'azure_data_lake',\n 'cassandra',\n 'cloudant',\n 'databricks',\n 'dataprep',\n 'dingding',\n 'discord',\n 'docker',\n 'druid',\n 'elasticsearch',\n 'emr',\n 'exasol',\n 'facebook_social',\n 'ftp',\n 'gcpcloudsql',\n 'gcpcloudsqldb',\n 'gcpssh',\n 'google_cloud_platform',\n 'grpc',\n 'hdfs',\n 'hive_cli',\n 'hive_metastore',\n 'hiveserver2',\n 'http',\n 'imap',\n 'jdbc',\n 'jenkins',\n 'jira',\n 'kubernetes',\n 'leveldb',\n 'livy',\n 'mongo',\n 'mssql',\n 'mysql',\n 'neo4j',\n 'odbc',\n 'opsgenie',\n 'oracle',\n 'pig_cli',\n 'postgres',\n 'presto',\n 'qubole',\n 'redis',\n 's3',\n 'samba',\n 'segment',\n 'sftp',\n 'slackwe________________________________________________________________________________________________________________________________ TestProviderManager.test_hooks ________________________________________________________________________________________________________________________________

self = <test_providers_manager.TestProviderManager testMethod=test_hooks>

    def test_hooks(self):
        provider_manager = ProvidersManager()
        connections_list = list(provider_manager.hooks.keys())
>       assert CONNECTIONS_LIST == connections_list
E       AssertionError: assert ['airbyte',\n 'asana',\n 'aws',\n 'azure',\n 'azure_batch',\n 'azure_container_registry',\n 'azure_cosmos',\n 'azure_data_explorer',\n 'azure_data_factory',\n 'azure_data_lake',\n 'cassandra',\n 'cloudant',\n 'databricks',\n 'dataprep',\n 'dingding',\n 'discord',\n 'docker',\n 'druid',\n 'elasticsearch',\n 'emr',\n 'exasol',\n 'facebook_social',\n 'ftp',\n 'gcpcloudsql',\n 'gcpcloudsqldb',\n 'gcpssh',\n 'google_cloud_platform',\n 'grpc',\n 'hdfs',\n 'hive_cli',\n 'hive_metastore',\n 'hiveserver2',\n 'http',\n 'imap',\n 'jdbc',\n 'jenkins',\n 'jira',\n 'kubernetes',\n 'leveldb',\n 'livy',\n 'mongo',\n 'mssql',\n 'mysql',\n 'opsgenie',\n 'neo4j',\n 'odbc',\n 'oracle',\n 'pig_cli',\n 'postgres',\n 'presto',\n 'qubole',\n 'redis',\n 's3',\n 'slackwebhook',\n 'samba',\n 'segment',\n 'sftp',\n 'snowflake',\n 'spark',\n 'spark_jdbc',\n 'spark_sql',\n 'sqlite',\n 'sqoop',\n 'ssh',\n 'tableau',\n 'trino',\n 'vault',\n 'vertica',\n 'wasb',\n 'yandexcloud'] == ['airbyte',\n 'asana',\n 'aws',\n 'azure',\n 'azure_batch',\n 'azure_container_registry',\n 'azure_cosmos',\n 'azure_data_explorer',\n 'azure_data_factory',\n 'azure_data_lake',\n 'cassandra',\n 'cloudant',\n 'databricks',\n 'dataprep',\n 'dingding',\n 'discord',\n 'docker',\n 'druid',\n 'elasticsearch',\n 'emr',\n 'exasol',\n 'facebook_social',\n 'ftp',\n 'gcpcloudsql',\n 'gcpcloudsqldb',\n 'gcpssh',\n 'google_cloud_platform',\n 'grpc',\n 'hdfs',\n 'hive_cli',\n 'hive_metastore',\n 'hiveserver2',\n 'http',\n 'imap',\n 'jdbc',\n 'jenkins',\n 'jira',\n 'kubernetes',\n 'leveldb',\n 'livy',\n 'mongo',\n 'mssql',\n 'mysql',\n 'neo4j',\n 'odbc',\n 'opsgenie',\n 'oracle',\n 'pig_cli',\n 'postgres',\n 'presto',\n 'qubole',\n 'redis',\n 's3',\n 'samba',\n 'segment',\n 'sftp',\n 'slackwebhook',\n 'snowflake',\n 'spark',\n 'spark_jdbc',\n 'spark_sql',\n 'sqlite',\n 'sqoop',\n 'ssh',\n 'tableau',\n 'trino',\n 'vault',\n 'vertica',\n 'wasb',\n 'yandexcloud']
E         At index 43 diff: 'opsgenie' != 'neo4j'
E         Full diff:
E           [
E            'airbyte',
E            'asana',
E            'aws',
E            'azure',
E            'azure_batch',
E            'azure_container_registry',
E            'azure_cosmos',
E            'azure_data_explorer',
E            'azure_data_factory',
E            'azure_data_lake',
E            'cassandra',
E            'cloudant',
E            'databricks',
E            'dataprep',
E            'dingding',
E            'discord',
E            'docker',
E            'druid',
E            'elasticsearch',
E            'emr',
E            'exasol',
E            'facebook_social',
E            'ftp',
E            'gcpcloudsql',
E            'gcpcloudsqldb',
E            'gcpssh',
E            'google_cloud_platform',
E            'grpc',
E            'hdfs',
E            'hive_cli',
E            'hive_metastore',
E            'hiveserver2',
E            'http',
E            'imap',
E            'jdbc',
E            'jenkins',
E            'jira',
E            'kubernetes',
E            'leveldb',
E            'livy',
E            'mongo',
E            'mssql',
E            'mysql',
E         +  'opsgenie',
E            'neo4j',
E            'odbc',
E         -  'opsgenie',
E            'oracle',
E            'pig_cli',
E            'postgres',
E            'presto',
E            'qubole',
E            'redis',
E            's3',
E         +  'slackwebhook',
E            'samba',
E            'segment',
E            'sftp',
E         -  'slackwebhook',
E            'snowflake',
E            'spark',
E            'spark_jdbc',
E            'spark_sql',
E            'sqlite',
E            'sqoop',
E            'ssh',
E            'tableau',
E            'trino',
E            'vault',
E            'vertica',
E            'wasb',
E            'yandexcloud',
E           ]

tests/core/test_providers_manager.py:262: AssertionError
bhook',\n 'snowflake',\n 'spark',\n 'spark_jdbc',\n 'spark_sql',\n 'sqlite',\n 'sqoop',\n 'ssh',\n 'tableau',\n 'trino',\n 'vault',\n 'vertica',\n 'wasb',\n 'yandexcloud']
E         At index 43 diff: 'opsgenie' != 'neo4j'
E         Full diff:
E           [
E            'airbyte',
E            'asana',
E            'aws',
E            'azure',
E            'azure_batch',
E            'azure_container_registry',
E            'azure_cosmos',
E            'azure_data_explorer',
E            'azure_data_factory',
E            'azure_data_lake',
E            'cassandra',
E            'cloudant',
E            'databricks',
E            'dataprep',
E            'dingding',
E            'discord',
E            'docker',
E            'druid',
E            'elasticsearch',
E            'emr',
E            'exasol',
E            'facebook_social',
E            'ftp',
E            'gcpcloudsql',
E            'gcpcloudsqldb',
E            'gcpssh',
E            'google_cloud_platform',
E            'grpc',
E            'hdfs',
E            'hive_cli',
E            'hive_metastore',
E            'hiveserver2',
E            'http',
E            'imap',
E            'jdbc',
E            'jenkins',
E            'jira',
E            'kubernetes',
E            'leveldb',
E            'livy',
E            'mongo',
E            'mssql',
E            'mysql',
E         +  'opsgenie',
E            'neo4j',
E            'odbc',
E         -  'opsgenie',
E            'oracle',
E            'pig_cli',
E            'postgres',
E            'presto',
E            'qubole',
E            'redis',
E            's3',
E         +  'slackwebhook',
E            'samba',
E            'segment',
E            'sftp',
E         -  'slackwebhook',
E            'snowflake',
E            'spark',
E            'spark_jdbc',
E            'spark_sql',
E            'sqlite',
E            'sqoop',
E            'ssh',
E            'tableau',
E            'trino',
E            'vault',
E            'vertica',
E            'wasb',
E            'yandexcloud',
E           ]

tests/core/test_providers_manager.py:262: AssertionError

@potiuk
Copy link
Member

potiuk commented May 27, 2021

Wrong sorting order :)

@msumit
Copy link
Contributor Author

msumit commented May 27, 2021

Thanks a lot @potiuk. I tried to use breeze but could not figure out the correct way to run it. About getting the sorting order wrong I pasted the values after running the tests locally (via py.test) and as I did not have all the providers installed, it gave me an incomplete order :(

@potiuk
Copy link
Member

potiuk commented May 27, 2021

Thanks a lot @potiuk. I tried to use breeze but could not figure out the correct way to run it. About getting the sorting order wrong I pasted the values after running the tests locally (via py.test) and as I did not have all the providers installed, it gave me an incomplete order :(

Yep. That's exactly the reason we have ./breeze. To bridge the gap between the local development environment and CI. If gives you the very exact mirror of what happens in the CI, including the capability of pulling the exact CI image that was used for your particular test run and reproduce it/fix it there.

@potiuk potiuk merged commit 10ed42a into apache:master May 27, 2021
@msumit msumit deleted the fix_http_conn_2 branch May 28, 2021 03:58
potiuk pushed a commit to potiuk/airflow that referenced this pull request Jun 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools area:providers full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants