Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new import tool #3182

Merged
merged 9 commits into from
Apr 21, 2023
Merged

new import tool #3182

merged 9 commits into from
Apr 21, 2023

Conversation

v-mserdar
Copy link
Contributor

@v-mserdar v-mserdar commented Mar 17, 2023

Description

This adds an import tool for use with either fhir oss or paas. It can also monitor the status of an ongoing import.
Read the readme.md file for a complete description of what this can do.

Related issues

Addresses [issue #].

Testing

It was tested while importing more than 1 TB of data to a fhir endpoint.

FHIR Team Checklist

  • Update the title of the PR to be succinct and less than 65 characters
  • Add a milestone to the PR for the sprint that it is merged (i.e. add S47)
  • Tag the PR with the type of update: Bug, Build, Dependencies, Enhancement, New-Feature or Documentation
  • Tag the PR with Open source, Azure API for FHIR (CosmosDB or common code) or Azure Healthcare APIs (SQL or common code) to specify where this change is intended to be released.
  • CI is green before merge Build Status
  • Review squash-merge requirements

Semver Change (docs)

Patch|Skip|Feature|Breaking (reason)

@v-mserdar v-mserdar added Enhancement Enhancement on existing functionality. Azure Healthcare APIs Label denotes that the issue or PR is relevant to the FHIR service in the Azure Healthcare APIs labels Mar 17, 2023
@v-mserdar v-mserdar added this to the S111 milestone Mar 17, 2023
@v-mserdar v-mserdar requested a review from a team as a code owner March 17, 2023 17:48
@v-mserdar v-mserdar enabled auto-merge (squash) March 17, 2023 22:02
@v-mserdar v-mserdar disabled auto-merge March 17, 2023 22:02

namespace Microsoft.Health.Fhir.ImporterV2
{
internal static class ImporterV2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImporterV2

Please reflect usage of this tool in its name.
It uses $import, registers it, and monitors it progress.
I can see that it can fail in monitoring. Tool should provide option to monitor without registering.

<add key="grant_type" value="Client_Credentials" />
<add key="client_id" value="{add your client id}" />
<add key="client_secret" value="{add your client secret}" />
<add key="resource" value="{add your FHIR endpoint}" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is not jus a connection string?

Copy link
Contributor Author

@v-mserdar v-mserdar Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the connection string for getting the blob container and lines 25-28 in the app.config for getting the auth token needed for $import status. Please let me know if I'm misunderstanding your question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work the same way for OSS FHIR?

Copy link
Contributor

@SergeyGaluzo SergeyGaluzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

1. It can monitor a long running $import job by polling the status. Simply provide the MonitorImportStatusEndpoint and the token fields in the config file.
Note that when this field is supplied then no other operation can be performed. Meaning you can't concurrently POST an $import.
Consequently when this setting is empty then the tool looks to POST an $import job.
2. Run an $import job by consuming ndjson blob files from the configured container and posting them to an OSS endpoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run

Register

Note that when this field is supplied then no other operation can be performed. Meaning you can't concurrently POST an $import.
Consequently when this setting is empty then the tool looks to POST an $import job.
2. Run an $import job by consuming ndjson blob files from the configured container and posting them to an OSS endpoint.
3. Run an $import job by consuming ndjson blob files from the configured container and posting them to a Paas endpoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run

Register

When running FHIR OSS be sure to have these configuration settings in the portal:
FhirServer__Operations__Import__Enabled = True
FhirServer__Operations__Import__InitialImportMode = True
FhirServer__Operations__Import__StorageAccountConnection = your_storageaccount_access_key_connection_string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FhirServer__Operations__Import__StorageAccountConnection = your_storageaccount_access_key_connection_string

Remove

FhirServer__Operations__IntegrationDataStore__StorageAccountConnection = your_storageaccount_access_key_connection_string

**You may need to remove this setting to get import to work**
FhirServer__Operations__IntegrationDataStore__StorageAccountUri
Copy link
Contributor

@SergeyGaluzo SergeyGaluzo Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test with leaving this setting and see if this still work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be removed, so I'll update the comment stating so.

the type of files it'll search for in the container
do not add a file extension as ndjson is assumed and added to the ResourceType automatically
-->
<add key="ResourceType" value="Observation" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify how to handle case when I don't want to have any filters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment stating that leaving this empty will search all.

of course this can be a much larger value than 20 so adjust as needed
-->
<add key="NumberOfBlobsForImport" value="20" />
<add key="FhirEndpoint" value="{add your FHIR endpoint}" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{add your FHIR endpoint}

Why is this in 2 places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this one

SergeyGaluzo
SergeyGaluzo previously approved these changes Mar 31, 2023
Copy link
Contributor

@SergeyGaluzo SergeyGaluzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

SergeyGaluzo
SergeyGaluzo previously approved these changes Mar 31, 2023
Copy link
Contributor

@SergeyGaluzo SergeyGaluzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Comment on lines 29 to 39
private static readonly string TokenEndpoint = ConfigurationManager.AppSettings["TokenEndpoint"] ?? string.Empty;
private static readonly string TokenGrantType = ConfigurationManager.AppSettings["grant_type"] ?? string.Empty;
private static readonly string TokenClientId = ConfigurationManager.AppSettings["client_id"] ?? string.Empty;
private static readonly string TokenClientSecret = ConfigurationManager.AppSettings["client_secret"] ?? string.Empty;
private static readonly string TokenResource = ConfigurationManager.AppSettings["FhirEndpoint"] ?? string.Empty;
private static readonly string ResourceType = ConfigurationManager.AppSettings["ResourceType"] ?? string.Empty;
private static readonly string ContainerName = ConfigurationManager.AppSettings["ContainerName"] ?? string.Empty;
private static readonly string ConnectionString = ConfigurationManager.AppSettings["ConnectionString"] ?? string.Empty;
private static readonly string FhirEndpoint = TokenResource;
private static readonly int NumberOfBlobsForImport = int.Parse(ConfigurationManager.AppSettings["NumberOfBlobsForImport"] ?? "1");
private static readonly TimeSpan ImportStatusDelay = TimeSpan.Parse(ConfigurationManager.AppSettings["ImportStatusDelay"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are this many properties, could we just bind to a configuration object? I think it would simplify all this a lot

private static readonly int NumberOfBlobsForImport = int.Parse(ConfigurationManager.AppSettings["NumberOfBlobsForImport"] ?? "1");
private static readonly TimeSpan ImportStatusDelay = TimeSpan.Parse(ConfigurationManager.AppSettings["ImportStatusDelay"]);
private static readonly HttpClient HttpClient = new();
private static BlobContainerClient s_blobContainerClientSource;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting s_ in here? Can we check the right code style is getting applied

Copy link
Contributor Author

@v-mserdar v-mserdar Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code style is referencing the same file as all other projects.
The IDE will flag warnings or errors as you can see in the screenshot. This is not unique to this project as far as warnings go. There are actually countless examples already that violate this IDE1006 rule if a field is private static and not readonly.

image

<PropertyGroup>
<OutputType>Exe</OutputType>
<AssemblyName>Fhir.RegisterAndMonitorImport</AssemblyName>
<RootNamespace>Microsoft.Health.Fhir.RegisterAndMonitorImport</RootNamespace>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft.Health.Internal...

Copy link
Contributor

@SergeyGaluzo SergeyGaluzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@v-mserdar v-mserdar merged commit 309a990 into main Apr 21, 2023
@v-mserdar v-mserdar deleted the personal/v-mserdar/new_import_tool branch April 21, 2023 03:15
ShaunDonn pushed a commit that referenced this pull request Apr 27, 2023
* new import tool

* revised app.config

* Revised based on code review

* revisions to handle oss; code cleanup

* updated readme to assist oss usage; added config setting for delay when getting import status

* revisions from code review

* final code review revisions

* changes from code review

* Adding one additional comment for clarify
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure Healthcare APIs Label denotes that the issue or PR is relevant to the FHIR service in the Azure Healthcare APIs Enhancement Enhancement on existing functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants