Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managed identity SQL auth problems (schema creation, new instance events) #166

Open
andsj073 opened this issue Apr 28, 2023 · 5 comments
Open
Assignees
Labels
db-schema This issue is related to or will impact the database schema, likely requiring a DB schema update. documentation Improvements or additions to documentation

Comments

@andsj073
Copy link

andsj073 commented Apr 28, 2023

Hello

The issue

I continuously fail to get the dt schema created in DFX provider SQL Database when using a Managed Identity (either User Assigned or System Assigned) in the SQL_DB_Connection string of the Function App during first run.

The only error message given, when using Mgd Identity, in Logs is:
Could not find stored procedure 'dt.CreateInstance'.

Creation works fine if I use the more basic SQL Server auth with user id and password.

But then, once the schema is in place, and back to using the Mgd Identity for auth to SQL, the created instance/event (by StartNewAsync) does not trigger Durable Orchestration execution, even though a row is created in NewEvents.

The environment

Using a clean install of Azure SQL Database on SQL Server with Elastic pool Standard DTU
and running the VS Code generated "SayHello" durable function on ASP Elastic Premium.

The Mgd Identity of the Function App is added to SQL Server Users executing (as AAD authenticated admin)

CREATE USER "name" FROM EXTERNAL PROVIDER;
ALTER ROLE db_owner ADD MEMBER "name";

Both rows returns successfully and the User is been verified by inspected using SSMS.
No RBAC roles are assigned to the Managed Identity.

Connection string to SQL Database in app settings of Function:

    "name": "SQLDB_Connection",
    "value": "Server=tcp:{redacted name}.database.windows.net,1433;Initial Catalog=DurableDB;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Authentication=Active Directory Managed Identity;User Id={redacted client id}",

The {redacted client id} is meticulously verified to be the client id of the Managed Identity in portal

The host.json contains

  "extensions": {
    "durableTask": {
      "storageProvider": {
        "type": "mssql",
        "connectionStringName": "SQLDB_Connection",
        "taskEventLockTimeout": "00:02:00",
        "createDatabaseIfNotExists": true
      }
    }
  }

There is no VNet integration, no private endpoints or service endpoints etc. The Function App can positively access the Database through firewall rules. The durable function execution triggered by HTTP returns 500.

Additional information and behavior

Again, everything works fine if I change to connection string in app settings to:

    "name": "SQLDB_Connection",
    "value": "Server=tcp:{redacted name}.database.windows.net,1433;Initial Catalog=DurableDB;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;User ID={redacted username};Password={redacted pwd}",

The the dt Schema is created and the Durable Function executes normally, and rows in TaskHub are created, ending up in RuntimeStatus Completed, with rows in History and Payloads.

However, to add to this for understanding, if I now switch back to the same User Assigned Managed Identity (switching connection string again), then the Durable Function starts execution and creates rows in the dt.NewEvents and dt.Instances table (with RuntimeStatus=Pending), but nothing else (no Payloads and no History) and the Function execution never completes (stuck in RuntimeStatus Pending).

Row in NewEvents

SequenceNumber	Timestamp	VisibleTime	DequeueCount	TaskHub	InstanceID	ExecutionID	EventType	RuntimeStatus	Name	TaskID	PayloadID
5	2023-04-28 07:38:42.2877671	NULL	0	{redacted mgd id name}	9e6aecbbc0994102acfedf869b61623f	8a41f23e32d747eaa90e23ffbec115d7	ExecutionStarted	Pending	DurableFunctionsOrchestrationCSharp1	NULL	NULL

Row in Instances

TaskHub	InstanceID	ExecutionID	Name	Version	CreatedTime	LastUpdatedTime	CompletedTime	RuntimeStatus	LockedBy	LockExpiration	InputPayloadID	OutputPayloadID	CustomStatusPayloadID	ParentInstanceID
{redacted mgd id name}	9e6aecbbc0994102acfedf869b61623f	8a41f23e32d747eaa90e23ffbec115d7	DurableFunctionsOrchestrationCSharp1		2023-04-28 07:38:42.2877671	NULL	NULL	Pending	NULL	NULL	NULL	NULL	NULL	NULL

For this run, no errors can be observed in Logs / Application Insight for the instance. Only successful execution information of the run.

Hence the following method is successfully executed (see logs below) and the DurableFunctionsOrchestrationCSharp1 (Orchestrator) is scheduled (Executed stored procedure dt.CreateInstance in 11ms), but the consecutive execution of DurableFunctionsOrchestrationCSharp1 never happens. No apparent errors.

        [FunctionName("DurableFunctionsOrchestrationCSharp1_HttpStart")]
        public static async Task<HttpResponseMessage> HttpStart(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestMessage req,
            [DurableClient] IDurableOrchestrationClient starter,
            ILogger log)
        {
            // Function input comes from the request content.
            string instanceId = await starter.StartNewAsync("DurableFunctionsOrchestrationCSharp1", null);

            log.LogInformation("Started orchestration with ID = '{instanceId}'.", instanceId);

            return starter.CreateCheckStatusResponse(req, instanceId);
        }

        [FunctionName("DurableFunctionsOrchestrationCSharp1")]
        public static async Task<List<string>> RunOrchestrator(
            [OrchestrationTrigger] IDurableOrchestrationContext context)
        {
            var outputs = new List<string>();

            // Replace "hello" with the name of your Durable Activity Function.
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Tokyo"));
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Seattle"));
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "London"));

            // returns ["Hello Tokyo!", "Hello Seattle!", "Hello London!"]
            return outputs;
        }

        [FunctionName(nameof(SayHello))]
        public static string SayHello([ActivityTrigger] string name, ILogger log)
        {
            log.LogInformation("Saying hello to {name}.", name);
            return $"Hello {name}!";
        }

The Function NuGets

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net6.0</TargetFramework>
    <AzureFunctionsVersion>v4</AzureFunctionsVersion>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.DurableTask" Version="2.9.4" />
    <PackageReference Include="Microsoft.DurableTask.SqlServer.AzureFunctions" Version="1.1.1" />
    <PackageReference Include="Microsoft.NET.Sdk.Functions" Version="4.1.1" />
  </ItemGroup>
  <ItemGroup>
    <None Update="host.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </None>
    <None Update="local.settings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <CopyToPublishDirectory>Never</CopyToPublishDirectory>
    </None>
  </ItemGroup>
</Project>

Oh, and adding to the above, the same erroneous behavior is observed when trying with a System Assigned Mgd Identity of the Function (that is added as db_owner User to the SQL): The execution starts, with a row to Instances, but never completes (no exeuction of the Durable Tasks I guess) and stays in Pending.

Logs

2023-04-28T08:25:10Z   [Information]   23d89f351d38433db3512aa7f884142c: Function 'DurableFunctionsOrchestrationCSharp1 (Orchestrator)' scheduled. Reason: NewInstance. IsReplay: False. State: Scheduled. HubName: {redacted function name}. AppName: {redacted function name}. SlotName: Production. ExtensionVersion: 2.9.4. SequenceNumber: 36.
2023-04-28T08:25:10Z   [Verbose]   23d89f351d38433db3512aa7f884142c: Executed stored procedure dt.CreateInstance in 11ms
2023-04-28T08:25:10Z   [Information]   Started orchestration with ID = '23d89f351d38433db3512aa7f884142c'.
2023-04-28T08:25:10Z   [Information]   Executed 'DurableFunctionsOrchestrationCSharp1_HttpStart' (Succeeded, Id=92bb6dd9-6a44-4151-9999-bb29e193c057, Duration=88ms)
@andsj073 andsj073 changed the title Schema creation not working when starting first time with managed identity Not working correctly with managed identity (schema creation, new instance events) Apr 28, 2023
@andsj073 andsj073 changed the title Not working correctly with managed identity (schema creation, new instance events) Managed identity SQL auth problems (schema creation, new instance events) Apr 28, 2023
@andsj073
Copy link
Author

Hold on,
I am currently investigating whether this strange behavior is due to the SQL Server/Database setup when it comes to the servers own managed identity and its assignment of role Directory Readers.
As per doc here:
Enable service principals to create Azure AD users
and here:
Managed identities in Azure AD for Azure SQL

Meanwhile, may I suggest that you add these references on your quickstarts page relating to Azure SQL Database Getting Started / Database setup / Azure SQL database

@andsj073
Copy link
Author

andsj073 commented May 4, 2023

@cgillum
Hello again,

RESOLUTION

I have found that (at least when using Azure SQL Database as SQL Server of the provider) if you want to use another SQL server user than the sql server admin (sa or of equivalent sql server priviliges) you need to add that user to the master database as well and add it to the role dbmanager. Otherwise the automatic schema checking and updating wont work.

This holds regardless whether the user in DurableDB is of role db_owner or dt_runtime, and regardless whether it is a Managed Identity from EXTERNAL PROVIDER or FOR a SQL server managed login.

Is this intentional?
If so, I propose such instructions are added to the quickstart/azure-sql-database page, and the multitenancy/managing-user-credentials page.

USE [DurableDB]; -- does not work, so run in this DB manually using SSMS or the like
CREATE USER [username] FOR [login]/FROM EXTERNAL PROVIDER;
ALTER ROLE db_owner/dt_runtime ADD MEMBER [username];

USE [master]; -- does not work, so run in this DB manually using SSMS or the like
CREATE USER [username] FOR [login]/FROM EXTERNAL PROVIDER;
ALTER ROLE dbmanager ADD MEMBER [username];

GO

The USE sql statements above actually doesn't work on SQL Database since it does not allow that statement.

I guess another way, that I haven't tried, would be to only run with "createDatabaseIfNotExists": true initially with a sql server admin role, and the with "createDatabaseIfNotExists": false when using dt_runtime role. It is however unlikely that you could assume sql server admin access in enterprise environments, and it also raises questions as to how schema updates would be managed.

@cgillum
Copy link
Member

cgillum commented May 4, 2023

Hi @andsj073. I think what you're describing is a behavior/requirement of Azure SQL Database and isn't specific to this project. However, I think fundamentally what you're looking for is detailed instructions (or even a tool) to help with the initial setup? Is that correct?

I don't think there's a way to remove the requirement for admin credentials when creating the database or setting up the users/roles. However, we may be able to streamline the process so that a DB admin can use a simple script to do the one-time setup and allow the developer to then just use the created database. Would that address your concerns?

@andsj073
Copy link
Author

andsj073 commented May 5, 2023

Hi @cgillum
I merely meant that your instructions are missing the piece that the user, by which the Function App (that runs Durable Functions w Azure SQL Database as taskhub provider) authenticates with SQL Database, needs to be dbmanager as well in "master", not only db_owner or dt_runtime of "DurableDB"

@cgillum cgillum self-assigned this Jun 6, 2023
@cgillum cgillum added documentation Improvements or additions to documentation db-schema This issue is related to or will impact the database schema, likely requiring a DB schema update. and removed Needs: Triage 🔍 labels Jun 6, 2023
@LockTar
Copy link

LockTar commented Oct 24, 2024

Ok. I'm glad that I found this issue because I'm already busy for two days trying to create the tables in my Azure SQL Server database.

I really think it's somewhere a bug. I searched in the repo code but I can't find the real solution.

Ok this is my scenario (like @andsj073). I have an Azure pipeline that automates everything, so I repeated this multiple times. Deleted Azure resources and rolled out everything.

  1. Create a clean Azure SQL Server with "Support only Microsoft Entra authentication for this server" option enabled.
  2. Create a new database
  3. Create Azure function (Isolated .NET 8) with managed identity (I haven't tested this with user managed identity)
  4. Execute script in the pipeline that executes the following:
    CREATE USER [msi name here] FROM external provider
    ALTER ROLE [db_owner] ADD MEMBER [msi name here]
  5. I would expect that the MSI now has full control over the database. This is not the case. I only got the following error from the Azure Durable Function activities:
    Login failed for user '<token-identified principal>'
  6. When I run the following command (THANK YOU @andsj073):
USE [master]; -- does not work, so run in this DB manually using SSMS or the like
CREATE USER [msi name here] FROM EXTERNAL PROVIDER;
ALTER ROLE dbmanager ADD MEMBER [msi name here];

The tables are created immediatly. This took me two whole days.

But the problem is that I still don't understand why. The reason is this:

I have another app service (not function) that is pointing to exactly the same database, uses the same scripts and also uses a managed identity that can connect instantly.
Ok, it's not creating tables but managed data.

Still, I have a third principal that is my Azure Pipeline service connection principal. That is also using the same script to set the roles on the database (not the master) and that principal DOES create tables because it's creating the database and it fires my Entity Framework Core Migrations sql script.

They all have db_owner permissions. I do nothing with the master database.

I tried searching for roles connected to members but I see no difference in the security.

When I look at the (Azure) SQL Server roles documentation (found here) it states the following for dbmanager:

Can create and delete databases. A member of the dbmanager role that creates a database, becomes the owner of that database, which allows that user to connect to that database as the dbo user. The dbo user has all database permissions in the database. Members of the dbmanager role don't necessarily have permission to access databases that they don't own.

Basically, it says:
Don't add the MSI user as member to the dbmanager role if it's not needed. I don't think it's needed because I don't want that this MSI can drop multiple databases. It should only be possible to drop it's own....

I really think it's a bug in this SQL Server provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
db-schema This issue is related to or will impact the database schema, likely requiring a DB schema update. documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants