-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible mergeable classes for merging code bases #1261
Comments
I think
Some classes like TdsParser, TdsParserStateObject, SqlDataReader, SqlCommand are very large and I don't think we can do in a single PR, it would be impossible to review. I believe the way to do those types is a many small PR's which take identical functionality from both into a new shared version of the file slowly reducing each project specific file to the parts which are different and then those can be thoroughly reviewed. |
Thanks for the comments. |
Looking at some of the merged from @lcheunglci I thought it might be useful to write down some of my knowledge and methodology used in my merges. From what I've gathered: The nextfx codebase was forked at some point and that fork had a managed implementation of the network interface written so that it could be used on linux for netcore. This means that this codebase now has managed and native versions of the Sql Network Interface (SNI) in the netcore codebase and they are switched out depending on the platform that you restore the nuget package for. You can also force the use of managed network in windows with an appcontext switch. This mostly makes a difference in the core of TdsParser and TdsParserStateObject. After netcore was forked the codebase was cleaned up a bit. So in general netcore is newer than netfx. However, changes and fixes may have been made to the netcore and netfx codebases independently so there's no guarantee that either was one better than the other they can both contain better versions of any particular function. The netfx codebase is very very old, it was in the original 1.0 release of netfx and it hasn't always been updated to use the latest features. An example of this is the pool key derivation mechanism on netfx, it uses direct api calls to windows security apis because the managed equivalents were not available or not know to be usable at the time of writing. This means that any change can be old or new and come from any version of the source code. To merge them you have to try to work out if they are functionally identical or equivalent enough that the difference cannot be observed by users up to and including timing and ordering of operations. This can get complicated and time consuming. Style between the two codebases is highly variable. In general when merging a file it's a good idea to turn on the compiler Info messages and look through all the possible refactoring options is gives you. You don't have to take all the suggestions just take a look at them and make the code feel like it has the same style as some of the already merged components. The objective is to work towards a coherently styled single codebase where naming and formatting are easy to understand between most files. |
Thanks @Wraith2 for your insight. I'll keep that in mind when I'm doing the code base merges. I'm currently starting small particularly focusing on the simpler code merges with the least conflicts before I look at the bigger ones. Your feedback and suggestions are much appreciated. |
While I was looking into |
Code merge for dotnet#1261 Issue for SmiXetterAccessMap class. Added the common code to .Common.cs class.
I think I see the PR you're talking about @Wraith2 (#1022), thanks for that. Nowadays, the main problem lies in the Oddly, .NET Framework 4.6.1 (which implements .NET Standard 2.0) has this class. Other frameworks naturally implement it, so we could potentially have one code path for .NET Standard 2.1 and .NET Framework, then a PlatformNotSupportedException for everything else. This means that it wouldn't support the platform versions below:
I personally think it'd be justifiable not to explicitly add support for a platform that the vendors don't support themselves, but not adding anything for UWP is a pity. I don't think there's a way to include it though. Naturally if SqlClient as a whole drops support for .NET Standard (2.0 or more broadly) then this problem becomes academic! |
it definitely will do and specifically NetStandard 2.0 |
Just as an update on this: PRs #2369, #2376, #2383 and #2390 will merge SqlClientFactory, AlwaysEncryptedHelperClasses, TdsParserHelperClasses, AAsyncCallContext and DbConnectionPoolIdentity. They also lay the groundwork for the merge of SqlDataReader. I think I can also safely merge DbConnectionFactory, DbReferenceCollection, DbConnectionClosed, SqlAuthenticationProviderManager and SqlFileStream without raising too many eyebrows - the implementations are identical (or so close to identical that it barely matters.) I can see the open issue and PR to terminate .NET Standard support, and I'll wait for this to be merged before dealing with the various encryption/enclave providers. In one situation, specific functionality will need to be locked behind conditional compilation: the .NET Framework's MDS supports the This leaves the more complicated cases, where functionality and support have diverged. So far I've come across a few of these:
Once there's the .NET Standard PR is implemented and there's nothing more I can merge, I'll take another look at these situations and see if they're as difficult as they look. I'm not planning to look at TdsParser, SqlCommand or SqlConnection any time soon. These probably need someone with a much better understanding of the library's history than me. |
@DavoudEshtehari In the wake of that removal, #2501 merges three of the four The remaining derived class is Once #2501 and the extra PR (edit: #2521) have been merged, there'll be room for Unix support for Always Encrypted. |
I've reached a stopping point with the merge, so it's now just a matter of triage. To summarise the table I've been using to keep track: Files covered by PRs
SQLCLR, code access security typesAll of these depend upon #2862. There's not much code to merge here, just a lot of files. I think there'll probably be 3-4 PRs for these.
Metrics, loggingThese use two different mechanisms between .NET Framework and .NET Core, and they also record metrics at slightly different times. It might be better to merge the API surface and verify the metric timings as part of any future OpenTelemetry work. The issue tracking that is #2210 and #2211.
Native SNIThis'll change the public-facing dependencies for the NuGet packages. Barring any objections, I'm planning to change the .NET Framework package to reference Microsoft.Data.SqlClient.SNI.runtime. This'll mean that there's one architecture-dependent DLL file in the resultant builds, not three DLLs which are toggled between at runtime.
TDS parserI'm pretty confident that the majority of the logic is identical, it's just obscured by the fact that only one codebase was refactored. There are probably some methods which can be merged as-is, but for the vast majority of cases I think we'll just need to repeat that refactor piecemeal. I'm not planning to start until #2714 is merged, given how deeply any refactor work will touch the state machine. |
The work to merge the two codebases has been going on for a long time. The problem is always testing and review. |
I've definitely seen that - there's a lot of history here. A lot of the problem seems to be that we've got a mixture of different types of changes:
Reviewing and testing PRs which contain any combination of those is always going to be tricky; many of the remaining classes have so much of the first two types of change that it's obscuring the changes which need the most review. I've seen a few "style change" PRs which touch large parts of both codebases. Those are disruptive when merged and their size makes them harder to review, but they might be a safer way to fix those first type of change and free up some bandwidth to review the more important changes. |
Is there value in moving 100% matches first? e.g. in TdsParser there is PutSession, which is 100% identical. Something like this: 66f975c |
I also wonder if it makes sense to equalize any code style changes between netfx and netcore by lifting up netfx to be the same as netcore (and if this should be a separate effort from merging) Something like this: 4331089 |
I've tried both approaches. The throughput limitations are always review time and release cadence. We can only go as far as the MS team allow and they're busy. |
So code style changes (for the sake of equalizing) should take the least time to review I guess? |
In theory but in practice any review will take the same amount of time to happen so you have to choose what is more important to use your review quota for. Do you want to make material changes and improve the library or do you want to improve the quality of the codebase in a way that does not affect users? |
Yep, understood. That's why I'd like to get some guidance from MS on what they prefer we'd do before starting new PR's that they don't want |
Speaking personally here... I've found my approach to PRs drifting into a cycle: start with a small PR which touches as little as possible and aligns some part of the public API (the API surface or behavioural) between netcore and netfx; follow up with a set of larger PRs which do more of the heavy lifting for that change. I've been trying to align the small PRs with the days/~fortnight before a GitHub milestone. Hopefully this should mean that the releases are met by easier-to-review community PRs, and the team gets a few, larger PRs when there's less pressure to review them. The Context Connection PR is a decent example of this - it's a small PR which functionally removes about 8k lines of code and half a dozen files. I'm not submitting those larger removals until after 6.0! Separately to this: the point around code style merges is good. I don't have any real opinions on whether we should have a series of "big bang" PRs which target a few files and apply every code style change at the same time, or some other approach. My only two opinions here are that we should avoid making very large code style adjustments as part of a merge, and that we shouldn't roll out a repo-wide change which needs manual merge conflict resolution in lots of PRs without some forewarning. I'm happy to contribute to whichever approach makes sense besides that. |
The main aim of this ticket is to clarify the project progress and having contribution with who is interested in it.
At the first step, we've prepared a list of possible mergeable files with their types which maybe need some pruning. Then, priority could be declared before starting to merge.
Mergable files list
The text was updated successfully, but these errors were encountered: