Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/watchdog thread #80

Open
wants to merge 59 commits into
base: master
Choose a base branch
from
Open

Conversation

asparrowhawk
Copy link
Collaborator

Watchdog

This pull request contains a significant number of changes. Notable are the
addition of a native assembly which contains COM interop code that is responsible
for retrieving the process id of a COM server given a IUnknown interface pointer.
This interface is typically the Office application that is being automated. A separate
README.md in the COMServer project directory provides additional information and
insight into how it is implemented and used.

With the functionality delivered by the COMServer assembly, a new /timeout <seconds>
command line option was added that informed how long the OfficeToPDF application
should wait for the conversion process to happen. This involved the addition of the
following new classes and interfaces:

  • IWatchdog
  • NullWatchdog
  • Watchdog
  • WatchdogFactory

Other changes are the addition of the ArgParser class. This is a class
that is derived from the Systems.Collections.Hashtable that was employed by
the original code. The new class pulls together all the related command line parsing logic
and state. It also adds type safe access to the contents of the hash table removing
the need for the client code to perform a cast to the expected type at point of access.

For example:

Boolean running = (Boolean)options["noquit"];

becomes

Boolean running = options.noquit;

The code changes also introduce the IConverter interface and ConverterFactory class.
These provide a uniform function for conversion and a simple way to create the required
converter based on the source filename's extension.

All of the Converter class implementations remain as per the original master branch. Any
changes were limited to using the ArgParser class instead of the Hashtable.

Also added were a number of NUnit based tests that ensure the correct behaviour of a lot
of the new classes. The test project also contains a number of 'Explicit' tests that can be
used to verify the behaviour of the Watchdog and that the COMServer code retrieves the
process id.

There is also a GitHub actions workflow that builds the source code and runs the unit tests
that are NOT marked as Explicit. See the .github\workflows\build.yml that is part of
the solution.

With the introduction of the COMServer assembly the projects must be built as either x86
or by default x64. Therefore the "Mixed Platforms" and "Any CPU" build configurations
have been removed from the projects and the solution.

For the Unit tests, some custom MSBuild configuration, copies the required native assembly to
the output directory:

  <Target Name="CopyDependents" AfterTargets="Build">
    <ItemGroup>
      <DependentFile Include="COMServer.dll;COMServer.pdb" />
    </ItemGroup>
    <PropertyGroup Condition="'$(Platform)' == 'x86'">
      <DependentDir>$(Configuration)</DependentDir>
    </PropertyGroup>
    <PropertyGroup Condition="'$(Platform)' != 'x86'">
      <DependentDir>$(Platform)\$(Configuration)</DependentDir>
    </PropertyGroup>
    <Copy SourceFiles="@(DependentFile -> '$(SolutionDir)COMServer\$(DependentDir)\%(Identity)')" DestinationFolder="$(TargetDir)" SkipUnchangedFiles="true" Condition="'$(NCrunch)' != '1'" />
  </Target>

The projects have been updated to use .NET Framework version 4.8. All work was carried out using
Visual Studio 2022. The existing NuGet packages were NOT updated. The Unit test project references
the latest NUnit NuGet packages.

Deployment still just requires the OfficeToPdf.exe and OfficeToPdf.exe.config files. The
COMServer.dll has been added to OfficeToPdf project as embedded resources as detailed in the
Costura documentation.

These code changes update the framework to .NET 4.8
Added NCrunch configuration in order to continously build the code.
Removed using statements that are not required.
Removed two unused Convert functions.
These code changes add a managed C++ project that contains COM code that retrieves the process
id of the Office COM server.

It also includes a simple test project and test that ensures the information returned is correct.
These code changes make the code closer to that described in the blog:

https://www.apriorit.com/dev-blog/724-windows-three-ways-to-get-com-server-process-id

Adds supports for standard, handle and extended OBJREF types.
These code changes add a new C++ assembly that allows the calling
of the ResolveOxid function based on the idl.

The same code could not be part of the managed C++ assembly.

The code is currently delivering the strings that need to be parsed
to get the port. From the port then hopefully the process id can
be found.
These changes capture the port associated with the connection to
the COM server.
Factored out function to get port.
This code demonstrates to the use of the GetExtendedTcpTable call in order to get
the informaiton about open tcp ports and their associated processes.

Sourec code from:

https://timvw.be/2007/09/09/build-your-own-netstat.exe-with-c/
These code changes contain a working fallback mechanism, that if the process id is greater
than 65535 it will use the second way detailed in the apriorit blog post:

https://www.apriorit.com/dev-blog/724-windows-three-ways-to-get-com-server-process-id

The changes also include documents referenced in part 1 and 2 of "The OXID Resolver" and also
the Microsoft MS-DCOM documentation.

https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dcom/f4643148-d34b-4f6f-bc9b-b14aed358544

https://airbus-cyber-security.com/the-oxid-resolver-part-1-remote-enumeration-of-network-interfaces-without-any-authentication/
https://airbus-cyber-security.com/the-oxid-resolver-part-2-accessing-a-remote-object-inside-dcom/
These code changes add a new unit test that uses pinvoke to call
on the code that gets the COM server process id.

This means that the managed C++ code is not required and can
be deleted.
Removed the unnecessary COMInterop project and its unit tests.

The required functionality can be pinvoked from the OXID project.
These changes contain code clean up and function parameter annotations.
These code changes output the OXID assemblies to a sub directory of
the project instead of a sub directory where  solution file exists.

Replaced the PostBuildEvent with MSBuild Copy Task.
Added README.md file with links to sources of information about the
implementation of the GetCOMProcessId() function.
These changes rename the OXID project to COMServer. The OXID name reflects
the OXID Resolver functionaliy that is part of Microsoft's DCOM.
These code changes move the argument parsing to its own class.

This is in preparation for adding unit tests and new options.
These changes add the start of some unit tests for the ArgParser class.

Found duplicate key value for "pdf_restrict_annotation" had been added.
These code changes add unit tests for the help parsing functionality.
These code changes add the timeout arg handling.
These changes add additional properties that allow type safe access to
the contents of the hash table.

Added additional unit tests to cover some of the new properties.
These code changes make use of the new type safe property accessors.
These code changes update the WordConverter to use the ArgParser properties for
both getting and setting.
Additional unit tests that check for correct return value when
given invalid timeout values.
These code changes extend the use of type safe access properties to the
parsed command line args.
These code changes add the ConverterFactory which returns instances of
the IConverter service based on the file extension.

Ensured all converter implementations implement the IConverter service.
These code changes highlight the way that the Outlook converter uses the Word
Converter to convert content to PDF.
Moved the AppOption class to its own source file.
These changes ensure the correct name is used for the Option.
Extended the use of type safety to the excel specific command line
arguments.
Extended the use of type safety to the power point specific command line
arguments.
Extended the use of type safety to the publisher specific command line
arguments.
Extended the use of type safety to the visio specific command line
arguments.
These code changes improve the unit tests by the re-use of the
DisposableApplication class.
These changes copy the unmanaged COMServer assembly to the output
directory along side the OfficeToPDF executable.

The COMServer assembly is used to locate the process id of a Microsoft
Office application that is running.
Renamed Compose to Pipe to match original intent.
Corrected the platform setting for the release configuration.

Removed the Netstat reference application to avoid confusion
as to its purpose.
These code changes add additional tests that ensure that the code that
finds the process id works with all of the Microsoft Office applications.
These code changes use the ExitCode through out with only
the final cast to an integer when exiting the application.
These code changes add type safe access to the pdf command line arguments.
These code changes add a GitHub action workflow to build and test
the code.
These code changes are required following a rebase on top of master.
These code changes embed the native assembly as a resource so that it doesn't
need to be deployed alongside the executable.
Handle the case where the watchdog has gone off and it is no longer possible to
close the application .
Removed the unnecessary binding redirects as the assemblies are embedded as
resources. The existing one was incorrect following a NuGet package update.

Remove reference to PdfSharp.SharpZipLib readme.txt.
@asparrowhawk asparrowhawk requested a review from vittala February 28, 2022 20:35
Microsoft.MSHTML does not appear to be registered on the build server. So as the
pre-built assemblies are already present then there is no need to build the COMServer
project.
@asparrowhawk
Copy link
Collaborator Author

I am having problems with GitHub actions. The build agents do not seem to have the Microsoft.mshtml.dll assembly registered on them. This is required in order to build the COMServer and OfficeToPdf projects due to their reliance COM and the Office Primary Interop Assemblies.

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(2926,5): warning MSB3284: Cannot get the file path for type library "0002e157-0000-0000-c000-000000000046" version 5.3. Library not registered. (Exception from HRESULT: 0x8002801D (TYPE_E_LIBNOTREGISTERED)) [D:\a\OfficeToPDF\OfficeToPDF\OfficeToPDF\OfficeToPDF.csproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(2926,5): warning MSB3283: Cannot find wrapper assembly for type library "MSHTML". Verify that (1) the COM component is registered correctly and (2) your target platform is the same as the bitness of the COM component. For example, if the COM component is 32-bit, your target platform must not be 64-bit. [D:\a\OfficeToPDF\OfficeToPDF\OfficeToPDF\OfficeToPDF.csproj]

I have tried registering the Microsoft.mshtml.dll assembly as part of the workflow, but that needs admin permissions on the build agent.

I will have to look at other work arounds and the possibility of using a docker container to build the code. But this may take a little bit of time.

Stop msbuild from using more than one process and therefore compile multiple
projects in parallel.
Removed COM references to MSHTML and VBIDE in order
to build the projects.

The OfficeToPDF compiles and runs without these references.
@asparrowhawk
Copy link
Collaborator Author

I have solved the build issue by removing the COM references to MSHTML and VBIDE from the OfficeToPdf project.

asparrowhawk and others added 3 commits March 1, 2022 11:43
This change speeds up the build by caching the NuGet packages.
This change updates the NuGet packages to the latest versions. This was
necessary as the tests were not being discovered in the latest version of
Visual Studio 2022.
@asparrowhawk asparrowhawk force-pushed the feature/watchdog-thread branch from f727a7e to ee15433 Compare March 18, 2024 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants