Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix encapsulated item with corresponding ReturnType #506

Merged
merged 2 commits into from
Jul 19, 2023

Conversation

rwecho
Copy link
Contributor

@rwecho rwecho commented Jul 12, 2023

@elgonzo
I created a pr, and hope to fix issue #505

@elgonzo
Copy link
Contributor

elgonzo commented Jul 12, 2023

Just to make clear, i am not the author nor any of the maintainers of this project, nor associated with any of them. Like you, i am just a user. Therefore, don't take my word for gospel ;-)

And since i speak with no authority regarding HAP, take from my review suggestions what you like. Or don't take away anything. I won't take no offense if you don't and it's all fine by me. Also, i am only able to visually check the code, so my apologies if i overlook something. Anyways, getting on with the business...

Your pull request is more than solid, imo. The only worthwhile improvement i can see is a code duplication situation that could be avoided by slightly restructuring the code.

There is a third place where ReturnType values are being handled in pretty much the same manner as your GetEncapsulatedHtml method. It's in the nested Tools class:

switch (xPathAttribute.NodeReturnType)
{
case ReturnType.InnerHtml:
{
result = Convert.ChangeType(htmlNode.InnerHtml, TType);
}
break;
case ReturnType.InnerText:
{
result = Convert.ChangeType(htmlNode.InnerText, TType);
}
break;
case ReturnType.OuterHtml:
{
result = Convert.ChangeType(htmlNode.OuterHtml, TType);
}
break;
default: throw new Exception();
}
return (T)result;

To avoid this kind of code duplication, i would put GetEncapsulatedHtml as an internal static method inside the nested Tools class. And also rename it, because what exactly is "encapsulated HTML" supposed to mean? ;-)
Something like this:

    internal static class Tools
    {
        ...

        internal static string GetHtmlForEncapsulation(HtmlNode node, ReturnType returnType)
        {
            switch (returnType)
            {
                case ReturnType.InnerText:
                    return node.InnerText;
                case ReturnType.InnerHtml:
                    return node.InnerHtml;
                case ReturnType.OuterHtml:
                    return node.OuterHtml;
                default:
                    throw new Exception("Unhandled ReturnType: " + returnType.ToString());
            };
        }
        
        ...

I also switched the method parameters around to maintain the same style/pattern as the other methods in the nested Tools class. (There is no particular reason i didn't make it an extension method. If you feel like it better being an extension method, make it so. It's fine with me either way.)

That would then lead to the two call sites of your GetEncapsulatedHtml methods to be changed from

innerHtmlDocument.LoadHtml(GetEncapsulatedHtml(xPathAttribute.NodeReturnType, htmlNode));

into

innerHtmlDocument.LoadHtml(Tools.GetHtmlForEncapsulation(htmlNode, xPathAttribute.NodeReturnType));

(or node instead of htmlNode at the second call site, respectively)

and this code in the GetNodesValuesBasedOnXPathReturnType method in the nested Tools class

switch (xPathAttribute.NodeReturnType)
{
case ReturnType.InnerHtml:
{
foreach (HtmlNode node in htmlNodeCollection)
{
result.Add(Convert.ChangeType(node.InnerHtml, listGenericType));
}
}
break;
case ReturnType.InnerText:
{
foreach (HtmlNode node in htmlNodeCollection)
{
result.Add(Convert.ChangeType(node.InnerText, listGenericType));
}
}
break;
case ReturnType.OuterHtml:
{
foreach (HtmlNode node in htmlNodeCollection)
{
result.Add(Convert.ChangeType(node.OuterHtml, listGenericType));
}
}
break;
}

would be simplified into

        foreach (HtmlNode node in htmlNodeCollection)
        {
            result.Add(
                Convert.ChangeType(
                    GetHtmlForEncapsulation(node, xPathAttribute.NodeReturnType),
                    listGenericType
                )
            );
        }

A closing side note that's a bit tangential and unrelated to your pull request, but i got a bit of a stomach pain regarding the EncapsulatorTests.cs test file. No, your test is fine as far as i can tell. It's more regarding the rest. (My advance apologies to the HAP team, as i will express a strong opinion.) if you look at EncapsulatorTests.cs, the test cases it contains (except your addition) are extremely poorly conceived, and in my honest opinion should just be eliminated (and ideally replaced with actual meaningful, robust and reliable tests). They don't test anything other than the GetEncapsulatedData method returning something that isn't null. There are no tests whether GetEncapsulatedData does actually has populated anything in the object instance it returned. But worse -- and more importantly -- is these tests being utterly unreliable, because they rely on external data from 3rd-party web sites the maintainers of HtmlAgilityPack have exactly zero (let me repeat: zero) control over. Yeah, those 3rd-party web sites one day might just want to "modernize" their web site design/layout, and the tests in EncapsulatorTests.cs might just like that completely fall apart. These are not sane tests.

@JonathanMagnan JonathanMagnan self-assigned this Jul 12, 2023
@JonathanMagnan
Copy link
Member

Hello @rwecho ,

What is your thought about @elgonzo recommendation?

Let me know when this branch will be ready to merge.

Best Regards,

Jon

@rwecho
Copy link
Contributor Author

rwecho commented Jul 13, 2023

@elgonzo, thank you for your advice. You have been patient and a powerful mentor. I have taken your suggestion and refactored the other two duplications. I believe it is functioning correctly now, so there is no need to make further changes, thanks to your timely correction and support. I will commit the changes again. If there are any areas that can be further improved, please let me know. Thank you once again, and I appreciate HAP team as well.

@rwecho
Copy link
Contributor Author

rwecho commented Jul 13, 2023

@JonathanMagnan I improved the code structure by elgonzo advice.

@rwecho
Copy link
Contributor Author

rwecho commented Jul 15, 2023

@JonathanMagnan please approve this PR if there's nothing that needs to be changed about this topic.

@JonathanMagnan JonathanMagnan merged commit 2cc71c3 into zzzprojects:master Jul 19, 2023
@JonathanMagnan
Copy link
Member

JonathanMagnan commented Jul 19, 2023

Hello @rwecho

Thank you again for your contribution ;)

Your code has been released in the v1.11.50

Best Regards,

Jon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants