Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexOutOfRangeException in ByteEncodingCMapTable.CharacterCodeToGlyphIndex Method #881

Closed
darbid opened this issue Aug 25, 2024 · 3 comments · Fixed by #882
Closed

IndexOutOfRangeException in ByteEncodingCMapTable.CharacterCodeToGlyphIndex Method #881

darbid opened this issue Aug 25, 2024 · 3 comments · Fixed by #882
Labels
bug document-reading Related to reading documents

Comments

@darbid
Copy link

darbid commented Aug 25, 2024

Issue Description:

There is an issue in the CharacterCodeToGlyphIndex method within the ByteEncodingCMapTable class, which implements the ICMapSubTable interface in the namespace UglyToad.PdfPig.Fonts.TrueType.Tables.CMapSubTables.

The method is encountering a System.IndexOutOfRangeException due to an attempt to access an index outside the bounds of the glyphMapping array. Below is the implementation of the method:

public int CharacterCodeToGlyphIndex(int characterCode)  
{  
    if (characterCode < 0 || characterCode >= GlyphMappingLength)  
    {  
        return 0;  
    }  
    return glyphMapping[characterCode];  
}  

Exception Details:

System.IndexOutOfRangeException: Index was outside the bounds of the array.  
   at UglyToad.PdfPig.Fonts.TrueType.Tables.CMapSubTables.ByteEncodingCMapTable.CharacterCodeToGlyphIndex(Int32 characterCode) in D:\..........UglyToad.PdfPig.Fonts\TrueType\Tables\CMapSubTables\ByteEncodingCMapTable.cs:line 57  
Exception thrown: 'System.IndexOutOfRangeException' in UglyToad.PdfPig.Fonts.dll  

Investigation Insights:

Upon examining the values, it appears that the exception is caused by a discrepancy between GlyphMappingLength and the actual length of the glyphMapping array:

  • GlyphMappingLength = 256
  • glyphMapping = {byte[252]}

Attached Document:
I have also attached a document, example.pdf
, which might help in reproducing and investigating the issue further.

Steps to Reproduce:

  1. Use the CharacterCodeToGlyphIndex method with an input characterCode greater than or equal to the length of the glyphMapping array but less than GlyphMappingLength.
  2. The method throws System.IndexOutOfRangeException.

Sample Code to Reproduce the Issue:

using var document = PdfDocument.Open(path);  
var text = new StringBuilder();  
   
foreach (var page in document.GetPages())  
{  
    text.AppendLine(string.Join(" ", page.GetWords()));  
}  
   
return text.ToString();  

Proposed Solution:
Ensure that both the GlyphMappingLength and the glyphMapping array are synchronized in terms of their lengths to avoid out-of-bounds access, or update the method to validate against the actual length of glyphMapping rather than GlyphMappingLength.

Additional Context:
The issue is observed in the following file:

D:......\UglyToad.PdfPig.Fonts\TrueType\Tables\CMapSubTables\ByteEncodingCMapTable.cs  

Please let me know if you need any further information or clarification.

@BobLd
Copy link
Collaborator

BobLd commented Aug 25, 2024

Hi @darbid, thanks a lot for the detailed issue. I think the easiest would be to update the CharacterCodeToGlyphIndex method.

Are you willing to create a PR to fix that? Or I can take care of it, just let me know

EDIT: I think the following should be enough

 public int CharacterCodeToGlyphIndex(int characterCode)
 {
	 if (characterCode < 0 || characterCode >= glyphMapping.Length)
	 {
		 return 0;
	 }

	 return glyphMapping[characterCode];
 }

@darbid
Copy link
Author

darbid commented Aug 25, 2024

I am sorry I'm not experienced enough to do a PR. Could you please?

BobLd added a commit to BobLd/PdfPig that referenced this issue Aug 25, 2024
@BobLd
Copy link
Collaborator

BobLd commented Aug 25, 2024

no prob, done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug document-reading Related to reading documents
Projects
None yet
3 participants