Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biber makes substitutions for \textgamma that cause errors #480

Open
hammondkd opened this issue Jun 16, 2024 · 9 comments
Open

Biber makes substitutions for \textgamma that cause errors #480

hammondkd opened this issue Jun 16, 2024 · 9 comments
Assignees
Milestone

Comments

@hammondkd
Copy link

[also asked at TeX StackExchange]

The following example produces an error:

\documentclass{article}
\usepackage{biblatex}
\usepackage{textgreek}
\begin{filecontents}{test.bib}
@article{Author2015,
    author  = "Imogene Mirabell Anne Author",
    title   = "Making {\textgamma}-Iron from Lead",
    journal = "Alchemist",
    volume  = 10,
    number  = 3,
    pages   = "121--134",
    year    = 2038
}
\end{filecontents}
\addbibresource{test.bib}

\begin{document}
\nocite{*}
\printbibliography
\end{document}

Now run LaTeX, then Biber, then LaTeX again and you'll get an error:

! LaTeX Error: Unicode character ɣ (U+0263)
               not set up for use with LaTeX.

In the .bbl file, the sequence \textgamma has been replaced with the actual character (gamma), which is not declared, hence the error.

Note that replacing \textgamma with \textalpha works fine.

Why does this occur? That is, why is Biber making this substitution? It seems like a bug in Biber to me.

Additional info: biber version 2.19

plk added a commit that referenced this issue Jun 16, 2024
@plk
Copy link
Owner

plk commented Jun 16, 2024

It's not a bug, biber always encodes to UTF-8 unless you tell it not to. The mapping it uses to do this can be modified as per the docs. However, the default for \textgamma wasn't ideal as it was the latin gamma. This has been changed in the commit linked since there is a different mapping for latin gamma anyway. You can fix your install by altering your recode_data.xml as per the fixing commit.

@hammondkd
Copy link
Author

I did not see this documented anywhere. Encoding in UTF-8 is not the same thing as replacing TeX macros with Unicode characters on-the-fly, so it seems like this should be mentioned and the mechanism to turn it off advertised in the documentation. Perhaps I just missed it? I did not see \text anywhere in the Biber documentation.

@plk
Copy link
Owner

plk commented Jun 16, 2024

See section 3.6 of the biber PDF documentation (texdoc biber in TexLIve).

@hammondkd
Copy link
Author

Does it make sense to translate \textalpha to U+03B1 and so forth?

@plk
Copy link
Owner

plk commented Jun 16, 2024

Always open to suggestions for default for these - many were just best guesses over a decade ago.

@davidcarlisle
Copy link

@hammondkd biber always replaces latex character commands by the character where possible, but \textgamma is intended for Greek so the change in the commit above to map it to Greek gamma certainly looks right to me. The latin gamma has some specialised uses but is pretty esoteric and latex does not define a mapping for it at all in the default setup for classic tex systems, hence the error reported that the standard \textgamma command ended up being translated to an undefined character error.

@plk thanks for quick fix for gamma.

I think the whole greek alphabet should be in the 03xx range not 02xx eg

    <map><from>textupsilon</from>                      <to hex="28A">ʊ</to></map>

is mapping to Latin upsilon which is possibly used in phonetics or somewhere but isn't the intended interpretation of \textupsilon and U+028A has no mapping in latex and will by default just make

! LaTeX Error: Unicode character ʊ (U+028A)
               not set up for use with LaTeX.

plk added a commit that referenced this issue Jun 16, 2024
@plk
Copy link
Owner

plk commented Jun 16, 2024

Done - all of the greek alphabet is available with--decodecharsset=full but any text* defaults are also now greek.

@perstar
Copy link

perstar commented Jun 17, 2024

I still think that it is unnecessary that situations like this yield errors. I still think the right solution is to remember the actual text (LaTeX code) used in the bib entry so that after all processing has been done (which includes conversions, standardization and whatnot) to a form used for comparison, sorting etc. when something actually should be output it is that remembered value that the user actually wrote in the bib entry that goes out, and not the standardized form used in the machinery.

Then it is the responsibility of the user to have something suitable there, and if there is a problem the user can easily change it and will not get a really hard to understand error message.

I suggested this in an issue in 2014. and since then I've seen several related issues about various problematic characters. I don't think we have seen the last such issue yet, but that this will continue to be a problem that appears for users now and then.

@plk
Copy link
Owner

plk commented Jun 18, 2024

It's just too messy to remember the original chars and we wanted to support Unicode as a standard for such things. You can indeed select the output format as ascii with biber and it will sort internally with UTF-8 and output ascii equivalents for characters not in the output encoding. This involves two conversions however, not a "remember the original commands` approach.

@plk plk self-assigned this Oct 2, 2024
@plk plk added this to the 2.21 milestone Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants