Skip to content

Commit

Permalink
Fix for umlaut
Browse files Browse the repository at this point in the history
  • Loading branch information
parterburn committed Jul 13, 2024
1 parent 67cfd99 commit 418c498
Show file tree
Hide file tree
Showing 5 changed files with 15 additions and 6 deletions.
2 changes: 1 addition & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ gem "rails-html-sanitizer", "~> 1.6"
gem 'email_reply_trimmer'
gem 'griddler-mailgun', '~> 1.1', '>= 1.1.1'
gem 'griddler', '~> 1.5.2'
gem "charlock_holmes", "~> 0.7.7" # text encoding detection for email parsing
gem "charlock_holmes", "~> 0.7.9" # text encoding detection for email parsing

gem 'mailgun_rails'
gem "ruby-openai"
Expand Down
4 changes: 2 additions & 2 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ GEM
image_processing (~> 1.1)
marcel (~> 1.0.0)
ssrf_filter (~> 1.0)
charlock_holmes (0.7.7)
charlock_holmes (0.7.9)
chartkick (5.0.2)
chronic (0.10.2)
cloudflare-rails (3.0.0)
Expand Down Expand Up @@ -494,7 +494,7 @@ DEPENDENCIES
byebug
capybara
carrierwave (~> 3)
charlock_holmes (~> 0.7.7)
charlock_holmes (~> 0.7.9)
chartkick (~> 5)
cloudflare-rails
combined_time_select
Expand Down
6 changes: 5 additions & 1 deletion app/lib/email_processor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,11 @@ def to_utf8(content)

begin
detection = CharlockHolmes::EncodingDetector.detect(content)
CharlockHolmes::Converter.convert content, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
if detection[:confidence] > 95
CharlockHolmes::Converter.convert content, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
else
content
end
rescue
content
end
Expand Down
8 changes: 6 additions & 2 deletions app/models/entry.rb
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,9 @@ def formatted_body
formatted_body = body
begin
detection = CharlockHolmes::EncodingDetector.detect(body)
formatted_body = CharlockHolmes::Converter.convert formatted_body, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
if detection[:confidence] > 95
formatted_body = CharlockHolmes::Converter.convert formatted_body, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
end
rescue => e
end
fix_encoding(formatted_body)
Expand Down Expand Up @@ -107,7 +109,9 @@ def sanitized_body

begin
detection = CharlockHolmes::EncodingDetector.detect(body_sanitized)
body_sanitized = CharlockHolmes::Converter.convert body_sanitized, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
if detection[:confidence] > 95
body_sanitized = CharlockHolmes::Converter.convert body_sanitized, detection[:encoding].gsub("IBM424_ltr", "UTF-8"), "UTF-8"
end
rescue => e
end
fix_encoding(body_sanitized)
Expand Down
1 change: 1 addition & 0 deletions app/views/layouts/application.html.haml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
%head
%title= yield_or_default :title, action_name.titlecase
%meta{ charset: 'utf-8' }
%meta{ httpEquiv: "Content-Type", content: "text/html; charset=ISO-8859-1"}
%meta{ name: 'viewport', content: 'width=device-width, initial-scale=1.0, minimum-scale=1.0, maximum-scale=1.0' }
%meta{ content: 'IE=edge', 'http-equiv' => 'X-UA-Compatible' }
= csrf_meta_tags
Expand Down

0 comments on commit 418c498

Please sign in to comment.