Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent Twitter char counting changes 😭 #8

Closed
snarfed opened this issue Nov 18, 2017 · 2 comments · Fixed by #12
Closed

Recent Twitter char counting changes 😭 #8

snarfed opened this issue Nov 18, 2017 · 2 comments · Fixed by #12

Comments

@snarfed
Copy link
Collaborator

snarfed commented Nov 18, 2017

evidently, along with the recent switch to 280 chars, twitter also changed the way they count chars. (IRC discussion.) it's no longer simply unicode code points, it's some weighted thing i don't understand yet. one big effect is that emoji seem to now count for two chars, not one.

this made e.g. @tantek's recent bridgy publish attempt fail. we tried to publish this content:

This is meditation. #FBF to last Saturday, in the zone, racing the Mt. Tam Trail Run half marathon with singular focus. 📷 chasquirunner.com

Awake the night before @TheNorthFace #ECSCA, thinking back to last Saturday’s challenges and triumphs, this… 

http://tantek.com/2017/321/t2/last-saturday-racing-tam-half-marathon

which twitter 403ed it with "Tweet needs to be a bit shorter."

if you count chars normally, it's 319 total, - 17 + 23 for chasquirunner.com, - 68 + 23 for the permalink, == 280.

however, if you paste it into the twitter UI, that says it's 3 chars over. deleting the camera and ellipsis emoji each drop it by 2. i haven't found the last extra char yet.

goddammit.

@snarfed
Copy link
Collaborator Author

snarfed commented Nov 18, 2017

ok, after reading the docs, i understand the new way. some chars (code points) now count for two instead of one, and they have a data driven config that determines which are which. it'll take a little work to implement, but not a ton. seems doable.

https://developer.twitter.com/en/docs/developer-utilities/twitter-text

@snarfed snarfed closed this as completed Nov 21, 2017
snarfed added a commit to snarfed/brevity-testcases that referenced this issue Nov 24, 2017
@snarfed
Copy link
Collaborator Author

snarfed commented Nov 24, 2017

reopening, this is still buggy. :/

added a failing test case in kylewm/brevity-testcases#3. the problem is that the new str_length() returns weighted count, but we also use it to index into the python string, which we shouldn't. those are separate measurements that can't be mixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant