Skip to content
This repository has been archived by the owner on Sep 18, 2021. It is now read-only.

Guidelines for new conformance tests #26

Open
garu opened this issue Dec 18, 2011 · 3 comments
Open

Guidelines for new conformance tests #26

garu opened this issue Dec 18, 2011 · 3 comments

Comments

@garu
Copy link
Contributor

garu commented Dec 18, 2011

I'm writing a new twitter-text library and it already passes all tests in 'twitter-text-conformance', but I'm not really confident it's correct. I say that because I feel a lot of unit tests are missing.

Take the username validation in validate.yml, for example. It has a test that says "valid username: a-z < 20 characters" and another one saying "All numeric username are allowed". How about mix-and-match? Is 20 the overall username size limit? How about Unicode? is "@" a valid username? Should the unicode "@" also be considered a valid username marker?

I had a lot of questions like that while I coded, and I still do. I'd love to volunteer and write all those tests, but I'm not the authority here so I can't pick what's valid and what's not off the top of my head - nor am I willing to try posts on my own Twitter account just for testing purposes (my followers would get crazy :)

tl;dr - Is there an implementation I can use as "correct"? This way I can use it as authoritative and see whether the new tests passes or fails.

Thanks!

@keitaf
Copy link
Contributor

keitaf commented Jan 3, 2012

Hi @garu

Here are the "official / reference" implementations of twitter-text.
https://github.com/twitter/twitter-text-rb
https://github.com/twitter/twitter-text-js
https://github.com/twitter/twitter-text-java

They're being updated frequently so we cannot say the current implementations define the "correct" behaviors, but they are the ones currently used in the productions.

@ablick
Copy link

ablick commented Jan 4, 2012

Can you please clarify what you mean by "they are the ones currently used in the productions"? If they are used in production on Twitter itself, then isn't that the defacto "correct" behavior?

I ask because I've found a scenario where the twitter-text-java implementation behaves differently than the text box on Twitter.com. If you enter " http://google.com ", where the spaces before and after the URL are UTF-8 non-breaking-space characters (\u00A0 in Java), then the text box on Twitter.com will find the link, count it as 20 characters, and display "Link will appear shortened." But when you pass that same string through the twitter-text-java library, it won't find the URL when you call Extractor.extractURLs().

@keitaf
Copy link
Contributor

keitaf commented Jan 5, 2012

Yes they are used in production on Twitter itself.

And thank you for reporting a bug. Ideally they should have the same behavior (and twitter-text-conformance is to help verifying their consistency) but as you pointed out there are still some inconsistency. I'll fix the bug on twitter-text-java.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants