New JSInterpreter Features #11292

sulyi · 2016-11-24T21:56:26Z

Please follow the guide below

You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your issue (like that [x])
Use Preview tab to see how your issue will actually look like

Make sure you are using the latest version: run `youtube-dl --version` and ensure your version is 2016.11.22. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

I've verified and I assure that I'm running youtube-dl 2016.11.22

Before submitting an issue make sure you have:

At least skimmed through README and most notably FAQ and BUGS sections
Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

Bug report (encountered problems with youtube-dl)
Site support request (request for adding support for a new site)
Feature request (request for a new functionality)
Question
Other

Description of your issue, suggested solution and other information

I think JSInterpreter has rather limited features. I've started to implement some new ones at #11272.
To do so I'm using a syntax grammar and actual parsing.
I wouldn't mind some feed back, help and testcases or just merely have a discussion about it.

The text was updated successfully, but these errors were encountered:

yan12125 · 2016-11-25T13:36:01Z

Actual parsing is complicated as Javascript is not a good language (neither is Python lol) Is there a need?

Personally I'm against such changes as that sounds like re-inventing wheels

sulyi · 2016-11-25T21:54:46Z

Yes there is a need. I'd like to do it.
When you say wheel are you referring to jaspyon or pynarcissus?
I think building an interpreter from scratch has it's own benefits and really not that complicated.
--- edit ---
I'm rather worried how efficient it can be made. With some clever solutions it'll be ok hopefully.

yan12125 · 2016-11-26T07:47:45Z

When you say wheel are you referring to jaspyon or pynarcissus?

No. I was referring to SpiderMonkey of Gecko, V8 of Blink, JavascriptCore of Webkit and ChakraCore of IE/Edge, and maybe other JS engines used in popular browsers.

Yes there is a need

Could you give some concrete examples (website, etc.). I'm not sure what should be concerned in #11272. Without limited targets, I'll review it as a real complete JS engine.

Well, at least obfuscated codes like #8489 (comment) (from openload) and iqiyi's login SDK should be supported:

$ curl "http://kylin.iqiyi.com/get_token" | jq .sdk
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6309    0  6309    0     0  30814      0 --:--:-- --:--:-- --:--:-- 30775
"eval(function(p,a,c,k,e,r){e=function(c){return(c<a?'':e(parseInt(c/a)))+((c=c%a)>35?String.fromCharCode(c+29):c.toString(36))};if(!''.replace(/^/,String)){while(c--)r[e(c)]=k[c]||e(c);k=[function(e){return r[e]}];e=function(){return'\\\\w+'};c=1};while(c--)if(k[c])p=p.replace(new RegExp('\\\\b'+e(c)+'\\\\b','g'),k[c]);return p}('j J=q(){j r=q(r,n){B r<<n|r>>>32-n};j n=q(r,n){j t,a,o,e,u;o=r&1j;e=n&1j;t=r&1c;a=n&1c;u=(r&1u)+(n&1u);O(t&a){B u^1j^o^e}O(t|a){O(u&1c){B u^2R^o^e}Q{B u^1c^o^e}}Q{B u^o^e}};j t=q(r,n,t){B r&n|~r&t};j a=q(r,n,t){B r&t|n&~t};j o=q(r,n,t){B r^n^t};j e=q(r,n,t){B n^(r|~t)};j u=q(a,o,e,u,i,v,c){a=n(a,n(n(t(o,e,u),i),c));B n(r(a,v),o)};j i=q(t,o,e,u,i,v,c){t=n(t,n(n(a(o,e,u),i),c));B n(r(t,v),o)};j v=q(t,a,e,u,i,v,c){t=n(t,n(n(o(a,e,u),i),c));B n(r(t,v),a)};j c=q(t,a,o,u,i,v,c){t=n(t,n(n(e(a,o,u),i),c));B n(r(t,v),a)};j f=q(r){j n;j t=r.E;j a=t+8;j o=(a-a%1r)/1r;j e=(o+1)*16;j u=1d(e-1);j i=0;j v=0;2E(v<t){n=(v-v%4)/4;i=v%4*8;u[n]=u[n]|r.1n(v)<<i;v++}n=(v-v%4)/4;i=v%4*8;u[n]=u[n]|19<<i;u[e-2]=t<<3;u[e-1]=t>>>29;B u};j d=q(r){j n=\"\",t=\"\",a,o;K(o=0;o<=3;o++){a=r>>>o*8&2M;t=\"0\"+a.1s(16);n=n+t.2W(t.E-2,2)}B n};j g=q(r){r=r.1D(/\\\\1H\\\\2C/g,\"\\\\n\");j n=\"\";K(j t=0;t<r.E;t++){j a=r.1n(t);O(a<19){n+=N.M(a)}Q O(a>2I&&a<2K){n+=N.M(a>>6|2N);n+=N.M(a&1h|19)}Q{n+=N.M(a>>12|2X);n+=N.M(a>>6&1h|19);n+=N.M(a&1h|19)}}B n};B q(r){r+=\"\";j t=1d();j a,o,e,l,p,s,h,m,C;j k=7,z=12,Z=17,b=22;j I=5,S=9,w=14,A=20;j R=4,U=11,y=16,x=23;j D=6,L=10,T=15,1a=21;r=g(r);t=f(r);s=2Z;h=30;m=31;C=35;K(a=0;a<t.E;a+=16){o=s;e=h;l=m;p=C;s=u(s,h,m,C,t[a+0],k,39);C=u(C,s,h,m,t[a+1],z,1v);m=u(m,C,s,h,t[a+2],Z,1w);h=u(h,m,C,s,t[a+3],b,1x);s=u(s,h,m,C,t[a+4],k,1y);C=u(C,s,h,m,t[a+5],z,1z);m=u(m,C,s,h,t[a+6],Z,1A);h=u(h,m,C,s,t[a+7],b,1B);s=u(s,h,m,C,t[a+8],k,1C);C=u(C,s,h,m,t[a+9],z,3f);m=u(m,C,s,h,t[a+10],Z,1E);h=u(h,m,C,s,t[a+11],b,1F);s=u(s,h,m,C,t[a+12],k,1G);C=u(C,s,h,m,t[a+13],z,1I);m=u(m,C,s,h,t[a+14],Z,1J);h=u(h,m,C,s,t[a+15],b,1K);s=i(s,h,m,C,t[a+1],I,1L);C=i(C,s,h,m,t[a+6],S,1M);m=i(m,C,s,h,t[a+11],w,1N);h=i(h,m,C,s,t[a+0],A,1O);s=i(s,h,m,C,t[a+5],I,1P);C=i(C,s,h,m,t[a+10],S,1Q);m=i(m,C,s,h,t[a+15],w,1R);h=i(h,m,C,s,t[a+4],A,1S);s=i(s,h,m,C,t[a+9],I,1T);C=i(C,s,h,m,t[a+14],S,1U);m=i(m,C,s,h,t[a+3],w,1V);h=i(h,m,C,s,t[a+8],A,1W);s=i(s,h,m,C,t[a+13],I,1X);C=i(C,s,h,m,t[a+2],S,1Y);m=i(m,C,s,h,t[a+7],w,1Z);h=i(h,m,C,s,t[a+12],A,24);s=v(s,h,m,C,t[a+5],R,25);C=v(C,s,h,m,t[a+8],U,26);m=v(m,C,s,h,t[a+11],y,27);h=v(h,m,C,s,t[a+14],x,28);s=v(s,h,m,C,t[a+1],R,2a);C=v(C,s,h,m,t[a+4],U,2b);m=v(m,C,s,h,t[a+7],y,2c);h=v(h,m,C,s,t[a+10],x,2d);s=v(s,h,m,C,t[a+13],R,2e);C=v(C,s,h,m,t[a+0],U,2f);m=v(m,C,s,h,t[a+3],y,2g);h=v(h,m,C,s,t[a+6],x,2h);s=v(s,h,m,C,t[a+9],R,2i);C=v(C,s,h,m,t[a+12],U,2j);m=v(m,C,s,h,t[a+15],y,2k);h=v(h,m,C,s,t[a+2],x,2l);s=c(s,h,m,C,t[a+0],D,2m);C=c(C,s,h,m,t[a+7],L,2n);m=c(m,C,s,h,t[a+14],T,2o);h=c(h,m,C,s,t[a+5],1a,2p);s=c(s,h,m,C,t[a+12],D,2q);C=c(C,s,h,m,t[a+3],L,2r);m=c(m,C,s,h,t[a+10],T,2s);h=c(h,m,C,s,t[a+1],1a,2t);s=c(s,h,m,C,t[a+8],D,2u);C=c(C,s,h,m,t[a+15],L,2v);m=c(m,C,s,h,t[a+6],T,2w);h=c(h,m,C,s,t[a+13],1a,2x);s=c(s,h,m,C,t[a+4],D,2y);C=c(C,s,h,m,t[a+11],L,2z);m=c(m,C,s,h,t[a+2],T,2A);h=c(h,m,C,s,t[a+9],1a,2B);s=n(s,o);h=n(h,e);m=n(m,l);C=n(C,p)}j X=d(s)+d(h)+d(m)+d(C);B X.2D()}}();j F={G:q(r){B q(r,n){B q(r){B{H:r}}(q(t){j a,o=0;K(j e=r;o<t[\"E\"];o++){j u=n(t,o);a=o===0?u:a^u}B a?e:!e})}(q(n,t,a,o){j e=2G;j u=o(t,a)-n(r,e);B 2H}(18,2J,q(r){B(\"\"+r)[\"P\"](1,(r+\"\")[\"E\"]-1)}(\"2L\"),q(r,n){B(1p r)[n]()}),q(r,n){j t=18(r[\"1q\"](n),16)[\"1s\"](2);B t[\"1q\"](t[\"E\"]-1)})}(\"2O\")};j 2P=q(r){j n=1p 1d;j t;O(r&&r.E>0){j a=r.W(\"*\");K(t=0;t<a.E-1;t++){2Q(t%3){1e 0:n+=N.M(18(a[t],8));Y;1e 1:n+=N.M(18(a[t],10));Y;1e 2:n+=N.M(18(a[t],16));Y}}B n}Q{B\"\"}};q 1m(r,n){j t=q(){r=J(r)};j a=q(){o(r.E,32)?2S():\"\";j t=F.G.H(\"2T\")?n.W(\".\"):4;j a=q(){r=J(r)};j e=F.G.H(\"2U\")?[]:0;j u=F.G.H(\"2V\")?0:8};j o=q(r,n){B r!=n};r+=\"\";1f{j e=q(){r=J(r)};1l&&1b?e():\"\"}1i(u){r=r+\"2Y\"}o(r.E,32)?t():\"\";j i=F.G.H(\"4\")?n.W(\".\"):\"\";K(j v=0;v<i.E;v++){r+=i[v]%7}B r}V=1m(V,1k);q 1t(r,n){j t=q(){r=J(r)};j a=q(r,n){B r!=n};r+=\"\";j o=q(){c+=r.P(g,r.E)};1f{j e=q(){r=J(r)};1l&&1b?e():\"\"}1i(u){r=r+\"1b\"}a(r.E,32)?t():\"\";j i=F.G.H(\"33\")?4:n.W(\".\");j v=F.G.H(\"34\")?\"1g\":[];j c=F.G.H(\"36\")?\"\":\".\";j f=F.G.H(\"37\")?32:0;K(j d=0;d<i.E;d++){v.38(i[d]%10)}K(j g=0;g<r.E;g+=5){j l=q(r,n){B r<n};O(l(f,4)){c+=r.P(g,g+5)+v[f];f++}Q{c+=r.P(g,r.E);Y}}B c;j p=q(){c+=r.P(g,r.E);j n=q(){r=J(r)};r=J(r)}}V=1t(V,1k);q 1o(r,n){j t=q(){r=J(r)};j a=q(r,n){B r!=n};r+=\"\";1f{j o=q(){r=J(r)};3a&&3b.1g?o():\"\"}1i(e){r=r+\"1g\"}j u=q(){r=r+\"1b\";1l&&1b?3c():\"\";a(r.E,32)?t():\"\"};a(r.E,32)?t():\"\";j i=F.G.H(\"3d\")?\"\":n.W(\".\");j v=F.G.H(\"3e\")?\"\":2F;j c=F.G.H(\"c\")?0:\".\";K(j f=0;f<r.E;f+=4){j d=q(r,n){B r<n};O(d(c,4)){v+=r.P(f,f+4)+i[c];c++}Q{v+=r.P(f,r.E);Y}}B v}V=1o(V,1k);',62,202,'|||||||||||||||||||var|||||||function|||||||||||return|||length|k0touZ|z0|p0||md5|for||fromCharCode|String|if|substring|else|||||input|split||break||||||||||parseInt|128|_|navigator|1073741824|Array|case|try|decodeURI|63|catch|2147483648|ip|location|mod7|charCodeAt|split4|new|charAt|64|toString|split5|1073741823|3905402710|606105819|3250441966|4118548399|1200080426|2821735955|4249261313|1770035416|replace|4294925233|2304563134|1804603682|x0d|4254626195|2792965006|1236535329|4129170786|3225465664|643717713|3921069994|3593408605|38016083|3634488961|3889429448|568446438|3275163606|4107603335|1163531501|2850285829|4243563512|1735328473|||||2368359562|4294588738|2272392833|1839030562|4259657740||2763975236|1272893353|4139469664|3200236656|681279174|3936430074|3572445317|76029189|3654602809|3873151461|530742520|3299628645|4096336452|1126891415|2878612391|4237533241|1700485571|2399980690|4293915773|2240044497|1873313359|4264355552|2734768916|1309151649|4149444226|3174756917|718787259|3951481745|x0a|toLowerCase|while|100|785|true|127|Date|2048|_getTime2|255|192|ecg6mf6ar|Decode|switch|3221225472|m1hIQ|b93|947d|e36|substr|224|locationnavigator|1732584193|4023233417|2562383102||c8|f4c3|271733878|7167|7f|push|3614090360|document|window|ZifVJ|253e|8af6|2336552879'.split('|'),0,{}));"

sulyi · 2016-11-26T08:25:28Z

I'll need to look into your suggestions, @yan12125, more carefully, but I don't think any of the engines you mentioned is written in py.

Jaspyon can be found at https://bitbucket.org/santagada/jaspyon/ and pynarcissus at https://github.com/jtolds/pynarcissus, these are the JavaScript interpreters in py that I found.

As the scope I'd say at first I'd aim to support all the code supported right now. That should be plenty to see how it goes. And a good base to build on and widen the support if it succeeds. My ultimate goal is indeed a complete JS engine, but saying out loud that does sound a bit overly ambitious at the moment.

yan12125 · 2016-11-26T09:26:59Z

I don't think any of the engines you mentioned is written in py

Yes they are implemented with C/C++/Objective-C. If your goal is a complete JS engine, they are excellent choices. Python wrappers for those existing engines are much easier to write and maintain in comparison with a new Python implementation from scratch. Could you share the idea for why a pure Python implementation is necessary?

sulyi · 2016-11-26T11:01:16Z

At the end of the day, it's up to the maintainers to decide if this is able to serve their needs or not.

As far as platform independence goes it's better not having native code or third party modules.

Implementing an JavaScript interpreter seams rather a whole lot of fun/challenge/practice/good reference...

Also why not?

yan12125 · 2016-11-26T16:00:59Z

@siddht1 What @sulyi wants is a Javascript interpreter in youtube-dl that handles scrips on web pages, not youtube downloaders written in Javascript. They are different.

As far as platform independence goes it's better not having native code or third party modules.

That depends. If there are already excellent solutions for a complex task, few people will re-implement them beyond the scope of toy projects. If you need a library not in your target language, a binding/wrapper is the most common way. For example, HTTP and TLS protocols are easy, so there are python-requests and python-tls. On the other hand, GUI systems are complex, so there are PyQt, PyGTK, and CPython's builtin tkinter module, but I have never seen a GUI toolkit in pure Python. At first JSInterpreter is implemented for handling signature decryption functions on YouTube, which are naive Javascript codes. As it's simple, a pure Python implementation is the best choice. On the other hand, bridging existing JS engines is better if you need all Javascript features, in terms of development and maintaining difficulty as well as performance.

Implementing an JavaScript interpreter seams rather a whole lot of fun/challenge/practice/good reference...

Changes in youtube-dl should solve real problems. Of course it's fun, but this is not the place.

Also why not?

I have no plan to reject your pull request. I'm asking your goal so that I can determine whether a pull request is ready or not. As you've said, your goal is a complete JS engine, so you can continue your work. When it's almost done (for example pass most of Spidermonkey's test suite), we can come back and start reviewing it.

siddht4 · 2016-11-27T05:06:02Z

@yan that's great by the way , i just suggested different approach of the fix . I would be ready to help if required , engine for mozilla is not gecko anymore , the engine could be written in node js as it support nAtive apps too.

sulyi · 2016-11-28T02:35:38Z

Thanks, @siddht1 it was actually helpful. After studying narcissus and other engines, and reading the specs I've realized I can't continue without first designing this. So after doing that right now I need to tear almost everything down and redo it. Hopefully it'll go faster second time.
--- edit ---
Basic concept:
The tokens of lexical grammar will be a dictionary of regular expression strings, the production of syntactic grammar will be described by another dictionary containing lists of token ids. The lexer would use both dictionaries compile the actual regex used to match tokens, and return them by statements as a list. Than the parser would use the second dictionary to decide how should be these tokens interpreted according to the syntactic grammar.

siddht4 · 2016-11-28T06:21:21Z

@yan @sulyi i am unable to view my first comment. if you have it kindly repost it or atleast mail me

sulyi · 2016-11-28T11:02:17Z

@siddht1 Sry, neither can I.

mozbugbox · 2016-11-28T12:33:59Z

Js2Py is another option which claims to support ECMA 5

Pure Python JavaScript Translator/Interpreter

Everything is done in 100% pure Python so it's extremely easy to install and use. Supports Python 2 & 3. Full support for ECMAScript 5.1, ECMA 6 support is still experimental.

yan12125 · 2016-11-28T13:55:34Z

Js2Py looks so far so good. The only two issues I found:

No support for defining functions like what I can do in Spidermonkey. pyimport seems a way but I'd like cleaner approaches. It's better to have for PAC support (how to configure youtebe-dl when using PAC on windows 7 #8278)
It generates Python codes and run it. Sounds like a security weakness

siddht4 · 2016-11-28T14:20:09Z

`

Js2Py is another option which claims to support ECMA 5

Pure Python JavaScript Translator/Interpreter

Everything is done in 100% pure Python so it's extremely easy to install and use. Supports Python 2 & 3. Full support for ECMAScript 5.1, ECMA 6 support is still experimental.

Js2Py looks so far so good. The only two issues I found:

No support for defining functions like what I can do in Spidermonkey. pyimport seems a way but I'd like cleaner approaches. It's better to have for PAC support (#8278)
It generates Python codes and run it. Sounds like a security weakness

`

according to me this is just the part of the puzzle , maybe an engine should be develop first similar to creating cobbler in java.

js2py lacks many components but other similar project can help to lessen the gap.

what everbody fails to see is that the flow will break

youtube-dl -----> in js engine -------> out js engine -----> site

site ----> js engine (fails to parse if gets inbound , where it`s suppose to get outbound)

fix

youtube-dl <-----> js engine #1 (inbound) <----------------------------- site (Send)
| /
| /
-------> js engine #2- (outbound)----> site (GET) /

siddht4 · 2016-11-28T14:25:33Z

sorry the diagram didnt came out as expected , wait this is what i wanted to suggest

sulyi · 2016-11-29T04:50:04Z

Funny thing, starting this I's hoping to discuss how to do this not why I shouldn't. Yes, existing solutions can be very help full, but only to a certain point. Right now I need to implement the parser, using the tokens. Yet, I might do some further minor changes on the lexer first, e.g. reserved words can get their collective token id since value and id would be an injective relation otherwise. Hopefully after I start working on the parser it'll start to show some promising signs.

sulyi · 2016-11-30T05:04:10Z

Can some one enlighten me what this means in the specs:

The ExpressionNoIn production is evaluated in the same manner as the Expression production except that the contained ExpressionNoIn and AssignmentExpressionNoIn are evaluated instead of the contained Expression and AssignmentExpression, respectively.

Is this just trying to say that parentheses are resolved? Because I couldn't find any clue to that.

yan12125 · 2016-11-30T06:15:57Z

Did you mean section 11.14? That describes the order of evaluation of expressions involving a comma operator.

sulyi · 2016-11-30T06:55:12Z

Yes, among others, like VariableDeclarationNoIn and RelationalExpressionNoIn, later has an even more confusing note:

The “NoIn” variants are needed to avoid confusing the in operator in a relational expression with the in operator in a for statement.

yan12125 · 2016-11-30T07:07:41Z

I guess NoIn variants are for simpler grammars - you don't need to peek so many tokens when building a lookahead LL parser.

sulyi · 2016-12-03T09:21:36Z

I'm thinking shunting-yard algorithm for assignment/conditional expression just to simply reduce number of methods. Looking at other interpreters' source it seams ast is preferred. I'd love to hear pros and cons.

yan12125 · 2016-12-03T13:13:54Z

That sounds fine. The only concern is error reporting for invalid expressions. In youtube-dl it's OK to assume all inputs are valid.

sulyi · 2016-12-03T17:33:33Z

Grouping will still be handled by expression and conditional expression also will have it's own token. Therefore error handling can be done properly, in my opinion.
Having implemented it, I think it's kinda' the same, only difference is that shunting-yard uses a local operator stack instead the callstack. Interpretation might have to be a little bit different, tho. At the current state that's broken and I'm a bit nervous about it, but it's getting there.

sulyi · 2016-12-03T19:19:24Z

Another thing, I'm thinking about refactoring. A separate grammar/tstream module would be nice in it's own package, like:

youtube_dl
 |
 +-- jsinterp
      |
      +-- __init__.py
      |
      +-- jsinterp.py
      |
      +-- <grammar.py>
      |
      +-- tstream.py

and in __init__.py:

from .jsinterp import JSInterpreter
from .tstream import TokenStream

 __all__ = ['JSInterpreter','TokenStream']

I believe that would be backward compatible too.

yan12125 · 2016-12-04T09:10:32Z

Refactoring is the way to go. But is there a need to expose TokenStream? I thought it's used internally in JSInterpreter only.

sulyi · 2016-12-04T12:00:49Z

Sure, that's a valid question I haven't think I got an answer to. My thoughts were, that one might want to have a tokenizer for the grammar.

Other thing I was thinking is error handling. I don't see it as a single function, therefore I don't know how to not do it. If that makes any sense.
Anyway, might worth discussing more how error-reporting should happen.

sulyi · 2016-12-05T13:02:04Z

As a partial parser in place there's only the interpreter left to implement in order to get to the milestone I've mentioned before. I might still under estimate the complexity of the remaining work, but I've started thinking about testing and adding dynamically SpiderMonkey's test to the testcases.

yan12125 · 2016-12-05T14:42:39Z

Thanks for that! Could you add some more tests first? Now #11272 is big enough and I guess it's fragile to refactoring.

sulyi · 2016-12-06T18:47:59Z

Probably it could be made much more robust by introducing a Token and/or an ASTree class.

There were some testcases (e.g. instantiation, from top of my head), I was wondering, while implementing things, how it would perform against. I would love some suggestions, thou.

There's quite some TODO before it passes the current testcases, I've just wanted to put out the idea of dynamic testcases. I'm not sure how to do it, but I hope reading the SpiderMonkey documentation will help.

sulyi · 2016-12-08T14:36:21Z

I've just added parser test. Subsequently got stuck with interpreter and as result made a mess.
I'm thinking a Reference helper class and a context stack for JSInterpreter would be helpful.

class Reference(object):
    def __init__(self, name, value, parent_key=None):
        self._type = name
        self._value = value
        if parent_key is not None:
            self._parent, self._key = parent_key
       else:
            self._parent, self._key = None, None
   
    def getvalue(self):
        return self._value

    def putvalue(self, value):
        if self._parent is not None and self._key is not None:
            self._value = value
            self._parent.__setitem__(self._key, value)
        else:
            raise ExtractorError('''Reference type %s is read-only''' % self._type)

     def delete(self):
         self._value = undefined
         self._type = None
         if self._parent is not None and self._key is not None:
             self._parent.__delitem__(self._key)
         # No need for error report here!

Reference._parent would be either local_vars (top-level) or array or object literal and Reference._key would be identifier, index, and property or method name respectably.

Instead of storing values they would store Reference instances.

sulyi · 2016-12-10T09:47:02Z

Any idea how to get comparing zip objects work in python3 (in a nested list by uinttest)?
[...]/test_jsinterp_parser.py#L106
[...]/test_jsinterp_parser.py#L170
[...]/test_jsinterp_parser.py#L310
[...]/test_jsinterp_parser.py#L371
...

yan12125 · 2016-12-10T10:52:11Z

Is there anything wrong in python3's zip?

sulyi · 2016-12-10T11:18:34Z

Solved it, kinda! With traverse. Yet, it's still a generator.
I'm not very keen on type(o) == zip check, thou.

yan12125 · 2016-12-10T12:10:23Z

I see. As performance is not critical in tests, you may want to just transform everything into lists.

sulyi · 2016-12-10T12:39:45Z

I think that'd have issue with operators and generators still get empty Even with itertools.tee.

yan12125 · 2016-12-10T12:42:13Z

Oops. Just ignore my previous comment

sulyi · 2016-12-10T12:54:01Z

copy.deepcopy is the only way to compare a generator twice, without converting it to list before hand.
I think I like that a lot.

sulyi · 2016-12-10T13:59:42Z

~~For getting SpiderMonkey tests can I use hglib package?~~ It seams that it's for local repos. More over hg does not support narrow clone. I think a small spider needs to be implemented to ad these tests dynamically.
-- edit --
Looking at the test, those does not seam very well applicable.

yan12125 · 2016-12-10T14:10:07Z

Do you want to clone the whole mozilla-central source tree? Please don't do that. There's no need to keep up-to-date with Firefox (For example it's going to support ES2017 while there's no need to support it in youtube-dl now.), so just copy files is fine.

sulyi · 2016-12-10T15:12:10Z

No, I definitely don't want to clone the whole mozilla-central. So, you say I shouldn't add anything dynamically? Just cherry-pick some of them and dump them in test_jsinterp_parser?

yan12125 · 2016-12-10T15:24:21Z

IIRC unlike SVN, mercurial does not allow cloning partial files. What youtube-dl needs are those tests. If you know a way to sync only those test files, go ahead. Note that extra dependencies should affect only tests/, not youtube_dl/

sulyi · 2016-12-10T15:28:20Z

How about a linux native spider like ~~curl~~ or wget?
-- edit --
This seams to work fine:
wget -np -r -e robots=off --accept='*.js' https://hg.mozilla.org/mozilla-central/file/tip/js/src/tests/ecma_5/
Adding a -nd even dumps it in a single directory but there's alot of shell.js and browser.js (due to their testing framework) those get an extra number extension in their names.

yan12125 · 2016-12-10T16:16:02Z

I guess you want to download Mozilla's test suite each time test_jsinterp invoked? I don't think it's a good idea as there are thousands of files.

sulyi · 2016-12-10T16:28:29Z

I don't know. I'd much rather extract some useful test from the testcases and run those through unittest, but I don't think I can do that. The only other option I see is to add tests to existing ones manually based on mozilla's tests.

yan12125 · 2016-12-10T16:32:47Z

Oops I should leave this comment here: #11272 (comment)

sulyi · 2016-12-10T16:49:33Z

Dumping a list of links to mozilla testcases in a file on building tests might also be a good practice.

yan12125 · 2016-12-10T17:04:48Z

As long as it does not take too long time, it fine.

sulyi · 2016-12-10T17:36:43Z

I've time it the wget crawl at home:

real	16m52.314s
user	3m38.500s
sys	0m1.940s

yan12125 · 2016-12-10T17:58:58Z

Does it take 16 minutes every time when running python test/test_jsinterp.py or there's a local copy and it's necessary only when syncing tests from Mozilla?

sulyi · 2016-12-10T22:29:02Z

If anybody is interested I'd like to share some side product, I've created while studying the specs.
These are the grammar I've extracted in EBNF notation and some syntax diagrams created by http://www.bottlecaps.de/rr/ui from it.

Can't attach them for some reason, so:
https://gist.github.com/sulyi/15674f4802503d81711b015a05faae46

sulyi · 2016-12-11T11:21:57Z

I'm wondering what @phihag thinks about this.

sulyi · 2016-12-15T14:04:48Z

I've reworked the test suite. It does not support adding tests from any other suites, but hopefully it's pretty straight forward adding new ones and use them to test either or both interpretation and parsing.
If anybody planing on adding new tests to help out which would be nice, please check it out or make a better one, just also integrate the current ones.

sulyi · 2016-12-17T19:30:40Z

I'm a little bit stuck with designing built-ins.

yan12125 · 2016-12-18T06:18:47Z

In the parsing stage I guess built-ins are not different than other things?

sulyi · 2016-12-19T23:14:33Z

Sry, haven't seen your post, till now and I also needed a break from it to rehash a bit.

I've passed parsing. Parsing is done. Might need some loving later, minor features and possible refracting into it's own class (and module).

I've redone the test suite in order to be able to run tests of parsing an interpreting flexibly on the same script codes, because I had to move on implementing the interpreter at the first place. Otherwise it would have got ugly fast if I hadn't sorted that out, before starting to work on that.

As the built-ins I've been thinking and I'll probably keep the Reference class and use it as a "wrapping", and recreate the inheritance tree of javascript objects in a separate module and use those as values. There might be a need for a dict to lookup JS properties in each class, but first I'll try to do it using hasattr or __dict__.

I've started working on function calls, but I think the context stack is not working properly. Doing update on globals with local_vars when context_push, and remove the difference on context_pop might solve it, but that likely has an issue with shadowing names.

sulyi · 2017-01-25T00:39:28Z

It's probably quite mute, yet I have to point out that the fix for #11663 and #11664 committed by @dstftw is a bit baffling to me.
Let's take the following three independent js expressions:
a = 42
this.a = 42
var a = 42
first two are equivalent, the third, beside that it sets the return value to undefined while the others to 42, only differs that the [[Configurable]] internal property of a is false instead of true.
The practical consequence of this is, when running delete a in the first two cases the result is true while in the third it's false, but not much else.
In all three cases this.a === a is true.

Tithen-Firion mentioned this issue Mar 26, 2017

openload.co extractor not working #10408

Closed

mengmo mentioned this issue Mar 5, 2019

New API spacemeowx2/DouyuHTML5Player#28

Closed

brouxco mentioned this issue Apr 4, 2020

Add dependency for JS parser streamlink/streamlink#2534

Closed

sulyi closed this as completed Sep 8, 2021

New JSInterpreter Features #11292

New JSInterpreter Features #11292

Comments

sulyi commented Nov 24, 2016

Please follow the guide below

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2016.11.22. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

Before submitting an issue make sure you have:

What is the purpose of your issue?

Description of your issue, suggested solution and other information

yan12125 commented Nov 25, 2016

sulyi commented Nov 25, 2016 • edited Loading

yan12125 commented Nov 26, 2016

sulyi commented Nov 26, 2016

yan12125 commented Nov 26, 2016

sulyi commented Nov 26, 2016

yan12125 commented Nov 26, 2016

siddht4 commented Nov 27, 2016

sulyi commented Nov 28, 2016 • edited Loading

siddht4 commented Nov 28, 2016 • edited Loading

sulyi commented Nov 28, 2016

mozbugbox commented Nov 28, 2016

yan12125 commented Nov 28, 2016 • edited Loading

siddht4 commented Nov 28, 2016 • edited Loading

siddht4 commented Nov 28, 2016

sulyi commented Nov 29, 2016

sulyi commented Nov 30, 2016

yan12125 commented Nov 30, 2016

sulyi commented Nov 30, 2016 • edited Loading

yan12125 commented Nov 30, 2016

sulyi commented Dec 3, 2016

yan12125 commented Dec 3, 2016

sulyi commented Dec 3, 2016

sulyi commented Dec 3, 2016

yan12125 commented Dec 4, 2016

sulyi commented Dec 4, 2016

sulyi commented Dec 5, 2016

yan12125 commented Dec 5, 2016

sulyi commented Dec 6, 2016

sulyi commented Dec 8, 2016 • edited Loading

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016 • edited Loading

sulyi commented Dec 10, 2016

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016

yan12125 commented Dec 10, 2016

sulyi commented Dec 10, 2016 • edited Loading

sulyi commented Dec 11, 2016

sulyi commented Dec 15, 2016 • edited Loading

sulyi commented Dec 17, 2016

yan12125 commented Dec 18, 2016

sulyi commented Dec 19, 2016 • edited Loading

sulyi commented Jan 25, 2017 • edited Loading

Make sure you are using the latest version: run `youtube-dl --version` and ensure your version is 2016.11.22. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

sulyi commented Nov 25, 2016 •

edited

Loading

sulyi commented Nov 28, 2016 •

edited

Loading

siddht4 commented Nov 28, 2016 •

edited

Loading

yan12125 commented Nov 28, 2016 •

edited

Loading

siddht4 commented Nov 28, 2016 •

edited

Loading

sulyi commented Nov 30, 2016 •

edited

Loading

sulyi commented Dec 8, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

yan12125 commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 10, 2016 •

edited

Loading

sulyi commented Dec 15, 2016 •

edited

Loading

sulyi commented Dec 19, 2016 •

edited

Loading

sulyi commented Jan 25, 2017 •

edited

Loading