-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module detection #41
Comments
It's usually easier to identify some dependencies by checking them yourself or reading something like I don't know if this will help, maybe the developer can identify some dependencies in advance?
|
This issue existed because we needed to know which library we were processing to give an appropriate output. License can be a good hint for both humans and wakaru. Modern bundlers often destroy most information, including the method name, so a module/function detection is still required. And this list won't grow brainlessly; we will pick high-value targets. 🙏 Dev can still identify the module by themself and rename the module name. |
This could then also tie in well with some of the ideas for 'unmangling identifiers' that I laid out here: Theoretically if we can identify a common open source module, we could also have pre-processed that module to extract variable/function names, that we could then potentially apply back to the identified module. I kind of think of this like 'debug symbols' used in compiled binaries. Though technically, if you know the module and can get the original source; and you know the webpacked version of that code; you could also generate a sourcemap that lets the user map between the 2 versions of the code. When I was manually attempting to reverse and identify the modules in #40, a couple of techniques I found useful:
Edit: This might not be useful right now, but just added a new section to one of my gists with some higher level notes/thoughts on fingerprinting modules; that I might expand either directly, or based on how this issue pans out: While it might be more effort than it's worth, it may also be possible to extract the patterns that wappalyzer was using to identify various libraries; which I made some basic notes on in this revision to the above gist: |
Within some webpacked code I was looking at (Ref): We can easily identify a number of the React modules based on their license header; which also includes the original filename: ~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/653.js:
13730 "use strict";
13731 /**
13732: * @license React
13733 * react-is.production.min.js
13734 *
~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/framework.js:
5 2920: function (e, n, t) {
6 /**
7: * @license React
8 * react-dom.production.min.js
9 *
..
8452 82875: function (e, n, t) {
8453 /**
8454: * @license React
8455 * react-jsx-runtime.production.min.js
8456 *
....
8492 99504: function (e, n) {
8493 /**
8494: * @license React
8495 * react.production.min.js
8496 *
....
8891 95507: function (e, n) {
8892 /**
8893: * @license React
8894 * scheduler.production.min.js
8895 *
~/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/pages/_app.js:
47741 93802: function (U, B) {
47742 "use strict";
47743: /** @license React v16.13.1
47744 * react-is.production.min.js
47745 *
.....
54586 "use strict";
54587 /**
54588: * @license React
54589 * use-sync-external-store-shim.production.min.js
54590 *
.....
54654 "use strict";
54655 /**
54656: * @license React
54657 * use-sync-external-store-shim/with-selector.production.min.js
54658 * And at least in this bundled code, export default JSON.parse(
'{"name":"statsig-js","version":"4.32.0","description":"Statsig JavaScript client SDK for single user environments.","main":"dist/index.js","types":"dist/index.d.ts","scripts":{"prepare":"rm -rf build/ && rm -rf dist/ && tsc && webpack","postbuild":"rm -rf build/**/*.map","test":"jest --config=jest-debug.config.js","testForGithubOrRedisEnthusiasts":"jest","test:watch":"jest --watch","build:dryrun":"npx tsc --noEmit","types":"npx tsc"},"files":["build/statsig-prod-web-sdk.js","dist/*.js","dist/*.d.ts","dist/utils/*.js","dist/utils/*.d.ts"],"jsdelivr":"build/statsig-prod-web-sdk.js","repository":{"type":"git","url":"git+https://github.com/statsig-io/js-client-sdk.git"},"author":"Statsig, Inc.","license":"ISC","bugs":{"url":"https://github.com/statsig-io/js-client-sdk/issues"},"keywords":["feature gate","feature flag","continuous deployment","ci","ab test"],"homepage":"https://www.statsig.com","devDependencies":{"@babel/preset-env":"^7.14.9","@babel/preset-typescript":"^7.14.5","@types/jest":"^27.1.0","@types/uuid":"^8.3.1","circular-dependency-plugin":"^5.2.2","core-js":"^3.16.4","jest":"^27.1.0","terser-webpack-plugin":"^5.1.4","ts-jest":"^27.1.0","ts-loader":"^9.2.3","typescript":"^4.2.2","webpack":"^5.75.0","webpack-cli":"^4.10.0"},"dependencies":{"js-sha256":"^0.9.0","uuid":"^8.3.2"},"importSort":{".js, .jsx, .ts, .tsx":{"style":"module","parser":"typescript"}}}'
); See also: |
With regards to module detection/similar for React, these might be interesting/useful:
|
I won't copy the content here in full as it was pretty long, but I detailed some of my higher level thoughts around some more 'esoteric' methods that might be applicable to module detection (AST fingerprinting, code similarity, etc) in this comment: |
This specific implementation is more related to detecting and injecting into webpack modules at runtime, but it might have some useful ideas/concepts that are applicable at the AST level too: // ..snip..
export const common = { // Common modules
React: findByProps('createElement'),
ReactDOM: findByProps('render', 'hydrate'),
Flux: findByProps('Store', 'connectStores'),
FluxDispatcher: findByProps('register', 'wait'),
i18n: findByProps('Messages', '_requestedLocale'),
channels: findByProps('getChannelId', 'getVoiceChannelId'),
constants: findByProps('API_HOST')
}; |
|
styled-components
/Tailwind-Styled-Component
libs #40Similar to what we alr have for babel runtime detection, consider introducing a module code detection that can help us transform the code and give the extracted module a better name other than
module-xxxx.js
.The text was updated successfully, but these errors were encountered: