-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
just opening one for my research on bot detection and stuff #190
Comments
True, bots can still bypass it. I have some good resources. Have not heard of the 2 step. |
Everything is bypassable in the world of Javascript well Thanks for resources I am looking into them just now |
I was wondering to look over CVE for specific browser and it's version, If for demo purpose we can proceed ahead and identify too much info on the device/browser I know it's actually creepy but comeon it's in the name too lol It's not a bad idea u know We can identify many things if we play well but I'm not sure it's a gud idea to implement but it's a definitely gud section to look still not sure for implementation. What do u feel?
|
Not a bad idea. Maybe start with a test page. What I sometimes do is begin with a test page and experiment/research there. If we get stable results, we can release on the main page. If it has good performance and good fingerprinting, we can implement it in the main fingerprint.
I like this idea. I will look into it. |
I am really interested in chrome://chrome-urls/ There are many thing which can make things go really really really deep ++ I am looking over cve which can verify the browser version for us but I was thinking over more of the section of bot detection, hmm and yea I saw there are Many features which are not supported in Chrome android at the section of Chrome flags there is a section for what is not supported on my device maybe can be something of notice? I guess So maybe we can look Into it |
This one is interesting… till it gets patched. In Chrome, it can be used to validate if a device is really on macOS. https://developer.mozilla.org/en-US/docs/Web/API/Web_Share_API#api.navigator.canshare |
See I told u Cve and bugs are great place for us to look even if it will be patched for later versions it will still be there for people who don't usually update ( I was one of them ) And I know many who don't update |
Btw Do u have anything in mind for bot detection ahead? I mean in the end Creepjs is a bot detection repo sort of itself, from the section of lies till loosing their expected features So I was curious if u had something in research lately Note:- Android and iOs devices never come with Angle as their gpu if they are real, Google emulator Friendly web test had the same thing and I have seen it only in bots till yet when it comes to these 2 os, It can be a small point I mean Imagine seeing intel as the gpu of Android device user 😂 aah dude nevermind just want to convey that hardware filter are an essential parts in gpu to combining confidence methodology it can be a gud charm |
mmm don't u think we should bring up geckodriver too in headless section as Till yet it is focused on chromedriver |
Good idea. We should absolutely include geckodriver and more. |
Nothing on my mind, atm. But, ideas are welcome.
This is on my mind. I've been slow to get to it. We should definitely look out for GPU lies in reported mobile devices. Samsung Xclipse 920 has Angle, but I think we can determine Angle is not iOS. |
mm but expect that device almost every device comes with real like mediatek helio or Qualcomm |
Hi, was busy with something well let's get back to research I found something interesting to look at:- |
https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=927531 found something to look at it's regarding 2 step tls fingerprinting |
Nice. I wonder if TLS fingerprint is distinct on mobile devices vs desktop. I presume no. |
Do u have a report of what is the top 5 browser version Creepjs usually gets to see I am curious if people use older version as there are bugs and vulnerability if old one is there _ that might be an interesting approach if we go in ethical way |
It depends on the date, but the top 5 versions usually consist of versions at or near the latest stable releases of Blink, Gecko, and WebKit. Here's yesterday, for example: We do get a lot of older browsers, though. The window test page contains a pool of browser versions seen in the last 40 days. I'm sure we would see even older browsers if the code was geared for ES5. Right now, the target is ES2019. |
found something Navigator.connection.type only there for android and ios can be a part as it is something quite not people hide if windows and Linux it's not there they says privacy issues........ Like they gave it to android and ios |
Nice. I plan to add this. Looks like https://wicg.github.io/netinfo/#privacy-considerations
|
I wanna test the networkinformation type to Google mobile friendly display test I think majority of the big brand bots uses simulation instead of emulation so it could be a part in terms of bot who are stating to be android but They are not , can be considered as suspicious by us I am currently learn typescript for js as we are switching at that I will explore Navigator more deep into every inner parts of it |
mm I wonder if brave mobile is different from normal brave in a way I wasn't aware of jsconsole.com so I was using this for other browsers
|
Does it mean headless rtt is 0 as a special case? I tested on Chrome, Brave, Kiwi , Chromium on both Android and Windows and Linux All results are more than 0 in rtt normally |
I imagine |
I did some research on Here's the the draft outlines the computation in greater detail (section "5.1.1.3. Computing Foundations") |
Hmm what can we do I think we can take it as a suspicious point maybe if it's unusually rare, it can be a thing, but I'm not sure if we should it's sort of similar to :- likeHeadless one in our creepjs we can do likeUnusal or something |
True I use dark reader all the time was working on a website so I saw it while debugging haha, will look for more interesting plugins which may leak some things over documents etc |
Hi, I was looking around gmail and I saw the are able to detect a secure or a suspicious browser, somewhat like we do at creepjs. But I am curious with their mechanism. I saw it after when we enter gmail address there is a detection script there. Wanna explore together? |
Sure, I imagine they use UA client hints to detect unseen devices and then warn backup email of unknown device log in to x account. The difficulty is de-obfuscating their code. This repo has a lot we can also look at. |
anyway I think gmail uses something more complex even puppeteer stealth can't get in login even in normal like same useragent etc without headless written there I think that's why I want us to see what's intresting there when u were inactive I was learning over dev tools detection from this repo :- but for now I'm really more interested in gmail detection Because of the above reason that's why I got interested maybe there can be something more we could learn ? who knows |
Damm that repo, I can sense some awesome thing right there |
Is the obfuscators absolutely foolproof? Since the JavaScript runs on the browser, the browser's JavaScript engine must be able to read and interpret it, so there's no way to prevent that. And any tool that promises that is not being honest. -- mentioned in https://obfuscator.io/#FAQ one of the best obfuscator I have seen till yet |
lol this is exactly what we needed |
Nice. I ran into that recently. That's a good detection. Good points. The Googlebot code looks like a challenge. I can see it collects the error stack here. |
Agreed, I am working over a small project rn which includes me to use ejs and express and a cdn of maybe vuejs, react native or any front end framework. I literally learned all 3 ( vue, angular and react ) within 5 days. u can imagine it's been a mind blowing week for me will start over looking googlebot one probably day after tomorrow. haaah ~ sigh in tiredness ~ |
Hii , I'm done with my project. Let's research 💝 I'm gonna look at the Google botgaurd. |
I found something, I even opened a issue as research the owner is kinda active too I noticed now so that's the latest code of Google botgaurd reverse attempt:- https://github.com/icetroll/botguard-RE we can learn from here |
Nice. That is a lot of code. I think it has to do with behavioral fingerprints. I see a few event listeners connected to DOM elements. |
I've been researching ways to detect Selenium and found some interesting leaks. Fascinating article here. Those values seem to be manipulated by different bots, but the object prototype contains unique keys that are important to the internal code. I haven't tested it, but I think it's possible to override those functions with eval code and use them to get internal values. |
The prototype functions might only reveal Selenium code and possibly different versions of the code. |
That too will be really interesting for creepjs. I am sure, maybe a sure bot detection haha. Rn I am giving names to the code of g-botgaurd to like understand it's working |
I have understood quite much about Google botgaurd, I will give u a summary properly here it is intresting ngl |
Any update over ur research? |
Nothing yet. But, a lot on my mind. I think the storage bytes are an incredible high entropy fingerprint in Chrome. It depends on the machine and what it's used for, but if there are no changes in storage, the fingerprints can categorize a machine in 1 trillion possible fingerprints (to put it lightly). In private tabs, chromium reduces entropy (unstable per session and low bytes available). Unrelated, I have this idea I might experiment with at some point. It's essentially a soft/superfast fingerprinting (less than 10ms and mostly low entropy), then it progressively slows down and expands into high entropy if anomalous hashes are detected. The idea is to make bad fingerprints move more slowly and good fingerprints move more quickly. |
I looked over current gmail working, I found that they are monitoring and using the performance api very well, which I didnt knew thought of I am exploring more but I saw the new v3 is I exploring other's antibot and monitoring behavior to expand creepjs , rn I am seeing this:- |
seeing their website it's interesting how they use api's and clever javascript ( what is more interesting is that they have mentioned in their code as comment that those codes were written in the year 2016 if they are not lying it's quite fascinating ), till yet I am seeing and writing what api they are using then I will summarize things here as I go |
I am resuming my research summarizing from today let's see I can put up some intresting points |
about chromedriver detection, check https://github.com/HMaker/HMaker.github.io/tree/master/selenium-detector most of the tests can be easily bypassed by patching chromedriver src though. |
Very nice detection, there.. Can these functions be patched or removed? The functions names can be modified, but wouldn't the prototype still leak the names. |
You can also change the prototype completely, also you could make chromedriver store that on window instead of document. chromedriver is just a CDP wrapper, but it sits at higher level of chromium architecture, so they use the page global JS state to store automation related vars. |
Gmail stuff summary They use proxy detection (mostly based on performance api ) + worker is their focus just like we have here + they do have few basic feature detection and with err detection and buckets etc and rest it's just they made them lengthy |
get a hug dude lol it's a good repo great effort really loved it |
I was thinking to challenge myself against creepjs techniques hehe |
I looked over the tls fingerprinting, You talked about but there is something I read at akamai research where they stated that bot are able to bypass to get on gud side :-
https://www.akamai.com/blog/security/bots-tampering-with-tls-to-avoid-detection
I came across a 2 step tls fingerprinting but I lost that pdf 🥲🥲 dammit
Will try to find it but do u know about it?
The text was updated successfully, but these errors were encountered: