Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HaveIBeenPawned? #289

Closed
amaury1093 opened this issue Apr 17, 2020 · 20 comments
Closed

HaveIBeenPawned? #289

amaury1093 opened this issue Apr 17, 2020 · 20 comments

Comments

@amaury1093
Copy link
Member

amaury1093 commented Apr 17, 2020

Add a field misc.have_i_been_pawned: true/false which makes an API call to https://haveibeenpwned.com/

@NChechulin
Copy link

There is a small problem: haveibeenpwned's API costs $3.50/month.
Maybe consider scraping or a similar free API?

@amaury1093
Copy link
Member Author

Ah, I wasn't aware it was paid. So maybe not, I don't think it's super high priority (and people can always make a separate API call for that).

I recall the author was open-sourcing it. Will it still be paid after?

@NChechulin
Copy link

NChechulin commented Nov 25, 2020

On API Key Page they provide a link to a blog post, which says:

Clearly not everyone will be happy with this so let me spend a bit of time here explaining the rationale. This fee is first and foremost to stop abuse of the API.

So, I think that we should not expect that API will become free soon.

@DigitalGreyHat
Copy link

DigitalGreyHat commented Nov 17, 2021

Hello I made my own API. It's free forever! And it works the same as haveibeenpwned.com. I try to make a PR soon.
Edit: I am not a rust dev😅

@LeMoussel
Copy link

@DigitalGreyHat Can you give some/more information about your API?

@olivermontes
Copy link

hi, any news? @DigitalGreyHat

@sylvain-reynaud
Copy link
Contributor

Hello, I am currently working on this.

There are my thoughts:

The problem with the cloudflare bypass is that we have to rely on a stealth browser. Otherwise cloudflare will be triggered. https://github.com/ultrafunkamsterdam/undetected-chromedriver seems to be the one with the biggest community. I did a PoC and the results are not reliable. It works ~70% of the time (30% of crash/no response). Another problem of the slealth browser is that it brings a lot of new dependencies with its maintainability need.

To my mind, implement the paid API is the way to go. Otherwise we can find another reliable and free API.

@amaury1093
Copy link
Member Author

Let's go with the paid API. @sylvain-reynaud would you like to create a PR?

I think the way to go is:

  • add an env variable RCH_HIBP_API_KEY, if it's set to something non empty, then make an API call
  • put the result in misc.have_i_been_pawned: Option<bool>

@amaury1093
Copy link
Member Author

Otherwise we can find another reliable and free API.

Do people know of other free APIs? Ideally open-source. We can always add misc.<other_api> = true/false, and make those extra API calls configurable.

@sylvain-reynaud
Copy link
Contributor

According to https://github.com/khast3x/h8mail#apis there are 3 free(ium) apis:

@LeMoussel
Copy link

LeMoussel commented Dec 12, 2022

For information, there is Fingerprint Suite with Playwright.
It's OK with Antibot. I didn't test with Cloudflare.

@sylvain-reynaud
Copy link
Contributor

It's OK with Antibot. I didn't test with Cloudflare.

const { chromium } = require('playwright');
const { FingerprintGenerator } = require('fingerprint-generator');
const { FingerprintInjector }  = require('fingerprint-injector');

(async () => {
	const fingerprintGenerator = new FingerprintGenerator();

	const browserFingerprintWithHeaders = fingerprintGenerator.getFingerprint({
		devices: ['desktop'],
		browsers: ['chrome'],
	});

	const fingerprintInjector = new FingerprintInjector();
	const { fingerprint } = browserFingerprintWithHeaders;

	const browser = await chromium.launch({ headless: false})

	// With certain properties, we need to inject the props into the context initialization
	const context = await browser.newContext({
		userAgent: fingerprint.userAgent,
		locale: fingerprint.navigator.language,
		viewport: fingerprint.screen,
	});

	// Attach the rest of the fingerprint
	await fingerprintInjector.attachFingerprintToPlaywright(context, browserFingerprintWithHeaders);

	const page = await context.newPage();

	await page.goto('https://haveibeenpwned.com/unifiedsearch/[email protected]');

	// wait for the page to load
	await page.waitForTimeout(20000);
	// log the page content
	console.log(await page.content());
	// screenshot the page
	await page.screenshot({ path: 'proof.png' });
})();

If it runs in headless it is blocked, if it runs with the browser window it is not blocked. You can check it with the code above.

I'll implement the paid API in first place.

@LeMoussel
Copy link

It seems OK in Firefox headless mode with this:

import path from 'path';
import { fileURLToPath } from 'url';

import { firefox } from 'playwright';
import { FingerprintGenerator } from 'fingerprint-generator';
import { FingerprintInjector } from 'fingerprint-injector';

(async () => {
    const fingerprintGenerator = new FingerprintGenerator();

    const browserFingerprintWithHeaders = fingerprintGenerator.getFingerprint({
        devices: ['desktop'],
        browsers: ['firefox'],
    });

    const fingerprintInjector = new FingerprintInjector();
    const { fingerprint } = browserFingerprintWithHeaders;

    const browser = await firefox.launch({
        headless: true
    });

    // With certain properties, we need to inject the props into the context initialization
    const context = await browser.newContext({
        userAgent: fingerprint.userAgent,
        locale: fingerprint.navigator.language,
        viewport: fingerprint.screen,
    });

    // Attach the rest of the fingerprint
    await fingerprintInjector.attachFingerprintToPlaywright(context, browserFingerprintWithHeaders);

    const page = await context.newPage();

    await page.goto('https://haveibeenpwned.com/unifiedsearch/[email protected]');

    await page.screenshot({ path: path.join(path.dirname(fileURLToPath(import.meta.url)), 'playwright_test_headless.png') });

    await browser.close()
})();

@LeMoussel
Copy link

LeMoussel commented Dec 12, 2022

Yep! It's OK with got-scraping
got-scraping library has usually better success than other libraries due to header generation, http2 and browser ciphers.

import { gotScraping } from 'got-scraping';

(async () => {
    const response = await gotScraping({
        url: 'https://haveibeenpwned.com/unifiedsearch/[email protected]',
        headerGeneratorOptions:{
            browsers: ['firefox'],
            devices: ['desktop'],
        }
    });
    console.log(response.body)
    const result = JSON.parse(response.body)
    console.log(`Response headers: ${JSON.stringify(response.headers)}`);
})();

@sylvain-reynaud
Copy link
Contributor

@LeMoussel wow I didn't know about this package, thank's 💯

So I'm working on adding the feature by calling this URL https://haveibeenpwned.com/unifiedsearch/[email protected]

sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 11, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 11, 2023
@sylvain-reynaud
Copy link
Contributor

Hello, my PR is ready to be reviewed :)

@beshoo
Copy link

beshoo commented Jan 11, 2023 via email

sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 13, 2023
@sylvain-reynaud
Copy link
Contributor

I fixed the format and removed code that might break if a field is added on the API Response.

@beshoo it uses the haveibeenpwned API. The endpoint used is the one used by the front-end haveibeenpwned.com.

@amaury1093
Copy link
Member Author

The node.js libraries are probably more battle-tested, but I would like to keep this repo as pure Rust.

Also, I'm reluctant to use a headless browser for HIBP. It seems there's a risk that it'll become flaky/blocked one day, and the maintenance burden will likely fall on me. I propose to start with the paid API, as descrbied in #289 (comment). I'll gladly purchase the paid API and make it available on https://reacher.email 's SAAS plan.

sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 17, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 17, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 18, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 31, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Jan 31, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Feb 14, 2023
sylvain-reynaud added a commit to sylvain-reynaud/check-if-email-exists that referenced this issue Feb 15, 2023
@amaury1093
Copy link
Member Author

Implemented in #1253, closing, thanks @sylvain-reynaud

juhniorsantos pushed a commit to juhniorsantos/check-if-email-exists that referenced this issue Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants