You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using Playwright or Pupperteer do not work for me with extractFromHtml()
I tried everything like:
Get the HTML to send to article-extractor const contentHTML = await page.locator('html').innerHTML();
OR const htmlPageContent = await page.content();
For testing where i use Playwright i did try with: const contentExtractHtmlArticle = await articleExtractor(htmlPageContent);
I did a extract.ts file with a export fonction.
I tried this in my export page: const { content } = await extractFromHtml(html);
OR const { content } = await extractFromHtml(String(html));
I import my fonction where i use Playwright just to tested if i got a async await error.
But if i use from url in my export function, i tryed in the below section with extrac(url) and it did work.
but did not work when sending html:
export async function articleExtractor(html) {
try {
const { content } = await extractFromHtml(html);
// GC isHTML IN utils.js NOTE: nothing to do with the error i removed this isHTML part to test more.
if (isHTML(content)) {
console.log('HTML found');
return content ;
} else {
console.log('HTML NOT found');
return 'text/html not found';
}
} catch (error) {
console.error('Error extracting text from HTML:', error);
return 'Error extracting text from HTML';
}
}
The text was updated successfully, but these errors were encountered:
Using Playwright or Pupperteer do not work for me with extractFromHtml()
I tried everything like:
Get the HTML to send to article-extractor
const contentHTML = await page.locator('html').innerHTML();
OR
const htmlPageContent = await page.content();
For testing where i use Playwright i did try with:
const contentExtractHtmlArticle = await articleExtractor(htmlPageContent);
I did a extract.ts file with a export fonction.
I tried this in my export page:
const { content } = await extractFromHtml(html);
OR
const { content } = await extractFromHtml(String(html));
I import my fonction where i use Playwright just to tested if i got a async await error.
But if i use from url in my export function, i tryed in the below section with extrac(url) and it did work.
but did not work when sending html:
The text was updated successfully, but these errors were encountered: