-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: HTTP request tool #9228
feat: HTTP request tool #9228
Conversation
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
); | ||
} | ||
const returnData: string[] = []; | ||
const html = cheerio.load(response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use @mozilla/readability
and jsdom
here to cleanly extract the content that's likely relevant to an end-user.
Something like this perhaps:
import { JSDOM } from 'jsdom'
import { Readability } from '@mozilla/readability'
const dom = await JSDOM.fromURL(url)
const article = new Readability(dom.window.document, {
keepClasses: true,
}).parse()
and then use article.content
.
we could also consider using turndown to convert the html into markdown, which LLM tend to handle better than html IMO.
import Turndown from 'turndown'
const markdown = turndown.turndown(article.content)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@netroy
what would be advantages over html-to-text
+ Cheerio
? since we already using such setup for Html
node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cheerio
is great to either use css selectors to extract text, but leaves the burden of determining the semantics in the markup to the end user.
First download an article via curl https://www.bbc.com/news/articles/cldd6x6gglxo > news.html
.
Then try this with cheerio
:
const fs = require('fs');
const cheerio = require('cheerio');
const html = fs.readFileSync('news.html', 'utf8');
const $ = cheerio.load(html);
console.log($('body').text());
With just readability
:
(async () => {
const { JSDOM } = require('jsdom');
const { Readability } = require('@mozilla/readability');
const Turndown = require('turndown');
const dom = await JSDOM.fromFile('news.html');
const article = new Readability(dom.window.document, {
keepClasses: true,
}).parse();
console.log(article.textContent);
})();
With readability
+ turndown
:
(async () => {
const { JSDOM } = require('jsdom');
const { Readability } = require('@mozilla/readability');
const Turndown = require('turndown');
const dom = await JSDOM.fromFile('news.html');
const article = new Readability(dom.window.document, {
keepClasses: true,
}).parse();
const turndown = new Turndown({
headingStyle: 'atx',
hr: '---',
bulletListMarker: '-',
codeBlockStyle: 'fenced',
});
const markdown = turndown.turndown(article.content);
console.log(markdown);
})();
Perhaps we should add a "Extract as Markdown" option in the node to determine if we want to use markup to reduce semantic noise in the extracted text?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for using @mozilla/readability
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
…ol-to-visit-a-website
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
); | ||
} | ||
const returnData: string[] = []; | ||
const html = cheerio.load(response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for using @mozilla/readability
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
packages/@n8n/nodes-langchain/nodes/tools/ToolHttpRequest/utils.ts
Outdated
Show resolved
Hide resolved
if (!Object.keys(options.headers as IDataObject).length) { | ||
delete options.headers; | ||
} | ||
|
||
if (!Object.keys(options.qs as IDataObject).length) { | ||
delete options.qs; | ||
} | ||
|
||
if (!Object.keys(options.body as IDataObject).length) { | ||
delete options.body; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!Object.keys(options.headers as IDataObject).length) { | |
delete options.headers; | |
} | |
if (!Object.keys(options.qs as IDataObject).length) { | |
delete options.qs; | |
} | |
if (!Object.keys(options.body as IDataObject).length) { | |
delete options.body; | |
} | |
if (options) { | |
options.url = encodeURI(options.url); | |
if (options.headers && !Object.keys(options.headers).length) { | |
delete options.headers; | |
} | |
if (options.qs && !Object.keys(options.qs).length) { | |
delete options.qs; | |
} | |
if (options.body && !Object.keys(options.body).length) { | |
delete options.body; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ol-to-visit-a-website
…ol-to-visit-a-website
1 flaky test on run #5573 ↗︎
Details:
e2e/5-ndv.cy.ts • 1 flaky test
Review all test suite changes for PR #9228 ↗︎ |
✅ All Cypress E2E specs passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR also updated packages/nodes-base/nodes/Splunk/splunk.svg
. Is that intentional?
Got released with |
1 similar comment
Got released with |
Summary
This tool enables the model to make HTTP requests with various parameters, send in path, body, headers or query . It also allows the model to utilize credentials from other nodes for authentication purposes. Optimize Response option allows some optimization to reduce context send to model.
Related tickets and issues
https://linear.app/n8n/issue/AI-162/tool-to-visit-a-website