-This is the main class of an extension, is the entry point from which configurations, connection providers, operations and sources are going to be declared. -
-Configurations
--
Config
--Default configuration -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Name |
-String |
-The name for this configuration. Connectors reference the configuration with this name. |
-- | x |
-
Tag List |
-
- Array of String - |
-- | - |
|
-
Expiration Policy |
-- | Configures the minimum amount of time that a dynamic configuration instance can remain idle before the runtime considers it eligible for expiration. This does not mean that the platform will expire the instance at the exact moment that it becomes eligible. The runtime will actually purge the instances when it sees it fit. |
-- |
|
-
Associated Operations
--
-
- - - -
- - - -
- - - -
Operations
-Crawl Website
-<mac-web-crawler:crawl-website>
-Crawl a website at a specified depth and fetch contents. Specify tags and classes in the configuration to fetch contents from those elements only. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Configuration |
-String |
-The name of the configuration to use. |
-- | x |
-
Website URL |
-
- String - |
-- | - | x |
-
Maximum Depth |
-
- Number - |
-- | - | x |
-
Retrieve Meta Tags |
-
- Boolean - |
-- | false |
-
|
-
Download Images |
-
- Boolean - |
-- | false |
-
|
-
Download Location |
-
- String - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
For Configurations.
--
-
- - - -
Get Page Content
-<mac-web-crawler:get-page-content>
-Get contents of a web page. Content is returned in the resulting payload. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Configuration |
-String |
-The name of the configuration to use. |
-- | x |
-
Page Url |
-
- String - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
For Configurations.
--
-
- - - -
Get Page Insights
-<mac-web-crawler:get-page-insights>
-Get insights from a web page including links, word count, number of occurrences of elements. Restrict insights to specific elements in the configuration. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Configuration |
-String |
-The name of the configuration to use. |
-- | x |
-
Page Url |
-
- String - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
For Configurations.
--
-
- - - -
Download Image
-<mac-web-crawler:download-image>
-Download all images from a web page, or download a single image at the specified link. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Page Or Image URL |
-
- String - |
-- | - | x |
-
Download Location |
-
- String - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
Generate Sitemap
-<mac-web-crawler:generate-sitemap>
-Retrieve internal links as a site map from the specified url and depth. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Website URL |
-
- String - |
-- | - | x |
-
Maximum Depth |
-
- Number - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
Get Page Meta Tags
-<mac-web-crawler:get-page-meta-tags>
-Fetch the meta tags from a web page. -
-Parameters
-Name | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Page URL |
-
- String - |
-- | - | x |
-
Output Mime Type |
-
- String - |
-The mime type of the payload that this operation outputs. |
-- |
|
-
Target Variable |
-
- String - |
-The name of a variable on which the operation's output will be placed |
-- |
|
-
Target Value |
-
- String - |
-An expression that will be evaluated against the operation's output and the outcome of that expression will be stored in the target variable |
-#[payload] |
-
|
-
Output
-Type |
-
- String - |
-
Types
-Expiration Policy
-Field | -Type | -Description | -Default Value | -Required | -
---|---|---|---|---|
Max Idle Time |
-
- Number - |
-A scalar time value for the maximum amount of time a dynamic configuration instance should be allowed to be idle before it’s considered eligible for expiration |
-- | - |
Time Unit |
-
-
-Enumeration, one of: -
-
|
-A time unit that qualifies the maxIdleTime attribute |
-- | - |