Skip to content

Job scheduler

jakubzembik edited this page Apr 1, 2016 · 31 revisions

Job scheduling service

Import data from a SQL database

Inside page Job Scheduler you can select Import data page for schedule import

  • Name - is a name of your job

  • JDBC Uri - required schema of jdbc uri:
jdbc:driver://host:port/database_name

  • Username and Password - if database is secured you need to pass credentials

  • Table - it's name of table from database specified for import

  • Destination dir - it's directory where you will import data
  • Choose import mode - there are 3 possible modes: Append, Overwrite and Incremental
    • Append - each import will fetch whole table into separate file. Results of previous imports will not be overwritten.
    • Overwrite - each import will fetch whole table and overwrite results of previous import.
    • Incremental - each import will fetch the difference from last import, and store it in separate file. Sqoop will recognize the delta by value of specific column, so column name containing id must be specified. We recommend using column which is auto-incremented. The initial value of column can be specified. Use 0 to import whole table during first run.
      • Column name - it's column from database, from which Value will be checked
      • Value - it's value from which you want import database

  • Start time - it's start time of your job
  • End time - it's end time of your job
  • Frequency - it's frequency with which your job will be submitted
  • Timezone - it's id of the time zone in which you entered start and end time

Job browser

In page Job browser you could see submitted jobs. There are two pages inside: Workflow jobs and Coordinator jobs.

  • Coordintator jobs - In this page you can see list of coordinator jobs. Coorinator jobs contain configuration and manage to spam workflow jobs. You can click on See details to get additional information.
    • Details - Additional information about coordinator job

  • Started workflow jobs - List of workflow jobs spawned by coordinator job. Each workflow job on the list have See details field, whcich will redirect you to selected workflow job details.

  • Workflow jobs - In this page you can see list of workflow jobs. Workflow job represent import from database to hdfs. You can click on See details to get additional information.
    • Details - Additional information about workflow job

  • See logs - Here you can see logs related to workflow job
Clone this wiki locally