Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ScanPDF Changes #200

Merged
merged 1 commit into from
Apr 26, 2022
Merged

Adding ScanPDF Changes #200

merged 1 commit into from
Apr 26, 2022

Conversation

phutelmyer
Copy link
Contributor

Describe the change
Updating ScanPDF with new output + object key generation. This should help resolve issue #198 as well as other issues identified internally with output and execution errors.

Object keys are now being collected via the backend.yml file. Add in additional objects you would like to collect in options.objects. On scanner execution, these objects will be scanned for. If found, a counter will be applied to count each reference.

  'ScanPdf':
    - positive:
        flavors:
          - 'application/pdf'
          - 'pdf_file'
      priority: 5
      options:
        objects:
          - 'AA'
          - 'EmbeddedFiles'
          - 'JavaScript'
          - 'JS'
          - 'Launch'
          - 'Macrosheet'
          - 'MediaBox'
          - 'OpenAction'
          - 'URI'
          - 'XObject'

As there are many field changes to list, please review the scanner for a list of all changes.

Describe testing procedures
Built PR and tested against 10+ PDF samples.

Sample output

{
  "file": {
    "depth": 0,
    "flavors": {
      "mime": [
        "application/pdf"
      ],
      "yara": [
        "pdf_file"
      ]
    },
    "scanners": [
      "ScanEntropy",
      "ScanExiftool",
      "ScanFooter",
      "ScanHash",
      "ScanHeader",
      "ScanPdf",
      "ScanYara"
    ],
    "size": 315764,
    "tree": {
      "node": "5b989e05-f4b8-4bc1-ab06-91921d14977f",
      "root": "5b989e05-f4b8-4bc1-ab06-91921d14977f"
    }
  },
  "request": {
    "attributes": {
      "filename": "samples/scan_pdf/4c4a650e020be616bc742a3c380d3e4487cd454faaa13d5a0b2fb8f21ca37d1c"
    },
    "client": "go-fileshot-testing",
    "id": "5b989e05-f4b8-4bc1-ab06-91921d14977f",
    "source": "TestHost",
    "time": 1650974414
  },
  "scan": {
    "entropy": {
      "elapsed": 0.00023,
      "entropy": 7.959114013971717
    },
    "exiftool": {
      "elapsed": 0.112316,
      "keys": [
        {
          "key": "PDFVersion",
          "value": 1.2
        },
        {
          "key": "Linearized",
          "value": "No"
        },
        {
          "key": "PageCount",
          "value": 4
        },
        {
          "key": "Producer",
          "value": "Acrobat Distiller 3.0 for Windows"
        },
        {
          "key": "Author",
          "value": "user"
        },
        {
          "key": "Creator",
          "value": "PageMaker 6.5"
        },
        {
          "key": "Title",
          "value": "sura-59-75.p65"
        }
      ]
    },
    "footer": {
      "elapsed": 0.000082,
      "footer": "cf82ff73921de657>]\r\n>>\r\nstartxref\r\n314295\r\n%%EOF\r\n"
    },
    "hash": {
      "elapsed": 0.003436,
      "md5": "a069e558ef88e6181f01077f1aa3c7f5",
      "sha1": "5814cbf7f6bfa57d8ac2c54e9e3a73271442ec30",
      "sha256": "4c4a650e020be616bc742a3c380d3e4487cd454faaa13d5a0b2fb8f21ca37d1c",
      "ssdeep": "6144:X32lC7x6+Ganc1VW9O6EheGi2yQRdIFPDk3BB0qDiT1adZJaFn4iVv9HO7f4mqMV:X8CkK8WDkxi2yQzIFLMBbkabJaF4iVuh"
    },
    "header": {
      "elapsed": 0.000036,
      "header": "%PDF-1.2\r\n%����\r\n1 0 obj\r\n<<\r\n/Type /XObject\r\n/Sub"
    },
    "pdf": {
      "author": "user",
      "creator": "PageMaker 6.5",
      "dirty": false,
      "elapsed": 0.189464,
      "encrypted": false,
      "format": "PDF 1.2",
      "images": 24,
      "lines": 154,
      "objects": {
        "MediaBox": 1,
        "XObject": 4
      },
      "old_xrefs": true,
      "pages": 4,
      "producer": "Acrobat Distiller 3.0 for Windows",
      "repaired": false,
      "title": "sura-59-75.p65",
      "words": 720,
      "xrefs": 64
    },
    "yara": {
      "elapsed": 0.000929,
      "matches": [
        "test"
      ]
    }
  }
}

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of and tested my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings

@phutelmyer phutelmyer self-assigned this Apr 26, 2022
@phutelmyer phutelmyer added bug Something isn't working enhancement New feature or request labels Apr 26, 2022
@phutelmyer phutelmyer merged commit e6e4f8a into master Apr 26, 2022
@phutelmyer phutelmyer deleted the 2022-04-26-ScanPDF_Update branch May 2, 2022 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant