Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes on/for v3 / v4 #830

Open
tunnckoCore opened this issue Apr 1, 2022 · 20 comments
Open

Notes on/for v3 / v4 #830

tunnckoCore opened this issue Apr 1, 2022 · 20 comments
Labels
Priority: Medium This issue may be useful, and needs some attention. Status: In Progress This issue is being worked on, and has someone assigned. Status: Proposal A feature request or another suggestion. Type: Maintenance Updating phrasing or wording to make things clearer or removing ambiguity, without a functionality.

Comments

@tunnckoCore
Copy link
Member

tunnckoCore commented Apr 1, 2022

workspaces/

  • formidable
  • formidable-plugins?
  • formidable-helpers
  • formidable-mini - an experimental minimal & modern idea
  • formidable-utils
  • formidable-storage
    • memory (VolatileFile)
    • disk (PersistentFile)
    • serverless (formidable-serverless)

misc

  • minimize dep weight
  • monorepo (turborepo /nx.dev + lerna ^5.1 + yarn workspaces) - Hela v6
  • written in typescript, drop / deprecated external packages
  • kodiak - merge automation & better PRs and commits (https://kodiakhq.com/)
  • Dependabot, make it auto-update and auto-merge all versions and branches (latest-2, latest, latest-3, next-2, next-3)
  • GitHub CodeQL action
  • go async/await and promises; possibly drop callback APIs entirely
  • document Release Process, Lines, and dist-tags & branches
    • latest-1, latest-2, next-2, latest-3, latest-4, next-4);
    • latest as master
  • better / real benchmarks
  • analyze old tests & rethink and probably drop them
  • some benchmarks here at busboy v1 - Changes in v1.0.0 mscdex/busboy#266
  • standard Web API File
@tunnckoCore tunnckoCore added Priority: Medium This issue may be useful, and needs some attention. Status: In Progress This issue is being worked on, and has someone assigned. Status: Proposal A feature request or another suggestion. Type: Maintenance Updating phrasing or wording to make things clearer or removing ambiguity, without a functionality. labels Apr 2, 2022
@GrosSacASac
Copy link
Contributor

go async/await and promises; possibly drop callback APIs entirely

Maybe do a mix in v3 and in v4 drop callback

@tunnckoCore
Copy link
Member Author

tunnckoCore commented Jun 16, 2022

New FormidableFile, based on fetch-blob and Web API File. Plus, we'll always make temporary files on tmp dir, for the time the process is running. When the process ends, they are cleared (all that is handled by fetch-blob). The user will still be able to to save on disk or to do that file whatever it wants. Something like opts.save on Formidable Options, will mean that we will save the data of that temporary created file to the disk on the user-chosen (or default) upload dir.

It handles:

  • random filename generation (default) with cuid
  • file rename function opts.rename
  • filename sanitization, ALWAYS, with filenamifyPath (with opts.replacement: '-' instead of the default !)
    • whether or not you use custom rename function, the result is always passed to sanitization
    • not yet sure if we should do that when renamer function is passed
/**
 * @typedef {import('filenamify').Options} FilenamifyOptions
 * @typedef {import('node:path').ParsedPath} ParsedPath
 *
 * @typedef FormidableFileOptions
 * @property {string} [type] - mime type from header
 * @property {number} [lastModified] - file mtimeMs, default `Date.now()`
 * @property {number} [size] - file size in bytes
 * @property {boolean} [randomName] - generate random filename, default `true` using `cuid`
 * @property {FilenamifyOptions} [filenamify] - options passed to `filenamify`
 * @property {RenamerFn} [rename] - filename renaming function
 * @property {{[key: string]: unknown}} [data] - any additional data or way of storing things
 *
 * @typedef {(filepath: string, contents: unknown, options: FormidableFileOptions) => string} RenamerFn
 * must be synchronous function
 */

import path from 'node:path';
import cuid from 'cuid';
import { filenamifyPath } from 'filenamify';

/**
 * @typedef {{type?: string, name: string, lastModified: number}} File
 * standard Web API File
 */
// @ts-ignore
import { File } from 'fetch-blob/file.js';

export class FormidableFile extends File {
  /**
   *
   * @param {string} filepath - filename from header
   * @param {unknown} contents - any file content value
   * @param {FormidableFileOptions} [options]
   */
  constructor(filepath, contents, options) {
    const { file, opts, fp } = normalize(filepath, contents, options);

    super(contents, file.base, {
      // @ts-ignore
      type: opts.mimetype || opts.mime || opts.type,
      lastModified: opts.lastModified || Date.now(),
    });

    /**
     * @type {string} - full filepath
     */
    // we assign this here, because to not mess with ParsedPath `file`
    this.path = fp;

    // eslint-disable-next-line no-underscore-dangle
    this._updateProps(file, contents, opts);
  }

  /**
   *
   * @param {ParsedPath} file - custom file
   * @param {unknown} contents - any file content value
   * @param {FormidableFileOptions} [options]
   */
  // eslint-disable-next-line no-underscore-dangle
  _updateProps(file, contents, options) {
    // it's already normalized
    // const opts = normalizeOptions(options);

    this.basename = file.base;
    this.dirname = file.dir;
    this.extname = file.ext;

    this.stem = file.name; // this.stem == this.name == file.name == basename without extname
    this.mime = options.type;

    if (options.size) {
      // @ts-ignore
      this.size = options.size;
    }

    Reflect.defineProperty(this, 'contents', { value: contents });
    Reflect.defineProperty(this, 'options', { value: options });
    Reflect.defineProperty(this, 'base', { value: file.base });
    Reflect.defineProperty(this, 'dir', { value: file.dir });
    Reflect.defineProperty(this, 'ext', { value: file.ext });
    Reflect.defineProperty(this, 'data', { value: options.data });
  }
}

/**
 *
 * @param {string} filepath - filename from header
 * @param {unknown} contents - any file content value
 * @param {FormidableFileOptions} [options]
 *
 */
function normalize(filepath, contents, options) {
  const opts = normalizeOptions(options);

  /**
   * @type {string} fp
   */
  const fp = filenamifyPath(
    // @ts-ignore
    opts.rename(filepath, contents, opts),
    opts.filenamify,
  );

  /**
   *
   * @type {ParsedPath} file
   */
  const file = path.parse(fp);

  return { file, opts, fp };
}

/**
 * @param {{[key: string]: unknown} & FormidableFileOptions} [options]
 * @returns {FormidableFileOptions}
 */
function normalizeOptions(options = {}) {
  const opts = {
    filenamify: { replacement: '-', ...options.filenamify },
    randomName: true,
    data: { ...options.data },
    ...options,
  };

  /**
   * @property {RenamerFn}
   */
  opts.rename = (f, c, o) => {
    if (typeof opts.rename === 'function') {
      return opts.rename(f, c, o);
    }
    return opts.randomName ? cuid() : f;
  };

  return opts;
}

/**
 * examples
 */

const virtualFile = new FormidableFile(
  './foo/bar/qu_sas<[email protected]>zab.gz<script onchange="alert(1)"></script>sax.png',
  await fs.readFile('./examples/10k.json'),
  {
    type: 'application/json',
    randomName: true,
    data: { sasa: 1 },
  },
);

console.log(virtualFile);

Ideally it will be called from part.on('data') in the future, but for now we can add it at _newFile, then we can use createTemporaryBlob to make that temp file and pass the return to the user.

Something like that:

part.on('data', async () => {
  let file = new FormidableFile(nameFromHeader, dataFromParser, {
    type: contentTypeFromParser,
    rename: formidableOptions.rename,
  });
  const blob = await createTemporaryBlob(
    file.contents,
    file.type,
    file.options.signal,
  );

  file.blob = blob;
  this.emit('file', file, blob);

  if (formidableOptions.save) {
    await fs.writeFile(file.path, file.contents);
  }
});

We are using createTemporaryBlob instead of createTemporaryFile because we already made a file instance of FormidableFile which extends File.

The reason of having our own File implementation on top, is that it's for useful properties that we need to pass to the users. Plus I always seeing it that way since I got in here in v1. The properties of it is based on enough battle-tested vfile (remark, rehype, unified) and vinyl (gulp), plus of course the minimalism of File.

Currently text(), stream() and those methods, are missing, but we probably should implement them too.

The whole thing can be introduced in v3 (and require node 14) and backport to v2 (without the temp file creation thing and without fetch-blob).

The rest of the interesting stuff like iterators and new APIs goes to v4.

@tunnckoCore
Copy link
Member Author

tunnckoCore commented Jun 17, 2022

Okay, so the @jimmywarting's cleaned Formidable parser, which is on the node-fetch is almost 2x faster than the current one.

Based on the busboys benchmark bench-multipart-files-100mb-big the old one is avg 3.7s, now it's 1.7s

@jimmywarting
Copy link
Collaborator

jimmywarting commented Jun 17, 2022

wow, didn't know that mine was 2x faster. did you come to any conclusion of why that is?

@tunnckoCore
Copy link
Member Author

Not yet. I'm not in parsers and that parser looks like magic to me LOL.

Btw, to make a note: it's faster when use just the class, but not the toFormData. And when you start putting things on top of all that it gets slower. Busboy v1 really does a great job at the very low level.

I'll commit updated benchmarks soon.

@jimmywarting
Copy link
Collaborator

[email protected] is publised, you can now use createTemporaryBlob or createTemporaryFile if you like

@jimmywarting
Copy link
Collaborator

jimmywarting commented Jul 6, 2022

btw, formdata is built into NodeJS core now... maybe you can do a conditional import like use some formdata package like formdata-polyfill (used by node-fetch) or formdata-node (both are spec compatible) if it isn't avalible on global namespace

so something like const FormData = globalThis.FormData || await import('formdata-polyfill/esm.min.js')
just so happen to know that NodeJS (undici's FormData imp accepts 3th party blobs like fetch-blob as well - not just only instances of globalThis.Blob)

fetch-blob is still going to stick around until nodejs/node#37340 has been resolved by nodejs itself

@jimmywarting
Copy link
Collaborator

Was a super long time ago since i even used gulp and i have never heard of vinyl or filev before... honestly just wished they followed Web File IDE structure instead...

vinyl should also impl text, arrayBuffer and stream() 👍

@tunnckoCore
Copy link
Member Author

tunnckoCore commented Jul 23, 2022

they followed Web File IDE structure instead

Well, yea. But they predated everything i think, that's why.

I built some formidable-mini on top of your fetch-blob and formdata-polyfill, you can check it here, it's pretty clean, small and a beauty, haha. There FormidableFile is just a Web File, or anyone can just pass the return stadnard FormData to their desired place :P

ah.. what a times..

I'm open for ideas and feedback. Not sure how to use these createTemporary* yet..

@GrosSacASac
Copy link
Contributor

So my plan is to add a few tests to #855 then merge and publish. This will probably be the last version 3.

And then probably mark 3 as latest. So we can avoid touching v2 like #908 and void having issues that were already fixed in version 3

Then finish #899 which transitions to typescript, so this will be version 4.

@tunnckoCore
Copy link
Member Author

@GrosSacASac Agree.

Seems like people are always just using latest and don't even read or care about anything else. I'm watching the downloads on all versions, and the majority still are on the v2, after the v2.1 publish they started moving to it, but at a very very slow pace.

Note: v3 also need to have CJS. We can use bundt which is a very small and simple regex based converter of import-exports to CJS (does only that), so that we can have both ESM and CJS.

@jimmywarting
Copy link
Collaborator

jimmywarting commented Dec 9, 2022

Note: v3 also need to have CJS. We can use bundt which is a very small and simple regex based converter of import-exports to CJS (does only that), so that we can have both ESM and CJS.

it could be possible to also just create a simple but small cjs wrapper around ESM since the nature of the parser is async anyways...

// index.cjs
module.exports = function (opts) {
  return {
    // import the ESM module lazily when needed, and pass the arguments along 
    parse: (...args) => import('./index.js').then(mod => mod.dafault(opts).parse(...args))
  }
}

Then the code could still be written as ESM and you wouldn't have to duplicate all of your code as a dual cjs/esm package hazard that increases the pkg size?

@tunnckoCore
Copy link
Member Author

@jimmywarting uh, yea. I actually do it similarly here and there. Good point.

@tommyhtran
Copy link

Why not merge the TypeScript PR into version 3?

@jimmywarting
Copy link
Collaborator

Don't think i have much saying in this... but i don't like TS syntax, it dose not run in browsers and never will.
the ESM support and extension-less paths in TS is a hot garbage atm.
I rather much prefer jsdoc + checkjs / allow js combo.
I much more prefer build-free solutions that just works without wasting lots of time compiling stuff that takes time.

you can have equally good type support with plain vanilla javascript. most of the times you don't even need to annotate either javascript or typescript.
let me give you an example. how would you go about using Audio to calculate the duration of a source? you would need a promise to listen on onloadedmetadata
you may write something like this:

function getDruation() {
  return new Promise<number>((rs) => {
    const audio = new Audio()
    audio.preload = 'metadata'
    audio.src = url
    audio.load()
    audio.onloadedmetadata = () => {
      rs(audio.duration)
    }
  })
}

getDruation returns a number
but if we rewrite it a little bit with async/await

async function getDruation() {
  const audio = new Audio()
  audio.preload = 'metadata'
  audio.src = url
  audio.load()
  await new Promise(rs => audio.onloadedmetadata = rs)
  return audio.duration
}

then you suddenly don't need to annotate anything.
there are other more clever ways to have typing without annotating your code.
if you use default parameters in functions then it can also figure out what the types are.

@tunnckoCore
Copy link
Member Author

@jimmywarting agree. Not fan of TypeScript too. Love the explosion and the attention, and how the ecosystem looks, but gosh I can't stand it to write a whole codebase in TS. I think if it's so needed we can expose just a separate d.ts, and when the codebase is drastically simplified it won't be that needed too.

As I said at #899: "I'll review it at a later point, but I don't think a review is much needed. Maybe the only thing that I could ask is to try with separate file approach - e.g. just a typings file that we will distribute with the package. Just to see how it looks like."

@tommyhtran
Copy link

I can understand a few of the frustrations with TypeScript, but maybe it would help to view it as a necessary evil.

As it is, I already found a few coding mistakes in the process of converting the codebase to TS. It sounds like the drawback here is having a compilation step and being forced to write code a little differently. IMO, compilation isn’t a big deal. Taking away the ability to write code more freely is what we need consensus on.

The alternative to the PR is probably going to be the status quo, type-less development in JS and a .d.ts file that may fall behind on any given commit.

@tunnckoCore
Copy link
Member Author

tunnckoCore commented Dec 30, 2022

@tommyhtran agree.

But I'm coming from that that the parser could be hard to be typed, and really there's not much need for that - it's an internal.
The rest is just a big bloat around the parser which can be simplified dramatically just like I did with formidable-mini (i'm playing around), which is around the same parser, but the rest on top of it is just a single simple class which won't be that hard to be written in TypeScript - not like the current one.

The thing is that we obviously (based on downloads stats of older versions) should focus on legacy and backporting things.. which drives out the motivation to work at all. In 2023 I'll figure out how to make users to migrate or upgrade, and why not we can get paid for helping and consulting. I'm very excited about v3, v4 and formidable-mini, but..

@GrosSacASac
Copy link
Contributor

Ok lets have a vote (react with one)

Continue with current situation (types in a separate project) ❤️

Add type defintion side by side to original source files (.js files + .d.ts files) 🚀

Convert project to typescript and have type definition in the source files, use a compiler to have js files on publish 😄

@tunnckoCore tunnckoCore pinned this issue Mar 10, 2024
@tunnckoCore
Copy link
Member Author

tunnckoCore commented Mar 10, 2024

@GrosSacASac i'll move formidable-mini as a separate repo in the org.

Currently it is on https://github.com/tunnckoCore/opensource/tree/master/modules/formidable-mini.

We can start moving tests there, and maybe tag it as v4? It's already API compatible kind of, without the file size options and stuff, but could be used for migrating users. And we can type it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Medium This issue may be useful, and needs some attention. Status: In Progress This issue is being worked on, and has someone assigned. Status: Proposal A feature request or another suggestion. Type: Maintenance Updating phrasing or wording to make things clearer or removing ambiguity, without a functionality.
Projects
None yet
Development

No branches or pull requests

4 participants