Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Generate map of short, unique dataobject identifiers as a system service #9451

Closed
sminnee opened this issue Apr 2, 2020 · 4 comments
Closed

Comments

@sminnee
Copy link
Member

sminnee commented Apr 2, 2020

Problem

Whenever we need a string to uniquely identify a DataObject class, we have a problem — fully qualified classname is often too long and unwieldy, but a class basename might not be unique. We've tripped over this repeatedly (eg #9400), generally saying "well I guess we have to use FQCN" and ending up with verbose subsystems.

Possible solution

Most of the time dataobject names are unique, but sometimes they're not. At some sort of "compile time" (maybe during dev/build or ?flush), build a map ge

A possible algorithm:

  • Start by mapping each FQCN to their their base name, building a cluster for each basename
  • Whenever a cluster has more than one item, prefix the short-name with the next-to-last segment of the FQCN, and split into 2 clusters based on that.
  • Repeat for clusters that still have more than 1 item
  • Stop when all clusters have a single item

The name of the cluster is now the short-name of the single item in it. Such an algorithm would produce the following mappings for the given set of dataobjects:

  • SilverStripe\Ecommerce\Models\Order => Ecommerce\Models\Order
  • SilverStripe\PolicyManager\Models\Order => PolicyManager\Models\Order
  • SilverStripe\Security\Member => Member

"Models" is a bit redundant. You could potentially have a tweak to the algorithm, that when you look at the next-to-last segment of the FQCN, you ignore it in the case where that segment has the same value in all cases. So, for "Order", the next-to-last item is "Models" in all cases, and therefore ignore it. Applying that:

  • SilverStripe\Ecommerce\Models\Order => Ecommerce\Order
  • SilverStripe\PolicyManager\Models\Order => PolicyManager\Order
  • SilverStripe\Security\Member => Member

Details

You might allow the caller of such a service the ability to specify the separator character. So while it's "" above, you could ask for "_" (for a table name) or "-" (for an HTML ID). You could also ask for lowercase, uppsercase, or classcase. For simplicity I would probably have the clustering be generated as case-insensitive in all cases at compile time.

Applications

Implications

These unique names might change when a module is installed, as a previously unique name might be no longer unique and require a prefix. If you have written HTML in a module that relies on a specific value, this might make the module unreliable.

One solution there would be to allow hard-coding of a shortname. The clustering algorithm would need to be configured to ignore any hard-coded classes and disallow those clusters from being used (probably via the same incremental-prefixing that is uses normally). This is essentially what happens for table names.

A much simpler approach

If this all seems like a bit much unnecessary complexity, a simpler approach would be to adopt a pattern of using the dataobject's table_name where we need a short-but-unique class identifier.

@sminnee sminnee changed the title Generate map of short, unique dataobject identifiers as a system service RFC: Generate map of short, unique dataobject identifiers as a system service Apr 2, 2020
@chillu
Copy link
Member

chillu commented Apr 2, 2020

I'm not sure I understand what problem this is trying to solve. We're already fine with the verbosity of FQCN in a code authoring context (supported by IDE autocompletion of course). What makes it different when it's a cache key or unique identifier? Cache key length restrictions? In this case, I'd just use table_name which already needs to be unique for a given project at least.

Note that we have a similar problem in GraphQL query and mutation names (usually by returning a name key in the attributes() method).

I don't want to end up in a situation where we have multiple ways of identifying DataObjects:

  • FQCN
  • Injector name
  • table_name
  • GraphQL name
  • Template shortcut name? Haven't looked into this
  • Unique identifier proposed here

Also, while the use cases are somewhat separate, this overlaps with the concept of Facades in Laravel, which have their own FQCN in the Illuminate\Support\Facades namespaces, but within that are "short and unique".

@sminnee
Copy link
Member Author

sminnee commented Apr 2, 2020

Fair enough maybe this is overcomplicated. I would note, however, that one advantage of this is that you could use it in the some of the other places such as table name and graphql name, where FQCNs are problematic.

@chillu
Copy link
Member

chillu commented Apr 3, 2020

that you could use it in the some of the other places such as table name and graphql name, where FQCNs are problematic

Yeah if we deprecate table_name and elevate that concept to a "short unique identifier" across the ORM, that might fly. But I'm cautious of adding more complexity to a system that's already hard to understand, particularly around the injector naming.

@sminnee
Copy link
Member Author

sminnee commented Apr 3, 2020

Based on Ingo's comments I'm going to withdraw this suggestion for now. Nice quick RFC there :P

@sminnee sminnee closed this as completed Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants