LutaML Path provides a parser for path expressions that reference elements within information models.
LutaML Path implements a path notation similar to the Object Constraint Language (OCL) to locate UML model elements across package hierarchies.
This gem is specifically designed to work with OMG UML models and supports referencing any UML element including packages, classes, interfaces, properties, and operations.
The path syntax is described below for UML.
The UML element path specification extends the OCL 2.4 specification to provide a mechanism for uniquely identifying model elements (classes, interfaces, enumerations, etc.) within the UML package hierarchy. It provides both relative and absolute path references.
An element path can be specified in these forms:
-
Single element:
ElementName
-
Relative path:
Package1::Package2::ElementName
-
Absolute path:
::Package1::Package2::ElementName
The absolute path variant starts with ::
to indicate the path begins at the model root.
Package and element names support full Unicode:
-
All Unicode letters and numbers are allowed
-
Names can contain any language characters including Japanese, Chinese, Korean etc.
-
Case sensitivity follows the source UML model
-
Pattern matching works with Unicode characters
Examples:
// Japanese package and class names
建物::窓::ガラス
::建築モデル::建物::窓
// Mixed language names
building::窓::Window
geometry::図形::円
// Patterns with Unicode
建物::部品*
*部::Base*
Segment wildcards match package hierarchy:
-
Single segment:
Package1::*::Element
matches Element in any subpackage of Package1 -
Multiple segments:
Package1::**::Element
matches Element in Package1 or any nested depth
Name patterns use glob syntax:
-
*
- matches zero or more characters (including Unicode) -
?
- matches exactly one character (including Unicode) -
[abc]
- matches one character in the set -
[!abc]
- matches one character not in the set -
[a-z]
- matches one character in the range (ASCII only) -
{pat1,pat2}
- matches either pattern
Examples:
-
Base*
- matches names starting with "Base" -
*Model
- matches names ending with "Model" -
建物*
- matches names starting with "建物" -
{ドア,窓}
- matches exactly "ドア" or "窓"
path ::= ['::]' package-path element-pattern
package-path ::= [package-pattern '::']*
package-pattern ::= package-name | '*' | '**' | glob-pattern
package-name ::= unicode-identifier
element-pattern ::= glob-pattern
glob-pattern ::= /* glob syntax as described above */
unicode-identifier ::= /* any valid Unicode identifier */
If a package name contains single or double colons:
-
Single colons (
:
) in package names are preserved as-is -
Double colons (
::
) in package names must be escaped as\::
-
Leading
::
for absolute paths cannot be escaped
-
Single element name or pattern matches in any package
-
Relative paths are resolved from current context
-
Absolute paths are resolved from model root
-
Path segments must match patterns exactly
-
Empty segments are invalid
-
Multiple matches are allowed with wildcards/patterns
-
Without wildcards/patterns, first match is used for multiple matches
-
Unicode normalization follows UML model rules
gem install lutaml-path
Or add this line to your application’s Gemfile:
gem 'lutaml-path'
The path syntax follows UML namespace conventions using ::
as a separator:
require 'lutaml/path'
# Simple element reference
path = Lutaml::Path.parse("Package::Class")
# Absolute path (starts from root namespace)
path = Lutaml::Path.parse("::Root::Package::Class")
# Path with wildcards
path = Lutaml::Path.parse("Package::*::BaseClass*")
The parser supports several kinds of patterns:
-
*
- matches any sequence of characters -
?
- matches any single character -
[abc]
- matches any character in the set -
{pattern1,pattern2}
- matches any of the comma-separated patterns
Examples:
# Match any class starting with "Base"
path = Lutaml::Path.parse("Base*")
# Match specific character patterns
path = Lutaml::Path.parse("Package::[A-Z]*::Interface")
# Match multiple alternatives
path = Lutaml::Path.parse("model::{Abstract,Base}Class")
The parsed path can be used to match against actual element paths:
path = Lutaml::Path.parse("model::*::BaseClass")
path.match?(["model", "core", "BaseClass"]) # => true
path.match?(["model", "BaseClass"]) # => false
path.match?(["other", "core", "BaseClass"]) # => false
-
Absolute paths (starting with
::
) must match the entire element path -
Relative paths can match elements at any depth
absolute = Lutaml::Path.parse("::model::Class")
relative = Lutaml::Path.parse("model::Class")
absolute.match?(["model", "Class"]) # => true
absolute.match?(["root", "model", "Class"]) # => false
relative.match?(["model", "Class"]) # => true
relative.match?(["root", "model", "Class"]) # => true
The path expression grammar follows these rules:
-
Path segments are separated by
::
-
The separator can be escaped with a backslash:
\::
-
An absolute path starts with
::
-
Each segment can contain:
-
Regular characters (including Unicode)
-
Wildcards (
*
,?
) -
Character classes (
[abc]
) -
Alternatives (
{pattern1,pattern2}
)
-
When matching paths with escaped colons, the escaped sequences are treated as part of the segment name:
path = Lutaml::Path.parse("model::std\\::string")
path.match?(["model", "std::string"]) # => true
path.match?(["model", "std", "string"]) # => false
# Reference a class in a package
"model::shapes::Rectangle"
# Reference an operation on a class
"model::shapes::Rectangle::area"
# Reference a property in a nested class
"model::university::Student::Address::street"
# Find all classes implementing an interface
"model::*::IShape"
# Match any stereotype application
"model::profiles::UMLProfile::*Stereotype"
These paths can be used to locate elements across UML model hierarchies, making it easier to reference and work with model elements programmatically.