Skip to content

Regular Expression

Meghna J edited this page Nov 9, 2023 · 1 revision

Design of the regular expression:

Field: name

^(?i)(?:(?:[a-z’-]+(?:\\s+[a-z’-]+)*),?\\s+[a-z’-]+(?:\\s+[a-z’-]+)*(?:\\s+[a-z’-]+)?)|(?:(?:[a-z’-]+(?:\\s+[a-z’-]+)*),?\\s*[a-z’-]+(?:\\s*[a-’-]+)*[’’]?[a-z’-]*(?:\\s+[a-z’-]+)?)$\n

Explanation:

  • ^ : Match the start of the string
  • (?i) : Set case-insensitive mode
  • (?:(?:[a-z’-]+(?:\s+[a-z’-]+)*),?\s+[a-z’-]+(?:\s+[a-z’-]+)*(?:\s+[a-z’-]+)?): First name format, where:
    • (?:[a-z’-]+(?:\s+[a-z’-]+)*) : Match one or more lowercase letters, apostrophes, and hyphens, optionally followed by one or more whitespace characters and more letters, apostrophes, or hyphens
    • ,? : Optional comma
    • \s+ : One or more whitespace characters
    • [a-z’-]+(?:\s+[a-z’-]+)* : Match one or more lowercase letters, apostrophes, and hyphens, optionally followed by one or more whitespace characters and more letters, apostrophes, or hyphens
    • (?:\s+[a-z’-]+)* : Optional additional names, each preceded by one or more whitespace characters
    • (?:\s+[a-z’-]+)? : Optional additional name at the end, preceded by one or more whitespace characters.
    • | : Or operator
  • (?:(?:[a-z’-]+(?:\s+[a-z’-]+)*),?\s*[a-z’-]+(?:\s*[a-z’-]+)*[’’]?[a-z’-]*(?:\s+[a-z’-]+)?)
    
    : Second name format, where:
    • (?:[a-z’-]+(?:\s+[a-z’-]+)*) : Match one or more lowercase letters, apostrophes, and hyphens, optionally followed by one or more whitespace characters and more letters, apostrophes, or hyphens
    • ,? : Optional comma.
    • \s* : Zero or more whitespace characters
    • [a-z’-]+(?:\s*[a-z’-]+)* : Match one or more lowercase letters, apostrophes, and hyphens, optionally separated by whitespace characters.
    • [’’]?[a-z’-]* : Optionally match an apostrophe followed by zero or more lowercase letters, apostrophes, or hyphens.
    • (?:\s+[a-z’-]+)* : Optional additional names, each preceded by one or more whitespace characters.
  • $ : Match the end of the string

The regular expression above okays the validation of names in the below format:

  • Bruce Schneier
  • Schneier, Bruce Wayne
  • O’Malley, John F.
  • John O’Malley-Smith
  • Schneier, Bruce
  • Cher

Field: phoneNumber

^(?:\\+?\\d{1,3}\\s*)?(?:\\(\\d{3}\\)|\\d{3})[-.\\s]?\\d{3}[-.\\s]?\\d{4}$

Explanation:

  • ^ : Match the start of the string.
  • (?:\+?\d{1,3}\s*)? : Optional country code and space(s), where + matches the ”+” character, \d{1,3} matches 1 to 3 digits, and \s* matches zero or more whitespace characters.
  • (?:\(\d{3}\)|\d{3}) : Three digits enclosed in parentheses or three digits without parentheses. The | is an ”or” operator that matches either expression on either side of it.
  • [-.\s]? : Optional separator character, where [-.\s] matches any one of ”-”, ”.”, or whitespace characters, and ? makes the character optional.
  • \d{3} : Match three digits.
  • [-.\s]? : Optional separator character, as explained above.
  • \d{4} : Match four digits.
  • $ : Match the end of the string.

The regular expression okays the validation of numbers in the following format:

  • 12345
  • (703)111-2121
  • 123-1234
  • +1(703)111-2121
  • +32 (21) 212-2324
  • 1(703)123-1234
  • 011 701 111 1234
  • 12345.12345
  • 011 1 703 111 1234

Assumptions

The regular expressions are guided by the following assumptions:

  • Names only contain lowercase and uppercase letters, hyphens, and apostrophes.
  • Other characters are not allowed in the regular expression.

Cons of the Approach

Names that incorporate characters other than the ones specified above are flagged as invalid.