Skip to content

A simple tool to format paragraphs with indentation, indentation character, alignment, padding (left, right, both), padding characters (left, right, both), and in-line whitespace characters.

License

Notifications You must be signed in to change notification settings

vdmeer/asciiparagraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ASCII Paragraph

Resources

Home page

site

Source repository

Github

Maven Central current release

current release 0.1.1

Maven Central all releases

all releases

Java 7 legacy version source repository

GitHub

Java 7 legacy version Maven releases

Maven Central

UTF-8 Howto, older blog

older blog

UTF-8 Howto, updated Wiki

updated SKB Wiki

Installation

Maven

For Maven declare a dependency in the <dependencies> section of your POM file.

Dependency declaration in pom.xml
<dependency>
    <groupId>de.vandermeer</groupId>
    <artifactId>asciiparagraph</artifactId>
    <version>0.1.1</version>
</dependency>

Gradle / Grails

compile 'de.vandermeer:asciiparagraph:0.1.1'

For other build systems see Maven Central

Features

Text paragraph with some flexibility for alignment, format, padding, margins, frames, and additional strings per line:

  • add text, as often as required in many different formats (string, text provider, render provider, ST, clusters),

  • removes all excessive white spaces (tabulators, extra blanks, combinations of carriage return and line feed),

  • 6 different alignments: left, right, centered, justified, justified last line left, justified last line right,

  • 3 different formats: first line, hanging, dropped capital letter

  • flexible width, set for text and calculated in many different ways for rendering

  • padding characters for left and right padding (configurable separately)

  • character for in-text white spaces

  • additional string at the start of each line

  • additional string at the end of each line

  • a frame around the text

  • flexible left/right margins configurable individually by length and character between frame, strings and text

  • flexible top/bottom margins configurable individually by length between frame and text

  • top/bottom/left/right margins outside a frame

  • character conversion to generated text suitable for further process, e.g. for LaTeX and HTML

Concepts

The main concepts are: paragraph, paragraph context, and paragraph renderer.

Anatomy of a paragraph

The figure below shows all spacing characteristics of a paragraph. The outer rectangle (using +, -, and | characters) marks the most outer part of a paragraph. This is followed by top, bottom, left, and right frame margins.

The next rectangle (using the UTF-8 double line characters , , , , , and ) shows an example frame. This is followed by top / bottom text margins and left / right string margins.

Left and right we find the start and end strings that can be set. This is followed by the left / right text margins.

The last rectangle (using the single line UTF-8 characters , , , , , and ) marks the actual paragraph. Here we have the text, plus optional padding and indentations.

+----------------------------------------------------------------------------------------------+
|                                                                                              |
|                                      Top Frame Margin                                        |
|                                                                                              |
|--------╔════════════════════════════════════════════════════════════════════════════╗--------|
|        ║                                                                            ║        |
|        ║                             Top Text Margin                                ║        |
|        ║                                                                            ║        |
|        ║--------+--------+--------┌─────────────────────┐--------+--------+---------║        |
|        ║        |        |        │                     │        |        |         ║        |
|  F  M  ║  S  M  |  S  S  |  T  M  │      Paragraph      │  T  M  |  E  S  |  S   M  ║  F  M  |
|  r  a  ║  t  a  |  t  t  |  e  a  │      text with      │  e  a  |  n  t  |  t   a  ║  r  a  |
|  a  r  ║  r  r  |  a  r  |  x  r  │      padding &      │  x  r  |  d  r  |  r   r  ║  a  r  |
|  m  g  ║  i  g  |  r  i  |  t  g  │     indentation     │  t  g  |     i  |  i   g  ║  m  g  |
|  e  i  ║  n  i  |  t  g  |     i  │                     │     i  |     g  |  n   i  ║  e  i  |
|     n  ║  g  n  |        |     n  │                     │     n  |        |  g   n  ║     n  |
|        ║        |        |        │                     │        |        |         ║        |
|        ║--------+--------+--------└─────────────────────┘--------+--------+---------║        |
|        ║                                                                            ║        |
|        ║                            Bottom Text Margin                              ║        |
|        ║                                                                            ║        |
|--------╚════════════════════════════════════════════════════════════════════════════╝--------|
|                                                                                              |
|                                     Bottom Frame Margin                                      |
|                                                                                              |
+----------------------------------------------------------------------------------------------+

Paragraph

A paragraph is a collection of text strings. The strings are processed as follows:

  • add text to the paragraph, multiple times if required

  • the paragraph will separate each added text using a space

  • for rendering a paragraph, all excessive white spaces will be removed

    • tabulators (converted to spaces),

    • more than one consecutive space,

    • line feed,

    • carriage return, and

    • line feed and carriage return.

Format of a paragraph

Paragraphs can be formatted using a number of special formats. Currently implemented are

  • First line - an indentation for the first line of the paragraph

  • Hanging paragraph - an indentation for everything but the first line

  • Dropped capital letter - a large capital letter for the first character of the first sentence spanning multiple lines

Text alignment

Text in the paragraph can be aligned in multiple different ways:

  • align left (open ended right site)

  • align right (open ended left site)

  • centered (all lines centered)

  • justified (all line justified)

  • justified with last line left aligned

  • justified with last line right aligned

Text padding

All lines will use padding to create a paragraph with equal length of each line. The padding on the left and the right depends on the text alignment:

  • align left: no padding left (all lines bound), padding on the right

  • align right: no padding on the right (all lines bound), padding on the right

  • centered: padding on both sides of each line

  • justified: no padding at all, each line starts and finishes with a word (or single character)

  • justified last line left align: padding only for the last line, on the right site

  • justified last line right align: padding only for the last line, on the left site

The characters being used for padding can be set separately, so that each site of a line gets a different padding character.

In-text white spaces

With all excessive white spaces removed, each line only contains single blanks. The exception to this rule are all justified paragraphs (here extra white spaces are added to give the impression of a justified paragraph).

The implementation allows to change the character used for in-text white spaces from the default (a blank) to any other character.

Start and end strings

Each line of a paragraph can be started and terminated by a specific (different or identical) string. These strings are outside the text area, i.e. no special formatting is done on those strings.

Margins

A paragraph has several margins for the left and right sides as well as for top and bottom. Each margin can be set - the width for let/right side margins and the height for top and bottom margins. Additionally, a character can be set for left/right margins (the same or different characters for each side).

Frame

A paragraph can also be framed. A frame is

  • a line above the paragraph,

  • borders for each line of the paragraph (on the left and right side),

  • and a line at the bottom of of the paragraph.

The frame is set as a frame theme. A number of those themes are provided in the skb-interfaces package. New themes can be created very easily, using ASCII and/or UTF-8 characters.

Summary of configurable characteristics

  • text width (length of each text line)

  • text alignment (for the whole paragraph): left, right, centered, justified (with additional options for last line)

  • text format: first line, hanging, dropped capital letter

  • frame: set a frame around the paragraph

  • start / end string: define a start and/or end string for each line

  • top and bottom margins above a frame (empty lines)

  • margins on the left and right of a frame (number with character)

  • margins between the frame and the start string (left) and end string and frame (right), using different length and character

  • margins between start string and text (left), and text and end string (right), using different length and characters

  • top and bottom margins for the text (including and string margin and string)

  • character converters to convert characters before line generation, i.e. to generate text suitable for LaTeX or HTML

Context

While the paragraph only maintains the text, the paragraph context maintains all configurable characteristics of the paragraph (see above). The current implementation directly has

  • paragraph alignment (default being justified, last line left)

  • paragraph format (default being none)

  • paragraph width (default being 80)

  • an optional library for dropped capital letters (default being not set)

  • an optional theme for a frame (default being not set)

The following characteristics are handled by special objects (one for each), which the context provides access to:

  • indentations (for first line and hanging paragraph)

  • all margins

  • all characters

  • all strings

  • all character (and target) translators

Additionally, the context provides a number of helper methods for rendering

  • different calculations for width, starting with simple text width and finishing with an all inclusive width

  • convenience methods to jointly set margins and characters, for the same left/right or top/bottom pairs

The paragraph can be initialized with a given context or plain, in which case it will create its own context object. Any future characteristics will be added to the paragraph context

Renderer

The actual rendering of a paragraph is realized by special render objects (i.e. it’s not done in the paragraph or its context). A paragraph can be rendered in two different ways:

  1. call the provided render methods on the paragraph object itself

  2. use a specialized render object

No changes are made to the paragraph text or any context settings by any render operation. All required text being processed and calculations being made will happen inside the renderer.

The render methods on the paragraph allow to render it (a) to the width set in the context or (b) to an overall required width. The first option is the most simple one: fill paragraph with text, set width on context, render. The second option can be used by other applications, for instance a table, to get a paragraph of required width.

For any other render operations use the provided standard renderer or create your own render object. The default renderer does currently provide render methods to different width with calculations provided by the context.

Note: coming soon: It also provides render methods that use their own context (i.e. ignore the context set in the paragraph). This allows for extremely flexibility in using the paragraph in many different scenarios.

Getting Started

The standard usage is:

  • create a paragraph

  • add text to the paragraph

  • change the paragraph context (to change its properties)

  • render the paragraph

  • use the created string, e.g. print it to a console or write it to a file

First, create a paragraph.

AsciiParagraph ap = new AsciiParagraph();

Next, add text. Any text can be added, the renderer will process the text (for instance remove excessive white spaces).

ap.addText("line	1");
ap.addText("2  2");
ap.addText("more text with	tab and \n newline");
ap.addText("some more text to get it over the 80 character default width");

Next, render the paragraph. This will provide the text output using the default settings from the paragraph’s context.

String rend = ap.render();

Finally, print the paragraph to standard out.

System.out.println(rend);

The output will be:

line 1 2 2 more text with tab and newline some more text to get it over  the  80
character default width

Examples

The following examples are using the classic "Lorem Ipsum" text as content.

Width

Width of 50, 40, and 30 on the same text.

Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam nonumy eirmod tempor invidunt ut
labore et dolore magna aliquyam

Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore
magna aliquyam

Lorem ipsum dolor sit amet,
consetetur sadipscing elitr,
sed diam nonumy eirmod tempor
invidunt ut labore et dolore
magna aliquyam

Whitespace handling

The paragraph will remove all additional white spaces so that the resulting text has words separated by 1 space. All tabulators, line feeds, and carriage returns will be removed.

"c2  c2"        // string with 1 extra blank
"c3   c3"       // string with 2 extra blanks
"c4    c4"      // string with 3 extra blanks

"t1	t1"                       // string with a tabulator
"t2		t2"               // string with 2 tabulators
"t3			t3"       // string with 3 tabulators
"t4\t\t\t\tt4"                    // string with 4 escaped tabulators

// a more complex construct using StringUtils to add CR and LF
"word followed by " + StringUtils.CR + " followed by" + StringUtils.LF + " followed by \n"

Using left alignment and a width of 60 the rendered output will be:

c2 c2 c3 c3 c4 c4 t1 t1 t2 t2 t3 t3 t4 t4 word followed by
followed by followed by

Conditional line breaks

Use HTML entities <br> and <br/> to render text

"line 1<br>"
"line 2<br/>"
"line three \n still line three"

to

line 1
line 2
line three still line three

In-text white space replacement

Remaining blanks in text can be automatically replaced by other characters

Lorem˽ipsum˽dolor˽sit˽amet,                Lorem—ipsum—dolor—sit—amet,
consetetur˽sadipscing˽elitr,˽sed           consetetur—sadipscing—elitr,—sed
diam˽nonumy˽eirmod˽tempor˽invidunt         diam—nonumy—eirmod—tempor—invidunt
ut˽labore˽et˽dolore˽magna˽aliquyam         ut—labore—et—dolore—magna—aliquyam

Text alignment

Left and right.

Lorem ipsum dolor sit amet, consetetur        Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy                  sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et              eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam            dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam                    voluptua. At vero eos et accusam

Centered and justified.

Lorem ipsum dolor sit amet, consetetur        Lorem ipsum dolor sit amet,  consetetur
   sadipscing elitr, sed diam nonumy          sadipscing  elitr,  sed   diam   nonumy
  eirmod tempor invidunt ut labore et         eirmod tempor  invidunt  ut  labore  et
 dolore magna aliquyam erat, sed diam         dolore magna aliquyam  erat,  sed  diam
   voluptua. At vero eos et accusam           voluptua.  At  vero  eos   et   accusam

Justified last line left and right.

Lorem ipsum dolor sit amet,  consetetur        Lorem ipsum dolor sit amet,  consetetur
sadipscing  elitr,  sed   diam   nonumy        sadipscing  elitr,  sed   diam   nonumy
eirmod tempor  invidunt  ut  labore  et        eirmod tempor  invidunt  ut  labore  et
dolore magna aliquyam  erat,  sed  diam        dolore magna aliquyam  erat,  sed  diam
voluptua. At vero eos et accusam                      voluptua. At vero eos et accusam

Text format

First line indentation and hanging paragraph.

Lorem   ipsum   dolor   sit   amet,            Lorem  ipsum  dolor  sit  amet,
consetetur  sadipscing  elitr,  sed        consetetur  sadipscing  elitr,  sed
diam nonumy eirmod tempor  invidunt        diam nonumy eirmod tempor  invidunt
ut labore et dolore magna  aliquyam        ut labore et dolore magna  aliquyam
erat, sed diam  voluptua.  At  vero        erat, sed diam  voluptua.  At  vero
eos et accusam                             eos et accusam

Dropped capital letters.

ooooo          orem ipsum dolor sit        #        orem ipsum dolor sit amet,
`888'          amet,     consetetur        #        consetetur      sadipscing
 888           sadipscing    elitr,        #        elitr,  sed  diam   nonumy
 888           sed   diam    nonumy        #        eirmod tempor invidunt  ut
 888           eirmod        tempor        #        labore  et  dolore   magna
 888       o   invidunt  ut  labore        ######   aliquyam  erat,  sed  diam
o888ooooood8   et   dolore    magna                 voluptua. At vero  eos  et
               aliquyam  erat,  sed        accusam
diam  voluptua.  At  vero  eos   et
accusam

Left/Right text margin (width and character)

Lorem ipsum dolor sit amet,                                Lorem ipsum dolor sit amet,
consetetur sadipscing elitr,                              consetetur sadipscing elitr,
sed diam nonumy eirmod tempor                            sed diam nonumy eirmod tempor
invidunt ut labore et dolore                              invidunt ut labore et dolore
magna aliquyam erat, sed diam                            magna aliquyam erat, sed diam
voluptua. At vero eos et                                      voluptua. At vero eos et
accusam                                                                        accusam

     Lorem ipsum dolor sit amet,                      Lorem ipsum dolor sit amet,
     consetetur sadipscing elitr,                    consetetur sadipscing elitr,
     sed diam nonumy eirmod tempor                  sed diam nonumy eirmod tempor
     invidunt ut labore et dolore                    invidunt ut labore et dolore
     magna aliquyam erat, sed diam                  magna aliquyam erat, sed diam
     voluptua. At vero eos et                            voluptua. At vero eos et
     accusam                                                              accusam

>>>>>>>>>>Lorem ipsum dolor sit amet,            Lorem ipsum dolor sit amet,<<<<<<<<<<
>>>>>>>>>>consetetur sadipscing elitr,          consetetur sadipscing elitr,<<<<<<<<<<
>>>>>>>>>>sed diam nonumy eirmod tempor        sed diam nonumy eirmod tempor<<<<<<<<<<
>>>>>>>>>>invidunt ut labore et dolore          invidunt ut labore et dolore<<<<<<<<<<
>>>>>>>>>>magna aliquyam erat, sed diam        magna aliquyam erat, sed diam<<<<<<<<<<
>>>>>>>>>>voluptua. At vero eos et                  voluptua. At vero eos et<<<<<<<<<<
>>>>>>>>>>accusam                                                    accusam<<<<<<<<<<

Extra text at each line start and line end

Normal paragraph, added start string "// ", added end string " -→"

Lorem ipsum dolor sit amet, consetetur  sadipscing
elitr, sed diam nonumy eirmod tempor  invidunt  ut
labore et dolore magna  aliquyam  erat,  sed  diam
voluptua.    At    vero     eos     et     accusam

// Lorem ipsum dolor sit amet, consetetur  sadipscing
// elitr, sed diam nonumy eirmod tempor  invidunt  ut
// labore et dolore magna  aliquyam  erat,  sed  diam
// voluptua.    At    vero     eos     et     accusam

// Lorem ipsum dolor sit amet, consetetur  sadipscing -->
// elitr, sed diam nonumy eirmod tempor  invidunt  ut -->
// labore et dolore magna  aliquyam  erat,  sed  diam -->
// voluptua.    At    vero     eos     et     accusam -->

Render to maximum width, calculating additional padding and strings

Render to text width, text - strings, text - strings - margin

Lorem ipsum dolor sit amet, consetetur  sadipscing
elitr, sed diam nonumy eirmod tempor  invidunt  ut
labore et dolore magna  aliquyam  erat,  sed  diam
voluptua. At vero eos et accusam

// Lorem  ipsum   dolor   sit   amet,   consetetur
// sadipscing elitr, sed diam nonumy eirmod tempor
// invidunt ut labore  et  dolore  magna  aliquyam
// erat, sed diam voluptua. At vero eos et accusam

// Lorem  ipsum  dolor  sit  amet,  consetetur -->
// sadipscing elitr, sed  diam  nonumy  eirmod -->
// tempor invidunt ut labore et  dolore  magna -->
// aliquyam erat, sed diam voluptua.  At  vero -->
// eos et accusam                              -->

//           Lorem  ipsum  dolor   sit   amet, -->
//           consetetur sadipscing elitr,  sed -->
//           diam   nonumy    eirmod    tempor -->
//           invidunt  ut  labore  et   dolore -->
//           magna  aliquyam  erat,  sed  diam -->
//           voluptua. At vero eos et accusam  -->

Frames

Examples of a few frames with different frame modes:

┌─────────────────────────┐        ┌                         ┐        ─────────────────────────
│                         │           Lorem ipsum dolor sit             Lorem ipsum dolor sit
│  Lorem ipsum dolor sit  │             amet, consetetur                  amet, consetetur
│    amet, consetetur     │           sadipscing elitr, sed             sadipscing elitr, sed
│  sadipscing elitr, sed  │        └                         ┘        ─────────────────────────
│                         │
└─────────────────────────┘

Code documentation (using frames)

Standard single line, multi-line, and doc comments using frames:

//                             /*                             /**
// Lorem ipsum dolor            * Lorem ipsum dolor            * Lorem ipsum dolor
// sit amet, consetetur         * sit amet, consetetur         * sit amet, consetetur
// sadipscing elitr,            * sadipscing elitr,            * sadipscing elitr,
// sed                          * sed                          * sed
//                              */                             */

Comments for bash scripts, normal and with a double-hashmark variation

#                         ##
# Lorem ipsum dolor       ## Lorem ipsum dolor
# sit amet, consetetur    ## sit amet, consetetur
# sadipscing elitr,       ## sadipscing elitr,
# sed                     ## sed
#                         ##

Comments for LaTeX and HTML

%                         <!--                      -->
% Lorem ipsum dolor       <!-- Lorem ipsum dolor    -->
% sit amet, consetetur    <!-- sit amet, consetetur -->
% sadipscing elitr,       <!-- sadipscing elitr,    -->
% sed                     <!-- sed                  -->
%                         <!--                      -->

Target translator for LaTeX character conversion

Left side w/o and right side with LaTeX target converter:

A sentence with some normal text,          A sentence with some normal text,
not specific to LaTeX. Now for some        not specific to LaTeX. Now for some
characters that require conversion:        characters that require conversion:
# % &. And some more: © § ¤. And           \# \% \&. And some more:
even more: È É Ê Ë. And some arrows        {\copyright} {\S} \currency. And
as well: ← ↑ → ↓ ↔                         even more: \`{E} \'{E} \^{E} \"{E}.
                                           And some arrows as well:
                                           \(\leftarrow{}\) \(\uparrow\)
                                           \(\rightarrow{}\) \(\downarrow{}\)
                                           \(\leftrightarrow{}\)

Target translator for HTML character conversion

Left side w/o and right side with HTML target converter

A sentence with some normal text,           A sentence with some normal text,
not specific to HTML. Now for some          not specific to HTML. Now for some
characters that require conversion:         characters that require conversion:
# % & < >. And some more: © § ¤. And        &#803; &#37; &amp; &lt; &gt;. And
even more: Ē ē Ĕ ĕ Ė ė Ę ę Ě ě. And         some more: &copy; &sect; &curren;.
some arrows as well: ← ↑ → ↓ ↔ ↕            And even more: &#274; &#275; &#276;
                                            &#277; &#278; &#279; &#280; &#281;
                                            &#282; &#283;. And some arrows as
                                            well: &larr; &uarr; &rarr; &darr;
                                            &harr; &#8597;

About

A simple tool to format paragraphs with indentation, indentation character, alignment, padding (left, right, both), padding characters (left, right, both), and in-line whitespace characters.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages