Prolegomenon to Merits and Squalor of Lightweight Markup Languages
Any writing system, far from the media carry it, needs a system for
formatting; Even in antiquity and middle ages copiers and writers used
methods for formatting the written text; Therefore, in codices we can
see using red ink, underlining, using bigger or smaller pens and a few
symbols. In modern time, after IT explosion we tried a couple of systems
of markup languages to formatting texts. For this goal modern human
invent a lot of markups with their rules, tools and file extensions.
Among this artifacts two umbrella concepts are important:
Markup languages, and word processors.
No need to mention that these concepts could be seen as two sides of a
coin, since word processors rely on markup systems. Thus, word
processors are dispensable, in the way that, theoretically speaking,
only after choosing a markup language one could choose a word processor;
although, in reality sometimes the word processor impose its own markup.
I can guess some reader might find discussing
markup languages in details unnecessary and pedantic;
because it may seems a shallow and skin-deep, but it is not. Sometimes
the word processor or the markup language interfere with writer's
thinking, a fragile human phenomenon, and cause
interruptions in the process. Thinking, at its core, is fragile, liquid
and fugitive; one small event demolish our thoughts, in such a way that
the thinker agent would not be able to retrieve it. Does not it
important? Additionally, there is no account for have a lot of
formatting options whose role are dispensable usually. However, I should
be more precise in these points, hence let me define a handful terms. In
the meantime, giving easy-to-remember, verbose definition would not be a
bad idea as sometimes nonsense explanations are flying over the web. A
few month ago I saw a post for
"why HTML is not a programming language?" with six reasons to
prove it! when HTML
had been a
programming language? it has never been. Because always it was
a markup language which the acronym comes from
HyperText Markup Language in which "markup" stated,
while people may forget.
Markup Languages, Lightweight Markups and WYSIWYG
Defining Markup, Lightweight Markup and WYSIWYG (What You See Is What You Get) make the question in hand clear enough to proceed. as the audience of this words is public opinion, I will avoid academic jargon, committing to more understandable style. Thus my favorite definition of these terms are these:
-
Markup Language: a markup language is a system of
rules (perhaps codes and tags) that shape the layout of the document,
format the texts and words, and make it more readable by human agents.
These goals fulfills mostly by wrapping text inside the width of the
screen, differentiate parts of texts by different font typeface,
outstanding titles and their levels by bigger letter and boldness, put
emphasize on words and phrases by boldness,
italicizing, underlining and different colors. Markup
languages are used mostly in opposition to raw text document formats,
or structured-data format languages, etc. A raw text, like a piece of
document in
.txt
file is not a markup, as all parts are in one size, no boldness, no italicization and no wrapping; therefore, if you open the file in a text editor a line with 300 words remain in one single line, so that you have to use horizontal scroll bar to see the left. Or in structured-data format documents reading the data, even when they are just regular expressions, is not easy. -
Lightweight Markup Languages: a lightweight markup
language is a markup language with simplest, unobtrusive, the least
rules and tags as much as possible, i.e. the simplest and smallest
number of rules and tags that are enough for a regular text to be
readable with ease by humans. As human expressions carry meaning in
different ways, a markup language have to give possibilities to all
kinds; note it worthy that emphasis is
meaningful in statements and plays a role. Hence, markups
have to include at least one option for emphasizing. In addition, in
case of any distraction, a markup language must offer rules
and possibilities to help the readers focus on the text and help them
to find the line or word they are looking for. A markup language that
provides these possibilities, with a few others, is lightweight markup
language. Lightweight markups used in contrast to common markups, i.e.
there is not counter-parts like "heavyweight". Common markups which
are not lightweight are for more technical purposes like scientific
papers which need charts, different direction writings, complicated
table, etc; When a certain document needs more complicated formats,
objects and styles a lightweight markup would not be helpful. Thus we
always need markup languages like
.tex
(LaTex),.odt
(Open Document),.xml
(eXtensible Markup Language),.docx
(MS Word),.pages
(Mac Pages), etc. to fulfill these tasks. -
WYSIWYG: this term, which is acronym for "What You
Get
IS What You
See" is not a markup per se, but it describes a software that gets users
commands in graphical user input instead of gets user orders in terms
of written tags (or codes). Putting WYSIWYG next to markup definitions
makes the list heterogeneous, and I am aware of that. The reason is a
number of markup languages like Microsoft
DOCX
(or previousDOC
version), or MacPAGES
are known with their word processor which produces them in WYSIWYG way; as far as they are closed and propriety formats we could take the processor in place of the format. This issue discussed below more.
Here I did not deal with markup languages such as epub
,
pdf
, djvu
, etc. Because the issue for this
discussion is the formats that used by writers at first place, while
EPUB, PDF, DJVU are not used in the first place, but rather, after the
document prepared then it convert to these systems for distribution
among readers.
Lightweight Markups Versus WYSIWYGs
As it mentioned above, Markup languages and WYSIWYG word processors are
not in the same kind, so that listing them together is not correct
technically; one reason stated above: they are known their markup
language which are not open source. Therefore, only the same processor
could create it. But there is another important reason to comparing
lightweight markups with WYSIWYGs:
Lightweight markup languages are ceded and overcome by WYSIWYGs; That
is, people prefer markup languages such as docx
or
odt
due to they could be created by a WYSIWYG word
processor.
as far as we want to deal with lightweight markups, not doing a kind of
Aristotelian classification, such an attitude serve the purpose well.
The Advent of WYSIWYGs
As WYSIWYG (What You See Is What You Get) word processors conquered the 21st century, one might wonder why should we use a lightweight markup language, or any other markup language that whose formatting needs to be write down, instead of using mouse and toolbars; it is a smart question which may guide us to somewhere good. For common people, who are not familiar with many different alternatives to those WYSIWYGs, this question answered simply and naively by expressing phrases like:
There is no good in using something like boring lightweight markups when we have MS Word, Google Docs and even in open source community LibreOffice, Abiword and Calligra.
I emphasized the word "see" in the phrase, because that word might be
the key to understand the situation. Most of the time, people who are
not in favor of lightweight markups, find lightweight markups difficult
to deal with or learn or just seem weird to them. Markdown
,
commonmark
, other Markdowns(Github flavored
,
Extra
, R
, etc), Textile
,
Restructuredtext
, Gemtext
, RDoc
,
Asciidoc
, a couple of other formats, and most recently
Jdot
are in a line when it comes to this specific question.
When we, people who prefer lightweight markups over, let say, MS Word
format, .docx
, it might be understood as a sub-culture for
a community, a kind of behavior which we like to do just to be
different, distinguishing ourselves from our fellow humans; do we?
absolutely not. Although, justifying this point needs a lot of work to
be done before. On the other hand, suppose we are right about the
lightweight markups, why, then, we have not succeeded in that goal and
still lightweight markups is not popular? that would be another question
which must be meet after the first one.
The Road-map
I would divide the discussion into two main section, inn which "Love and Squalor" borrowed from the name of a short story of J. D. Salinjer, "with Love and Squalor":
- the merit and love of lightweight markup languages;
- the demerit and squalor of lightweight markups.
in the first section we look into markups, in particular, lightweight
ones; in doing that we take the merit of them into account. The reason
of repeating "lightweight" again and again is
that comparing other markups such as HTML
,
LaTex
, odt
(OpenDocument),
with other formats needs different attitude which does not apply to
lightweights necessarily. in a markup language like HTML
or
LaTex
there are different issues. thus, it is a wise act to
stick to lightweights. in this section we scrutinize the lightweight
markups and comparing them with
WYSIWYG_s, such as .docx
(_MS Word format). note
it worthy that in this comparison there is a point: lightweight markups
like Markdown, essentially are document formats, and .docx
,
as well; if we compare MS Word with Markdown it is not a good comparison
as the former is an application and the latter is a document format. but
this is just a kind of figure of speech, since a well-shaped
.docx
could be generated only by MS Word. Although other
applications, like LibreOffice, could produce
.docx
files but their production behave in a strange way.
Regarding second section, we will take a look into the dark part of
lightweight markups, i.e. what deprive them from being popular. perhaps
this second part is more important; as many people do not care about
reason but they seek performance, simplicity, and user-friendliness. you
can lecture a group about avoiding one certain thing and get their
approval, but they do not abandon it unless there would be another
alternative
option.