Skip to content

Recording and Tracking Work In Progress: Issue Trackers, Mailing Lists, Wikis, etc

AndyGlew edited this page Jan 12, 2021 · 6 revisions

Many tools for tracking and organizing work

There are many different tools where you can discuss, record, and/or track work in progress. These include

  • issue trackers
  • mailing lists
and probably less common
  • wikis
  • fora such as website forums, newsgroups, notes files
  • shared file systems
  • pull request documentation
Each has their own pros and cons. Many projects use more than one. Here (Recording and Tracking Work In Progress: Issue Trackers, Mailing Lists, Wikis, etc I discuss them, and espouse my personal BKMs.

Need Terminology

I wish I had a good commonly recognized term for this sort of thing, but covers mailing lists, mailing list archives, wikis, other forums, etc.

I often use the term "Personal Information Management (PIM)". but that's personal, although most people seem to recognize.

A term with the same acronym is "Project Information Management (PIM", but my experience is that fewer people are comfortable with that term.

"Information Management" is more generic, but often has a more specific meaning, particularly in the " Information Retrieval (IR)" community.

Most of these systems need to have an element of "hypertext", but that's a particular technology used to implement such a system. Some systems are not hypertext, I.e. some systems do not have links, at least do not have links that are automatically understood by whatever software you are using. IMHO that is suboptimal, but it exists so "hypertext" is perhaps restrictive, particularly in a historical sense. Email especially frequently does not have links, in particular not links to other email messages.

I am sometimes here saying email/issue/wiki but this is particular software. More generically "discussion/tracking/reference", but again it is not a term that is obvious to most people. plus, there are some categories of information that we would like to track, such as calendars, that are not obviously covered by these terms.

So: I don't have a good name, but I'm just gonna write this up anyway.

Table of Contents

Table of Contents

discussion versus tracking (issues and resolution) versus reference

  • Discussion*: participants in a project discuss aspects of the project. e.g. in meetings, email, use groups, website for - and also in tools that are not necessarily the best such discussions. It is great to archive discussions, e.g. to automatically save email in a web.accessible place. but such archives aren't good for everything. Discussions are great. However, discussions meander and fork. witness how often subject lines do not correspond to the contents of the message. In a mailing list it is often hard to figure out what the final resolution of an issue is. Even if you have a convention like sending out email with the title "RESOLVED: ...", frequently there will be responses to such emails that often have nothing to do with the resolution. Moreover, there will frequently be multiple responses.
  • Issue Tracking*: Projects often need to track particular issues, and track whether they have been resolved or not. While you can start discussions about issues in tools optimized for discussion, such as a mailing list, it is good to have a special purpose tool like an issue tracker where you can see a list of all issues, whether they have been resolved or not, who is assigned to work on something, whether they had been deferred to the future, etc.
BTW, I much prefer "issue tracking" to "bug tracking". sometimes issues are not bugs, but for the most part all bugs are issues that need to be resolved, although frequently the resolution is trivial.

Discussions (email, newsgroups) and Issues (JIRA, etc.) need weak names. most of us don't even think about email items as having names, although of course they need unique identifiers in the email software. Subject lines are great, but subject lines are far from unique. Issue trackers typically assign issue numbers sequentially, albeit possibly in certain groups or topics. Issues can have subject lines, but again they might not be unique.

But there is more to projects than just issues and resolutions. There is also *reference* material. There are FAQS ( Frequently Asked Questions). There is rationale. Long discussions in email or issue strings may be summarized or boiled down. Just like issues you want to find the final resolution, in such reference material you may want to find just a summary of the discussion, not the long history of emails.

  • Formal documentation and specifications are the ultimate references. Or at least they should be. By this I include both drafts and actual published, ratified, accepted via various standardization group, documentation. However, formal documentation using techpubs tools such as LaTeX or AsciiDoc can be a real pain to maintain. Often a fairly expensive build process is necessary to create the reference from its component files. Often the build process and/or techpubs tools require debugging. I.e. overall formal documentation in things like LaTeX and AsciiDoc is often fairly heavyweight. Even a single maintainer may not want to do it more than once a week or so, although it's great when you've got tools that automatically regenerate things like the PDF publication format from the LaTeX or AsciiDoc every time an edit is made. Formal docs are usually best managed with one or a few editors per document, or at least per section or chapter. multiple contributors may submit changes.
Shared file systems are also reference tools. e.g. you can upload presentations or benchmark reports. however, shared file systems quickly become a collection of randomly named objects. Their organization, their relationship, can be hard to observe. Shared file systems typically want strong names: each object needs a unique name. People adopt naming conventions like embedding dates or usernames in the filenames, but these conventions change over time. You can use directory structure to provide some organization, but that can be hard to maintain. symlinks, etc., can provide more flexibility, but they are also hard to maintain.

Wikis are IMHO the quintessential lightweight reference tools. Both for community oriented discussions, and also for projects. They are lightweight, you can edit them very quickly, much more quickly than you can edit a formal specification or document. They can quickly be reorganized. There are many examples of wikis that have been edited by many members in the community, ranging from the [original](https://wiki.c2.com) to Wikipedia. but you can also arrange them to have the editor/publisher model.

mailing lists

Mailing lists are probably the least common denominator. Pretty much everybody has access to email nowadays. (Although believe it or not there are some people who have web.access, but who do not have email.access. Almost equivalently, there are some people who find that there is so much spam and other wastage on email that they no longer use it.)

Let's assume that all email is archived, and can be searched. let us further assume that the email archives are append only, for the most part not edited or subject to deletion.

The problem with mailing lists is that they were real pain to search. It is often hard to figure out what the final resolution of something is. Sure, there are search commands. idiosyncratic search suboptimal, but many sites have generic search, like site-specific Google.

Searching is made more painful by the habit of quoting an entire discussion. Often this results in multiple hits for the same item, requiring mental effort to disambiguate. At the very least your search engine should be able to ignore such quotes if the content appears earlier in the email thread that your search result has popped up - but not to ignore them in other cases.

Email discussions meander (as do nearly all discussions). witness how often subject lines do not correspond to what is being discussed.

Email archiving is great. but... At the very least it should be possible to link to an email in the archive from (a) other emails, (b) issue trackers, and (c) references such as wikis. One of the problems with separating email from email archives is that often you do not know what such a link is available at the time you read your email. You have to go from your email reader to the archive software to discover what link is. That extra step is friction, and makes it harder to use links. This is one of the reasons why people have a habit of quoting the entire email they are referring to.

Email (archives) typically reflects a history. We do not/should not go back and edit the email archives, or else the historical view is lost. but the historical viewpoint is not the only viewpoint we want for project documentation

issue trackers

Issue trackers like JIRA, bugzilla, and the GitHub issue tracker, do a better job than email of (a) ensuring that things don't fall through the cracks (b) summarizing final decisions, etc.

Issue trackers nearly always a part of the database or at least searchable, so you can find out what bugs are high-priority, low priority, what has been fixed and what has not been fixed, etc.

Issue trackers nearly always have discussions or threads, items posted or otherwise attached to an issue. These are really discussions, almost like email. and, unfortunately, like any email system they often meander, become voluminous, and so on.

Some issue trackers are append only. E.g. you cannot modify the issue description after you've posted it. And you can only post new things, append them to the end of the issues sequential discussion list. This is suboptimal.

One of the most important things at issue trackers need to do is that they need to make it possible to find out not just that an issue has been resolved, but what that resolution is. Adding it to the end of the discussion list might seem to be the answer, but only if the issue is immediately closed. Often times the discussion continues, especially if there are bugs in what you hitherto thought was the final resolution. But also if there is a partial resolution - an issue has three subcomponents, and only one is resolved.

It is therefore important for issue trackers to be able point to the current resolution or plan of record. this can be done in many ways

  • e.g. if you have append only items, you might still have a pointer or link to the current resolution item(s)
  • there might be a special item in which the current resolution(s) are described
  • if you don't have any special support for resolutions, it might be possible to edit the original issue discussion and add the final resolution there (and to edit it further if that final resolution turns out not to be the final resolution)
Similarly, although perhaps less important, it can be useful to be able to edit the original issue description. often times we figure out more aspects of an issue. Often times an issue begins as a simple bug report, and then analysis figures out what the actual code problem is.

Note: while it is important to be able to have the latest and greatest description of and resolution for an issue, it's also great to have the history. At the very least in a version control system. If not that, keep the sequential discussion.

Observe: whereas email is nothing except a sequence of historical items, issue trackers may have such a sequence in the discussion, but also have summaries of the latest and greatest. The hopefully final or at least test so far resolution. Nearly all issue trackers have status, such as complete. IMHO it is best if issue trackers also have latest and greatest text descriptions, although not all do.

Most issue tracking systems assign issue numbers sequentially, although perhaps in several domains, possibly even hierarchically nested. issues can be given names, but those names may not be unique; and keeping them unique may lead to very clumsy names, so issue numbers are frequently the most convenient human accessible tracking identifier.

wikis

Wikis are lightweight reference tools. they are, or at least should be, easier to edit than formal documentation. (Should be - the term wiki is inspired by the Hawaiian term fast. if it is not fast, it is not a wiki.)

Wikis can be edited by a community, but they need not be

Wiki pages are usually referenced by human readable names. As usual, ideally in a nested hierarchical namespace, although that is unfortunately uncommon. (E.g. Atlassian Confluence, probably the most widely used commercial with you the moment, only has non-nested namespaces. ditto Wikipedia.)

IMHO one of the most important things that make a wiki a wiki is that you can easily create references, including references to pages that do not exist yet. the references typically use a human readable page name, although it should also be easy to decouple the text that is displayed in the wiki page from the name of the page that it links to.

Because wiki pages use a human readable page name, renaming pages is often an issue. some wiki systems give you the option of doing a global search/replace when you change a human readable page name, but that doesn't help links that are outside the scope of that search/replace - e.g. in email, or your own notebook. Unique identifiers for pages augmenting the human readable name, numeric or hash, are more resilient. WYSIWYG wiki systems may hide that from the user, but the unique identifier may need to appear in a URL if web.accessible, although perhaps hidden in HTML ..

Because wiki pages are human readable they are more discoverable than email archives and even issue trackers. Wiki users can often successfully guess what a page name should be. Especially if they have name completion. wikis that have no hierarchy, however, are less discoverable than filesystems or wikis with hierarchical names. hierarchy improves discoverability.

Unfortunately, wikis can often degrade and meander in much the way that email archives and issue trackers meander. *wikis need maintenance*, organization and reorganization. You don't talk about reorganizing an email archive. Issue trackers have a relatively flat, sequential, namespace. but wikis can be reorganized, just like any document. A wiki can have a logical organization that links to both email archives and numbered issue tracking items.

IMHO everything needs History tracking. I.e. version control. Wikis like anything else. but whereas email archives are primarily organized from a historical viewpoint, wikis are not. Wikis should be primarily organized in a logical manner. Issue trackers are a bit of both.

Other

Newsgroups, Web.Forums versus Email

There are other tools optimize for discussions, including

  • USENET newsgroups,
  • the original UIUC PLATO notesfiles (arguably the first online communities)
  • many, many, many website forums
These are characterized by weak naming, and a primarily historical point of view, although frequently there is threading, whether by subject or by " in response to" links.

Email is currently the least common denominator, although in the early days of net, before the World Wide Web, use groups probably were.

Email is not always archived in a broadly accessible manner. If email is not archived, then it is hard for a new user to go back through history.

Email is fundamentally point-to-point, with mailing lists as multicast. Newsgroups, etc., are fundamentally multicast. Nearly all newsgroups are archived in some manner, or at least have some recent history.

If email is not archived, there is no way to link to it. Even if email is archived, the link may not be apparent in the email that people receive. It may be necessary to go to the separate email archive tool to discover a link. That's a pain. Ideally the mailing list manager will also embed at least a link to the archive system, and ideally a link to the individually archived email item.

Tools such as newsgroups usually make linking more first-class.

Website forum make linking intrinsic: you can't be viewing a forum page on a website unless you have the URL to it. However, some website managers do not make such URLs persistent, i.e. they do not have permalinks.

All discussions for and meander. witness how often email subjects do not correspond to the message contents. Newsgroups have the same problem. UIUC Plato notes files tried to limit the damage of such divergence, by preventing arbitrary branching. A particular primary post could only have one sequential list of replies - and in some configurations a branch off that main branch. But higher degrees of branching required creating a new post. also, Plato permitted subjects to be changed, and subtrees (for the limited branching) to be extracted. i.e. Plato allowed some post factor reorganization, although it did not permit editing of content once posted[*].

Blogs and Comments

Blogs are another mixture. Blogs are typically maintained by a single user or a small community. however, most blog software has a sequential discussion strain of responses attached to a particular blog post.

Using More Than One

Any given project may use more than one such technology at the same time. E.g. projects will typically have mailing lists, and also issue trackers. Similarly, projects may also wikis and shared file systems and... Often times more than one wiki; more than one email distribution system.

Since all of these technologies have overlapping capabilities

  • e.g. discussions may be done both on email lists and in an issue tracker
  • plans of record may be recorded in an issue tracker (best) but also in a wiki
    • and worst-case in email and nowhere else
It can be very easy to get lost. Somebody may look for a historical discussion of an issue in an issue tracker when it is in fact in a mailing list.

Each such tool has their own search facilities (we may hope) but often these serve facilities are not integrated.

But it is not just a question of searching. Users also need to browse, when they do not know what they should be searching for.

Perhaps the best thing to do is to create an integrated search facility. e.g. make them all accessible to Google, ideally so that a single search qualification like site:riscv.org will find all of the email archives and issue trackers and so on. Failing a single site ( which riscv.org has definitely not done), good least document what multiple sites need to be searched

   SEARCH-TERM (site:riscv.org or site:github.com/riscv or site:lists.riscv.org or ... organization Google  or other cloud file system... ).  

It should be possible to create a web front end doing that.

But of course, people prefer to be able to use tools they already know, not just the site-specific tool you've already created. especially if you're organization/site/project specific tool is not as capable as the tools people already know, e.g. if it does not admit all Google or being or DuckDuckGo advanced search syntax.

Failing integrated search a useful step is to forward things like updates to the issue tracker or wiki to the mailing list, which is itself archived. At least then everything ends up archived in the same place.

However, users often do not want to see every change to an issue tracker. They may wish to only see particular issues. Issue trackers typically have subscription systems, but people have to go and manually set them up.

One approach to combining the benefits of automatically forwarding changes to issues and other material to a mailing list is to create an archive only version of the mailing list. a version that does not forward everybody, but which saves it.

Stepping back, we can see that the problem is that searching and archiving are actually orthogonal to the email/issue/wiki/what ever technology. We certainly want unified searching and browsing, although frequently do not have that. Unified archiving, archiving in the same place or it least in a related or adjacent place, is the next best thing. Forwarding to an email list accomplishes this because email is in many ways the least common denominator.

      1. Unified linking ###
Above we have discussed unified searching, browsing, and archiving.

And aspect that is often neglected is linking.

It should be possible to easily and permanently link between all of the different technologies: email/issue/wiki, discussion/tracking/reference.

Easily: e.g. it should be obvious what a link is, ideally in an email header

Permanently: permalinks, need I say more. Note that permit links nearly always require a unique numeric or hash identifier, possibly in addition to a human readable name. If a link uses only human readable name, ideally history can map it to a permalink identifier.

Discoverability is a third aspect, often neglected. a linking system that uses only numeric identifiers (like Google drive) makes it hard, tactically impossible, to find similar documents. Hierarchy helps: if either the URL or pathname indicates a parent directory or folder or category that you can link to, or if the object itself contains a link to similar categories.

Linking, Discoverability and browse ability are very much related.

Linking means that you can establish a link, although the link may not be human readable.

Discoverability and browse ability mean that you can find adjacent information in a human friendly manner.

Content (> text)

Many innovations in such systems have occurred restricting themselves to text. E.g. the very first wikis were purely textbased. Next comes uploading files and linking to them. Far fewer systems allow copy/paste of things like bitmap images or vector drawings into their content.

Uploading files and linking is a minimal capability, but it has its shortcomings. At the very least it requires a few extra steps. But also it requires the user to manage uploaded filenames. Which themselves frequently decay, in the sense that links to the may be lost, naming conventions may be lost, etc.

Object embedding can be very useful - with copy/paste or at least drag-and-drop - the user does not need to create a name or link. but obviously text-only systems have trouble with this.

E.g. GitHub wiki and issues cannot do embedded objects. They can link to uploaded files. One of Atlassian Confluence's best advantages as a wiki is that it handles embedded objects. Last time I checked Atlassian JIRA does not handle embedded objects. Because of that limitation, I often created a wiki page in Confluence corresponding to issues that were best described as lots of small bitmap images with the errors in the documentation highlighted. Much faster than describing in ASCII text what the problem is; faster than uploading and linking, especially when there are lots of small subproblems. And you can always have a JIRA or bugzilla issue link to such a wiki page. However, they don't live in the same system, so you can clone a repository but such information may actually disappear. :-(

One of the advantages of using email archives as the backing storage system is that most modern email system support HTML, and many support embedded objects, at least mime types.

Unfortunately, some users are still using non-HTML email. And even HTML email is inconsistently supported: many users report that they cannot read the HTML that Microsoft Outlook produces, even though they read most other HTML email.

Furthermore ... one must always be aware that embedded objects have security implications. One might hope that bitmap objects are relatively secure, there have been bugs. Certainly embedded PDFs and power points and ... have had plenty of security bugs. IMHO we must support advanced datatypes and embedded objects, but it's completely reasonable to ask "are you sure" before you display or otherwise interrogate such an email with embedded objects. God help us when just the mime header browser has security bugs which has happened :-(

Clone this wiki locally