In the past three decades, I have spent most of my writing time in typing LaTeX for academic papers, research presentations, assignments and even examination papers. As is the natural inclination for computer scientists, I have been improving my workflow for many years so that I can maximize the productivity while minimizing the labour work. Here is the LaTeX workflow with which I feel comfortable.

LaTeX Foundations

There is no need to emphasize more on the convenience of using LaTeX for academic writing. If this sounds geek to you, you need to read some foundational materials such as WikiBook.

For the workflow related to LaTeX writing, I recommend the following cross-platform packages:

  • LaTeX distribution: TeXLive! or online one such as Overleaf
  • TULIP-Lab LaTeX Packages: GitHub Templatex, including powerdot-tuliplab which can be installed from CTAN
  • *.tex editor: TexStudio, or Visual Studio Code with extensions such as LaTeX Workshop, Markdown All in One, etc.
  • *.bib editor: Zotero with Better BibTex
  • git repository: GitHub private repository, Bitbucket or Overleaf (which I use mainly as another git server, for collaboration when no local LaTeX distribution installed). All those can co-exist for a single paper, as remote repository. For example, Overleaf can link/push/pull with a git private repository.
  • git client: SmartGit free for non-commercial use
  • Changes Tracking: LaTeXDiff and Git-LaTeXDiff, for version difference extraction and visualization

Zotero & Better BibTex

A more detailed coverage on Zotero can be found at: Tools Zotero

Zotero is a free and open-source reference management software to manage bibliographic data and related research materials. Especially you can use it to manage all those PDF files you collected. It provides notable features include web browser integration, online syncing, generation of in-text citations, footnotes, and bibliographies.

Better BibTeX (BBT) is an extension for Zotero that makes it easier to manage bibliographic data, especially for people authoring documents using text-based toolchains including LaTeX. For LaTeX users, it can automatically export and generate citation keys consistently, and also can be customized. You can use a consistent or the default configuration when exporting items into a bib file.

Grammar/Spelling Checking for LaTeX

ASpell

Many excellent spell checkers for the LaTeX are available, such as ASpell, which can be incorporated into the compilation process. Before you compile you can do

aspell -t -c report.tex

It lets you interactively spell check the whole file. The -t option is to tell the spell checker that the file is in TeX or LaTeX format so that it will ignore macros.

If you prefer to simply get a list of misspelled words non-interactively, you can run:

cat file.tex | aspell list -t | sort | uniq

Grammar Checking Tools

Queequeg

Queequeg is a simple command-line tool aimed exactly at finding concordance errors in English. It works with LaTeX sources out of the box.

qq -t mainbody.tex
TeXtidote

TeXtidote is a command-line tool based on LanguageTool so that it can work on LaTeX files.

java -jar textidote.jar --html mainbody.tex > checkreport.html

Polymorphical Editions

As a researcher, you might need to prepare different editions of the same technical report for different submitting channel: the conference, the journal, etc.

You should be aware of the difference between two terms:

  • Revision: One particular state of the files in a project, positioned along one or more lines of the development history
  • Edition: One particular variation which has a purpose in being different from other variations, such as the conference edition, the journal edition, the technical report edition etc.

Notice that I have avoided using the word version, which is too ambiguous.

The method to maintain multiple editions is to create a configurable master file, and to include it from a series of top-level files, one for each Edition that is required. For an illustration of general principles for making polymorphic editions, we can use the following example using Viktor Eijkhout's comment package and Hendrik Mittby's todonotes package. Suppose three editions with 4 TeX files are:

  • The conference edition: with some block for conference only
  • The journal edition: with some block for journal only
  • The report edition: with all blocks

Conference.tex:

%================================================================= \usepackage{comment} %================================================================= % \excludecomment{JournalOnly} \includecomment{ConferenceOnly} % %================================================================= \input{mainbody} ...

Journal.tex:

```latex ... %================================================================= \usepackage{comment} %================================================================= % \excludecomment{JournalOnly} \includecomment{ConferenceOnly} % %================================================================= \input{mainbody} ...

Report.tex:

... %================================================================= \usepackage{comment} %================================================================= % \includecomment{JournalOnly} \includecomment{ConferenceOnly} % %================================================================= \input{mainbody} ...

mainbody.tex:

... \usepackage{comment} \begin{ConferenceOnly} We have \SI{10}{\hertz}, \si{\kilogram\metre\per\second}, the range: \SIrange{10}{100}{\hertz}. $\nicefrac[]{1}{2}$. \missingfigure{Make a sketch of the structure of a trebuchet.} \end{ConferenceOnly} \begin{JournalOnly} This is a paragraph which is only available in journal edition, and the conference one will not include it. \end{JournalOnly} ...

Collaborative Writing in Git + LaTeX

Git is mainly used as the repository for code. Considering that LaTeX is a kind of scripting, it is a natural choice to put your writing under version control systems such as git. You may wonder what are the advantages of this over GUI tools such as Microsoft Word. I can quickly list some of the things that I find not easy to do in Microsoft Word:

  • Maintain a full editing history, and quantify the line by line contributions from co-authors;
  • Tag, Release and Branch the project;
  • Easily creating a PDF with revision details;
  • Adding comments by hiding the source in the LaTeX source;
  • Multiple authors writing at the same time, on the same file!

Setup Git Repositories

Optional: One Repository in local texmf tree For All Papers

A local texmf tree is for putting various package-like-artifacts, that are not proper packages managed through your package manager (e.g. Miktex or TeXLive). How to setup a local tex tree is beyond the scope of this answer but it isn't hard: for Windows, most likelty the home directory under ~/texmf/tex/; for macOS, please put it under /usr/local/texlive/texmf-local. For MkTex on Windows, you can run mktexlsr; For TexLive you can type the following command:

cd sudo texhash

The local texmf tree means that you never have to use absolute, or even relative paths for your bibliography bib files. It is also very useful as many conferences and journals distribute their own templates and styles not through CTAN, but just as .sty and .bst files.

Put the .bib in local texmf tree

At our team, we have several common bibliographic files in the BibTeX format: tuliplab.bib, deeplearning.bib, tourism.bib and hospitality.bib etc. They reside in our common texmf tree in the subfolder /bibtex/bib/. Hence, all members can specify the bibliography by only using the file name (such as \bibliography{tuliplab} without the full path) --- no matter where your working copy of the common texmf tree is located.

Add local texmf tree to git

Once you have created that tree, add it to git. Then when ever you work on a new computer check out the repo from git, and tell your TeX distribution to know about the local texmf tree.

One Repository Per Paper

There are many free or public available platforms for Git, such as GitHub or Bitbucket. You need to create one new repository in your chosen git platform, and initialize it with all the necessary .tex and .bib files. We recommend the following folder structure:

├── .git # Git folder automatically created upon git checkout. │   ├── ... │   ├── gitHeadInfo.gin # auto generated git information │   └── hooks # store those triggers so that gitinfo can be auto generated ├── Data # contains the training data, test data, validation data sets ├── Code # contains the source code, script, or Python notebook └── Report # tex, bib files    ├── figures # subfolder    ├── report.tex # working draft master file, you should use this for CS papers    ├── report-htm.tex # working draft master file, you should use this for HTM papers    ├── report-lncs.tex # template master file for Springer conferences LNCS format    ├── report-ieee.tex # template master file for IEEE conferences format    ├── report-acm.tex # template master file for ACM conferences format    ├── mainbody.tex # mainbody TeX file for your paper    ├── tuliplab.bib # BIB file for tuliplab publications    ├── yourbib.bib # BIB file for your own research paper's references    ├── slides.tex    ├── poster.tex    └── ...

For version information, you need to use the following package:

  • GitInfo2: This package requires some configurations under your project's .git/hooks folder, and after that it will work like a charm.

    You can test it by checking out or pulling/pushing your repository, and it should generate/update the file ./git/gitHeadInfo.gin in the local project repository, such as ~/MyFancyPaper/.git/gitHeadInfo.gin.

    Here is one example of using GitInfo2 in the LaTeX file.

  • LaTeXDiff and Git-LaTeXDiff: LaTeXDiff will take two LaTeX files and produce a formatted difference output in PDF, while Git-LaTeXDiff combines git and LaTexDiff together such that you can easily view the difference between two committed versions.

    git latexdiff HEAD~N # diff between your work tree and the N commits back git latexdiff HEAD~1 --main MyFile.tex # can specify the main tex file git latexdiff HEAD~3 HEAD~5 --main MyFile.tex # diff between the 3rd and the 5th commits back

    Here is one sample output of the difference PDF:

LaTeX git workflow

Git-Flow

If you didn't use branches sufficiently, you can use git in the old fashion of svn. However, branches allow much more flexible development of the project, and you should learn it from here.

Once you have some general knowledge of git branches, I recommend to use the Git-Flow, a branch-based workflow that supports teams and projects where deployments are made regularly and development is continuous. A good tutorial on it can be found at Bitbucket.

Typically, you will have two major kinds of branches:

  • master branch should be treated as the main flow of your work, and it should be always production ready, and in a ready to submit state.
  • develop branch is parallel to the master branch, it is the main branch where the source code of HEAD always reflects a state with the latest delivered development changes for the next release.

Next to the main branches master and develop, the development model uses a variety of supporting branches to aid parallel development between collaborators, ease tracking of features, prepare for submission and to assist in quickly fixing problems.

  • Feature branches: branch off from and merge back into develop branch. They are used to develop new subsections (features) for the paper.
  • Release branches: branch off from develop and merge back into develop and master. When the state of develop is ready for a submission, we branch off and give the release branch a name reflecting the submission, such as IJCAI2018.
  • Hotfix branches: branch off from master and merge back into develop and master. It is typically used to fix those identified issues existing in all existing and working branches. So we use it when revising based on the review comments of the submission.

LaTeX + Git-Flow

The main idea is that we will create one repository with different subfolders for one project (paper), and use the git-flow and tagging to develop the paper. The following figure illustrate the commit history of one paper repository.

Rules for Collaborative LaTeXing

You MUST follow the following rules when writing collaboratively. Otherwise, your co-author will find it impossible to work together.

Rules for LaTeX Source Code

  1. Avoid ineffective modifications.
  2. Do not change line breaks without good reason.
  3. Turn off automatic line wrapping of your LaTeX editor.
  4. Start each new sentence in a new line.
  5. Split long sentences into several lines so that each line has at most 80 characters.
  6. For all the marginal comments by other co-authors, you MUST responed to them using a responsive marginal comment, such as \gangli{What is wrong here?} \qwu{This is my response ...}
  7. If your comment has been properly addressed by co-authors, you MUST remove or comment out both the original and the response marginal comments.
  8. You can customize related blocks and use the following code to indicate where you are updating up to: \gliMarker %TODO: YourName up to here!

Rules for LaTeX+Git Workflow

  1. Put only those files that are directly modified by the user under version control.
  2. Frequent commits of minor changes: add a meaningful and descriptive comment when committing your modifications to the repository
  3. Don't commit a broken version: verify that your code can be compiled flawlessly before committing.
  4. Use git-latexdiff or latexdiff (or git compare tools) to critically review your modifications before committing them to the repository.
  5. Use the git client for copying, moving, or renaming files and folders that are under revision control.
  6. Use git-flow scheme to create feature, branch, release and version of your paper. Be really careful when you are touching develop or master branch.
  • master branch represents the life of the public document, which should always be ready for releasing to others;
  • develop branch accumulates all finished features and keeps the lifecycle of all development before release.
  • feature branch for major items of work, for example, I am adding the section of experiment etc. If you are quite sure that it can be done without affecting other collaborating author's workflow, you can directly work on the develop branch.
  1. Using tagging extensively to keep notes on what a given point in the history means. You may need to familiar with SemVer.