In the past three decades, I have spent most of my writing time in typing LaTeX
for academic papers, research presentations, assignments and even examination papers. As is the natural inclination for computer scientists, I have been improving my workflow for many years so that I can maximize the productivity while minimizing the labour work. Here is the LaTeX
workflow with which I feel comfortable.
LaTeX Foundations
There is no need to emphasize more on the convenience of using LaTeX
for academic writing. If this sounds geek to you, you need to read some foundational materials such as WikiBook.
For the workflow related to LaTeX
writing, I recommend the following cross-platform packages:
LaTeX
distribution: TeXLive! or online one such as Overleaf- TULIP-Lab
LaTeX
Packages: GitHub Templatex, includingpowerdot-tuliplab
which can be installed from CTAN *.tex
editor: TexStudio, or Visual Studio Code with extensions such asLaTeX Workshop
,Markdown All in One
, etc.*.bib
editor: Zotero with Better BibTexgit
repository: GitHub private repository, Bitbucket or Overleaf (which I use mainly as another git server, for collaboration when no localLaTeX
distribution installed). All those can co-exist for a single paper, as remote repository. For example, Overleaf can link/push/pull with a git private repository.git
client: SmartGit free for non-commercial use- Changes Tracking: LaTeXDiff and Git-LaTeXDiff, for version difference extraction and visualization
Zotero & Better BibTex
A more detailed coverage on Zotero
can be found at: Tools Zotero
Zotero is a free and open-source reference management software to manage bibliographic data and related research materials. Especially you can use it to manage all those PDF files you collected. It provides notable features include web browser integration, online syncing, generation of in-text citations, footnotes, and bibliographies.
Better BibTeX (BBT) is an extension for Zotero that makes it easier to manage bibliographic data, especially for people authoring documents using text-based toolchains including LaTeX. For LaTeX users, it can automatically export and generate citation keys consistently, and also can be customized. You can use a consistent or the default configuration when exporting items into a bib file.
Grammar/Spelling Checking for LaTeX
ASpell
Many excellent spell checkers for the LaTeX
are available, such as ASpell, which can be incorporated into the compilation process. Before you compile you can do
aspell -t -c report.tex
It lets you interactively spell check the whole file. The -t
option is to tell the spell checker that the file is in TeX
or LaTeX
format so that it will ignore macros.
If you prefer to simply get a list of misspelled words non-interactively, you can run:
cat file.tex | aspell list -t | sort | uniq
Grammar Checking Tools
Queequeg
Queequeg is a simple command-line tool aimed exactly at finding concordance errors in English. It works with LaTeX
sources out of the box.
qq -t mainbody.tex
TeXtidote
TeXtidote is a command-line tool based on LanguageTool so that it can work on LaTeX
files.
java -jar textidote.jar --html mainbody.tex > checkreport.html
Polymorphical Editions
As a researcher, you might need to prepare different editions of the same technical report for different submitting channel: the conference, the journal, etc.
You should be aware of the difference between two terms:
Revision
: One particular state of the files in a project, positioned along one or more lines of the development historyEdition
: One particular variation which has a purpose in being different from other variations, such as the conference edition, the journal edition, the technical report edition etc.
Notice that I have avoided using the word version
, which is too ambiguous.
The method to maintain multiple editions is to create a configurable master file, and to include it from a series of top-level files, one for each Edition
that is required. For an illustration of general principles for making polymorphic editions, we can use the following example using Viktor Eijkhout's comment
package and Hendrik Mittby's todonotes
package. Suppose three editions with 4 TeX
files are:
- The conference edition: with some block for conference only
- The journal edition: with some block for journal only
- The report edition: with all blocks
Conference.tex:
%=================================================================
\usepackage{comment}
%=================================================================
%
\excludecomment{JournalOnly}
\includecomment{ConferenceOnly}
%
%=================================================================
\input{mainbody}
...
Journal.tex:
```latex
...
%=================================================================
\usepackage{comment}
%=================================================================
%
\excludecomment{JournalOnly}
\includecomment{ConferenceOnly}
%
%=================================================================
\input{mainbody}
...
Report.tex:
...
%=================================================================
\usepackage{comment}
%=================================================================
%
\includecomment{JournalOnly}
\includecomment{ConferenceOnly}
%
%=================================================================
\input{mainbody}
...
mainbody.tex:
...
\usepackage{comment}
\begin{ConferenceOnly}
We have \SI{10}{\hertz},
\si{\kilogram\metre\per\second},
the range: \SIrange{10}{100}{\hertz}.
$\nicefrac[]{1}{2}$.
\missingfigure{Make a sketch of the structure of a trebuchet.}
\end{ConferenceOnly}
\begin{JournalOnly}
This is a paragraph which is only available in journal edition,
and the conference one will not include it.
\end{JournalOnly}
...
Collaborative Writing in Git
+ LaTeX
Git
is mainly used as the repository for code. Considering that LaTeX
is a kind of scripting, it is a natural choice to put your writing under version control systems such as git
. You may wonder what are the advantages of this over GUI tools such as Microsoft Word
. I can quickly list some of the things that I find not easy to do in Microsoft Word
:
- Maintain a full editing history, and quantify the line by line contributions from co-authors;
- Tag, Release and Branch the project;
- Easily creating a PDF with revision details;
- Adding comments by hiding the source in the
LaTeX
source; - Multiple authors writing at the same time, on the same file!
Setup Git
Repositories
Optional: One Repository in local texmf
tree For All Papers
A local texmf
tree is for putting various package-like-artifacts, that are not proper packages managed through your package manager (e.g. Miktex
or TeXLive
). How to setup a local tex tree is beyond the scope of this answer but it isn't hard: for Windows, most likelty the home directory under ~/texmf/tex/
; for macOS
, please put it under /usr/local/texlive/texmf-local
. For MkTex
on Windows, you can run mktexlsr
; For TexLive
you can type the following command:
cd
sudo texhash
The local texmf
tree means that you never have to use absolute, or even relative paths for your bibliography bib
files. It is also very useful as many conferences and journals distribute their own templates and styles not through CTAN, but just as .sty
and .bst
files.
Put the .bib
in local texmf
tree
At our team, we have several common bibliographic files in the BibTeX format: tuliplab.bib
, deeplearning.bib
, tourism.bib
and hospitality.bib
etc. They reside in our common texmf tree in the subfolder /bibtex/bib/
. Hence, all members can specify the bibliography by only using the file name (such as \bibliography{tuliplab}
without the full path) --- no matter where your working copy of the common texmf
tree is located.
Add local texmf
tree to git
Once you have created that tree, add it to git. Then when ever you work on a new computer check out the repo from git, and tell your TeX
distribution to know about the local texmf
tree.
One Repository Per Paper
There are many free or public available platforms for Git
, such as GitHub
or Bitbucket
. You need to create one new repository in your chosen git
platform, and initialize it with all the necessary .tex
and .bib
files. We recommend the following folder structure:
├── .git # Git folder automatically created upon git checkout.
│ ├── ...
│ ├── gitHeadInfo.gin # auto generated git information
│ └── hooks # store those triggers so that gitinfo can be auto generated
├── Data # contains the training data, test data, validation data sets
├── Code # contains the source code, script, or Python notebook
└── Report # tex, bib files
├── figures # subfolder
├── report.tex # working draft master file, you should use this for CS papers
├── report-htm.tex # working draft master file, you should use this for HTM papers
├── report-lncs.tex # template master file for Springer conferences LNCS format
├── report-ieee.tex # template master file for IEEE conferences format
├── report-acm.tex # template master file for ACM conferences format
├── mainbody.tex # mainbody TeX file for your paper
├── tuliplab.bib # BIB file for tuliplab publications
├── yourbib.bib # BIB file for your own research paper's references
├── slides.tex
├── poster.tex
└── ...
For version information, you need to use the following package:
-
GitInfo2: This package requires some configurations under your project's
.git/hooks
folder, and after that it will work like a charm.You can test it by checking out or pulling/pushing your repository, and it should generate/update the file
./git/gitHeadInfo.gin
in the local project repository, such as~/MyFancyPaper/.git/gitHeadInfo.gin
.Here is one example of using
GitInfo2
in the LaTeX file.
-
LaTeXDiff and Git-LaTeXDiff:
LaTeXDiff
will take two LaTeX files and produce a formatted difference output inPDF
, whileGit-LaTeXDiff
combinesgit
andLaTexDiff
together such that you can easily view the difference between two committed versions.git latexdiff HEAD~N # diff between your work tree and the N commits back git latexdiff HEAD~1 --main MyFile.tex # can specify the main tex file git latexdiff HEAD~3 HEAD~5 --main MyFile.tex # diff between the 3rd and the 5th commits back
Here is one sample output of the difference
PDF
:
LaTeX
git workflow
Git-Flow
If you didn't use branches sufficiently, you can use git
in the old fashion of svn
. However, branches allow much more flexible development of the project, and you should learn it from here.
Once you have some general knowledge of git branches, I recommend to use the Git-Flow
, a branch-based workflow that supports teams and projects where deployments are made regularly and development is continuous. A good tutorial on it can be found at Bitbucket.
Typically, you will have two major kinds of branches:
master
branch should be treated as the main flow of your work, and it should be always production ready, and in aready to submit
state.develop
branch is parallel to themaster
branch, it is the main branch where the source code ofHEAD
always reflects a state with the latest delivered development changes for the next release.
Next to the main branches master
and develop
, the development model uses a variety of supporting branches to aid parallel development between collaborators, ease tracking of features, prepare for submission and to assist in quickly fixing problems.
Feature
branches: branch off from and merge back intodevelop
branch. They are used to develop new subsections (features) for the paper.Release
branches: branch off fromdevelop
and merge back intodevelop
andmaster
. When the state ofdevelop
is ready for a submission, we branch off and give the release branch a name reflecting the submission, such asIJCAI2018
.Hotfix
branches: branch off frommaster
and merge back intodevelop
andmaster
. It is typically used to fix those identified issues existing in all existing and working branches. So we use it when revising based on the review comments of the submission.
LaTeX
+ Git-Flow
The main idea is that we will create one repository with different subfolders for one project (paper), and use the git-flow
and tagging
to develop the paper. The following figure illustrate the commit history of one paper repository.
Rules for Collaborative LaTeXing
You MUST follow the following rules when writing collaboratively. Otherwise, your co-author will find it impossible to work together.
Rules for LaTeX
Source Code
- Avoid ineffective modifications.
- Do not change line breaks without good reason.
- Turn off automatic line wrapping of your
LaTeX
editor. - Start each new sentence in a new line.
- Split long sentences into several lines so that each line has at most
80
characters. - For all the marginal comments by other co-authors, you MUST responed to them using a responsive marginal comment, such as
\gangli{What is wrong here?} \qwu{This is my response ...}
- If your comment has been properly addressed by co-authors, you MUST remove or comment out both the original and the response marginal comments.
- You can customize related blocks and use the following code to indicate where you are updating up to:
\gliMarker %TODO: YourName up to here!
Rules for LaTeX
+Git
Workflow
- Put only those files that are directly modified by the user under version control.
- Frequent commits of minor changes: add a meaningful and descriptive comment when committing your modifications to the repository
- Don't commit a broken version: verify that your code can be compiled flawlessly before committing.
- Use
git-latexdiff
orlatexdiff
(orgit
compare tools) to critically review your modifications before committing them to the repository. - Use the
git
client for copying, moving, or renaming files and folders that are under revision control. - Use
git-flow
scheme to createfeature
,branch
,release
andversion
of your paper. Be really careful when you are touchingdevelop
ormaster
branch.
master
branch represents the life of the public document, which should always be ready for releasing to others;develop
branch accumulates all finishedfeatures
and keeps the lifecycle of all development before release.feature
branch for major items of work, for example,I am adding the section of experiment
etc. If you are quite sure that it can be done without affecting other collaborating author's workflow, you can directly work on thedevelop
branch.
- Using
tagging
extensively to keep notes on what a given point in the history means. You may need to familiar with SemVer.
Leave your thought here
Your email address will not be published. Required fields are marked *