MT Summit Peer Review Policy – European Association for Machine Translation

Before starting to review a paper for an MT Summit, please read the following. While the policies below mainly address papers containing research, (Research: technical and Research track) many of them may be relevant for other kinds of papers (Translator, Projects, and User tracks).

Early action: Please have a look at the paper as soon as you get the assignment. If you notice any serious issues with the paper (format violations, missing compulsory sections, deanonymization through self-citations, acknowledgements or non-anonymous repositories), or you feel that the paper is too far out of your area of expertise, reach out to your track chair so that they can adjust the assignments or consider a desk rejection. If you only look at the paper close to the deadline, it will be too late (and you’ll have to review it anyway, so checking early can save you work!).

What constitutes a paper: The submission has to contain the actual paper, which may contain appendices or refer to code and/or data. As a general rule, the paper should be readable as a standalone document, and any details important for understanding the key aspects of the work should be in the paper rather than in appendices, accompanying source code, etc. Non-key aspects (e.g. implementation details, parameters, examples, or experiments that provide additional support for the main claims) can be linked as supplementary materials.

Reviewer’s summary: The summary of a paper should be just that: a neutral, dispassionate summary of the research question and findings/contributions. We would help the program chairs the most by not tainting these summaries with, e.g., suggestions that the topic is exciting or a waste of time.

Identify and list contributions: Make sure you acknowledge all the contributions that you believe the paper is making: experimental evidence, replication, framing of a new question, artefacts that can be used in future work (models, resources, code), literature review, establishing new cross-disciplinary connections, conceptual developments, theoretical arguments. A paper may make several contributions, and not all of them need to be equally strong. You should state in your own words what you see as contributions of the paper, rather than copy/paste it from the abstract. This helps the authors and the chairs to see that you understood the gist of the paper, and hence the rest of your review should be taken seriously.

Reasons to accept: Even if you fundamentally disagree with something in the paper, it is important for the program chairs (and the mental health of the authors) that you accurately state all the best aspects of it. It also helps them to see that you actually understand the work, and hence any criticisms should be taken seriously.

Valuable contributions: an engineering solution, a literature review, a useful artefact (a model, software, a resource), reproduction of prior results, analysis (of models or data), theory, argumentation/position.
Does it advance the field? What did we learn from it that we did not know before? What can we do that we could not do before? Are the research claims clearly articulated and sufficiently backed up by the appropriate evidence, literature, or theory/arguments?
Does it perform an improvement in performance? Is the baseline sufficiently strong and well-tuned, and that the result is statistically robust? Has the computational budget been reported?
Is it clear and well written?
Does it formulate a timely or important questions?
Does it reframe a problem or shows links that were unknown before between ideas and concepts? Does it develop new concepts?

Reasons to reject: Many things could go wrong in a study:

There may be claims that are not actually supported by the evidence or by the arguments, but that are presented as conclusions rather than as hypotheses.
The framing may be misleading.
There may be obvious methodological flaws (for example, only the best run results are reported), errors in the proofs, in the implementation, or in the analysis.
There may be insufficient detail to understand what was done or how to reproduce the method and the results.
There may be a lack of clarity about what the research question is (even if it is “Does system A work better than system B”?), what was done, why, and what was the conclusion.
The paper may not make it clear in what way the findings and/or the released artefacts advance the field.

Be sure that you do not cite as weaknesses things that are not in fact weaknesses, but rather belong in the “suggestions” box. Once you have the first draft of your review, you will need to revisit it to check for these frequent issues.

Limitations: Every work has limitations, and submissions to MT Summit conferences should include a mandatory section for discussing them; the honest and thorough statement of limitations should not penalize the paper, and it is an antidote to the ever-present risk of hype.

Missing References: Please list any references that should be included in the bibliography or need to be discussed in more depth. If you mentioned missing previous work or lack of baselines, give the full citation. In particular, if you say that the submission lacks novelty, please be sure to include the specific work that you believe to be similar.

Typos, Grammar, Style, and Presentation Improvements: Make sure your review contains presentation suggestions, clarifications, and points out (a reasonable amount of) typos or language errors. These small things that you do not consider “weaknesses” and should not affect the overall assessment, unless, of course, the paper is completely illegible or unclear: this would therefore be clearly a “weakness”.

Reviewers are beta-readers who can help the authors to significantly improve their papers. If you see something that they could do to increase the potential impact of their work, it is professional courtesy to let them know. Some examples include figures that are hard to read or interpret, the points that require extra background reading, ambiguity, and non-obvious connections between sections.

Do not give yourself away when mainly suggesting citations of your own papers, as it may de-anonymize you (especially when the paper in question is not very well-known).

Reproducibility: Think about how easy it would be to reproduce the paper: imagine you had to ask someone who had a strong translation technologies background but was not immersed in the field how to reproduce the paper in an e-mail. Could you just hand them the submission? If not, how long would the e-mail be to fill in the gaps?

Confidence: If your confidence is low, please use the confidential comments section at the end of the form to let your chairs know which aspects of your review might not be reliable.

Best Paper Award: To mark this, try to consider the question “how could you argue that this will be best paper”.

Responsible translation technologies research: If the paper is a research paper, make sure that it addresses the following questions (some of which may also be applicable for non-research papers):

Does it describe the limitations of the work?
Does it discuss possible risks?
Do the abstract and the introduction summarize the main claims in the paper?
If the paper uses artefacts such as code, data, models, etc.:
- Does it cite who created these artefacts?
- Does it discuss the licence or terms of use for the artefacts?
- Have third parties’ artefacts been used according to their licences?
- In particular, if used to derive new artefacts, does this derivation agree with the original licences?
- If data is collected, have steps been taken to ensure that they do not contain information that uniquely identifies people or offensive content of any kind?
- Are artefacts properly documented as regards coverage of domains, languages, and linguistic phenomena, demographic groups represented, etc.?
- Does the paper give relevant statistics such as the number of examples, details of training / testing / development splits, etc. for the data that used or created?
If the paper reports computational experiments:
- Does it report the number of parameters in the models used, the total computational budget (e.g., GPU hours), and computing infrastructure used?
- Does it discuss the experimental setup, including hyperparameter search and best-found hyperparameter values?
- Does it report descriptive statistics about the results (e.g., error bars around results, summary statistics from sets of experiments), and is it transparent whether it is reporting the maximum, the mean, etc. or just a single run?
- If existing packages were used (for preprocessing, for normalization, for evaluation. etc. ), does the paper report the implementation, model, and parameter settings used?
If human annotators were used:
- Does the paper report the full text of instructions given to participants, including, screenshots, disclaimers of any risks to participants or annotators, etc.?
- Does it report information about how participants were recruited and paid (crowdsourcing platform, students, etc.), and discuss if the payment is adequate given the participants’ demographics (e.g., country of residence)?
- Does it discuss whether and how consent was obtained from people whose data is being used or curated?
- Was the data collection protocol approved (or determined exempt) by an ethics review board?
- Does it report the basic demographic and geographic characteristics of the annotator population that is the source of the data?

A good review: here’s how to make sure your review is useful:

Be specific when formulating the issues you found in the paper.
Make sure your arguments to recommend rejection are well-founded: for example, lack of novelty should be backed with references to similar work, or state-of-the-art performance may not be relevant when contributions are about efficiency.
Make sure your score(s) are consistent with your review
Check that the tone is professional and neutral, if not kind. When writing it is easy to be more sarcastic or dismissive. Your author may be a starting researcher, or someone working in isolation. Write the review you would like to get yourself.

Reviewing is in fact mutually beneficial: in exchange for feedback, the reviewers get to have the first look at cutting-edge research. You may even find while reviewing that you were wrong about something, which may help you later in your own research.