Whether the MML tree is correct or not.
Whether the QID's are correct or not.
(success)

MathMLben

Build Status Maintainability

MathMLben is a benchmark to the evaluate tools for mathematical format conversion (LaTeX ↔ MathML ↔ CAS).
The Gold Standard comprises 305 mathematical formulae:

  • 1-100 extracted from the NTCIR 11 Math Wikipedia
  • 101-200 from the NIST Digital Library of Mathematical Functions (DLMF)
  • 201-305 from the NTCIR arXiv and NTCIR-12 Wikipedia datasets
  • 306-309 physics differential equation Formula Concepts (Scharpf et al., 2019) from Wikipedia
  • 310-375 from 25 physics Wikipedia articles annotated using the AnnoMathTeX formula and identifier name recommender system (https://annomathtex.wmflabs.org).

An overview of the sources can be found at dataset table.

A GUI to make changes to the data is available at wmflabs with the following input fields:

  • formula name
  • formula type (definition, equation, relation or general formula)
  • original TeX and
  • corrected TeX,
  • hyperlink to the original formula (source) and
  • semantic Tex field for annotations (DLMF macros, Wikidata QIDs).

The expression tree preview visualization is provided by VMEXT.

Anotations of Wikidata items (QIDs) can be made via

  • the TeX-macro \w{Q...} for a general mathematical expression,
  • \wf{Q...} for a function or
  • \wdef{Q...} at the beginning of the formula.

Multiple annotation of the same token is dispensable.
DLMF LaTeX macros (see Digital Repository of Mathematical Formulae by H. S. Cohl et al., DOI 10.1007/978-3-319-08434-3_30), e.g. \EulerGamma@{z} for the gamma function or \JacobiP{\alpha}{\beta}{n}@{x} for the Jacobi polynomial, are interpreted.

Furthermore you can create new macros at in the latexml style file. Be careful that the quotation marks '' are balanced and survey the travis build status on travis-ci.org.

We gladly invite experts that are able to judge the correctness of the formula, its name and type as well as the semantic annotations and the expression tree.
Controlling and correcting the Gold Standard by setting the Tree State and QID State flags or adapting some of the input fields if necessary is highly welcome and helps to enable and improve the conversion between the diverse formats of mathematical notation available today (LaTeX, MathML and various CAS).

If you estimate a problem with the annotation (uncertainty or ambiguity), the formula name and type, the expression tree or MathML to be in need of discussion, please create an issue.

Accessing RAW Data & QIDs Directly

Each path mathmlben.wmflabs.org/:QID is linked to the specific gold entry of the given QID.

You can also access the raw json-data of the gold standard entries without using the GUI. Simple GET-requests to mathmlben.wmflabs.org/rawdata/:QID will return the json-file for the given QID. Using the path rawdata/all will return a json-array that contains the entire gold standard (approx. 2.5MB).

Example URL Description
mathmlben.wmflabs.org/3 Directs to GUI entry 3
mathmlben.wmflabs.org/rawdata/3 Returns RAW json-file of gold standard entry 15
mathmlben.wmflabs.org/rawdata/all Returns RAW json-array of the entire gold standard (approx. 2.5MB)
Impressum
(success)
(success)
(success)

Note that the API token will be sent to our server.
We recommend to delete the token after usage.

{ "qID": 1, "title": "", "type": "", "correct_tex": "", "math_inputtex": "", "math_inputtex_semantic": "", "uri": "", "correct_mml": "", "comment": "DATA NOT LOADED YET - CHANGE QID!" }
Impressum

Fork me on GitHub