Interactive GUI Guide¶
The BibTeX Validator features a modern, web-based Graphical User Interface (GUI) that allows you to intuitively review, compare, and accept changes to your BibTeX entries. This guide provides a comprehensive overview of all GUI features, controls, and workflows.
Launching the GUI¶
To start the GUI, run:
uv run bibtex-validator references.bib --gui
This launches a local web server (default port: 8010) and automatically opens your default browser. The GUI provides an interactive interface for reviewing validation results and selectively applying changes.
GUI Structure Overview¶
The interface is organized into three main sections:
flowchart TD
A[GUI Interface] --> B[Validation Summary]
A --> C[Navigation Toolbar]
A --> D[Comparison Table]
B --> B1[Attention Pie Chart]
B --> B2[Statistics Icons]
B --> B3[Accept All Entries Button]
C --> C1[Previous/Next Buttons]
C --> C2[Entry Selector Dropdown]
C --> C3[Entry Stats Display]
D --> D1[Field Column]
D --> D2[BibTeX Value Column]
D --> D3[API Value Column]
D --> D4[Source Badge Column]
D --> D5[Status Badge Column]
D --> D6[Actions Column]
D --> D7[Footer Actions]
1. Validation Summary¶
The Validation Summary appears at the top of the dashboard and provides a global overview of your bibliography’s validation status.
Attention Pie Chart¶
A circular progress indicator (pie chart) shows the percentage of entries requiring attention. The chart uses a conic gradient where:
Red portion: Percentage of entries with issues (updates, conflicts, or differences)
Gray portion: Percentage of entries that are fully validated
The chart is accompanied by a counter displaying: {entries_with_issues}/{total_entries} ({percentage}%)
Visual Representation:
The pie chart uses CSS conic-gradient to visualize the percentage of entries requiring attention. The red portion represents entries with issues, while the gray portion represents fully validated entries.
Statistics Icons¶
Four icon-based statistics provide quick insights:
Reviews (Blue) - :octicon:pencil Review Icon¶
:gui-status-review:Review
Color: Blue background with dark blue text
Meaning: Fields where the API found new data to update incomplete or missing fields
Action Required: Review and decide whether to accept the suggested values
Example: A missing
journalfield that can be filled from Crossref
Conflicts (Orange) - :octicon:alert Conflict Icon¶
:gui-status-conflict:Conflict
Color: Orange background with dark orange text
Meaning: Critical mismatches between your local BibTeX data and API data
Action Required: Manual review required - these are significant differences (e.g., Year 2023 vs 2024, different author names)
Example: Your BibTeX has
year = {2023}but Crossref reports2024
Differences (Yellow) - :octicon:diff Difference Icon¶
:gui-status-different:Different
Color: Yellow background with dark yellow text
Meaning: Minor styling or formatting differences that don’t affect content
Action Required: Optional review - usually safe to accept
Example:
"Journal Name"vsJournal Name(quotes difference), orSmith, JohnvsSmith, J.(abbreviation)
Identical (Green) - :octicon:check-circle Identical Icon¶
:gui-status-identical:Identical
Color: Green background with dark green text
Meaning: Fields that perfectly match the authoritative source
Action Required: None - these fields are already correct
Example: Title, author, and year all match exactly
3. Keyboard Shortcuts¶
The GUI supports comprehensive keyboard navigation for efficient workflow. All shortcuts are disabled when input fields (INPUT, TEXTAREA, SELECT) have focus to prevent conflicts.
Usage Tips¶
Use arrow keys for quick sequential review
Use
PageUp/PageDownfor large bibliographiesUse
Home/Endto jump to extremesPress
Escto clear selection and see the overview
4. Source Badge System¶
Source badges indicate which online database provided the metadata for each field. Each source has a distinct color scheme for easy identification.
Source Types and Colors¶
Crossref (Blue)¶
:gui-badge-crossref:CROSSREF
Primary Use: DOI metadata - the main authoritative source for published papers
Coverage: Journals, conferences, books with DOIs
Priority: Highest (1st in priority order)
Reliability: Very high - official DOI registry
arXiv (Red)¶
:gui-badge-arxiv:ARXIV
Primary Use: Preprint papers and arXiv-hosted publications
Coverage: Preprints, e-prints, arXiv papers
Priority: High (3rd in priority order)
Reliability: High - official arXiv metadata
Special Features: Automatically adds
eprintandeprinttypefields
Semantic Scholar (Indigo)¶
:gui-badge-scholar:SEMANTIC SCHOLAR
Primary Use: AI-powered academic search and metadata
Coverage: Broad academic literature
Priority: Lower (7th in priority order)
Reliability: Good - AI-enhanced but may have errors
Best For: Finding missing DOIs or metadata for obscure papers
DBLP (Purple)¶
:gui-badge-dblp:DBLP
Primary Use: Computer science bibliography
Coverage: CS conferences, journals, and proceedings
Priority: Medium (4th in priority order)
Reliability: Very high for CS publications
Best For: Computer science papers, especially conference proceedings
PubMed (Sky Blue)¶
:gui-badge-pubmed:PUBMED
Primary Use: Biomedical and life sciences literature
Coverage: Medical journals, biomedical research
Priority: Medium (6th in priority order)
Reliability: Very high for medical publications
Best For: Papers with PMID identifiers
Zenodo (Gray)¶
:gui-badge-zenodo:ZENODO
Primary Use: General repository for research outputs
Coverage: Datasets, software, reports, presentations
Priority: High (2nd in priority order, after Crossref)
Reliability: High - official repository
Special Features: Often includes GitHub repository links
DataCite (Gray)¶
:gui-badge-datacite:DATACITE
Primary Use: Data DOI registry
Coverage: Research datasets, data publications
Priority: Medium (5th in priority order)
Reliability: High - official data DOI registry
Best For: Datasets and data publications
OpenAlex (Gray)¶
:gui-badge-openalex:OPENALEX
Primary Use: Comprehensive academic metadata
Coverage: Very broad - aggregates multiple sources
Priority: Lowest (8th in priority order)
Reliability: Good - comprehensive but may be less precise
Best For: Fallback when other sources fail
Source Priority Order¶
When multiple sources provide data for the same field, the validator uses this priority order:
Crossref (highest priority)
Zenodo
arXiv
DBLP
DataCite
PubMed
Semantic Scholar
OpenAlex (lowest priority)
Source Selection Dropdown¶
For fields with multiple available sources, a dropdown menu allows you to choose which source’s value to use.
Features:
Visual Indicator: Badge shows a chevron-down icon when multiple sources are available
Click to Expand: Clicking the badge opens a dropdown menu
Source List: Shows all sources that have data for this field
Current Selection: Checkmark indicates the currently selected source
Auto-Update: Changing the source immediately updates the displayed API value
Visual Example:
When expanded, shows available sources:
:gui-badge-crossref:CROSSREF ✓
Currently selected
:gui-badge-arxiv:ARXIV
:gui-badge-scholar:SEMANTIC SCHOLAR
:gui-badge-dblp:DBLP
When Available:
Multiple sources found data for the same field
Field is not in “identical” or “bibtex-only” status
At least 2 sources have different values
5. Status Badge System¶
Status badges indicate the relationship between your BibTeX value and the API value for each field.
Status Types¶
Review (Blue)¶
:gui-status-review:Review
Label: “Review”
Meaning: New data is available from the API for a field that is missing or empty in your BibTeX
Action: Review the suggested value and accept if appropriate
Visual: BibTeX value shows as strikethrough red
(empty), API value in greenExample: Your entry has no
journalfield, but Crossref provides it
Conflict (Orange)¶
:gui-status-conflict:Conflict
Label: “Conflict”
Meaning: Significant mismatch between your BibTeX value and API value
Action: Manual review required - these are important differences
Visual: Both values displayed side-by-side for comparison
Examples:
Year mismatch:
2023vs2024Different author names:
Smith, JohnvsSmith, JaneJournal name differences:
J. ACMvsJournal of the ACM
Different (Yellow)¶
:gui-status-different:Different
Label: “Different”
Meaning: Minor formatting or stylistic differences (similarity > 70%)
Action: Usually safe to accept - these are cosmetic differences
Visual: Both values displayed for comparison
Examples:
Quote differences:
"Title"vsTitleAbbreviation:
Smith, JohnvsSmith, J.Case differences:
Journal Namevsjournal name
Identical (Green)¶
:gui-status-identical:Identical
Label: “Identical”
Meaning: Your BibTeX value perfectly matches the API value (after normalization)
Action: No action needed - field is already correct
Visual: Single value displayed (same in both columns)
Note: These fields are verified and correct
Accepted (Emerald)¶
:gui-btn-accept:Accepted
Label: “Accepted”
Meaning: You have accepted the API value for this field
Action: None - change has been applied
Visual: Shows briefly after accepting, then field reloads
Transient: Status appears for ~2 seconds after acceptance
Rejected (Red)¶
:gui-btn-reject:Rejected
Label: “Rejected”
Meaning: You have explicitly rejected the API suggestion
Action: None - your original value is preserved
Visual: Shows briefly after rejecting, then field reloads
Transient: Status appears for ~2 seconds after rejection
Local Only (Gray)¶
Local Only
Label: “Local Only”
Meaning: Field exists in your BibTeX but no API source provides data for it
Action: None - keep your local value
Visual: BibTeX value shown, API value shows as
-(dash)Examples: Custom fields, notes, or fields not in standard metadata
Status Badge Visual Reference¶
:gui-status-review:Review
:gui-status-conflict:Conflict
:gui-status-different:Different
:gui-status-identical:Identical
:gui-btn-accept:Accepted
:gui-btn-reject:Rejected
Local Only
6. Comparison Table¶
The comparison table is the core of the GUI, providing a detailed field-by-field comparison for the selected entry.
Table Structure¶
The table has six columns:
Field: BibTeX field name (e.g.,
title,author,year,journal)BibTeX Value: Current value in your local
.bibfileAPI Value: Suggested value from online sources
Source: Data source badge (with dropdown if multiple sources available)
Status: Status badge indicating the comparison result
Actions: Accept/Reject buttons or status message
Row Display Logic¶
Rows are displayed in priority order:
Updates (Review status) - New data available
Conflicts - Significant mismatches
Differences - Minor differences
Identical - Verified matches
Local Only - Fields not in API
Visual Formatting by Status¶
Update Rows¶
Field |
BibTeX Value |
API Value |
Status |
|---|---|---|---|
title |
~~(empty)~~ |
New Title from API |
Review |
Conflict/Different Rows¶
Field |
BibTeX Value |
API Value |
Status |
|---|---|---|---|
year |
2023 |
2024 |
Conflict |
Identical Rows¶
Field |
BibTeX Value |
API Value |
Status |
|---|---|---|---|
title |
Matching Title |
Matching Title |
Identical |
Local Only Rows¶
Field |
BibTeX Value |
API Value |
Status |
|---|---|---|---|
note |
Custom note text |
- |
Local Only |
8. API and Data Sources¶
The BibTeX Validator integrates with multiple academic databases and metadata providers. Understanding each source helps you make informed decisions when reviewing suggestions.
Data Source Overview¶
graph TD
A[BibTeX Entry] --> B{Has DOI?}
B -->|Yes| C[Crossref API]
B -->|No| D{Has arXiv ID?}
D -->|Yes| E[arXiv API]
D -->|No| F{Has Title?}
F -->|Yes| G[Search APIs]
C --> H{Zenodo DOI?}
H -->|Yes| I[Zenodo API]
H -->|No| J[DataCite API]
G --> K[DBLP Search]
G --> L[Semantic Scholar]
G --> M[OpenAlex Search]
D -->|Yes| N[Extract DOI from arXiv]
N --> C
O[Has PMID?] --> P[PubMed API]
Q[Recursive Enrichment] --> R{Found New DOI?}
R -->|Yes| C
R -->|No| S{Found New arXiv ID?}
S -->|Yes| E
Source Details¶
Crossref¶
API Endpoint:
https://api.crossref.org/works/{doi}Primary Use: DOI metadata for published papers
Data Quality: Very high - official DOI registry
Coverage: Journals, conferences, books, reports
Rate Limiting: Polite user-agent recommended
Best For: Published papers with DOIs
Fields Provided: title, author, journal, year, volume, pages, DOI, ISSN, entrytype
arXiv¶
API Endpoint:
http://export.arxiv.org/api/query?id_list={arxiv_id}Primary Use: Preprint papers
Data Quality: High - official arXiv metadata
Coverage: Preprints, e-prints
Rate Limiting: 1 request per second recommended
Best For: Preprint papers, arXiv-hosted publications
Fields Provided: title, authors, year, arxiv_id, categories
Special: Automatically adds
eprintandeprinttype="arxiv"fields
Zenodo¶
API Endpoint:
https://zenodo.org/api/records/{record_id}Primary Use: General research repository
Data Quality: High - official repository
Coverage: Datasets, software, reports, presentations
Best For: Research outputs in Zenodo
Fields Provided: title, authors, year, publisher, DOI, URL (often GitHub links)
Special: Extracts GitHub repository URLs from related identifiers
DataCite¶
API Endpoint:
https://api.datacite.org/dois/{doi}Primary Use: Data DOI registry
Data Quality: High - official data DOI registry
Coverage: Research datasets, data publications
Best For: Datasets and data publications
Fields Provided: title, creators, year, publisher, DOI, type, URL
DBLP¶
API Endpoint:
https://dblp.org/search/publ/apiPrimary Use: Computer science bibliography
Data Quality: Very high for CS publications
Coverage: CS conferences, journals, proceedings
Search Method: Title + author search
Best For: Computer science papers, especially conference proceedings
Fields Provided: title, authors, year, venue (journal/conference)
Semantic Scholar¶
API Endpoint:
https://api.semanticscholar.org/graph/v1/paper/searchPrimary Use: AI-powered academic search
Data Quality: Good - AI-enhanced but may have errors
Coverage: Broad academic literature
Search Method: Title + author search
Best For: Finding missing DOIs or metadata for obscure papers
Fields Provided: title, authors, year, venue, DOI, externalIds
PubMed¶
API Endpoint:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgiPrimary Use: Biomedical and life sciences literature
Data Quality: Very high for medical publications
Coverage: Medical journals, biomedical research
Requires: PMID (PubMed ID) in BibTeX entry
Best For: Papers with PMID identifiers
Fields Provided: title, authors, year, journal
OpenAlex¶
API Endpoint:
https://api.openalex.org/worksPrimary Use: Comprehensive academic metadata
Data Quality: Good - comprehensive but may be less precise
Coverage: Very broad - aggregates multiple sources
Search Method: DOI lookup or title search
Best For: Fallback when other sources fail, or for comprehensive metadata
Fields Provided: title, authors, publication_year, venue, DOI, volume, issue, pages
Special: Provides detailed bibliographic information (volume, issue, pages)
Priority Order and Conflict Resolution¶
When multiple sources provide data for the same field, the validator uses this priority order:
Crossref (highest priority - most authoritative for DOIs)
Zenodo (high priority - official repository)
arXiv (high priority - official preprint source)
DBLP (medium priority - excellent for CS)
DataCite (medium priority - official data registry)
PubMed (medium priority - excellent for medical)
Semantic Scholar (lower priority - AI-enhanced)
OpenAlex (lowest priority - comprehensive fallback)
Rationale: Official registries (Crossref, Zenodo, DataCite) have highest priority, followed by domain-specific authoritative sources (arXiv, DBLP, PubMed), then general search engines (Semantic Scholar, OpenAlex).
Recursive Enrichment¶
The validator implements a “recursive enrichment” feature that discovers missing identifiers:
Process:
If an entry has no DOI, search secondary sources (DBLP, Semantic Scholar, OpenAlex, PubMed)
If a DOI is found in secondary sources, fetch from Crossref/Zenodo/DataCite
If an entry has no arXiv ID, check if secondary sources mention one
If an arXiv ID is found, fetch from arXiv API
Example Flow:
Entry (no DOI, has title)
→ Search DBLP → Found DOI: 10.1234/example
→ Fetch Crossref → Got full metadata
→ Check Crossref → Found arXiv ID: 1234.5678
→ Fetch arXiv → Got preprint metadata
This ensures maximum metadata coverage even when identifiers are missing.
9. Data Flow and Workflow¶
Understanding the complete data flow helps you use the GUI effectively.
Validation to Display Flow¶
sequenceDiagram
participant User
participant GUI
participant Backend
participant APIs as External APIs
participant File as BibTeX File
User->>GUI: Launch --gui
GUI->>Backend: Load BibTeX file
Backend->>Backend: Parse entries
Backend->>APIs: Fetch metadata (parallel)
APIs-->>Backend: Return metadata
Backend->>Backend: Compare & validate
Backend->>Backend: Generate ValidationResult
Backend-->>GUI: Return results
GUI->>GUI: Display summary & entries
User->>GUI: Select entry
GUI->>Backend: GET /api/entry/{key}
Backend-->>GUI: Return comparison data
GUI->>GUI: Render comparison table
User->>GUI: Accept field
GUI->>Backend: POST /api/save
Backend->>Backend: Update entry
Backend->>File: Save to .bib file
Backend-->>GUI: Success response
GUI->>GUI: Reload entry & update stats
Typical Workflow¶
Launch and Validate
Run
bibtex-validator references.bib --guiValidator fetches metadata from all available sources
Summary displays global statistics
Review Summary
Check attention pie chart for overall status
Review statistics (reviews, conflicts, differences, identical)
Decide whether to use “Accept All Entries” or review individually
Navigate Entries
Use dropdown, arrow keys, or Previous/Next buttons
Focus on entries with badges (
+N,!N)
Review Comparison Table
Check each field’s status badge
Compare BibTeX vs API values
Select source if multiple available (dropdown)
Make Decisions
Accept: Click Accept button for individual fields
Reject: Click Reject button to keep original
Bulk Actions: Use Accept All / Reject All for entry
Global: Use Accept All Entries for everything
Verify Changes
Changes are saved immediately upon acceptance
Entry reloads to show updated state
Statistics update in real-time
10. Tips and Best Practices¶
Efficient Review Process¶
Start with Summary: Use the attention pie chart to gauge overall quality
Focus on Conflicts: Prioritize entries with conflict badges (
!N)Use Keyboard Navigation: Arrow keys and PageUp/PageDown speed up navigation
Batch When Safe: Use “Accept All” for entries where you trust the API sources
Verify Critical Fields: Always manually review conflicts in title, author, and year
Source Selection Strategy¶
Trust Official Sources: Crossref and Zenodo are most reliable
Domain-Specific: Use DBLP for CS, PubMed for medical
Multiple Sources: When multiple sources agree, confidence is higher
Source Disagreement: If sources conflict, prefer Crossref for published papers
Handling Conflicts¶
Year Mismatches: Often indicate preprint vs published version - verify publication date
Author Differences: Check for name variations, middle initials, or order
Journal Names: Abbreviations vs full names are common - usually safe to accept API version
Title Differences: Usually minor (capitalization, punctuation) - review carefully
Performance Considerations¶
Large Bibliographies: Use keyboard shortcuts for faster navigation
Network Delays: API fetching happens during validation, not in GUI
Save Frequency: Each Accept action saves immediately - no batch save needed
Conclusion¶
The BibTeX Validator GUI provides a powerful, intuitive interface for reviewing and enriching your bibliography. By understanding the badge systems, keyboard shortcuts, and data sources, you can efficiently validate large bibliographies while maintaining control over every change.
For command-line usage, see the Usage Guide. For details on the validation logic, see the Internal Logic documentation.