Language Elements

About this database

Language Elements is a living systematic review database of neurostimulation studies examining the causal role of brain regions in language processing. It compiles findings from transcranial magnetic stimulation (TMS), transcranial electrical stimulation (tES), and direct electrical stimulation (DES) studies — the only techniques capable of establishing causal brain–language relationships rather than correlational ones.

The database is built around elementalism: a theoretical framework that characterises brain regions by the minimal computational operations they causally support, inferred bottom-up across heterogeneous tasks (following Genon et al., 2018). This high-specificity approach distinguishes Language Elements from existing resources, which typically catalogue findings at the level of broad linguistic domains.

The database is designed to serve both basic research — supporting experimental planning and meta-analytic synthesis — and intraoperative clinical decision-making, providing evidence-based task recommendations for awake craniotomy language mapping.

The database

The systematic review (PROSPERO: CRD42024602006) searched PubMed, Scopus, Embase, and PsychInfo from October 2024 to January 2025, returning 12,763 records. After deduplication and screening, the current database includes:

221included papers

741data rows

22languages covered

~3,500participants

44data points per paper

The database is live and updated as screening and extraction continue.

How to cite

A paper describing the database and the elementalism framework is currently in submission. In the meantime, please cite the PROSPERO registration:

Williamson, T. R., et al. (2024). Elements of the neurobiology of language: A neurostimulation model. PROSPERO CRD42024602006. https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=602006

Contributors

Language Elements is an international collaboration between research teams in the UK and Germany.

United Kingdom

T. R. Williamson

Project lead · PhD researcher

UWE Bristol & Southmead Hospital, North Bristol NHS Trust

Anna E. Piasecki

Co-investigator · Psycholinguistics

UWE Bristol

Neil U. Barua

Clinical validator · Neurosurgeon

Southmead Hospital, North Bristol NHS Trust

Eimear McKnight

Research assistant · Syntax

UWE Bristol & Queen Mary University of London

Antonia Vogt

Collaborator · Neurosurgery

University of Cambridge

Kristofer Kinsey

Collaborator · Neuropsychology

UWE Bristol

Naomi Heffer

Collaborator · Neurostimulation

UWE Bristol

Jemma Sedgmond

Collaborator · Neurostimulation

UWE Bristol

Sonia Mariotti

Collaborator · Bilingualism

UWE Bristol & Southmead Hospital

Germany

Gesa Hartwigsen

Professor of Cognitive Neuroscience

Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig & Leipzig University

Philipp Kuhnke

Postdoctoral researcher · Neurobiology of language

Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig & Leipzig University

Lydia Wiernik

Collaborator · Sign language & multilingualism

University of Göttingen & MPI Leipzig

Funding

This work was supported by the British Council Going Global Partnerships Springboard Programme (UK–Germany).

Contact

For queries about the database, the systematic review, or potential collaborations, please contact T. R. Williamson at t.williamson@uwe.ac.uk.

How it works

Neuroscientist Mode

Neuroscientist Mode searches the database directly. Every result is drawn verbatim from the systematic review dataset — no inference, no generation. Free-text search and filters (stimulation type, linguistic area, hemisphere, inhibition/facilitation) operate on the raw data. The brain visualiser plots MNI coordinates extracted directly from the included papers.

Neurosurgeon Mode

Type a brain region name to begin. The tool runs in two stages: first characterising the functional profile of the region, then generating intraoperative task recommendations based on that characterisation.

Functional characterisation

When you search a region, the tool identifies all neurostimulation studies in the database targeting that region and extracts the linguistic processes they implicate. These processes are then grouped into operation-function labels using one of three layers, each with a defined fallback:

1 Controlled vocabulary grouping No AI · Instant

Processes are matched against a curated controlled vocabulary of operation-function labels. If coverage is sufficient, results are grouped instantly using this vocabulary alone — no network request is made. This layer is fully deterministic and reproducible across sessions.

2 AI-assisted functional characterisation AI · ~15–25s

When controlled vocabulary coverage is insufficient, a two-stage AI pipeline characterises the functional profile of the region:

Stage 1 — Similarity assessment (skipped if <6 processes): An AI model assesses which processes share the same computational architecture before grouping, applying five criteria — cognitive construct mapping, computational primitive, task demand structure, outcome convergence, and spatial scale compatibility — grounded in the frameworks of Song et al. (2025) and Poldrack et al. (2011). Only processes plausibly reflecting the same underlying computation are grouped; the rest appear under "Other processes".
Stage 2 — Operation-function labelling: An AI model generates operation-function labels for each cluster, guided by the bottom-up functional characterisation approach of Genon et al. (2018). Label specificity is calibrated to the neuroanatomical hierarchy level of the searched region (lobe, gyrus, or subgyral division). Labels name the minimal computational operation a region performs, not linguistic domains or task types.

Results may vary slightly between sessions due to the probabilistic nature of language models.

Each group includes a brief AI rationale explaining the computational basis for the grouping. For deeper transparency, an ✶ Explain this Functional Characterisation option generates an evidence-based audit trail — citing specific processes, stimulation modalities, brain regions, and outcomes that informed the AI's decision.

3 Flat process list No AI · Always available

If neither layer 1 nor layer 2 is available — because the AI call fails, the controlled vocabulary has insufficient coverage, or there is no internet connection — the tool displays all identified processes in a flat ranked list ordered by study count. All data remains drawn directly from the systematic review.

Task recommendations

Once a region has been characterised, clicking Generate Recommendations runs a three-step process:

1 Evidence retrieval No AI · Instant

All tasks used in neurostimulation studies of this region are retrieved from the database and organised by the processes they target. Tasks are deduplicated and prepared for the AI, which receives the full task evidence for this region alongside the functional characterisation from the previous stage.

2 AI recommendation with clinical guardrails AI · ~30–40s

An AI model selects and ranks tasks from the database evidence, applying clinical constraints drawn from two peer-reviewed frameworks:

De Witte et al. (2015) — the Dutch Linguistic Intraoperative Protocol (DuLIP), defining which linguistic functions are testable during DES and what constitutes a valid intraoperative task. doi:10.1080/02687038.2015.1071993
Wager et al. (2017) — guidance on the awake surgery operating environment, timing constraints, and response modality requirements. doi:10.1016/j.neuchi.2016.10.002

The AI is instructed to recommend tasks from database evidence only, flag data gaps explicitly, label every recommendation by evidence type and DES compatibility confidence, and never extrapolate beyond the evidence base. All recommendations cite their source papers.

3 Fallback if AI is unavailable No AI · Instant

If the AI call fails — due to a network error, a timeout, or a service interruption — you will see: "Failed to generate recommendations. Please check your API key and try again." The tool does not produce unsupported output. No recommendations are shown unless they are grounded in the database evidence and established clinical guidelines. In this case, the functional characterisation from the previous stage remains visible and can still be used to inform clinical judgement.

AI methodology & guardrails

AI-assisted functional characterisation uses a three-stage evidence pipeline, each constrained by peer-reviewed methodological frameworks embedded in the AI prompts:

Similarity assessment — Processes are assessed for computational homogeneity before grouping, guided by principles of cognitive ontology (Poldrack et al., 2011; Cognitive Atlas) and multiscale neuroscience (Song et al., 2025). Five criteria are applied: cognitive construct mapping, computational primitive, task demand structure, outcome convergence, and spatial scale compatibility.
Operation-function grouping — Similar processes are labelled using bottom-up functional characterisation (Genon et al., 2018). Labels name the minimal computational operation, not linguistic domains.
Task recommendations — Intraoperative tasks are recommended within clinical feasibility guardrails from the Dutch Linguistic Intraoperative Protocol (De Witte et al., 2015) and awake surgery operating environment guidelines (Wager et al., 2017).

All AI outputs are advisory and should be interpreted by qualified clinicians. Reasoning is anchored to established methodological frameworks through guardrail knowledge embedded directly in the AI prompts. Repeat searches within the same session return cached results instantly without additional API calls.

Clinical disclaimer
The intraoperative task recommendations generated by this tool are intended to support, not replace, clinical judgement. They are derived from the neurostimulation research literature and have not been prospectively validated as clinical decision-support outputs. All task selection for awake craniotomy procedures must be approved by the responsible surgical and neuropsychology team. Confidence scores are heuristic indicators based on study count, stimulation modality, and sample size — they are not validated clinical risk scores.