Research Graph Schema (Version 3.5)

Research Objects

Research Graph is composed of five primary nodes, also known as research objects:

  • Researcher: An individual who conducts scientific or scholarly study, often within a specific field of knowledge.
  • Publications: Academic publications include journal articles, books, patents, conference abstracts, and theses, serving as the primary medium for disseminating research findings.
  • Research Data: Data that is used or generated in the course of academic studies or research project.
  • Grants: Awards and grants are financial support provided by funding bodies (like government agencies, foundations, or institutions) to researchers or research projects.
  • Organisations: This encompasses institutions like universities, research institutes, and academic societies.

The following tables describe the metadata (propeties, types and constraints) for each node (research object). We have also provided an example of the JSON representation of each research object.

Researcher

Property Type Description
key ◆ string Unique URI for research object identification
📐 Format: @source/@local_id
source ◆ string Metadata source identifier
💡 Use domain name without extension
📝 Examples: orcid, datacite, nih
local_id ◆ string Record identifier in source database
last_updated ◆ dateTime Update timestamp in Research Graph
full_name ◆ string Complete name when parts unavailable
â„šī¸ Fallback for missing first/last names
first_name string First name (English preferred)
last_name string Last name (English preferred)
url string Resource URL
orcid string ORCID ID (xxxx-xxxx-xxxx-xxxx)
🔗 Structure documentation
scopus_author_id string Scopus author identifier

◆ Required field

Example Record

{
  "key": "orcid/0000-0002-4259-9774",
  "source": "orcid",
  "local_id": "0000-0002-4259-9774",
  "last_updated": "2024-03-15T14:30:00Z",
  "full_name": "Amir Aryani",
  "first_name": "Amir",
  "last_name": "Aryani",
  "url": "https://orcid.org/0000-0002-4259-9774",
  "orcid": "0000-0002-4259-9774",
  "scopus_author_id": "35068996400"
}

Publication

Property Type Description
key ◆ string Unique URI for research object identification
📐 Format: researchgraph.ai/@source/@local_id
source ◆ string Metadata source identifier
💡 Use domain name without extension
📝 Examples: orcid, datacite, nih
local_id ◆ string Record identifier in source database
last_updated ◆ dateTime Update timestamp in Research Graph database
title ◆ string Publication title
author_list ◆ string Comma-separated list of co-authors
â„šī¸ Complements title for easy recognition
publication_type string Publication type
🔗 ORCID work types
url string Resource URL
doi string Digital Object Identifier
📐 Format: doi_prefix/doi_suffix
isbn string International Standard Book Number
publication_year integer Publication year (YYYY format)
scopus_eid integer Scopus EID
Unique academic work identifier in Scopus

◆ Required field

Example Record

{
 "key": "crossref/10.1038/sdata.2018.99",
 "source": "crossref",
 "local_id": "10.1038/sdata.2018.99",
 "last_updated": "2024-03-15T14:30:00Z",
 "title": "A Research Graph dataset for connecting research data repositories using RD-Switchboard",
 "author_list": "Aryani, A., Poblet, M., Unsworth, K., Wang, J., Evans, B., Devaraju, A., Hausstein, B., Klas, C.-P., Zapilko, B., Kaplun, S.",
 "publication_type": "journal-article",
 "url": "https://doi.org/10.1038/sdata.2018.99",
 "doi": "10.1038/sdata.2018.99",
 "publication_year": 2018
}

Dataset

Property Type Description
key ◆ string Unique URI for research object identification
📐 Format: researchgraph.ai/@source/@local_id
source ◆ string Metadata source identifier
💡 Use domain name without extension
📝 Examples: orcid, datacite, nih
local_id ◆ string Record identifier in source database
last_updated ◆ dateTime Update timestamp in Research Graph database
title ◆ string Dataset title
author_list ◆ string Comma-separated list of creators or curators
â„šī¸ Complements title for easy recognition
url string Resource URL
doi string Digital Object Identifier
📐 Format: doi_prefix/doi_suffix
publication_year integer Publication year (YYYY format)

◆ Required field

Example Record

{
 "key": "zenodo/10.5281/zenodo.4939953",
 "source": "zenodo",
 "local_id": "10.5281/zenodo.4939953",
 "last_updated": "2024-03-15T14:30:00Z",
 "title": "People with disability participated in labour force 2018",
 "author_list": "Woo, J., Aryani, A.",
 "url": "https://doi.org/10.5281/zenodo.4939953",
 "doi": "10.5281/zenodo.4939953",
 "publication_year": 2021
}

Grant

Property Type Description
key ◆ string Unique URI for research object identification
📐 Format: @source/@local_id
source ◆ string Metadata source identifier
💡 Use domain name without extension
📝 Examples: orcid, datacite, nih
local_id ◆ string Record identifier in source database
💡 Best practice: grant_number
last_updated ◆ dateTime Update timestamp in Research Graph database
title ◆ string Grant title
url string Resource URL
purl string Persistent URL
📝 Example: http://purl.org/au-research/grants/arc/SR0567397
doi string Digital Object Identifier
📐 Format: doi_prefix/doi_suffix
publication_year integer Publication year (YYYY format)
funder string Funding organization
💡 Best practice: funder domain address
📝 Example: arc.gov.au
funding_amount long Funding amount
📝 Example: 450000
funding_currency string Currency code
💡 Use ISO 4217 codes
🔗 ISO 4217 reference
start_year integer Grant start year (YYYY)
end_year integer Grant end year (YYYY)

◆ Required field

Example Record

{
 "key": "arc/DP210103512",
 "source": "arc",
 "local_id": "DP210103512",
 "last_updated": "2024-03-15T14:30:00Z",
 "title": "Advanced Machine Learning Techniques for Climate Change Prediction",
 "url": "https://dataportal.arc.gov.au/NCGP/Web/Grant/Grant/DP210103512",
 "purl": "http://purl.org/au-research/grants/arc/DP210103512",
 "doi": "10.13039/501100000923",
 "publication_year": 2021,
 "funder": "arc.gov.au",
 "funding_amount": 875000,
 "funding_currency": "AUD",
 "start_year": 2021,
 "end_year": 2024
}

Organisation

Property Type Description
key ◆ string Unique URI for research object identification
📐 Format: @source/@local_id
source ◆ string Metadata source identifier
💡 Use domain name without extension
📝 Examples: orcid, datacite, nih
local_id ◆ string Record identifier in source database
last_updated ◆ dateTime Update timestamp in Research Graph database
name ◆ string Organisation name
💡 Best practice: use English label from WikiData records
url string Organisation URL
📝 Example: http://www.monash.edu/
grid string GRID identifier
📝 Example: grid.1002.3
doi string Digital Object Identifier
📐 Format: doi_prefix/doi_suffix
ror string ROR identifier
📝 Example: 02bfwt286
isni string ISNI identifier
📝 Example: 0000000419367857
wikidata string Wikidata identifier
📝 Example: Q598841
country string Country
📝 Example: Australia
city string City
📝 Example: Melbourne
latitude float Latitude coordinate
📝 Example: -37.908333333333
longitude float Longitude coordinate
📝 Example: 145.13805555556

◆ Required field

Example Record

{
 "key": "ror/02bfwt286",
 "source": "ror",
 "local_id": "02bfwt286",
 "last_updated": "2024-03-15T14:30:00Z",
 "name": "Monash University",
 "url": "http://www.monash.edu/",
 "grid": "grid.1002.3",
 "doi": "10.13039/501100001779",
 "ror": "02bfwt286",
 "isni": "0000000419367857",
 "wikidata": "Q598841",
 "country": "Australia",
 "city": "Melbourne",
 "latitude": -37.908333333333,
 "longitude": 145.13805555556
}

Design Principles 

The Research Graph schema is designed to enable transformation of scholarly information to accessible and interoperable networks. The schema follows the following design principles:

  • East of use: The schema enables the graph databases to be easy to collect, connect, and work with by software developers.
  • Highly interoperable with PID systems: The schema enables Research Graph content to be early linked to PID systems such as ORCID, Crossref, DataCite, and other providers of persistent identifiers. Also, it enables the Research Graph to be easily translated to other schemas using the PID ecosystem. 
  • Highly computable: The schema enables Research Graph content to be light and scalable, providing the foundation for storing and computing very large graphs using only personal computers instead of high-performance computing. 

Aligned with these principles, the schema is not comprehensive and does not include a complex and complete set of metadata for all possible data sources. Instead, it encourages the Research Graph systems to use external databases to augment the graph data when more comprehensive analysis is needed.