Mar 2026

The Shifting Landscape of Research Analytics: From Data Access to Trustworthy Insights


I watched the "Vibe Coding with Open Alex" video recently, and it's been rattling around in my head since.

The barrier has dropped

A researcher who used to need a vendor's tools or a specialist analyst can now point an LLM at a CSV of publications and get something meaningful in an afternoon. NotebookLM, Cursor, ChatGPT with code interpreter - these tools have commoditised a lot of what used to be specialist craft. Knowing how to wrangle data, write the right pandas operations, build a decent visualisation: it's still a skill, but it's no longer a bottleneck.

The community is noticing. People at ISSI or STI who wouldn't have called themselves coders are now shipping notebooks.

What vendors still have

The traditional pitch - "we make the data accessible and usable" - is eroding. But it hasn't gone. The remaining advantages are real, just narrower:

  • Data quality and curation is still genuinely hard. An LLM can't conjure a correctly disambiguated author profile from messy affiliation strings. The judgement calls embedded in our data - what counts as the same person, how to handle name variants, how to classify access type - represent accumulated domain knowledge. You can't replicate that just by having a model and a raw corpus.
  • Coverage and timeliness still matter. The gap between Dimensions and something patched together from Unpaywall + OpenAlex + CrossRef is real, especially at scale and for recent data.
  • Trustworthiness as infrastructure. Institutions and funders need something they can cite in a report to government. That requires provenance, audit trails, a vendor who can be held accountable. An ad-hoc notebook doesn't cut it.

Where the value is moving

The role of a data vendor is genuinely shifting. Clients increasingly don't want a database to query - they want answers and frameworks.

That means becoming more opinionated: providing benchmarks, recommended metrics, pre-classified outputs. Less neutral data pipe, more methodology partner. The value-add is less about retrieval and more about knowing what to measure and why.

Something interesting is also happening with risk. Clients used to question their own capability to analyse data. Now many feel they have that capability - so the concern has shifted to whether the underlying data is trustworthy enough to build on. That's a subtle inversion, and it increases scrutiny on data quality. Pressure for some vendors, opportunity for others.

The vendors who thrive in five years will be the ones who made their data legible to AI workflows, not just to human analysts.

Cite this blog post:


Michele Pasin. The Shifting Landscape of Research Analytics: From Data Access to Trustworthy Insights. Blog post on www.michelepasin.org. Published on March 3, 2026.

Comments via Github:


See also:

2025


paper  The Dimensions API: a domain specific language for scientometrics research

Frontiers in Research Metrics and Analytics, Oct 2025. https://doi.org/10.3389/frma.2025.1514938



paper  Enhancing the Accessibility of ORCID Public Data, now additionally hosted on Google BigQuery

4th International Conference on the Science of Science and Innovation, Copenhagen, Denmark, Jun 2025.




2022


paper  Generating large-scale network analyses of scientific landscapes in seconds using Dimensions on Google BigQuery

International Conference on Science, Technology and Innovation Indicators (STI 2022), Granada, Sep 2022.


2017



paper  Data integration and disintegration: Managing Springer Nature SciGraph with SHACL and OWL

Industry Track, International Semantic Web Conference (ISWC-17), Vienna, Austria, Oct 2017.



paper  Using Linked Open Data to Bootstrap a Knowledge Base of Classical Texts

WHiSe 2017 - 2nd Workshop on Humanities in the Semantic web (colocated with ISWC17), Vienna, Austria, Oct 2017.




paper  Fitting Personal Interpretation with the Semantic Web: lessons learned from Pliny

Digital Humanities Quarterly, Jan 2017. Volume 11 Number 1


2016




paper  Insights into Nature’s Data Publishing Portal

The Semantic Puzzle (online interview), Apr 2016.


2015



paper  Learning how to become a linked data publisher: the nature.com ontologies portal.

5th Workshop on Linked Science 2015, colocated with ISWC 2015., Bethlehem, USA, Sep 2015.





paper  ResQuotes.com: Turn your Notes and Highlights into Research Ideas

Force11 - Research Communications and e-Scholarship conference, Oxford, UK, Jan 2015.


2013


paper  Moving EMLoT towards the web of data: an approach to the representation of humanities citations based on role theory and formal ontology

New Technologies in Medieval and Renaissance Studies, (forthcoming). (part of the 'Envisioning REED in the Digital Age' collection)





2012


paper  Annotation and Ontology in most Humanities research: accommodating a more informal interpretation context

NeDiMaH workshop on ontology based annotation, held in conjunction with Digital Humanities 2012, Hamburg, Germany, Jul 2012.




2010



paper  How do philosophers think their own discipline? Reports from a knowledge elicitation experiment

European Philosophy and Computing conference, ECAP10, Munich, Germany, Oct 2010.



paper  Data integration perspectives from the London Theatres Bibliography project

Annual Conference of the Canadian Society for Digital Humanities / Société pour l'étude des médias interactifs (SDH-SEMI 2010), Montreal, Canada, Jun 2010.





2009



paper  Laying the Conceptual Foundations for Data Integration in the Humanities

Proc. of the Digital Humanities Conference (DH09), Maryland, USA, Jun 2009. pp. 211-215