The Shifting Landscape of Research Analytics: From Data Access to Trustworthy Insights

Mar 2026

I watched the "Vibe Coding with Open Alex" video recently, and it's been rattling around in my head since.

The barrier has dropped

A researcher who used to need a vendor's tools or a specialist analyst can now point an LLM at a CSV of publications and get something meaningful in an afternoon. NotebookLM, Cursor, ChatGPT with code interpreter - these tools have commoditised a lot of what used to be specialist craft. Knowing how to wrangle data, write the right pandas operations, build a decent visualisation: it's still a skill, but it's no longer a bottleneck.

The community is noticing. People at ISSI or STI who wouldn't have called themselves coders are now shipping notebooks.

What vendors still have

The traditional pitch - "we make the data accessible and usable" - is eroding. But it hasn't gone. The remaining advantages are real, just narrower:

Data quality and curation is still genuinely hard. An LLM can't conjure a correctly disambiguated author profile from messy affiliation strings. The judgement calls embedded in our data - what counts as the same person, how to handle name variants, how to classify access type - represent accumulated domain knowledge. You can't replicate that just by having a model and a raw corpus.
Coverage and timeliness still matter. The gap between Dimensions and something patched together from Unpaywall + OpenAlex + CrossRef is real, especially at scale and for recent data.
Trustworthiness as infrastructure. Institutions and funders need something they can cite in a report to government. That requires provenance, audit trails, a vendor who can be held accountable. An ad-hoc notebook doesn't cut it.

Where the value is moving

The role of a data vendor is genuinely shifting. Clients increasingly don't want a database to query - they want answers and frameworks.

That means becoming more opinionated: providing benchmarks, recommended metrics, pre-classified outputs. Less neutral data pipe, more methodology partner. The value-add is less about retrieval and more about knowing what to measure and why.

Something interesting is also happening with risk. Clients used to question their own capability to analyse data. Now many feel they have that capability - so the concern has shifted to whether the underlying data is trustworthy enough to build on. That's a subtle inversion, and it increases scrutiny on data quality. Pressure for some vendors, opportunity for others.

The vendors who thrive in five years will be the ones who made their data legible to AI workflows, not just to human analysts.

Cite this blog post:

Michele Pasin. The Shifting Landscape of Research Analytics: From Data Access to Trustworthy Insights. Blog post on www.michelepasin.org. Published on March 3, 2026.

Comments via Github: