Words | Michele Pasin

blog AI Skills Are a Cottage Industry - And That Might Be Permanent.

Jun 2026

Everyone in my organisation is building their own AI skills. Prompts for summarising documents, triaging emails, drafting updates, analysing data. Each one slightly different. Each one optimised for the person who built it.

blog What the data doesn't show: presenting at CLACSO's funding flows project.

Apr 2026

I was in Buenos Aires last week for the closing event of Tracking research funding flows in the Global South, a project run by CLACSO in partnership with CWTS/Leiden University and IDRC. I'd been invited to present on Dimensions as a funding data infrastructure - roughly fifteen minutes to explain what we collect, how it's structured, and where we stand on openness.

blog The Shifting Landscape of Research Analytics: From Data Access to Trustworthy Insights.

Mar 2026

I watched the "Vibe Coding with Open Alex" video recently, and it's been rattling around in my head since.

blog Event: Hayden Thorpe live in Pordenone.

Oct 2025

I went to see Hayden Thorpe for the first time the other night at Pordenone's Ex-Convento Live venue (Italy). It's quite funny because I had never heard of this musician before last week. I just ran into him on Spotify and immediately felt it was so introspective and inspiring, full of simple yet deep melodies. I loved the performance and am looking forward to seeing more of him!

paper The Dimensions API: a domain specific language for scientometrics research.

Oct 2025 Frontiers in Research Metrics and Analytics, Oct 2025. https://doi.org/10.3389/frma.2025.1514938

We describe the Dimensions Search Language (DSL), a domain-specific language for bibliographic and scientometrics analysis. The DSL is the main component of the Dimensions API (version 2.12.0), which provides end-users with a powerful, yet simple-to-learn and use, tool to search, filter, and analyze the Dimensions database using a single entry point and query language. The DSL is the result of an effort to model the way researchers and analysts describe research questions in this domain, as opposed to using established paradigms commonly used by software developers e.g., REST or SOAP. In this article, we describe the API architecture, the DSL main features, and the core data model. We describe how it is used by researchers and analysts in academic and business settings alike to carry out complex research analytics tasks, like calculating the H-index of a researcher or generating a publications' citation network.

paper Enhancing the Accessibility of ORCID Public Data, now additionally hosted on Google BigQuery.

Jun 2025 4th International Conference on the Science of Science and Innovation, Copenhagen, Denmark, Jun 2025.

ORCID is committed to openness, exemplified by the annual release of its Public Data File since 2012. This dataset, encompassing all public ORCID records, has been downloaded over 190,000 times and serves as a resource for analyzing research community dynamics, scientific migrations, collaboration networks, and ORCID adoption trends. However, the file’s substantial size poses challenges for users lacking advanced data management skills, hindering exploratory analyses

blog Event: Installation at Sydney Data Arena.

Mar 2025

My piece, "Dreamy Pianos - Study No 1 in C Minor," is part of a new sonic installation by Simon Porter, currently showcased at the University of Technology Sydney.

paper Alleanze Ingannevoli: Svelare il lato nascosto della ricerca.

Jan 2025 1° Congresso Nazionale sull’Integrità nella Ricerca, Rome, Italy, Jan 2025.

Introdotta nel mondo della ricerca nel 2024, la scientometria forense (Forensic Scientometrics o FoSci) è una nuova disciplina sviluppata per facilitare l'analisi dei dati di pubblicazione, delle reti di co-autorialità, delle collaborazioni istituzionali e altro ancora. Le tecniche FoSci permettono di portare alla luce aspetti della ricerca scientifica che indicano potenziali rischi, come la partecipazione occulta a reti di ricerca compromesse o i rapporti con individui o gruppi noti per la diffusione di produzioni scientifiche di dubbia qualità o fraudolente.

blog Unpacking OpenAlex topics classification.

Sep 2024

In this post I have taken a closer look at the classification of scientific disciplines in OpenAlex, a recently developed database of scientific works. The topics classification has been entirely generated computationally using a mix of citation clustering techniques and LLM-based labeling. The results, although not always so precise, are definitely worth exploring further.

paper Dimensions: Calculating Disruption Indices at Scale.

Sep 2024 Quantitative Science Studies, Sep 2024. https://doi.org/10.48550/arXiv.2309.06120

Evaluating the disruptive nature of academic ideas is a new area of research evaluation that moves beyond standard citation-based metrics by taking into account the broader citation context of publications or patents. The "CD index" and a number of related indicators have been proposed in order to characterise mathematically the disruptiveness of scientific publications or patents. This research area has generated a lot of attention in recent years, yet there is no general consensus on the significance and reliability of disruption indices. More experimentation and evaluation would be desirable, however is hampered by the fact that these indicators are expensive and time-consuming to calculate, especially if done at scale on large citation networks. We present a novel method to calculate disruption indices that leverages the Dimensions cloud-based research infrastructure and reduces the computational time taken to produce such indices by an order of magnitude, as well as making available such functionalities within an online environment that requires no set-up efforts. We explain the novel algorithm and describe how its results align with preexisting implementations of disruption indicators. This method will enable researchers to develop, validate and improve mathematical disruption models more quickly and with more precision, thus contributing to the development of this new research area.