I've been using Google Colab on a regular basis during the last few months, as I was curious to see whether I could make the switch to it from a more traditional Jupyter/JupyterLab environment. As it turns out, Colab is pretty amazing in many respects, but there are still situations where a local Jupyter notebook is my first choice. Keep reading to discover why!
Google Colaboratory (also known as Colab, see the FAQs) is a free Jupyter notebook environment that runs in the cloud and stores its notebooks on Google Drive.
Colab has become extremely popular with data scientists, particularly those doing some kind of machine learning tasks. Part of this popularity is because Colab has deep integration with Google's ML tools (e.g., TensorFlow), and in fact Colab actually permits switching to a Tensor Processing Unit (TPU) when running your notebook. For FREE. Which, by itself, is pretty remarkable already.
There are tons of videos on YouTube and tutorials on Medium, so I'm not going to describe it any further, because there is definitely no shortage of learning materials if you want to find out more about it.
I normally turn to notebooks because I need to demonstrate real-world applications of APIs to a (sometimes not-so-technical) audience. A lot of the work I've been doing lately has crystallized into the 'Dimensions API Labs' portal. This is essentially a collection of notebooks aimed at making it easier for people to extract, process, and turn into actionable insights the many kinds of data my company's APIs can deliver.
My usual workflow: - Getting data by calling APIs, sometimes using custom-built Python packages - Processing the data using pandas or built-in Python libraries - Building visualizations and summaries using tools like Plotly
My target audience: - Data scientists and developers who want to become proficient with our APIs - Analysts and domain experts who are less technically advanced but have the capacity to turn interesting research questions into queries and API-based workflows
Read on to find out how Colab ticked a lot of the boxes for this kind of work.
In general, Jupyter notebooks are an ideal tool for showcasing API functionalities and data features. The ability to pack together code, images, and text within a single runnable file makes the end result intuitive yet powerful.
Google Colab brings a number of extra benefits to the table:
No install setup. That was a massive selling point for me. If I have to share an API recipe with just anyone, Colab allows me to do that very quickly, even with non-technical users. They just have to open up a webpage, hit 'play', and run the notebook. Moreover, Colab includes by default many popular Python libraries and, if you need to, you can pip-install your own favorite ones too. Neat!
It scales well. I ran a couple of workshops recently with 30+ users, without any performance issues. Compared to setting up a JupyterHub server, it's much easier and cheaper too, of course. Plus, people can go home and re-run the same notebooks virtually within the same exact environment. No need to fiddle with Python, Docker, or Jupyter packages.
Sharing and commenting. The collaborative features of Colab need no introduction. Just think of how easy it is to share a Google Doc with your colleagues—only in this case you'd do it with a notebook!
Playground mode. Colab introduced the notion of playground mode, which essentially allows you to open a notebook in read-only mode (trying to save throws the error "This notebook is in playground mode. Changes will not be saved unless you make a copy of the notebook."). I find this feature extremely handy for demos, or in situations where one needs to experiment with a notebook without the risk of overwriting its 'stable' state.
Snippets. Colab includes a sidebar with many useful code snippets by default. You can extend that easily by creating your own 'snippets' notebook, going to Tools > Preferences, pasting the snippets notebook URL in Custom snippet notebook URL, and saving. Simple and effective. The new snippets can be shared with teammates too!
Extra UI components. The Colab folks developed a syntax for generating Forms components using markdown. This is very cool because it lets you generate simple input boxes, which can be used, for example, by non-technical people to enter data into a notebook. Also worth pointing out that forms are created using comments-like code (e.g., #@param {type:"string"}) so they don't interfere with the notebook if you open it within a traditional Jupyter environment.
The Google ecosystem. The integration with the rest of the G-Suite is unsurprisingly amazing, so pulling/putting data in and out of Drive, Sheets, or BigQuery is quick, easy, and well-documented.
Performance limitations. Of course, the performance will never be as good as running things locally (having said that, you can even use GPUs for free, but I haven't tried that yet). So for bigger projects involving complex algorithms or very large datasets, other data science platforms are probably better, e.g., Gigantum.
Interface learning curve. You have to get used to the Colab interface. It somehow still feels a bit more fiddly than JupyterLab, to me. Keyword shortcuts can be a problem too: you can customize them in Colab, but I couldn't replicate all of my (rather heavily customized) JupyterLab ones, due to conflicts with other default ones in Colab. So some muscle-memory pain there.
Exporting to HTML is not that good. Being able to turn Jupyter notebooks into a simple HTML file is pretty handy, but Colab can't do that. You can, of course, download the .ipynb file and then export locally (via nbconvert), but that doesn't always produce the results you'd expect either. For example, Plotly visualizations (like this one) are not rendering properly unless I run the whole notebook locally in JupyterLab before exporting.
Some Python libraries won't work out of the box. For example, I have a Python library called dimcli that builds on the latest prompt-toolkit. Turns out that Colab, by default, runs IPython 5.5.0 (latest version is 7), which is incompatible with prompt-toolkit. You can, of course, upgrade everything on Colab (e.g., pip install --upgrade --force-reinstall library-name), which is great; however, that may lead to further dependency errors, and so on.
Project versioning. Colab includes a built-in revision history tool, and it can integrate with GitHub too. Yet, I often end up creating multiple copies/versions of a notebook, instead of relying on the revisions system. I wish there was a better way to do this.
The Google ecosystem (again). As much as this can be a massive plus for some people (see above), it can also be a massive problem for others. Some customers I work with don't have access to G-Suite, full stop. That's not so uncommon, especially with large enterprises that are concerned about data privacy.
Google Colab is simply great for small/medium data projects. Hands down to the developers who built it. Some features are totally neat, and especially when I intend to share whatever I'm doing with more than one person, I immediately hit my New Colab Document shortcut.
Nonetheless, I still use JupyterLab a lot, for a variety of projects. For example, for quick personal data investigations. Or for projects that I know will be shared only with other data scientists (who need no guidance in order to run them). Or for projects with long-running processes and high memory consumption.
So the two things need to coexist.
The main challenge I'm facing now is: how to seamlessly move from one environment to the other? Here's what I learned so far:
if not 'google.colab' in sys.modules to run code selectively based on the platform (e.g., see here).Makes sense? If you know of a better way to do this, I'd love to know!
Cite this blog post:
Comments via Github:
2025
paper Enhancing the Accessibility of ORCID Public Data, now additionally hosted on Google BigQuery
4th International Conference on the Science of Science and Innovation, Copenhagen, Denmark, Jun 2025.
2022
International Conference on Science, Technology and Innovation Indicators (STI 2022), Granada, Sep 2022.
2021
2017
paper Data integration and disintegration: Managing Springer Nature SciGraph with SHACL and OWL
Industry Track, International Semantic Web Conference (ISWC-17), Vienna, Austria, Oct 2017.
paper Fitting Personal Interpretation with the Semantic Web: lessons learned from Pliny
Digital Humanities Quarterly, Jan 2017. Volume 11 Number 1
2015
2013
paper Fitting Personal Interpretations with the Semantic Web
Digital Humanities 2013, University of Nebraska–Lincoln, Jul 2013.
2012
2011
2010
2009
2007
paper PhiloSURFical: browse Wittgensteinʼs Tractatus with the Semantic Web
Wittgenstein and the Philosophy of Information - Proceedings of the 30th International Ludwig Wittgenstein Symposium, Kirchberg, Austria, Aug 2007. pp. 319-335