Who I am, what I do and what I like


My name is Angelica Lo Duca, alias alod83. I’m a Computer Scientist with a PhD in Computer Science. I work in the Research and Technology fields at the Institute of Informatics and Telematics of National Research Council, Italy.

My research interests include: Data Science, Machine Learning, Text Analytics, Data Visualisation…

Data Analysis

A ready-to-run code including preprocessing, parameters tuning and model running and evaluation.

Image by Buffik from Pixabay

In this short tutorial I illustrate a complete data analysis process which exploits the scikit-learn Python library. The process includes

  • preprocessing, which includes features selection, normalization and balancing
  • model selection with parameters tuning
  • model evaluation

The code of this tutorial can be downloaded from my Github Repository.

Load Dataset

Firstly, I load…

Machine Learning

An overview of how to integrate Comet in GitLab, with a practical example to make them work together

Photo by Lopez Robin on Unsplash

Comet is a meta Machine Learning experimentation platform, providing many features to track, compare, optimize and monitor experiments and models. In practice, Comet permits tracking datasets and code changes. It also provides tools to improve productivity and collaboration, including panels and reports.

Recently, Comet has been integrated with GitLab, a…

Data Science Discussions

Give your tired eyes a break and let AI efficiently auto-generate reports you don’t need to read

Photo by Andy Kelly on Unsplash

Stop wasting time rewriting the same report over and over again. With the spread of AI writing tools, reports get written automatically without human interaction, saving you time and effort. The task takes just a few seconds to set up, so you can stop writing from scratch every time.


Data Manipulation

An overview of the GeoPandas Python library, with a step-by-step example

Photo by GeoJango Maps on Unsplash

Data science application often require working with data in geographic space. Shapefiles are files that store geospatial data organized using a file-based database. Shapefiles are used by GIS professionals, local government agencies, and businesses for mapping and analysis.

In this blog post I will describe an elegant way of working…

Machine Learning

Some basic concepts on Machine Learning Overfitting and some tips to mitigate it.

Photo by h heyerlein on Unsplash

If you’ve invested some time in learning Machine Learning, you’ve likely come across the term overfitting. Overfitting is a common problem and no single framework is immune. It’s not uncommon for algorithms to overfit from their first implementation. However, we can prevent this from happening by knowing how overfitting affects…

In 2021 I wrote 95 articles, gained 1192 followers, and have known many people who share the same interests.

Photo by Jingda Chen on Unsplash

I started blogging for fun two years ago, in February 2020. I had no idea what I was doing. I still remember my first article on Towards Data Science, entitled Dataset Manipulation with Open Refine. You can’t imagine my joy when Ben Huberman and his staff accepted my first article!

Data Visualization

Some tricks and tips on how to improve the readability of a bar chart through the popular Python library for Data Viz

Photo by Nathan Dumlao on Unsplash

Recently I have read a very interesting book by Jose Berengueres, entitled Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist. In this book, the author describes many techniques to extract from a dataset a very interesting story. …

Angelica Lo Duca

4x Top 1000 Medium Writer | +40k monthly views | I write on Data Science, Python, Tutorials and, occasionally, Web Applications.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store