Who I am, what I do and what I like


My name is Angelica Lo Duca, alias alod83. I’m a Computer Scientist with a PhD in Computer Science. I work in the Research and Technology fields at the Institute of Informatics and Telematics of National Research Council, Italy.

My research interests include: Data Science, Machine Learning, Text Analytics, Data Visualisation…

Data Analysis

A ready-to-run code including preprocessing, parameters tuning and model running and evaluation.

Image by Buffik from Pixabay

In this short tutorial I illustrate a complete data analysis process which exploits the scikit-learn Python library. The process includes

  • preprocessing, which includes features selection, normalization and balancing
  • model selection with parameters tuning
  • model evaluation

The code of this tutorial can be downloaded from my Github Repository.

Load Dataset

Firstly, I load…


A step-by-step tutorial on how to install Trino, connect it to a SQL server, and write a simple Python Client.

Photo by Hermes Rivera on Unsplash

Trino is a distributed open source SQL query engine for Big Data Analytics. It can run distributed and parallel queries thus it is incredibly fast. Trino can run both on on-premise and cloud environments, such as Google, Azure, and Amazon.

In this tutorial, I describe how to install Trino locally…


A tutorial on how to build and query a hierarchical table using a relational database.

Photo by benjamin lehman on Unsplash

Given its flat nature, a relational database is not suitable to represent hierarchical data. However, thanks to some tricks, you can transform a relational database into good storage for hierarchical data.

In this article, I cover the following topics:

  • definition of hierarchical data
  • how to convert hierarchical data into a…

Machine Learning

The second episode of the scikit-learn series, which explains the well-known Python Library for Machine Learning

Tuning parameters in sklearn.cluster
Image by Author

Clustering is an unsupervised Machine Learning technique, where there is neither a training set nor predefined classes. Clustering is used when there are many records, which should be grouped according to similarity criteria, such as distance.

A clustering algorithm takes a dataset as input and returns a list of labels…

Some tips on how to become a successful Artificial Intelligence blogger, signed by one of the Medium Top Writers.

Photo by Maranda Vandergriff on Unsplash

Writing a blog article could be a simple thing. But writing a successful article is not quite as simple. I had the pleasure of interviewing Dario Radečić, one of the most popular Medium bloggers on the subject of Artificial Intelligence. …

Machine Learning

Some tips on how to optimize the development process of a Machine Learning model in order to avoid surprises during the deployment phase.

Photo by Nick Owuor (astro.nic.visuals) on Unsplash

Eventually I was able to breathe a sigh of relief: my Machine Learning model works perfectly both on training and on the test set. All the metrics used to measure the performance of my model achieve very high performance.

Web Programming

Learn arrays and loops in PHP by implementing a simple service that allows users to log in to a system and to read articles.

Photo by Florian Olivo on Unsplash

In this tutorial I illustrate a practical example to learn arrays in PHP. The idea is to implement a service that allows users to log in to a system and to view articles. The service receives the user’s username and password via GET. Do not worry if you do not…

Angelica Lo Duca

Top 1000 Medium Writer in May, June and July 2021. I write on Data Science, Python, Tutorials and, occasionally, Web Applications.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store