☕
Humans love coffee at least the coolest ones do. You probably wake up every day, and one of the first things you do automatically is prepare a good cup of coffee. You may not care about the story behind it or what kind it is as long as it gives you that caffeine hit you need, you're good to go. But if you're as curious as I am, you might have wondered: Where does this coffee come from? Which countries consume the most coffee? What type of coffee is the best-selling?
The purpose of this page is to illustrate statistics on Colombian coffee exports.
The source of this data is directly from the
National Federation of Coffee Growers of Colombia.
These visualizations aim to provide a more detailed insight into international trade patterns.
Colombia is the third-largest coffee exporter in the world, accounting for 8% of global exports.
There are many different types of coffee beans and products made from them. Here, we’re focusing only on the top exported coffee products.
Green coffee refers to the raw seeds of coffee cherries that have been separated, or "processed," but not yet roasted. These beans develop their intense aroma, acidity, body, and flavor during the roasting process. This is the one you probably drink at Starbucks.
Soluble Also known as instant coffee, is coffee that has been brewed then de-hydrated, easily dissolved. Not my favorite.
Roasted Coffee It's just raw green coffee beans dropped into loaders and then into a rotating drum. The drum is pre-heated to a temperature of 240 degrees.
Extracts Is a product of using coffee beans and alcohol to create a concentrated coffee flavoring that can be used in baked goods, ice cream and cocktails.
United States is by far the country who import most coffee from Colombia. In 2023, Colombia led with $1.35 billion worth of coffee exported to the US. Info here
Coffee exports have fluctuated over the years, showing no strong upward or downward trend, but rather seasonal variations. Export volumes mostly range between 800K and 1.4M bags.
🤖☕📈
I like coffee, but I also enjoy math and discovering interesting insights hidden in data. In this case,
I want to apply a few machine learning concepts to predict future trends and detect outliers in coffee exports.
Machine Learning (ML) is a branch of artificial intelligence (AI) that enables computers to learn patterns from data and make predictions or decisions without being explicitly programmed.
Time forecasting is used to predict future values based on past data trends. The mathematics behind it relies on statistical and machine learning models.
For anomaly detection, we will identify unexpected spikes or drops in coffee exports. To achieve this, we'll use Isolation Forest, a machine learning model designed to detect outliers (anomalies) in time-series data.
The latest data we have is from December 2024, and we want to predict coffee exports for the next five months.
The error percentage between the forecasted value (964,207) and the actual value (1,150,000) for January 2025 is approximately 16.16%.
Our prediction isn’t too bad, but it could be significantly better.
There are external factors that this model does not account for, such as demand, policies, weather conditions, and global trade dynamics.
Anomaly detection helps us identify outliers. The most obvious one, just by looking at the graph, is the export volume in May 2021, with only (345.017) bags exported.
This represents a 52% decrease compared to the amount exported just one month earlier. This drop occurred due to road blockades across the country as part of the National Strike, which lasted for over a month.
This project also aims to showcase how an ETL data pipeline works, along with the technologies and methodologies behind it. If you're interested in the nerdy details, you can read more by clicking here!
Preliminary figures correspond to the provisional declared value in the export declaration. The FNC (NFC) publishes these figures for illustrative purposes and they should not be understood as an official source for the value of coffee exports. The official source is the consolidated and final figures regularly published by the DIAN.