On 2nd Feb, I launched a web dashboard for tracking the spread of recent coronavirus (COVID-19) outbreak, which provides a real-time view of global confirmed, recovered, and death cases. It so far has attracted more than 17,000 active users and was shared almost 3,500 times on social media. I am really glad that I could contribute my skillsets for helping people in this global emergency and especially want to thank
Correlation is one of the most fundamental statistical concepts used in almost any sectors.
For example, as in portfolio management, correlation is often used to measure the amount of diversification among the assets contained in a portfolio. Choosing assets with low or negative correlation with each other can help to reduce the risk of a portfolio. In addition, correlations give insights about marketing strategies and business outcomes in marketing research, which further help marketers make actionable decisions, and ultimately, grow businesses.
Python is slow.
I bet you might encounter this counterargument many times about using Python, especially from people who come from C or C++ or Java world. This is true in many cases, for instance, looping over or sorting Python arrays, lists, or dictionaries can be sometimes slow. After all, Python is developed to make programming fun and easy. Thus, the improvements of Python code in succinctness and readability have to come with a cost of performance.
The prerequisite for doing any data-related operations in Python, such as data cleansing, data aggregation, data transformation, and data visualisation, is to load data into Python. Depends on the types of data files (e.g. .csv, .txt, .tsv, .html, .json, Excel spreadsheets, relational databases etc.) and their size, different methods should be applied to deal with this initial operation accordingly. In this post, I will list some common methods for importing data in Python.
Last week, I shared with you how to make a dashboard to track the spread of coronavirus using Dash in python, from which you can have a real-time overview of the numbers of global coronavirus cases, including confirmed, recovered and deaths cases, and their distribution on a world map.
As for the first version, we implemented basic dash functions and obtained a static application interface. In other words, except for the native interactions offered by plotly (e.
Last month, I published four posts to share with you my experience in using matplotlib. Benefit from its full control of elements on a given graph, matplotlib is deemed as a fundamental python library for data visualisation and used by many other libraries (e.g. seaborn and pandas) as plotting module. This is also why I think learning matplotlib is an essential part for being a practitioner in data science, which helps to build up in-depth understanding about logic behind data visualisation tools.
From my previous posts about the hierarchical structure of matplotlib plotting and the many ways to instantiate axes, we can see that these features render matplotlib a great potential for creating highly complex and customisable visualisations. To demonstrate this and also improve my own understanding of matplotlib, I set out to make an infographic using matplotlib this week.
An infographic generally combines visual imagery, data charts, and minimal text together.
Although matplotlib is extremely powerful and the only limitation might be our imagination, it is a bit challenging for new users to find the right path as there are always more than one way to achieve the same goal in matplotlib. Calling axes is one of them.
Let’s say you just decide to make plots using object-oriented interface (aka artist layer plotting) in matplotlib. However, I bet you will be soon running into problems when trying to instantiate axes to start your plotting.