I have always wanted to create a lottery generator, either in Excel, Python, or R. I managed to find this tutorial (first published in 2011) on YouTube which creates a lottery generator using Microsoft Excel and basic VBA code. This tutorial is really well explained, and I was able to create a lottery generator in no time at all.
The image below is my version of the tutorial above. This has been updated though beacause the lottery now lets you pick from 59 numbers, than the original 49.
However, I wanted to create a euromillions generator which is based on chosing from five numbers (between 1 to 50), and two numbers (from 1 to 12) called Lucky Stars. This means that a number can appear twice on one ticket, therefore a modification of the excel spreadsheet neeeded to consider this.
The above is possible to do which requires creating a second rand() column based on the Lucky Stars numbers. The tweak to the VBA requires linking the lucky numbers to the columns in the generator tab using the following code to produce the numbers :
I’ll use this generator to produce some statistics based on actual lotto results.
Data analyst positions require some knowledge of Power BI or Tableau as a way to visualise your data. Typically I used Maps and reports to outline my research, but pivotting into a Data role, it is necessary to learn a visualisation tool so that your data output can be seen by others. Power BI is a microsoft product which is used by many companies, and whilst the software is free to download, publishing your work requires a licence. In contrast, Tableau has a paid and free (public) option so users can get to grips with learning the software and publishing (for free!) a dashboard. So I decided to finally try and create a Tableau Public dashboard using a tutorial published on youtube shown below. Credit to Nestor Adrianzen for publishing this content.
Here is the result of my Tableau Public dashboard.
I love coding, but I have always stuck to R as my master language because I felt so comfortable with it. Although I did learn python during the lockdown of 2020 and upload projects using it to my Github last year, I have tended not to use it that much and I really did need a refresher in it. Whilst more people sign up to coursera or udemy, I have found the courses are freecodecamp to be quite solid in their teaching, Therefore, I have began the Data analysis for Python certificate.
Alongside this, I am refreshing my basic GCSE maths, therefore I thought it was be quite cool to code python alongside this. Today I have uploaded the first of my jupyter notebooks (there are more to come!)
The following link is the one to the github repo containing the first notebook.
I have always loved looking at maps. I find them fascinating as they come in map forms (street, geology, land cover etc) that show can show a great deal of spatial information. It was this passion which persuaded me to study Geography and Geology at University and I loved specialising in Geographical Information Systems as it allowed me to create my own maps in a professional environment. Since my undergraduate days I relied too much on the Esri ArcPro environment which is only available under licence and can be quite expensive for making maps as a hobby. Yes, there is QGIS but I really wanted to learn how to map geographical data through python code. The major benefit of coding is that you have complete control over the process. I first experienced coding with python when I attended two NCAS scientific computing courses in 2018. Since then I didn’t have access to the resources to carry on coding (Linux in particular) until this year when I purchased a desktop pc for myself (waahey!).
Refreshing my Linux and python knowledge
So before I actually jumped into coding maps with python I had to refresh my knowledge and actually relearn the Linux environment. I first had to run a linux environment from my windows OS and then install anaconda from the command line. I used the tutorial below to run kali-linux on my system.
Refreshing my NCAS training
The next step was to refresh my NCAS training. This was a great exercise to do and I really enjoyed my time on the course. The jupyter-notebooks are available on my Github:
The next task was to refresh basic python knowledge. During lockdown I learnt basic python via the freecodecamp Youtube channel. I create a python repo which will be updated with random python scripts which will support support my on going learning.
I now had a refresh on using python for geographic data it was now time to use my own dataset to support my knowledge. I love eurovision and I really wanted to geographically show how many times a country has won the contest. Whilst the statiscal analysis was evaluated with a PostgreSQL database and R; the map production work was done in a python environment using the geopandas package! I was so happy to produce the following map:
Now there is still room for improvement (the labelling!), but I was so happy to be able to create this data from scratch and it looks like a pretty decent map. During this project I learnt how to merge tables, reproject map projections and create polygons with code! Basic GIS processes that I was so used to in Esri ArcPro. The repo for this project is available at:
Eurovision 2021 has just been in Rotterdam and Italy have won for the first time in over 30 years. The song that won was sang in Italian and I began to wonder what the language of winning songs and countries were throughout eurovision history. As of 1999 countries can send songs in any language – so I thought it would be a great idea to look at data from 1956 to 1998 and from 1999 to 2021. I therefore created a project which used R, Python and Docker to answer the following questions:
How many participants have taken part since Eurovision began?
What languages were the winning songs from 1956 to 1998?
What languages were the winning songs from 1999 to 2021?
What was the frequency of countries winning from 1956 to 1998?
What was the frequency of countries winning from 1999 to 2021?
The number of participants has been rising since the start of the contest. There was a big jump in the number of participanting countries since the start of the contest holding semi finals in 2004. The maximum number of participating countries is capped at 46 according to the EBU rules and only members can participate with the exception of Australia who is an associate member. With the inclusion of the semi-finals the number of participanting countries remain high as the contest is as popular as ever. This also means winning has become hard as countries not part of the ‘Big 5’ have to first qualify for the final.
The next figure shows the frequency of times a song in a particular language has won in the contest. I have presented data from 1956 to 1998 as this was when it was required to sing in a native language. There was a brief time in the 1970s where countries could sing in another language but this was reversed. The figure below shows that the songs were in a diverse range of languages with English and French being the most popular.
The next Figure shows the frequency of countries which have won the Eurovision. Most notable results from the data is the frequency of Ireland, United Kingdom, France and Luxembourg which sing in English and French that explain the language data in the above figure.
The next figure shows the number of times a particular language has featured in the Eurovision song contest since the language rule change from 1999. With the number of participants remaining high I guess singing in a widely spoken language has become important. However, I’d argue that sending a song in a native language is still important as the most recent winners Italy only send in Italian and the Portugese entry in 2017 remains popular winning song despite portugese only spoken in portugal.
The below figure shows the winning countries from 1999 – 2021 and shows a greater range of countries winning the contest. This is possibly down to two reasons. The first being that the rise in participating countries means a greater diversity of song choice and winners. The second being that the reduction in block voting favouring certain countries than others. Whilst this still does happen (hello Greece and Cyprus!) each country has an equal chance of winning if their song in favourable to the public once they have gotten through to the final of couse! It is for this reason which might explain the rise in English being dominant in the contest as countries want to connect and win votes which is easier to do in English than a native tounge which is only understood to your own community. Despite this the recent winning songs in Italian and Portugeese make an arguement as a song being unique in a sea of English songs which can get people voting!
What can be taken away from this?
If you want to stand out then sing in your own language!
I learnt to create, modify and commit files to git using Microsofts visual code editor and terminal console. I created a public demo-repo and made various changes and commits to README.md and index.html files. The tutorial was posted on the freecodecamp.org YouTube channel in the following video: