Back to Pixeltree
Data Science
Journey to Data Science
Samuel Surulere
4 min

After a frustrating run of three years, trying to land an academic position (teaching/research) in North America, Europe and even Asia, I came to a point of realizing that the chances of landing a job in academia is very slim. Even after landing a research position, rising up to a tenure-track position would be more difficult due to the stiff competition and highly constrained demand. This realization became more apparent to me after relocating to Canada in January 2023.

In other to avoid career stagnation and eventual resentment of my existence, I sat and reflected on what career direction I needed to ply next. I have always had a huge respect and admiration for programmers and those working in the tech field. Since there was sort of some alignment between Mathematics, Statistics and Data Science, I felt strongly inclined to pursue a career in that line. During coffee meetings, most of the people I had a chat with advised me to consider a career in Data Science (including a University of Calgary professor). All of these happened between January and March. I began the transition by attending a free course on YouTube (Data Analyst Bootcamp for beginners by Alex the Analyst). I picked up some basic skills on PowerBI, SQL, Excel and Python. I also took some more free courses but this was in no way sufficient to start a data science career. So, I rummaged through the internet for opportunities like fully funded and/or partially funded data science bootcamps. Among the many interesting ones I stumbled on (including Applied Data Science Lab at Cybera, AltaML at AMII, NPower Canada, TECH Careers, General Assembly, BlackTECH, InceptionU), Lighthouse Labs invited me for an interview. During the interview, I was informed that I qualify for some government funding. This caught my interest as I couldn’t afford to pay tens of thousands of dollars for a bootcamp or masters degree in data science, at that time.

I wrote and passed the assessments (coding and logic), then quit all my commitments to focus on acquiring data science skills. The bootcamp was quite a unique experience for me (and I’m sure it is for anyone who attends it). We had to complete seven projects on SQL, Statistical Modelling, Tableau, Mid-term project, Supervised learning, Unsupervised learning and the capstone within 12 weeks. People who have attended other bootcamps attested to the fact that Lighthouse Labs is the most intensive bootcamp. The pressure to complete daily tasks and submit projects coupled with sleepless nights was immense. I picked up on skills for querying databases, statistical analysis, data visualization, data cleaning, wrangling and preprocessing, some feature engineering, machine learning (which was the most vital part for me). At the end of the 12 weeks, I have acquired basic foundational skills for traditional machine learning related to tabular/structured data and some bit of working with unstructured data (NLP and computer vision). These skills were sufficient to launch me on for further personal study with emphasis on the needed skills in the modern industry. After graduation, I began to have coffee meetings with data professionals to ensure that my learnings (and projects) are in the right direction.

The first three months after graduation was not in anyway a pleasant one. Despite all my job application efforts, I never received a single call for an interview. Apparently, having a PhD (and no data science related experience) automatically disqualifies me from entry level jobs. This means landing a position would majorly be through recommendation or by meeting directly with decision makers. This was a huge twist in my path to building a career in data. Luckily for me, I joined Careers In Technology and Innovation (CITI) months before starting the bootcamp. Serene Yew usually organizes the biggest hackathon in Calgary. Zackary Novac ensures that CITI members are fully represented yearly. So this year (last week of January 2024), we had two teams from CITI. My team built an app for newcomers that has an integrated chatbot (named Bostie). Bostie answers questions related to the steps new immigrants needs to take to fully settle into Calgary. The entire app idea was a curated one-stop resource ro provide extensive assistance to newcomers (permanent residents, refugees, visitors). I was responsible for building Bostie and used Langchain and OpenAI GPT’s API frameworks as the main tech stack. Bostie was built as a standard RAG architecture and the database was official government documents that was saved into a vector database. Working on this project within that short time was the highlight of my transition. It was the first time I worked on developing a Generative AI powered backend. I watched some YouTube videos on it last year but didn’t implement any project. My team clinched second place during the award ceremony. Some weeks later, Serene needed someone to work on an machine learning project and she contacted Lighthouse Labs. My assigned advisor recommended me to her (and probably some other graduates). I tried my best at the interview but the fact that I participated in her hackathon and built something significant was probably the major influence on her decision to hire me. And that was how I got to this point. Yaaaaa!!!

I have been working as a junior data scientist in Pixeltree Inc. for four months now. Since this is the first time I am working in the industry, all I can say is it has been quite an interesting (and fulfilling) ride. Compared to academia, where you work and deliver results at your own pace, industry needs results as soon as possible. Sprint planning gives strict timelines that must be adhered to. During the bootcamp, we were told that the learning cycle never ends for tech jobs. I realized this within the first two weeks of starting the position. Up until that time, I was quite familiar with binary classification and multiclass classification models. I wasn’t familiar with multilabel and multioutput multiclass classification models. The first project I worked on was a multilabel classification problem, which made the following two weeks the most intense learning curve for me. I managed to complete the model training, evaluation and testing in a juptyer notebook. Then, built the production pipeline (python scripts) along with Serene. We included the CI/CD process and also test-driven development (TDD) to ensure conformity to the required industry standard. Other projects I have worked on have also been a learning curve. There is something new to learn or work on. Although, the journey can be long, frustrating and discouraging sometimes, transitioning to data is something I would recommend anytime. Using data and science to solve problems is very captivating and exciting. I included some links for those interested in transitioning to data.

Links to the bootcamps and other resources mentioned in the blog:

  1. Data Analyst bootcamp
  2. Lighthouse Labs
  3. Cybera Data Science lab
  4. AMII AltaML careers
  5. NPower Canada
  6. Manpower TECH Careers
  7. BlackTECH
  8. General Assembly
  9. InceptionU
  10. CITI website

Tags

Previous Article
The Software Develop Idea to Revitalize Downtown Calgary

Samuel Surulere

Jr. Data Scientist

Related Posts

Validating your idea - Figuring out your target market
2 min
© 2024, All Rights Reserved.

Quick Links

AboutServicesContact Us

Social Media