Data Analyst Bootcamp at MySkill

Ramadhian Ekaputra
20 min readAug 11, 2024

--

For 16 sessions, I embarked on an intensive journey through the Data Analyst Bootcamp at MySkill. Each session was filled with new challenges that enriched my analytical skills, covering everything from data fundamentals to applying complex analysis techniques. Every stage of this journey, like the components of a fishbone diagram, enhanced my ability to solve problems using a data-driven approach, ensuring that each aspect of this learning experience was interconnected and significantly contributed to my development as a Data Analyst.

              Bootcamp MySkill
────────────────
/ \
Tools Techniques
/ \ / \
Software Methods Analysis Visualization
/ \ / \
Excel, SQL Data Prep Stats, ML Looker

Kickstart Career as a Data Analyst

Course Summary

This is the summary of this slide presentation

Why do companies need data?

  1. Informed Decision-Making: Data provides insights that allow companies to make evidence-based decisions, reducing the reliance on guesswork and intuition.
  2. Identifying Trends and Patterns: Companies use data to identify trends and patterns that can inform strategy and operational changes.
  3. Improving Efficiency: Data helps companies streamline operations, identify inefficiencies, and optimize processes.
  4. Customer Insights: Understanding customer behavior and preferences through data allows companies to improve their products, services, and marketing strategies.
  5. Competitive Advantage: Leveraging data can give companies a competitive edge by enabling them to innovate and stay ahead of market trends.
  6. Risk Management: Data helps identify and mitigate potential risks before they become significant issues.

What’s the difference between data engineer, analyst, and scientist?

Analysis of “Habits of Being Happy”

the summary of this article:

Let’s study these cases

Case 2 — Baking Store

Shaenette loves baking so much that she is considering selling her pastries online. Do you think she needs to be data-driven? What are your advice to her?

Case 3 — Charity Industry

Haji Endo is the head chief of one of the largest charities in Yokohama. Fundraising and distribution in traditional fashion have been running for years, but Haji Endo wants to do a breakthrough: to serve the donors and recipients more personally. What can he do?

Intro to Data Analytics

Konteks: Data Analyst at E-Commerce

Problem 1

Konteks:

  • Di salah satu Online Travel Agent Company kalian berperan sebagai data analyst untuk product Flight
  • Ditemukan sebuah anomali bahwa terjadi lonjakan penjualan pada tanggal 3 November 2023
  • Lonjakan terjadi sebesar 50% dibanding hari sebelumnya dan 25% dibanding tanggal yang sama bulan lalu
  • Dilain sisi, terdapat 2 promo marketing campaign yang sedang berjalan:

○Promo Flight 50% dengan max. 200 ribu untuk pelanggan baru
○Promo Flight 10% dengan max 500 ribu untuk pelanggan lama

  • Dilain sisi, ada promo referral yang sedang berjalan dibulan ini dimana

○ Setiap user yang join / bertransaksi untuk pertama kali, akan mendapatkan hadiah IDR 50k untuk orang yang diajak dan mengajak

  • User datang ke kalian untuk mencari tahu apa penyebab terjadinya kejadian diatas

Buatkan Analytical Thinking Framework untuk case diatas yang mencakup:

  • Background
  • Objective
  • Data
  • Point to be checked/ Initial Hypotheses

Problem 1: Lonjakan Penjualan di Online Travel Agent Company

Problem 2

Konteks:

  • Di salah satu Money Transfer company
  • Ditemukan rata-rata waktu tunggu transaksi mengalami lonjakan
  • Hal ini menyebabkan banyak user yang complaint
  • Kejadian terjadi selama periode 1–5 Nov 2023
  • Belum diketahui penyebab dari insiden diatas

Problem 2: Lonjakan Rata-Rata Waktu Tunggu Transaksi di Money Transfer Company

Problem 3

Konteks:

  • Anda merupakan salah satu Marketing Data Analyst di salah satu Tech Company.
  • Ditemukan bahwa terjadi penurunan jumlah akuisisi customer*, padahal secara budget iklan yang dikeluarkan cenderung stagnant.
  • Hal ini menyebabkan biaya akuisisi pelanggan (CAC, customer acquisition cost) menjadi lebih mahal.
  • Kejadian terjadi di sepanjang pekan ke-3 dan ke-4 akhir tahun 2023 kemarin.

*akusisi customer: jumlah customer yang mendaftar kedalam app / platform

Problem 3: Penurunan Akuisisi Customer di Tech Company

Let’s Learn Basic Statistics

Course Summary

STATS — Task

You are a part of the CRM Team and asked to evaluate the redeemed voucher of the day.

  • What are the (mean, median, mode, std. dev, and outlier threshold) of the last month's performance?
Number 1
  • Is there any outlier from last month's performance?

Yes

  • If yes? how many?

one, at uid 18 with voucher Redeem value = 500

Standard Dev — Task

If you are the leader of one institution in Surabaya, how many chairs that you need to prepare to cover 68% of all visitors who at least will get the seat?

Assuming the data is normally distributed

We can prepare the chair to cover 96% of the seats at least for

= mean + 2 * std.dev

= 6,43 + 2 * (2,14)

= 10,7

11 chairs we need to prep.

Z Score — Task

Mana yang harus Udin Ambil jika mempertimbangkan biaya hidup berdasar lingkungannya?

Ambil Tuban, karena gaji Tuban (2.59) relative lebih baik dibandingkan dengan gaji Bekasi (0.79) walaupun secara gaji absolut Tuban di bawah Bekasi.

Percentile — Task

In one money transfer company, the expected of Percentile 90 SLA is under 5 mins to ensure the customer satisfaction of the service provided.

50 transactions occurred with each of the SLAs is attached

  1. Has the company achieved the P90 Satisfaction Level Condition?
  2. What is your recommendation to the product team to solve this condition?

Client tidak akan membeli karna aktualnya itu 10 menit , yang dijanjikan p90 = 5 menit (atau artinya yang dijanjikan 90% transaksi yang disediakan pasti berada dibawah atau = 5 menit)

STATS — Task

You are a part of the CRM Team and asked to evaluate the redeemed voucher of the day.

  • What are the (mean, median, mode, std. dev, and outlier threshold) of the last month's performance?
  • Is there any outlier from last month's performance?

No

  • If yes? how many?

No

Linear Regression — Task

Kamu adalah seorang manager museum, kamu meminta untuk memperkirakan dan memvisualisasikan perkiraan pengunjung yang potensial pada Q1 2022

  • Buatlah perkiraan untuk Q1 2023
  • Buatlah visualisasi (dalam scatter plot)

Introduction to Problem Solving

Mini Task

Halojek is a company with 2 main products:

  • Halofood: a service to provide food delivery
  • Haloride: a service to provide mobility using motorcycles

The revenue in this quarter (Q3, 23) decreased by 50% compared to the same quarter last year (Q3, 22).

The cost for this quarter is as follows:

  • Marketing cost: IDR 800 Bio,
  • Labor cost: IDR 750 Bio
  • Infrastructure cost: IDR 3,500 Bio

The cost for last quarter is as follows:

  • Marketing cost: IDR 600 Bio,
  • Labor cost: IDR 600 Bio
  • Infrastructure cost: IDR 2000 Bio

Objective: Develop a problem-solving framework for the case above, you are a data analyst to deep dive into the root cause of the cost increase this quarter and also provide the alternative

Step 1: Clarify the Problem

Clue: Quantified the gap

Step 2: Breakdown the Problem

Clue: Should be data-driven | 4W — What, Where, Who, When

Step 3: Set the Objective

Clue: The objective should be quantified and time-bound

Step 4: Define the root cause

Clue:

  • Why, why, why
  • Brainstorm is allowed

Step 5: Develop countermeasures

Consider possible alternatives based on preferred criteria

  • Feasibility
  • Cost
  • Practicality
  • Security
  • etc.

Working with Google Sheets: Extract — Format — Clean

Task

Buat worksheet baru lalu extract data raw superstore pada folder modul ke dalam sheet.
Ubah format data pada kolom diskon menjadi percentage.
Ubah format data pada kolom sales dan profit menjadi currency dengan mata uang dollar dan maksimal dua angka di belakang koma.
Pada kolom profit tandai profit yang negatif dengan warna merah.
Ubah data pada kolom Ship_Mode sampai Region dengan Proper Case.
Periksa apakah data superstore memiliki transaksi duplikat.
Pada bagian Segment terapkan data validation sehingga data yang masuk adalah satu dari pilihan berikut: Corporate, Consumer, Home Office

Link Spreadsheet:

SQL Basic 1

Create the Database

  1. Buka pgAdmin 4
  2. Klik Servers
  3. Klik PostgreSQL 16
  4. Klik Database
  5. Klik Create
  6. Klik Database
  7. Pada Kolom Database ketikkan nama “Tokopaedi”
  8. Klik Save
  9. Database muncul di Object Explorer

Create the Table

  1. Buka https://sqliteonline.com/
  2. Klik garis ikon ≡ di pojok kiri atas
  3. Pada tab PostgreSQL → Click to Connect
  4. Ketik “CREATE TABLE orders ()
  5. Select all semua codingan yang ter-highlight
  6. Klik Run
  7. Table tersimpan di database

SQL Basic 2

Task

You now have a transaction table of customers who have transacted on Tokopaedi. Next, you are asked to:

  1. Display the names of customers in the ‘Consumer’ segment who have purchased a table.
  2. Display the names of customers from the ‘Corporate’ and ‘Home Office’ segments who are from Los Angeles and transacted during the year 2018.

Import the Table

  1. Go To https://sqliteonline.com/
  2. Click the Import button
  3. Click Open File
  4. Browse the ‘orders_new_edited.csv’
  5. Click Open
  6. On the “Custom name” choose ‘First line’
  7. Click Ok

CASE 1: Display the names of customers in the ‘Consumer’ segment who have purchased a table.

  1. Identify the Table and Columns, The relevant columns are customer_name, segment, and subcategory
  2. The next step is to formulate the query
SELECT DISTINCT customer_name, segment, subcategory
FROM orders_new_edited
WHERE segment = 'Consumer'AND subcategory = 'Tables';

CASE 2: Display the names of customers from the ‘Corporate’ and ‘Home Office’ segments who are from Los Angeles and transacted during the year 2018.

  1. Identify the Table and Columns, The relevant columns are customer_name, segment, order_date, and city
  2. The next step is to formulate the query
SELECT DISTINCT customer_name, order_date, city
FROM orders_new_edited
WHERE segment IN ('Corporate', 'Home Office')
AND city = 'Los Angeles'
AND order_date LIKE '%2018%';

SQL Basic 3

Task

  1. Retrieve transactions resulting in a loss in Los Angeles between 2018 and 2019, sorted by the largest loss
  2. Retrieve transactions resulting in a profit in Henderson during Q1 of 2018, ordered by the highest profit

Import the Table

  1. Go To https://cloud.google.com/bigquery
  2. Click the ‘Console’ Menu, Click ‘Big Query’, Click ‘Select a project’, Click ‘NEW PROJECT’
  3. Rename the ‘Project name’, click Create
  4. Right-click the ⁞ on ‘MySkill DA 16’, Create a dataset
  5. On the Create dataset table, rename the ‘Dataset ID’, and click ‘CREATE DATASET’
  6. Right-click the ⁞ on ‘tokopaedi’, Create a table
  7. On Source Table, ‘Create a table from’ dropdown and choose ‘Upload’, then click ‘Browse’, and upload these Orders.csv dapat di download di sini https://drive.google.com/drive/folders/14K1L1A5BLUyBoxWJW-976jyj8HvSwlhW?usp=sharing
  8. On the ‘Schema’ table, click ‘Auto detect’, and rename the table, click ‘CREATE TABLE’
  9. On ‘Get started’, click ‘COMPOSE A NEW QUERY’

CASE 1: Retrieve transactions resulting in a loss in Los Angeles between 2018 and 2019, sorted by the largest loss

  1. Arrange the transactions in ascending order based on the amount of loss, with the most negative amount listed first
  2. Formulate the query
  3. The query successfully identified the biggest loss in Los Angeles
  4. We arrange these transactions in descending order based on the most negative profit value to display the biggest losses at the top
SELECT order_date, city, profit
FROM 'tokopaedi.orders'
WHERE order_date BETWEEN '2018-1-1'AND '2019-12-31'
AND city = 'Los Angeles'
AND profit < 0
ORDER BY profit ASC;

CASE 2: Display the names of customers from the ‘Corporate’ and ‘Home Office’ segments who are from Los Angeles and transacted during the year 2018.

  1. Identify the Table and Columns
  2. Sort the transactions by profit in descending order
  3. The next step is to formulate the query
  4. By executing the SQL query, we successfully identified a single order was placed on January 9, 2018, by customer CC-12685 in Henderson, which led to multiple profitable transactions
  5. This indicates that customer CC-12685 made a substantial purchase resulting in varying levels of profit from different items Q1 2018
SELECT order_date, city, profit
FROM 'tokopaedi.orders'
WHERE order_date between '2018-1-1'and '2018-3-31'
AND city = 'Henderson'
AND profit > 0
ORDER BY profit DESC;

SQL for Data Analysis

Task

  1. Determine which city has the highest revenue
  2. Calculate the average spending per customer in the city with the highest revenue
  3. Create a table listing the names of customers from the city with the highest revenue who have spending above the average

CASE 1: Determine which city has the highest revenue

  1. Identify the Table and Columns
  2. Formulate the query
  3. According to the query result, New York City has the highest total revenue for the year, amounting to $256,368.16. This indicates that New York is a significant market for the business, contributing the most in terms of sales revenue
SELECT city, SUM(sales) AS total_revenue
FROM 'tokopaedi.orders'
GROUP BY city
ORDER BY total_revenue DESC
LIMIT 1;

CASE 2: Calculate the average spending per customer in the city with the highest revenue

  1. Identify the Table and Columns
  2. Formulate the query
  3. In New York City, the average spending per customer is approximately $280.18. This figure represents the mean amount of money spent by each customer in the city, helping to understand the spending behavior and providing a benchmark for assessing customer value
WITH city_with_highest_revenue AS (
SELECT city
FROM 'tokopaedi.orders'
GROUP BY city
ORDER BY SUM(sales) DESC
LIMIT 1
)
SELECT o.city, AVG(c.total_spending) AS average_spending_per_customer
FROM (
SELECT customer_id. SUM(sales) AS total_spending
FROM 'tokopaedi.orders'
WHERE city = (SELECT city FROM city_with_highest_revenue)
GROUP BY customer_id
) c
JOIN 'tokopaedi.orders' o ON c.customer_id = o.customer_id
GROUP BY o.city;

CASE 3: Create a table listing the names of customers from the city with the highest revenue who have spending above the average

  1. Identify the Table and Columns
  2. Formulate the query
  3. The query identified several customers in New York City whose spending exceeded the average of $280.18. Customers such as Adam Bellavance, Mark Packer, Christine Phan, and others have spent more than the average amount
WITH city_with_highest_revenue AS (
SELECT city
FROM 'tokopaedi.orders'
GROUP BY city
ORDER BY SUM(sales) DESC
LIMIT 1
),
average_spending AS (
SELECT AVG(total_spending) AS avg_spending
FROM (
SELECT customer_id, SUM(sales) AS total_spending
FROM 'tokopaedi.orders'
WHERE city = (SELECT city FROM city_with_highest_revenue)
GROUP BY customer_id
) sub
),
customers_above_average AS (
SELECT customer_id, customer_name, SUM(sales) AS total_spending
FROM 'tokopaedi.orders'
WHERE city = (SELECT city FROM city_with_highest_revenue)
GROUP BY customer_id, customer_name
HAVING SUM(sales) > (SELECT avg_spending FROM average_spending)
)
SELECT customer_name
FROM customers_above_average;

Introduction to Python

Why Learn Python?

Ease of Learning

  • Simple and readable syntax
  • Extensive documentation and community support

Versatility

  • Web Development (e.g., Django, Flask)
  • Data Science & Machine Learning (e.g., Pandas, Scikit-Learn)
  • Automation & Scripting
  • Game Development (e.g., Pygame)

Strong Community

  • A large number of libraries and frameworks
  • Active forums and communities (e.g., Stack Overflow, Reddit)

Career Opportunities

  • High demand in various industries
  • Competitive salaries

Python Syntax and Basics

Variables and Data Types:

  • Dynamic typing: ‘x = 10’
  • Common data types: int, float, str, list, dict

Control Structures:

  • Conditional Statements: ‘if’, ‘elif’, ‘else’
  • Loops: ‘for’, ‘while’

Functions:

  • Defining functions: ‘def my_function():’
  • Returning values: ‘return’

Mini Task — Create a complete biodata using a dictionary. Then, using a key, display your nickname!

Step 1: Creating a dictionary containing biodata

# My Personal Data
personal_data = {
"Full Name": "Ramadhian Ekaputra", #FullName
"Nickname": "Ian", #Nickname
"Gender": "Male", #Gender
"Role": "Staff", #Role
"Hobby": "Trekking", #Hobby
}

Create a Python Dictionary

  1. Start by creating a Python dictionary, A dictionary is a collection of data elements stored in key-value pairs.
  2. You can use the following syntax: personal_data={}

Add Key-Value Pairs

  1. Inside the dictionary curly braces{}, begin adding your data
  2. Each piece of data should be assigned a key using quotation marks, followed by a colon:
  3. Then the corresponding value in quotation marks or without them if it’s a number

Step 2: Assessing the Nickname

nickname= personal_data["Nickname"]
print(nickname)

Accessing Information Using Keys

  1. Once you’ve created your dictionary with all your information, you can retrieve specific data using its key
  2. To retrieve your nickname, for instance, you can use the following syntax nickname = personal_data[“nickname”]

Print the Retrieved Information

  1. You can display the retrieved data (nickname) using the print() function.

Python 2: Working with Pandas

Mini Task

  • Compare the number of immigrants from India and China
  • Compare the trend of the top 5 countries that contributed the most to immigration to Canada

Task 1: Compare the number of immigrants from India and China

df_CI = df_clean.loc[['India', 'China'], years]
df_CI
  1. Write the code to obtain the immigrant data of selected countries and years
  2. Run the code using a run button or Ctrl + Enter
  3. The code selects and displays immigration data for India and China from 1980 to 2013. The interpreted data for the years 1980 to 1999 indicates an overall increase in the number of immigrants from both countries, with China showing a more consistent and steeper rise compared to India.
df_CI = df_CI.transpose()
df_CI.head()
  1. The line of code plots the transposed DataFrame as a line plot using pandas’ plot method with kind=‘line’
  2. Remember that pandas plot indices on the x-axis and columns as individual lines on the y-axis.
  3. Since df_CI is a data frame with Country as the index and year as the column, we must first transform the data frame using the transpose() method to swap the rows and columns.
df_CI.plot(kind= 'line')
import matplotlib.pyplot as plt

df_CI.index = df_CI.index.map(int) # let's change the omdex values of df_CI to type integer for plotting
df_CI.plot(kind= 'line')

plt.title('Immigrants from China and India')
plt.ylabel('Number of Immigrants')
plt.xlabel('Years')

plt.show()
  1. Import the matplotlib.pyplot module, which is used for creating static, interactive, and animated visualizations in Python.
  2. Change the index values of df_Cl to integers. This is important for plotting, as it ensures the years are treated as numerical values on the x-axis.
  3. Create a line plot of the transposed DataFrame df_Cl. Each line represents the number of immigrants from India and China over the years.
  4. Add a title to the plot and labels to the y-axis and x-axis, making the plot easier to understand
  5. Display the plot on the screen
  6. The plot indicates that both India and China experienced an increase in the number of immigrants from 1980 to 1999.
  7. However, China's increase is more pronounced, especially in the late 1990s. This could be due to various socio-economic factors influencing immigration patterns during that time.
  8. The graphical representation provides a clear visual comparison of immigration trends from these two countries over the given period.

Task 2: Compare the trend of top 5 countries that contributed the most to immigration to Canada

#The correct answer is:
#Step 1: Get the dataset. Recall that we created a Total column that calculates immigration by country.
#We will sort on this column to get our top 5 countries using pandas sort_values() method.

import matplotlib.pyploy as plt

inplace = True # parameter saves the changes to the original df_can dataframe
df_clean.sort_values(by= 'Total', ascending= False, axis= 0, inplace= True)

# get the top 5 entries
df_top5 = df_clean.head(5)

#transpose the dataframe
df_top5 = df_top5[years].transpose()

print(df_top5)

#Step 2: Plot the dataframe. To make the plot more readable, we will change the size using the 'figsize' parameter/
df_top5.index = df_top5.index.map(int) # let's change the index values of df_top5 to type integer for plotting
df_top5.plot(kind='line', figsize=(14, 8)) # pass a tuple (x,y) size

plt.title('Immigration Trend of Top 5 Countries')
plt.ylabel('Number of Immigrants')
plt.xlabel('Years')

plt.show()
  1. This code compares immigrations trends of the top 5 countries contributing the most to Canadian immigration.
  2. First, it imports the necessary plotting library, sorts the DataFrame by the total number of immigrants, and selects the top 5 countries
  1. It then focuses on yearly data, converts the years to integers, and creates a line plot to visualize trends over time.
  2. The plot includes titles and labels for clarity

Mini Task Link

Python 3: Python Application for Data Analysis

Task: Analyzing Top 5 Cities by Profit in 2016 and Visualizing with Bar Diagram

Objective: To identify the top 5 cities with the highest profits in 2016 using sales data from the Tokopaedi dataset.

Methodology

  • Data filtering
  • Grouping
  • Visualization
import pandas as pd
import matplotlib.pyplot as plt

# Load the data
url = 'https://raw.githubusercontent.com/dataskillsboost/FinalProjectDA11/main/tokopaedi.csv'
data = pd.read_csv(url)

# Convert the 'order_date' column to datetime format
data['order_date'] = pd.to_datetime(data['order_date'])

# Filter the data for the year 2016
data_2016 = data[data['order_date'].dt.year == 2016]

# Group by city and calculate the total profit for each city
city_profit = data_2016.groupby('city')['profit'].sum().reset_index()

# Sort the cities by profit in descending order and get the top 5
top_5_cities = city_profit.sort_values(by='profit', ascending=False).head(5)

# Print the result
print(top_5_cities)

# Plot the bar diagram
plt.figure(figsize=(10, 6))
plt.bar(top_5_cities['city'], top_5_cities['profit'], color='skyblue')
plt.xlabel('City')
plt.ylabel('Profit')
plt.title('Top 5 Cities by Profit in 2016')
plt.xticks(rotation=45)
plt.show()

Data Preparation

Dataset Description: The dataset contains sales data including order dates, cities, and profit amounts.

Data Filtering: Filtered the data to include only records from the year 2016.

  • New York City: Achieved the highest profit, likely due to its large market size and high consumer spending.
  • Los Angeles: Significant sales, potentially driven by its diverse economy and large population.
  • Lafayette: Strong profit performance, possibly due to localized market strategies and effective sales campaigns.
  • Detroit: Steady profit, indicating a recovering market with growing consumer confidence.
  • Providence: Notable profit, suggesting effective niche marketing and strong customer loyalty.:

Implications for Business Strategy:

  • Focus on Top-Performing Cities: Allocate more resources and tailor strategies to maximize profits in these regions.
  • Replicate Successful Strategies: Identify the key factors contributing to high profits in these cities and apply them to other regions.
  • Address Gaps: Investigate lower-performing cities to identify challenges and opportunities for growth.

Conclusion

Summary of Findings:

  • The top 5 cities by profit in 2016 were New York City, Los Angeles, Lafayette, Detroit, and Providence.
  • Key factors contributing to high profits include market size, consumer spending, effective marketing strategies, and localized sales efforts.

Data Visualization

Dataset

Objective:

  • Demonstrate the use of Google Looker for data analysis and visualization.

Dataset Source:

Google Looker Capabilities:

  • Transform raw data into meaningful insights.
  • Aid in better decision-making and strategy formulation.

Report Example:

  • View a similar dynamic report here.

Goal:

  • Provide a comprehensive overview of the data.
  • Highlight key trends and metrics.
  • Showcase Google Looker’s ability to convert data into actionable business intelligence.

Upload The Dataset

  1. Open the data source
  2. Click File →Make a copy →Choose the designated folder
  3. On Google Looker →Click Blank Report
  4. On Google Connectors, → Click Google Sheets
  5. Click Copy of Data Sales →click Add
  6. On the page, →Click ADD TO REPORT

Page 1 — Overall Performance

Page 1:

  1. What is the sales trend from year to year per quarter?
  2. How is the distribution of sales for each segment?
  3. How are sales distributed in each region?

Page 2 — Product Performance

Page 2:

  1. What is the weekly sales trend for each category?
  2. What were the total sales over the period?
  3. What is the total profit over the period?
  4. What is the average profit value for each customer?
  5. What is the average profit value for each product?
  6. Which subcategory provides the biggest profit?
  7. Which subcategory gives the biggest average profit?

Dashboard Google Looker

Outro

As my journey through the Data Analyst Bootcamp at MySkill ends, I find myself equipped with the technical skills and a deeper understanding of the data-driven mindset that is essential in today’s world. Each session was a building block, adding to a solid foundation that I now stand on as I move forward in my career. This boot camp has been more than just a learning experience; it has been a transformative process that has shaped how I approach problem-solving and decision-making. As I step into the next chapter of my professional life, I carry with me the knowledge, confidence, and passion that will undoubtedly drive my success in the field of data analytics.

                         ┌─────────────────────────────────────┐
│ Open Door │
│ (New Opportunities) │
└─────────────────────────────────────┘
/ \
/ \
/ \
Pathway of blocks representing skills learned:

┌──────────┐ ┌────────────┐ ┌───────────────┐
(bright background) │ SQL │ │ Statistics │ │ Machine │
│ │ │ │ │ Learning │
└──────────┘ └────────────┘ └───────────────┘

┌─────────────┐ ┌────────────┐ ┌─────────────┐
│ Data │ │ Visualization │ │ Z-Scores │
│ Cleaning │ │ │ │ │
└─────────────┘ └────────────┘ └─────────────┘

┌─────────────┐ ┌─────────────┐ ┌───────────────┐
│Python │ │Regression │ │Data Wrangling │
│ │ │Analysis │ │ │
└─────────────┘ └─────────────┘ └───────────────┘

(more blocks behind as you continue to look down the path)

┌────────────────────────────┐
│ Person standing at the │
│ start of the pathway, │
│ holding a laptop and briefcase │
└────────────────────────────┘

--

--