Hey, it's Bernie!

DataScience

As part of our team assignment for the module on big data, I was looking at what kind of environment data we can find so that we can practice the use of using BigQuery to experience the full ELT pipeline.

Here's the prompt I used.

Hi Claude you are an environmental consultant tasked to study how air quality of a country is affected by the activities within in, and you are doing a big data analysis to understand, rank and visualize the top 5 industry polluters. You are also to compare these polluters across different countries and show how, if any, what quantifiable improvements made by each country and their results. Guide me through step by step using Google's Cloud Platform's Big Query to ingest a free dataset through API or other real-time data ingestion, preferably streaming, then use a suitable data warehouse to store the data and then do ELT pipelines, then create some prediction models and generate reports through the transformed data. Your final output should be a modern dashboard that can be presented to the United Nations meeting to share the challenges of balancing advancement vs air quality and how poor business decisions can lead to bad air quality and eventually poor health of the people.

So now, it's day 2 and we are working on creating a workflow of using Docker to run Spark to get Air Quality data through API. I'll keep you updated on what happens.

#AirQuality #DataScience #BigQuery #MachineLearning #EnvironmentalData #GoogleCloud #ClaudeAI #DataAnalysis #Sustainability #TechJourney

After two months of diving into machine learning, I’ve realized that data science is like performing an autopsy on a business problem to find the clues hidden in the facts. By treating data like a detective mystery, we can stop guessing and start building solutions based on real-world evidence and predictive confidence.

At the start of 2026, I jumped head first into the world machine learning (ML) and data science (DS), and I realized that I'm starting to look at the world with a new lens, and add-on if you will, where every problem starts with data.

I know, some of you reading this might say “no Bernie, we always start with identifying the business problem, and ask ourselves what are we trying to solve.” Yes, that too, but after that, you would hire a data scientist right? That's where I'm starting my story.

Why Data Science is like a Medical Drama

The clues are in the data. With every business problem, the data is like a dead body on an autopsy table, ready for the data scientist to slice and dice, looking for clues as to why there's a problem in the first place. I've always loved a good detective and/or medical drama, and now I know why. ML and DS is also like that. Find the clues, solve the mystery. And after we solve the mystery we can start building a solution based on facts, which can then point us into the right direction to collect, not just more data, but the RIGHT data. Ultimately we would like to be able to predict with higher confidence, the consequences of our future decisions and actions, so we don't make the same mistakes that would cause said problems to arise again.

So many examples from the real-world, like

  • Hotel Industry: Reducing “no-show” rates by identifying patterns in booking cancellations.
  • FinTech: Detecting fraudulent transactions before they clear by spotting anomalies in spending.
  • Education: Analyzing student performance data to intervene before grades drop.
  • Marketing: Auditing marketing spend to see exactly where the ROI is failing and why.

It can help us create systems and processes for better business ROI.

I've always wondered about such things. Why are things the way they are. Almost everything we see around us is a consequence of someone's decision and action – what made that person come to that choice? And by understanding their motivations (through the data) we can make better decisions for a future that we want, whether good or bad.

With the right and big enough dataset, we can predict (and solve) almost anything.

#ai #data #reflection #MachineLearning #DataScience #BusinessROI