WEB SCRAPING, DATA VISUALIZATION, SELENIUM

Photo by Ilya Pavlov on Unsplash

CONTENT

  • What is Selenium?
  • Why Selenium?
  • Project Pre-requisites
  • Website
  • Code Snippets
  • Data Visualization

What is Selenium? Selenium is an open-source web-based automation tool. Originally developed in ThoughtWorks as an in-house tool in 2004, it was eventually released as an open-source.

While Selenium is understood as a testing tool, some working on…


DATA VISUALIZATION

Data visualization provides a visual context through maps or graphs. In doing so, it translates the data to a more natural form for the human mind to comprehend and pick out patterns or points of interest. …


In-depth guide

Photo by Jesse Dodds on Unsplash

I was working with a team relatively new to business/data analytics when the group discussions on feature engineering for machine learning came up. Understandably, it may be confusing and intimidating for people new to machine learning. Working in Jupyter Notebooks and Python, we naturally would refer to 1) the built-in…


PowerBI Dashboard Mini-Proj

CONTENT

Introduction

Data Sources

Building the Data Model

DAX

Putting It Together

Conclusion

Introduction

“Besides the figures reported in the local & regional news, can I have a personal dashboard to visualize what trends there might be?” From social distancing to staying home as much as possible to recovering from vaccination effects…


Sorting out statistical tests & usage one at a time

Foreword

The main driver for this article stems from an effort to deepen understanding and note down for future reference, some of the common statistical methods and their application context. For each of the methods, an example dataset is instantiated along with code implementation. …


Everyone’s running sprints, try a marathon

Table of Contents

1. Introduction

2. Course Overview

3. The good

4. Areas of Improvement

5. Closing

Photo by Boitumelo Phetla on Unsplash

1. Introduction

Launched in late 2020, the IBM Data Science course is one of the few data science courses released by IBM to help graduates and working professionals seeking to pivot and break into data science. …


Be deliberate in the problem-solving process

The cumulation of Coursera’s IBM Data Science Professional Course is a capstone project that requires course participants to identify a business problem that requires the use of location data and neighborhood clustering. The ability to analyze business problems, cut through the noise, and identify the actual issue to be addressed…


Web Scraping, Data Visualization

As the covid situation flared up once again in the city-state, we naturally cut down our time spent outdoors and try to minimize time spent in crowded areas as far as possible. Nevertheless, the pantry will require periodic replenishment. Also, the Household Overlord has been remarked that online grocery delivery…


Feature selection is a process where the predictor variables that contribute most significantly towards the prediction/ classification of the target variable are selected. In feature selection for linear regression models, we are concerned with four aspects regarding the variables. Framed as a mnemonic “LINE”, these are:

  1. Linearity. The selected variable…


Photo by Markus Spiske on Unsplash

Data cleaning and Exploratory Data Analysis go hand-in-hand — with a better understanding of the data, can one be better positioned to spot errors or outliers for mitigation.

Most of us do EDA through pandas functions, coupled with visualizations using matplotlib to seaborn. Occasionally, we define functions to do 1)…

ShengJun

Data Science Enthusiast, Analyst. Sharing insights from own learning journey and pet projects in this space. Linkedin Profile: www.linkedin.com/in/ShengJunAng

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store