Table of Contents
Hi 
Welcome to my online portfolio and blog.
I’m Vincent, a result-oriented data scientist and machine learning engineer with a data-driven mindset and attention to details. Ready to work and willing to learn and meet new challenges in a diverse and fast-paced working environment that appreciates my skills and offers avenues for growth. Competent in executing data mining, data preparation, exploratory data analysis, data story telling, data visualization, feature engineering, machine learning modeling and model deployment. Extensive experience in supervised and unsupervised machine learning algorithms and concepts. Proficiency in Windows and Linux operating systems. An all-round data analytics practitioner greatly passionate about artificial intelligence and python programming with an aim of becoming a world class problem solver.
Here’s a little more about me:
Technologies 
My tech stack includes but is not limited to:
- Python programming
- SQL
- Git and github
- Linux
Skills 
My knowledge is built around:
- Python programming:
- numpy
- pandas
- matplotlib
- seaborn
- Machine learning
- sklearn:
- Unsupervised learning:
- KMeans
- K Nearest Neighbors (KNN)
- Principle Component Analysis (PCA) etc.
- Supervised learning:
- Logistic regression
- Decision trees
- Random forests
- Support vector machines
- Boosting algorithms etc.
- Unsupervised learning:
- sklearn:
- Data science
- Natural Language Processing (NLP)
- Git and github
- Data analysis
- Data visualization
- Data scraping
- Data cleaning
- …(everything data really.)
Certifications 
- Feature Engineering for Time Series Forecasting
- Data Science
- Data Analysis with Python
- Machine Learning with Scikit Learn
- Unsupervised Learning in Python
- AWS Machine Learning Fundamentals
- Introduction to SQL, Advanced SQL
Recent Achievements 
Packaged machine learning code into a forecasting python package enabling the machine learning team to run forecasts with a few lines of code using a pipeline similar to scikit-learnโs API.
Led a pilot project to predict estimated time of arrival for vessels ferrying shipments from Vietnam to Western Africa.
Built a stocks price prediction and analysis app for Kenyan stocks. (Link)
Built my online portfolio website and blog. (Link)
Did a machine learning project about sentiment analysis covering a full project cycle from data acquisition to model deployment. Data scraping from twitter, text modeling using word vector representation with the bag of words and tfidf models and hosted a web application on streamlit cloud.
Designed a python package called datastand to help data scientists, machine learning engineers and data analysts better understand data. It gives quick insights about given data; general dataset statistics, size and shape of the dataset, number of unique data types, number of numerical and non-numerical columns, a small overview of the dataset, missing data statistics, missing data heatmap, and provides methodologies to impute missing data.
Publications/Deployments๐ 
Stocks Watch
A stocks price prediction and analysis app for Kenyan stocks.
Code for this is currently private (might make public once I flag off everything as “refined”) and the web-app is live here.
Re-invest
A collection of investment and trading calculators (compounding).
Code available on github and web-app here.
Sentiment Analysis Web App
This project was aimed at predicting the sentiment associated with tweets during the COVID-19 pandemic. Covered a whole project cycle from data acquisition to model deployment. The whole project cycle details and code on my github The project’s web application is live here.
Portfolio Website and Blog
I built this portfolio website and blog.
Python Package datastand
Datastand is a python package designed to help Data Scientists, Machine Learning Engineers and Data Analysts to better understand data. It gives quick insights about given data; general dataset statistics, size and shape of dataset, number of unique data types, number of numerical and non- numerical columns, small overview of dataset, missing data statistics, missing data heatmap and provides methodology to impute missing data. Package link on pypi I also made a guide to showcase how datastand works and help you get started here.
Technical Articles
Find technical articles written by me on the posts page here.