• Hi!
    I'm Bhuvaneshwari

    I am currently working as a research assistant on NLP at Dalhosuie University.

About Me

Who Am I?

Hi I'm Bhuvaneshwari. I am a research assistant at Dalhousie University.

I am a Data Science researcher with six years of industry experience and an exciting mixture of software engineering, machine learning, deep learning, data visualization, and business analysis expertise.

In my professional experience, I have worked on end-to-end analytics projects that involved Data Analysis, Data Engineering, Data Visualization, Data Modeling, Machine Learning Model Deployment, Building Deep Neural Networks for NLP, System Design, and Analytics Framework Development for solving business problems.

I am currently working as an NLP Research Assistant at Dalhousie University, creating a robust and novel approach for interactive clustering, classification problems, and data visualization.

Also, as a Research Assistant, I have built custom NLP models for Free text and match-making software using sentiment analysis that identifies sensitive and hidden information from unstructured text records.

Programming: Python (Numpy, Pandas, NLTK, SKLearn, Matplotlib, Gensim, huggingFace, Flask, Dash), SQL, PowerBI.

DevOps & Misc Tools: PyCharm, Jupyter notebook, Git, Jira, Jenkins, Slack, Trello, BitBucket, Airflow, and Tibco

Machine Learning: Regression Modeling, Decision trees, Random Forest, kNN Classifier, K-means Clustering, Dimensionality reduction(PCA, T-SNE, UMAP), Feature Extraction, Natural Language Processing (Text Analytics), Convolutional Neural Network, Information Retrieval

What I do?

Here are some of my overall skills

Research

I have experience in research, designing and developing sophisticated ML solutions for real-world problems using traditional and cutting-edge ML solutions.

Data

Excellent knowledge of data exploration, data cleaning, collecting, generalizing, evaluation of models, data integration and data manipulating.

Skills

Knowing that technology is evolving fast, made me passionate about learning new concepts and skills. Ability to implement novel ideas.

My Specialty

My Technical Skills

My skills are divided into three sections: Professional, Intermediate, and Familiar. The skills in the Professional section are the tools that I'm working with regularly.

Professional

Programming Languages    Python
   Python Libraries PyTorch, Keras, MLflow
Hugging Face Transformers, spaCy, NLTK
NumPy, scikit-learn, pandas, Mathplotlib, jupyter notebooks
   Databases DBMS:  MySQL, SQL
Version control    Git,    GitHub,    GitLab
Operating system   Windows,    Linux
Writing Tools Microsoft Office , LATEX
Workflow Agile Development & Scrum
others    Slack,   Trello

Intermediate

Programming Languages    Java, C, C++
   PHP (CodeIgniter, Laravel)
   HTML, CSS, JavaScript (Bootstrap)
MASM Assembly
   Python Libraries pytest, py2neo, selenium, PyMongo
Cloud    Docker
Programming platforms    Android
   Databases DBMS:  postgreSQL, SQLserver(familiar)
NoSQL: Neo4j, MongoDB
Graphic Design Adobe Fireworks, Adobe photoshop, Camtasia
others Jibble

Familiar

Programming Languages MATLAB, R
Ontology SPARQL, Protégé
HDL Verilog
Education

Education

M.S. Computer Science January 2021 – April 2023

Dalhousie University, Halifax, Canada.
GPA: 4.00 (Out of 4.3), via 12 credit
Thesis title: "Personalized Topic Modelling on Domain-Specific document collections"
Courses:

  • Advanced topics in NLP: A-
  • Deep Learning: A
  • Machine Learning: A+
  • Visual Analytics: A

Experience

Work Experience

Software Engineer Oct 2016 – Feb 2021

TCS, Chennai, India (Full-time/Remote)

  • Performed research on hospital clinical reports
  • Worked across development and research team
  • contributed to deidentification development and research stack
  • Researched around Named entity recognition and measurement extraction with rule-based and SOTA language models

NLP Research Assistant Mar 2019 – Apr 2021

Dalhousie University, Halifax, Canada

  • Using deep language models to solve downstream NLP tasks.
  • Worked on both individual and team project.
  • Leveraged my knowledge of Python to develop a CLI tool for multi-task learning

Data Analyst Aug 2018 – Feb 2019

Tata Consultancy Services, Chennai, India
This comany is mostly focused on Software development and R&D projects.
Detailed achievements and working experience:

  • Part of social network analysis team, mostly worked with python, neo4j, and MongoDB (Py2neo and PyMongo library in python)
  • Scrum (software development framework)

Application Support Engineer Jun 2015 – Aug 2016

TCS, Chennai, India
This compnay is mostly focused on programming boards(Raspberry Pi) and developing IOT projects.

  • First started as an intern and then was selected to lead the training sessions.
  • Topics included: UNIX vs Linux, file manipulation on terminal, kernel description, file system, shell scripting, system administration and network basics, version controls.

Experience

Teaching Experience

Databases Lab Assistant Feb 2017 – May 2017

Pondicherry University, Pondicherry, India

  • Facilitated tutorial classes.
  • Presented lectures on PostgreSQL
  • Held office hours to help students with Psycopg, SQLAlchemy and Django.

My Work

Projects

Query-focused Extractive Summarization using pre-trained models January 2020 - April 2020

  • Course Project for Natural Language Processing(NLP) Course
  • Team Project (collaboration with a PhD student)
  • Used different scoring functions to extract most important sentences of a document
  • Download Project Report
Abstract: In the process of writing a research paper, researchers often spend a lot of time organizing and summarizing previous work related to the research. To help with this problem, the proposal of this project is to use Query-focused Extractive Summarization algorithms to produce relevant highlights in related research. For this project, the problem of insucient labeled data was solved by using pre-trained models such as BERT and BioBERT to produce accurate representations of words. To measure the validity of the approaches, they were applied to the BioASQ dataset of medical articles and obtained results consistent with each other using Cosine similarity and Euclidean distance, each with several pre-trained models. One of the challenges of this project was that producing the embeddings of a lot of sentences with a pre-trained model is a very time- consuming task, so a scalable tool was developed to eciently compute token embeddings for a variety of pre-trained models.

Classification Of Imbalanced Dataset Using BERT Embeddings May 2019 - Aug 2019

  • Course Project for Deep Learning Course
  • Team Project (collaboration with 2 MCS students)
  • Used BERT embeddings to classify the type of harrasment of each tweet in a imbalanced twitter dataset.
  • Download Project Report
Abstract: Online harassment is becoming prevalent as a specific type of communication on Twitter. Considering the huge amount of user-generated tweets each day, the problem of detecting and possibly limiting these contents automatically in real-time is becoming a fundamental problem. But often real-world datasets are imbalanced, comprising predominantly of “normal” examples and less number of “abnormal” ones which causes the learning algorithm to simply generate a trivial classifier that classifies every example as the majority class. To tackle this problem, we use SMOTE to oversample the embeddings of the minority classes where the embeddings are obtained from the BERT pretrained language model. Finally, we use these oversampled embeddings to train our bi-directional LSTM classifier model to categorize the tweets into four classes: non-harassment, sexual harassment, physical harassment and indirect harassment. Our experiments show that using SMOTE on the top layer representations of BERT significantly improves the F1 score than merely adjusting the class weights.

Visual Analysis Of Harassment Classification In Twitter May 2019 - Aug 2019

  • Course Project for Deep Learning Course
  • Team Project (Collaboration with 2 MCS students)
  • Developed a visualization system using D3 to demonstrate the semantic similarity of words.
  • Download Project Report
  • Although We used the same dataset for this project and the deep learning course, the methogologies and purpose of the projects are completly different.
Abstract: In our project, we develop a Deep Neural Network model to classify the tweets based on four classes: non-harassment, sexual harassment, physical harassment and indirect harassment. We then use Deep SHAP, a unified approach that explains the output of any machine learning model to interpret the predictions of our classifier. We demonstrate the semantic similarity of words in the tweet by visualizing the embeddings in low dimensions using t-SNE. To provide a quick, high level information of the similarities and anomalies between the categories, the final predictions of the model are summarized into different styles of TreeMaps.

Custom project Dec 2017

Descripiton: This project is designed to assist people who work in the kitchen such as, the kitchens of grand hotels, where the matter of temperature and heat turns out to be essential. They have giant refrigerators, freezers, a huge kitchen, and a couple of storage to be taken care of. To make sure that each section is being in an appropriate situation and the systems are functioning correctly, this project is designed to monitor a system to prevent any damage that might happen to these workers. In this system, a temperature and humidity sensor is located in the kitchen. The information that is being received from the environment will be sent out through a Wi-Fi module to a web server, the web server will store the data. Then users can watch theses information from the project website. This website also makes it available for the user to set a threshold in which the kitchen should not increase or decrease. If any change occurred out of the given threshold, the user will immediately be notified with an email. The history of the kitchen's temperature is also available to view anytime on the website.

Client and server project Jul 2017

  • Course project for Microprocessor Lab
  • Introduced to serial programming, and sending packet through client to server and vice versa;
  • Developed using python (server) and C (client);

File manipulation simulation Jul 2015

  • Course project for Assembly.
  • Designed to work with system calls.
  • Developed using assembly 8086 - 32 bit.

CMD simulator Jan 2013

  • Course project for Fundamentals of Computer Programming.
  • Got familiar with advanced programming skills in Python.
  • Developed a simulation of the commands of CMD.
Presentations

Presentations

Poster presentation Nov 2019

Basquarane F., Milios E., Matwin S., (2019, Nov). "Biomedical sentence-level and document-level representation learning", Poster session presented at the Canadian computing conference for Women inTechnology, Mississauga, ON

Awards

Awards

Awarded Best Presented Project in the researchNS and Volta Data Student Challenge Sep 2020

The Culture of Respect Committee in Computer Science (CoReCS) of Dalhousie Faculty of Computer Science awarded me the support to attend the virtual Grace Hopper Celebration (vGHC) 2020.

2nd place in Diversity and inclusion hackathon Feb 2021

Organizer: honor issuer ShiftKey Labs and Faculty of Computer Science (Dalhousie University)
Our team won the second place for presenting Real Align. Real Align is a software that creates a safe engagement platform focused on peer to peer support for people with disabilities to connect with each other.

Awarded the support for attending CAN-CWiC 2019 Nov 2022

I was awarded the support by the Culture of Respect Committee in Computer Science (CoReCS) of Dalhousie Faculty of Computer Science to both attend and present a poster in ACM Canadian Celebration of Women in Computing (CAN-CWIC).

Star of the year award Nov 2019

Nielsen.
Secured the Star of the Year Award for my work in Cloud Migration in 2020 from Nielsen Co.

Read

Recent Blog

Coming soon...










Get in Touch

Contact

Halifax, NS, Canada