As part of my MSc AI project in collaboration with Narosu Co., Ltd., I am developing an on-premise AI-powered product recommendation system tailored for independent online store operators. While large marketplaces benefit from advanced AI models, smaller vendors often lack such capabilities.
This project bridges that gap by delivering a localized, privacy-preserving, conversational AI chatbot that recommends products based on semantic understanding of customer input.
text-embedding-ada-002 (OpenAI)QdrantLLaMA 3.1 8B InstantLangChain for RAG pipelineGradioGradio + FastAPI
This project introduces two new Visual Studio Code (VSCode) extensions designed to enhance code completion capabilities for Microsoft Small Basic and C, addressing the limitations of traditional and AI-based tools like Copilot. Unlike standard VSCode extensions that rely on user-typed prefixes or language-specific grammar assumptions, the proposed extensions provide syntax-structure-based code suggestions without requiring initial input, thereby aiding users unfamiliar with programming grammar. Built using the YAPB parser builder tool and LR grammars, the system uses WithinTop3Guide to present the top three context-aware suggestions. It also features an interactive preview of candidates—displaying both identifier names and decomposable expressions—to improve comprehension and usability for both novice and advanced users. The extensions aim to provide grammar-compliant, generative AI-driven code completions and are fully open-source on GitHub.
The Cardio Vas. Disease Detection project aims to develop a machine learning model in Python to predict the presence or absence of cardiovascular diseases (CVD) based on a set of input features. The project compares the performance of two popular tree-based algorithms, Random Forest and Decision Trees, for CVD prediction. These algorithms are widely used in the field of machine learning and offer different advantages and trade-offs. The project utilizes a dataset containing various clinical and demographic features of individuals, such as age, gender, blood pressure, cholesterol levels, and smoking habits. Each instance in the dataset is labeled as either having a cardiovascular disease or being disease-free.
The Cleveland Heart Disease dataset contains a wide range of patient attributes, including clinical, demographic, and physiological features such as age, gender, cholesterol levels, blood pressure, and electrocardiogram measurements. Each instance in the dataset is labeled as either having heart disease or being disease-free. In this project, various machine learning algorithms will be applied to the dataset, including but not limited to decision trees, random forest, support vector machines (SVM), logistic regression, and neural networks. These algorithms offer different strengths and weaknesses, and by evaluating their performance, we can identify the most effective approach for heart disease prediction.
The project utilizes NLP techniques to analyze the text content of social media posts and classify them as either cyberbullying or non-cyberbullying. NLP encompasses a range of methods and algorithms that enable computers to understand and process human language. By applying techniques such as text preprocessing, sentiment analysis, part-of-speech tagging, and machine learning, the project aims to extract meaningful features from text data and develop a robust model for cyberbullying detection. The development of the model involves several steps. Firstly, a comprehensive dataset of social media posts, labeled as either cyberbullying or non-cyberbullying, is collected and prepared for analysis. The dataset may include various forms of text data, such as tweets, comments, or forum posts, from different social media platforms. Next, the collected data is preprocessed to remove noise, handle punctuation, and transform the text into a format suitable for analysis. NLP techniques, such as tokenization, stemming, and
The Iris flower dataset consists of measurements of sepal length, sepal width, petal length, and petal width for three different species of Iris flowers: Setosa, Versicolor, and Virginica. The goal of the project is to develop a deep learning model that can analyze these feature measurements and accurately predict the corresponding Iris flower species. In this project, a deep learning model will be designed and trained using the Iris flower dataset. The model will be built using popular deep learning frameworks such as TensorFlow or PyTorch, which provide a wide range of tools and functionalities for constructing and training neural networks.
created with
Website Builder Software .